I've been very fascinated with the idea of running some AI tooling within the house, so I set up a server at home to do that. I use Proxmox VE as my virtualization server. Here are the steps I took to set up my machine.
BIOS Settings
The first thing I had to do was to make sure that my BIOS was ready to support this configuration. I use an AMD M4 CPU with a Gigabyte Motherboard. Your requirements may differ. But here are the 4 things I had to ensure when I was setting this up
- AMD-V (SVM Mode)
This enables AMD's virtualization technology, essential for running virtual machines efficiently. It allows the hypervisor (in this case, Proxmox) to access the CPU's hardware virtualization features, improving VM performance and enabling features like IOMMU. - IOMMU (Input/Output Memory Management Unit)
IOMMU is critical for GPU passthrough. It allows direct mapping of I/O devices (like your Nvidia GPU) to guest VMs, isolating them from the host system. This enables the VM to access the GPU directly, providing near-native performance. - ACS (Access Control Services)
ACS is part of the PCIe specification that helps with device isolation. It's crucial in multi-GPU setups or complex PCIe configurations to ensure that devices can be isolated appropriately for passthrough. - Above 4G Decoding
This setting allows the system to map PCIe devices above the 4GB memory address space. It's often necessary to properly function high-end GPUs and other PCIe devices, especially in systems with large amounts of RAM. Enabling this can prevent issues with GPU passthrough in some systems.
Proxmox Configurations
Enable IOMMU
- Edit the GRUB configuration
vim /etc/default/grub
- Modify the
GRUB_CMDLINE_LINUX_DEFAULT
line
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"
- Update GRUB and reboot:
update-grub
reboot
Identify GPU PCI IDs
- Run the following commands to identify the PCI IDs for your GPU
lspci -nn | grep NVIDIA
0b:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA102 [GeForce RTX 3090] [10de:2204] (rev a1)
0b:00.1 Audio device [0403]: NVIDIA Corporation GA102 High Definition Audio Controller [10de:1aef] (rev a1)
- Note the PCI IDs for your GPU (e.g 10de:2204)
Load Required Modules
- Edit the modules file:
vim /etc/modules
- Add these lines:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
Blacklist NVIDIA Drivers
- Create a new blacklist file:
vim /etc/modprobe.d/blacklist.conf
- Add these lines:
blacklist nvidia
blacklist nouveau
Configure VFIO
- Create a new VFIO configuration file:
vim /etc/modprobe.d/vfio.conf
- Add this line, replacing
10de:2204
with your GPU's vendor and device IDs:
options vfio-pci ids=10de:2206
Update Initramfs
update-initramfs -u
Configure the VM in Proxmox
Create a new VM in Proxmox
- Configure and install for Ubuntu 24.04
- Change your machine type to q35 to give you access to the PCI-Express devices
Configure the VM's hardware configuration
- Add PCI Device
- Enable the ROM-Bar and PCI Express
Boot the machine up
Configure the Ubuntu Virtual Machine
Install the Nvidia drivers
sudo apt update && sudo apt dist-upgrade -y
sudo apt install nvidia-driver-535 nvidia-utils-535
Reboot the VM
Verify the GPU Passthrough
Check if the GPU is recognized
nvidia-smi
You should get an output that looks like this
Install Nvidia machine learning libraries
sudo apt install nvidia-cuda-toolkit