Install Nvidia driver, CUDA, and CuDNN

3 minute read

This post is to record the best way I found to install Nvidia driver, CUDA and CuDNN on Ubuntu. It is probably more useful for updating those softwares than installing them for the first time. From time to time, we find ourselves in a situation where we need to update them because some other dependencies have moved to the next versions. For example, I just found out the nightly builds of Tensorflow 2.0 RC have moved to CUDA 10-0-0 from 9 as in the official release versions. And there are other times I would like to update the libOpenCL library. So this is just to help me remember how to do it.

I originally learned these things as part of building my own Linux desktop workstation. I followed the excellent tutorial at here, though my actual build is pretty different from the one therein. Lately I found the part of installing Nvidia drivers and CUDA softwares isn’t quite the best in my opinion. Instead of downloading files and installing them manually, a better approach would be to use apt. It is not only easier to install, but also more convenient to manage. Different versions of CUDA can be installed in /usr/local/ and a symlink is created for you to the one installed last. You can update the symlink the one you want to use and always have the environment variables like PATH and LD_LIBRARY_PATH point to the same symlink. However, I would recommend only keeping one version on your system to avoid unexpected/buggy behavior.

Remove existing drivers/CUDA

So to begin with, remove all the previously installed Nvidia softwares if you have installed any. I tried to install new versions of CUDA without removing older versions and got errors like

dpkg: error processing archive /var/cache/apt/archives/nvidia-418_418.87.00-0ubuntu1_amd64.deb (--unpack)

You can probably get away with this particular error using the following command

sudo dpkg -i --force-overwrite /tmp/apt-dpkg-install-IyEgEg/45-nvidia-418_418.87.00-0ubuntu1_amd64.deb

In fact, it is necessary to use it when you do get the above error since at that point the apt install system is stuck in a bad install. Using

sudo apt --fix-broken install

as suggested by apt won’t fix the problem. Once you fix this problem (or better still, directly proceed to the following uninstallation), you can use the following to remoe all existing Nvidia softwares

sudo apt remove --purge "nvidia*"
sudo apt autoremove

and reboot the system.

Install new versions of drivers/CUDA

If it is not in your APT repository yet, add the following source for drivers

sudo add-apt-repository ppa:graphics-drivers

and then

sudo apt update

You can search for the drivers available by

apt search nvidia

The correct/latest version can be checked here. For me, the current long lived branch version is 430 so I installed

sudo apt install nvidia-driver-430

I first tried nvidia-418 since that seems to be the binary proprietary version while nvidia-driver-430 is the open source metapackage version, but 418 makes the system graphics clicky so I changed to 430, which seems to be much better. After that, reboot the system and make sure that the new driver is running by checking with

nvidia-smi

Then we can install CUDA and CuDNN. Make sure that you know the right version you want to install. For example, the latest CUDA version is 10-1 while the current version Tensorflow 2.0 RC looks for 10-0. Then instal them using

sudo apt install cuda-10-0
sudo apt install libcudnn7

Edit 09/24/2019: In a recent reinstall of cuda-10-0, I had trouble installing cuda-10-0, it always complains that the --configure step for some libraries such as cuda-runtime-10-0, cuda-libraries-dev-10-0 cannot be completed. The reason provided was that cuda-npp-10-0 was not installed. But it is installed. The way I solved it is to simply clean the cache and reinstall it. After that we can manually trigue the configuration process and everything becomes sane again.

sudo apt clean
sudo apt install --reinstall cuda-npp-10-0
sudo dpkg --configure -a

Last but not least, don’t forget to add the right paths the your PATH and LD_LIBRARY_PATH environment variables, such /usr/loca/cuda/bin to PATH and /usr/local/cuda/lib64 to LD_LIBRARY_PATH. Then you are good to go ;-)

Tags:

Categories:

Updated:

Comments