Running ollama on RX 6600

Why Local LLMs?

I believe that Local LLMs are the better future for privacy reasons.

Setup

OS NixOS Unstable
CPU R9 7950X
RAM 64GB @ 6000Mhz
GPU RX 6600

Configuration

As seen in my configuration.nix, I have ollama enabled as a service. The problem with this for my GPU is that it does not automatically run with it. While I can dig up how to change the configuration to include an environment variable, we can just run it from the command line.

OLLAMA_HOST="127.0.0.1:11444" HSA_OVERRIDE_GFX_VERSION=10.3.0 ollama serve
  • OLLAMA_HOST is important here since we don't want to conflict with the default port.
  • HSA_OVERRIDE_GFX_VERSION is the important environment variable to set since this enables work with the GPU.

Usage

Simply run

OLLAMA_HOST="127.0.0.1:11444" ollama run llama2:latest

With your model of choice of course. If you have radeontop installed you should see the VRAM usage spike up to ~60%.

Models that fit in the VRAM

Here is a list of models I tested that fit in the VRAM of the 6600 as of [2024-04-13 Sat].

  • codegemma:7b
  • llama2:7b
  • zephyr:7b
  • gemma:instruct

Models that do not work

These models seem to not work although they fit in the GPU

  • phi:2.7b
  • wizardcoder:7b-python
  • wizardcoder:7b-python-q4_0

Have fun!