Building RAG based on Ollama

June 19, 2024 by Kinh Nguyen

This is on Mac OS Sequoia

Install ollama, see the guides here

https://github.com/ollama/ollama

Install conda/miniconda

https://docs.anaconda.com/miniconda/miniconda-install/

Create new environment with

conda create -n ollama
conda activate ollama

Install python and package (note that all were done via pip afterwards)

conda search python
conda install python=3.12.4 #install your latest

Supporting libraries install

pip install llama-index \
	llama-index-llms-ollama \
	langchain-community \
	llama_index.embeddings.huggingface \
	ipywidgets \
	chromadb \
	llama-index-vector-stores-chroma

Make a settings.py as change Extra guides. to your customised directive

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.ollama import Ollama
# persistent store
from llama_index.core.storage.docstore import SimpleDocumentStore
from llama_index.core.storage.index_store import SimpleIndexStore
from llama_index.core.vector_stores import SimpleVectorStore
from llama_index.core import StorageContext
# chrome store
import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore

# streaming
from llama_index.core import get_response_synthesizer

# bge-base embedding model
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")
# ollama
Settings.llm = Ollama(model="llama3", request_timeout=360.0)

# initialize client, setting path to save data
db = chromadb.PersistentClient(path="chroma_db")

# create collection
chroma_collection = db.get_or_create_collection(working_collection)

# assign chroma as the vector_store to the context
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# load some documents
documents = SimpleDirectoryReader(working_dir).load_data()

# create your index
index =  VectorStoreIndex([])

if run_time == 0:
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    index = VectorStoreIndex.from_documents(documents,storage_context=storage_context)
    index.storage_context.persist(persist_dir="persist_data")
else:
    storage_context = StorageContext.from_defaults(vector_store=vector_store, persist_dir="persist_data")
    # index = VectorStoreIndex.from_documents(vector_store, storage_context=storage_context)
    index = VectorStoreIndex([], storage_context=storage_context)

query_engine = index.as_query_engine(streaming=True, similarity_top_k=1)

def ask(question):
    guides = "Extra guides."
    response = query_engine.query(guides+question)
    response.print_response_stream()

def update_doc():
    from llama_index.core import VectorStoreIndex
    index = VectorStoreIndex([])
    for doc in documents:
        index.insert(doc)

and save it to the structure

folder
	|data
		|text_data.txt
	|settings.py

The RAG is now ready

Then in a notebook we can use the RAG as

working_collection = "quickstart"
working_dir = "./data"
run_time = 1

where run_time should be set to 0 at the first run to generate vector store, working_collection to your collection, and data to your folder.

Load the next cell

%run -i settings.py

and then query as

ask("what is the main point of the text?")

Comments