Untitled
In [ ]:
# -*- coding: utf-8 -*-
'''
Created on Tue Feb 10 13:38:04 2026

# This program will import and show the versions of existing libraries used in this Conda Environment.

## This is the method of setting up the Conda environment for semantic search and reasoning.

### This will get you close, it highlights needed steps and potential issues however, you will likely encounter issues on setup as libraries
    change ... be patient and use library documentation, Wiki, release notes, AI, etc. to help resolve bugs. 

'''

#####################################################################################################################################
#
# 1) Use Anaconda Prompt 
# 2a) Clean Cache: conda clean --all -y
# 2b) pip cache purge
# 3) Create environment: conda create -n llama_RAG4 python=3.12 -y
# 4) Activate environment: conda activate llama_RAG4
# 5a) Install Pytorch and related libraries (ran into conflicts w/ dlls): pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu130
# 5b) Install Pytorch and related libraries with conda (preferred for stability, installs pytorch 2.5.1 and CUDA 12.4): conda install pytorch pytorch-cuda -c pytorch -c nvidia 
# 6a) Verify Pytorch version: python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"
# 6b) Results: 2.5.1 and True
# 7a) Verify Torch CUDA Version and Availability: python -c "import torch; print(f'Torch CUDA Version: {torch.version.cuda}'); print(f'Is GPU Available: {torch.cuda.is_available()}')"
# 7b) Results: Torch CUDA Version: 12.4 and Is GPU Available: True
# 8) Verify CUDA version: nvidia-smi 
# 9) Results: 13.1 (This is fine as a higher version works because backwards compatible; nvidia-smi reports the maximum CUDA API version your NVIDIA driver supports (forward compatibility indicator))
# 10a) Verify version of the NVIDIA CUDA Compiler currently active on your system: nvcc --version --> Results Compiler is 12.6 (still pointing to old version)
# 10b) Windows 10 Press Windows key → type edit the system environment variables
# 10c) Edit and move to top (lower box System Variables): C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\libnvvp
# 10d) Edit and move to top (lower box System Variables): C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin\x64
# 10e) Edit and move to top (lower box System Variables): C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin
# 10f) Restart Anaconda Prompt 
# 10g) Verify version of the NVIDIA CUDA Compiler currently active on your system: nvcc --version --> Results Compiler is 13.0
# 11) CUDA_PATH: Edit this variable to be C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0.
# 12) CUDA_PATH_V13.0: If this variable does not exist, create it. Set its value to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0
# 13) To compile llama-cpp-python you must have the full Visual Studio Build Tools (full IDE not required) installed with the "Desktop development with C++
#     Alternatively you can install a precompiled wheel see below
#       - https://visualstudio.microsoft.com/downloads/?q=build+tools#build-tools-for-visual-studio-2026
#       - When the "Visual Studio Installer" opens, it will show you a list of "Workloads."
#       - You MUST check: "Desktop development with C++".
#       - On the right side, under Installation details, make sure "C++ CMake tools for Windows" is also checked.
# 14) Install C++ compiler and files
#       - This link from Microsoft: https://aka.ms/vc14/vc_redist.x64.ex
# 15) Install Numpy, Pandas, scikit-learn, matplotlib, and spyder: conda install -c conda-forge cmake numpy pandas scikit-learn matplotlib dateparser spyder -y
# 16a) Install transformers (see notes first below), evaluate, datasets accelerate: conda install -c conda-forge transformers evaluate accelerate -y
# 16b) Install datasets (doing with line above caused conflict): conda install -c huggingface -c conda-forge datasets
# 17) Install SentenceTransformers: conda install -c conda-forge sentence-transformers -y
# 18) Install FAISS IMPORTANT NOTE: conda install -c conda-forge faiss-gpu -y downgrades torch and turns off GPU instead: conda install -c pytorch faiss-cpu
#         - For personal RAG system on windows, FAISS initially longer on setup but on searches reasonable. More a concern with more users.
#           Unless you are searching through millions of vectors (e.g., a massive enterprise database), the latency difference between CPU and GPU
#           for a single query is measured in milliseconds and I will save on VRAM for LLM. 
# 19a) After the c++ compiler is installed, install Llama-cpp-python (first run all four commands, this takes 20 - 40 minutes)
# 19b)   set CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0
#        set PATH=%CUDA_PATH%\bin;%CUDA_PATH%\libnvvp;%PATH%
#        set CMAKE_ARGS=-DGGML_CUDA=on
#        set FORCE_CMAKE=1
#        pip install llama-cpp-python --no-cache-dir --verbose 
# 19c)   Verify installation (in Anaconda Terminal): python -c "import llama_cpp; print(llama_cpp.__version__)"
# 19d    Download Small Model and save to folder: https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/blob/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf 
# 19e)   Verify CUDA Availability (in Anaconda Terminal): python -c "from llama_cpp import Llama; llm = Llama(model_path='E:/Python/AI/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf', n_gpu_layers=-1, verbose=True)"
# 19f)   ALTERNATIVE COMPILED WHEEL llama-cpp-python CUDA 12.4: pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124
# 20) Install Llama-index: pip install llama-index
# 21) Install Llama-index-llms-llama-cpp: pip install llama-index-llms-llama-cpp
# 22) Install Llama-index-embeddings-huggingface: pip install llama-index-embeddings-huggingface
# 23) Install pyarrow (if conflict prevented install): conda install -c conda-forge pyarrow==20.0.0 -y
# 24) Save Conda Environment (once everything is working): conda env export > C:\Your\File\Location\llama_RAG_backup.yml 
#
###################################################################################################################################

'''
Notes:
 - If the C++ compiler installation was successful, you should see a copyright message from Microsoft, the compiler version (Microsoft (R) C/C++ Optimizing Compiler Version...), and basic usage information.
 - If you see an error like 'cl' is not recognized..., the C++ workload was likely not installed correctly, and you must return to the Visual Studio Installer to ensure "Desktop development with C++" is selected.
 - Note on nvidia-smi: This shows the maximum version of CUDA your NVIDIA Driver supports. Since I installed CUDA 13.1 for DaVinci Resolve,
   my driver was updated to handle up to 13.1. It shows 13.1 because that is what your hardware is capable of running.
 - If you want extreme detail on pip installation especially "llama-cpp-python" you can use --verbose at end of installation string.
 - If you want to do a dry run to see what the command will install before you do it, put this at end of command string: --dry-run 

@author: David DiPaola

'''

# Imports for Llama RAG and data analysis

import os
import random
os.environ["TORCH_COMPILE_DISABLE"] = "1"  # ← global disable, likely this is not needed

import torch
import datasets
from datasets import load_dataset, load_from_disk

import transformers
from transformers import AutoTokenizer, AutoModel, TrainingArguments, TextStreamer
from transformers.modeling_utils import PreTrainedModel

import sentence_transformers
from sentence_transformers import SentenceTransformer, CrossEncoder

import faiss

import llama_cpp
from llama_cpp import Llama

import numpy as np
import sqlite3
import pandas as pd
import json
import gc

# Imports for date manipulation and search

import dateparser
from datetime import datetime, timedelta
from collections import defaultdict
import re # when located date

# Import for debugging

import sys # used in debugging sys.exit(0) to stop program

# Verify Version of Import

print(f"\nVersions of Specific Libraries\n")
print(f"Datasets: {datasets.__version__}")
print(f"Transformers: {transformers.__version__}")
print(f"Sentence_Transformers: {sentence_transformers.__version__}")
print(f"FAISS: {faiss.__version__}")
print(f"Llama_cpp: {llama_cpp.__version__}")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"Numpy: {np.__version__}")
print(f"Pandas: {pd.__version__}")
print(f"dateparser: {dateparser.__version__}")
print(f"re: {re.__version__}")
print(f"Json: {json.__version__}")

print(f"\nGPU Availability for Torch\n")
# 1. Check GPU availability 
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"Memory Allocated: {torch.cuda.memory_allocated(0) / 1024**2:.2f} MB")
    print("PyTorch version:", torch.__version__)              # Expect 2.5.1 or similar
    print("Torch CUDA version:", torch.version.cuda)
    print("Is GPU available:", torch.cuda.is_available())     # MUST be True
    print("CUDA device count:", torch.cuda.device_count())    # At least 1
    print("Max memory:", torch.cuda.get_device_properties(0).total_memory / 1e9, "GB")  # Your VRAM
else:
    print("WARNING: PyTorch cannot see the GPU. Check your 'pytorch-cuda' install.")

# 1. Check if the Llama-cpp was built with CUDA support

print(f"\nGPU Availability for llama-cpp\n")
try:
    llm = Llama(model_path="C:/File/Location/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf", n_gpu_layers=-1, verbose=False)
    print("SUCCESS: Llama-cpp is using the GPU!")
except Exception as e:
    print(f"FAILURE: Could not initialize GPU: {e}")

'''

# Results of code

Versions of Specific Libraries

Datasets: 4.5.0
Transformers: 4.57.6
Sentence_Transformers: 5.2.2
FAISS: 1.12.0
Llama_cpp: 0.3.16
PyTorch version: 2.5.1
CUDA available: True
Numpy: 1.26.4
Pandas: 2.3.3
dateparser: 1.3.0
re: 2.2.1
Json: 2.0.9

GPU Availability for Torch

GPU: NVIDIA GeForce RTX 3070
Memory Allocated: 0.00 MB
PyTorch version: 2.5.1
Torch CUDA version: 12.4
Is GPU available: True
CUDA device count: 1
Max memory: 8.589410304 GB

GPU Availability for llama-cpp

llama_context: n_ctx_per_seq (512) < n_ctx_train (2048) -- the full capacity of the model will not be utilized
SUCCESS: Llama-cpp is using the GPU!

'''