04 — LLM Providers & Account Management

Connect your AI accounts, set up failover, never hit rate limits.

Quick Start — Which provider should I use?

If you want...	Use this	Cost
Cheapest possible	Claude CLI (Max subscription)	$20/month flat, unlimited
Best quality + speed	Anthropic API	~$3-15/MTok
Access to everything	OpenRouter	Varies by model
Free + private	Ollama (local)	$0 (runs on your hardware)
Mix of all	All of the above	Arvis auto-rotates between them

The recommended setup: Claude CLI as primary (cheap, unlimited) + one API key as backup.

Provider Setup Guides

1. Claude CLI (Max Subscription — $20/month, unlimited)

Use your Claude Pro/Max subscription with Arvis. No per-token cost — just your monthly subscription. This is the cheapest way to run Arvis.

Setup (all platforms — Windows, Mac, Linux)

Step 1: Install Claude Code

npm install -g @anthropic-ai/claude-code

Step 2: Add an account

npm run add-account

A browser opens — log in with your Claude account. Auth files save to data/accounts/acc1/. Done.

Step 3: Add more accounts (optional — for zero rate limits)

npm run add-account work
npm run add-account personal

That's it. No .env changes needed — Arvis auto-detects all accounts in data/accounts/ on startup.

Session expired? Re-login:

# Just run add-account again with the same name — it detects existing auth and re-authenticates
npm run add-account acc1

Advanced: Manual env var setup

If you prefer managing account directories yourself:

CLAUDE_CLI_HOME=/path/to/claude-home-dir
CLAUDE_CLI_HOME_1=/path/to/second-account
CLAUDE_CLI_HOME_2=/path/to/third-account

Good to know

Each account = one Claude Pro/Max subscription
npm run add-account auto-names accounts acc1, acc2, acc3... or you can pass a custom name
3 accounts = practically unlimited messages (rate limits rotate between them)
Auth folders contain tokens — don't share them or commit them to git
On a headless VPS (no browser), the login prints a URL — open it on your phone/laptop to complete

2. Anthropic API (pay-per-token)

Best for: fast responses, full tool support, precise cost control.

Get your key: console.anthropic.com → API Keys → Create Key

ANTHROPIC_API_KEY=sk-ant-api03-xxxxx

Multiple keys (for higher rate limits):

ANTHROPIC_API_KEY=sk-ant-main-key
ANTHROPIC_API_KEY_1=sk-ant-backup-key
ANTHROPIC_API_KEY_2=sk-ant-team-key

Available models:

Model	Speed	Quality	Cost (input/output per MTok)
`claude-opus-4-6`	Slow	Best	$15 / $75
`claude-sonnet-4-6`	Fast	Great	$3 / $15
`claude-haiku-4-5-20251001`	Fastest	Good	$0.80 / $4

Agent model spec: claude-sonnet-4-6 or anthropic/claude-sonnet-4-6

3. OpenAI (pay-per-token)

Get your key: platform.openai.com → Create new secret key

OPENAI_API_KEY=sk-xxxxx

Multiple keys:

OPENAI_API_KEY=sk-main
OPENAI_API_KEY_1=sk-backup

Available models:

Model	Speed	Quality	Cost (input/output per MTok)
`gpt-4.1`	Fast	Great	$2 / $8
`gpt-4.1-mini`	Fastest	Good	$0.40 / $1.60
`gpt-4.1-nano`	Instant	Basic	$0.10 / $0.40
`o4-mini`	Slow	Reasoning	$1.10 / $4.40

Agent model spec: openai/gpt-4.1-mini

4. OpenRouter (one key, all models)

Best for: trying different models without managing multiple API keys.

Get your key: openrouter.ai → Create Key

OPENROUTER_API_KEY=sk-or-xxxxx

Access to 200+ models: Claude, GPT-4, Gemini, Llama, DeepSeek, Qwen, Mistral, and more. Pricing varies per model.

Agent model spec: openrouter/claude-sonnet-4-6 or openrouter/meta-llama/llama-4-maverick

5. Google Gemini (pay-per-token)

Get your key: aistudio.google.com → Create API Key

GOOGLE_API_KEY=AIzaSyxxxxx

Available models:

Model	Speed	Quality	Cost (input/output per MTok)
`gemini-2.5-pro`	Medium	Great	$1.25 / $10
`gemini-2.5-flash`	Fast	Good	$0.30 / $2.50
`gemini-2.0-flash-lite`	Fastest	Basic	Free tier available

Agent model spec: google/gemini-2.5-flash

6. Ollama (free, runs locally)

Best for: privacy, offline use, zero cost. Requires a decent GPU.

Install: ollama.com/download

# Pull a model
ollama pull llama3
# or
ollama pull qwen2.5-coder:7b

OLLAMA_BASE_URL=http://localhost:11434

Agent model spec: ollama/llama3

7. Custom OpenAI-Compatible Provider

Any service with an OpenAI-compatible API (Together AI, Groq, Fireworks, etc.):

CUSTOM_BASE_URL=https://api.together.xyz/v1
CUSTOM_API_KEY=your-key

How Arvis Picks Which Account to Use

You don't have to think about this — it's automatic. But here's what happens:

Priority order (tries top to bottom)

Priority	Provider	Why
10	Claude CLI	Free with subscription
20	Anthropic API	Fast + best Claude models
50	OpenAI	Good alternative
60	OpenRouter	Access to everything
70	Google Gemini	Solid backup
200	Ollama	Free, local, last resort

Within the same provider, the account with the fewest messages used is picked (load balancing).

When one gets rate limited

Arvis automatically switches to the next available account. You don't see anything — maybe a slightly delayed response.

Retry	Cooldown before retrying that account
1st	1 minute
2nd	5 minutes
3rd	25 minutes
4th+	60 minutes (cap)

If ALL accounts are exhausted, you'll see: "All AI accounts are rate-limited. Retrying automatically in ~X minutes."

Smart model selection

When Arvis has to use a backup provider, it picks the right size model:

Short/simple messages ("hi", "what time is it") → cheap model (haiku/mini/flash)
Long/complex messages ("analyze this code", "design a system") → powerful model (sonnet/opus/pro)

Recommended Setups

Budget setup (free or almost free)

# Just one Claude CLI account — $20/month, covers most usage
npm run add-account

# Optional: free local backup
OLLAMA_BASE_URL=http://localhost:11434

Standard setup (reliable, low cost)

# Primary: 2 Claude CLI accounts (auto-detected, no env vars needed)
npm run add-account
npm run add-account backup

# Backup: Anthropic API with cheap model
ANTHROPIC_API_KEY=sk-ant-xxxxx

Power setup (never rate limited, best quality)

# 3 Claude CLI accounts
npm run add-account
npm run add-account work
npm run add-account personal

# API backup
ANTHROPIC_API_KEY=sk-ant-main
ANTHROPIC_API_KEY_1=sk-ant-backup

# Fallback
OPENAI_API_KEY=sk-xxxxx

# Last resort
OLLAMA_BASE_URL=http://localhost:11434

Setting Models Per Agent

In Dashboard → Agents → click your agent → Config tab:

Primary model: anthropic/claude-sonnet-4-6
Fallbacks: openrouter/claude-sonnet-4-6, openai/gpt-4.1-mini

Or tell the Conductor: "Set dev-agent's model to claude-sonnet-4-6 with openai fallback"

Cost Tracking

Every API call is logged automatically. View in Dashboard → Usage page.

CLI accounts always show $0 cost (subscription model). API accounts show exact per-token costs.

You can disable expensive accounts in Dashboard → Settings → Accounts without removing them from .env.

Advanced: How Failover Works Internally

Request comes in for agent with model="claude-sonnet-4-6"
│
├─ STAGE 1: Try preferred provider (anthropic)
│   → Pick account with lowest usage that isn't rate-limited
│   → If found: use it. Done.
│
├─ STAGE 2: Try fallback chain from agent config
│   → agent.modelFallbacks = ["openrouter/claude-sonnet-4-6", "openai/gpt-4.1"]
│   → Try each in order until one works
│
└─ STAGE 3: Any account at all
    → classifyComplexity(prompt) → pick appropriate model size
    → If nothing available: queue for retry with backoff

Advanced: Per-Provider API Details

Provider	Endpoint	Auth	Tool format
Anthropic	`api.anthropic.com/v1/messages`	`x-api-key` header	`tool_use` content blocks
OpenAI / OpenRouter / Ollama	`/v1/chat/completions`	`Bearer` token	`function_call` messages
Google Gemini	`generativelanguage.googleapis.com`	`key=` query param	`functionCall` / `functionResponse`

Advanced: Tool Call Loop

When the LLM wants to use a tool (web_search, http_fetch, etc.), the Provider Runner automatically handles multi-turn tool calls — up to 5 rounds per request. The CLI runner delegates tool use to Claude Code's built-in tools (bash, file access, etc.).