📖 API Documentation

OpenAI-compatible interface · One-line integration · Streaming support

🔑 Authentication

All API requests require an API key. Include it in the Authorization header:

Authorization: Bearer tk-your-api-key

Get your API key from the Dashboard after registration. Keys have the prefix tk- and are shown only once upon creation.

⚠️ Keep your API key secret! Do not share it or commit it to version control. Use environment variables.

📋 Available Models

Model ID Type Context Input /M tok Output /M tok

Model ID	Type	Context	Input /M tok	Output /M tok
16 models available across 6 model families DeepSeek V4 · V3.2 · Qwen3.5 · Kimi K2 · GLM-5 · MiniMax M2 · Llama 3.3 View All Models `curl https://www.flintapi.ai/v1/models`

16 models available across 6 model families

DeepSeek V4 · V3.2 · Qwen3.5 · Kimi K2 · GLM-5 · MiniMax M2 · Llama 3.3

View All Models curl https://www.flintapi.ai/v1/models

GET /v1/models — List all available models

curl https://www.flintapi.ai/v1/models

💬 Chat Completions

POST /v1/chat/completions

Send a conversation and get a model-generated response. Supports multi-turn dialogue and system prompts.

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://www.flintapi.ai/v1",
    api_key="tk-your-api-key"
)

response = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    max_tokens=500,
    temperature=0.7
)

print(response.choices[0].message.content)

cURL

curl https://www.flintapi.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer tk-your-api-key" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [
      {"role": "system", "content": "You are an expert programmer."},
      {"role": "user", "content": "Write a Python function to find prime numbers up to N."}
    ],
    "max_tokens": 300,
    "temperature": 0.3,
    "stream": false
  }'

Response Format

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1717891234,
  "model": "deepseek-v4-flash",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Quantum computing uses qubits..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Streaming (SSE)

# Python streaming example
stream = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[{"role": "user", "content": "Tell me a story about AI"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Set stream: true to receive responses via Server-Sent Events (SSE).

⚙️ Parameters

Parameter	Type	Default	Description
model	string	Required	Model ID: `deepseek-v4-flash` or `deepseek-v4-pro`
messages	array	Required	Conversation messages with `role` and `content`
max_tokens	int	1024	Maximum tokens to generate
temperature	float	0.7	Sampling temperature (0=deterministic, 2=random)
top_p	float	1.0	Nucleus sampling threshold (0–1)
stream	bool	false	Enable SSE streaming response
stop	array	null	Stop sequences to halt generation
presence_penalty	float	0	Topic novelty bias (-2.0 to 2.0)

🚦 Rate Limits

To ensure fair resource allocation across all users, FlintAPI enforces the following rate limits:

Tier	Requests / Minute	Tokens / Minute	Concurrent Requests
Standard	60	100,000	5
Enterprise	Custom	Custom	Custom

When rate limited, the API returns HTTP 429. Headers include Retry-After (seconds). Need higher limits? Contact support@flintapi.ai.

⚠️ Error Codes

401

Missing or Invalid API Key

No Authorization header provided, or the key is invalid.

402

Insufficient Balance

Your account balance cannot cover the estimated cost of this request.

403

Access Denied

Your API key or account has been disabled. Contact support.

429

Rate Limit Exceeded

Too many requests. Check the Retry-After header and slow down.

500

Internal Server Error

Something went wrong on our end. Please try again later.

503

Service Unavailable

The inference backend is temporarily unavailable. Retry shortly.

🔌 SDK Examples

JavaScript / Node.js

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://www.flintapi.ai/v1",
  apiKey: "tk-your-api-key"
});

const response = await client.chat
  .completions.create({
    model: "deepseek-v4-flash",
    messages: [
      {role: "user", content: "Hello!"}
    ]
  });

console.log(response.choices[0]
  .message.content);

import "github.com/sashabaranov/go-openai"

config := openai.DefaultConfig("tk-key")
config.BaseURL = "https://www.flintapi.ai/v1"
client := openai.NewClientWithConfig(config)

resp, err := client.CreateChatCompletion(
  ctx,
  openai.ChatCompletionRequest{
    Model: "deepseek-v4-pro",
    Messages: []openai.ChatCompletionMessage{
      {Role: "user", Content: "Hello"},
    },
  },
)