Boilerplate to set up your own LLM infrastructure

Keep data on your servers without missing a GPT-like AI power.

Get LLM Boilerplate

See it in action

Our boilerplate is the framework for your LLM automation

Local translation API endpoint comes with the boilerplate

Content plan generation comes with the boilerplate

Go from 0 to 1 to N

Seamless LLM production for every use case — business, development, or passion projects.

For managers

Integrate LLMs to enhance experiences, streamline workflows, and drive growth.

Learn more

For developers

Effortlessly deploy, customize, and scale open-source LLMs.

Learn more

For enthusiasts

Experiment, build, and share with our easy-to-use boilerplate.

Learn more

What's included

Prepare

Any LLM you want

All opensource models from 🤗 HF are supported. Quantized or not. With reasoning or without.

Task-specific models

We suggest models for every use case (e.g. Qwen2.5-7B-Instruct for following instructions).

Extended context

Use RAGs for both more relevant and extended context via Qdrant engine.

Research

Prompt engineering

Write your own templates and prompts via a templating engine. Keep the logic inside the prompt.

Templates

15 handpicked templates right out of the box. Not hundreds - only the ones you will actually use.

Experimentation

Change your prompts and templates until you get the best solution on your data. Track it in Evidently.

Deploy

Privacy

Your data and data of your users is secure. Our boilerplate allows LLM to work on your servers only.

Backend

With our boilerplate you can use vLLM, Ollama, Llama.cpp or other backends for your inference.

Metrics

Collect telemetry, visualize it on charts or export reports to see how LLMs impact your business.

Success stories

Our knowledge comes from previously completed projects.
This is the foundation of our boilerplate.

+65% gain in text safety

Client had an F1 score of 0.51 using the OpenAI moderation API.

After switching to a Qwen2.5-based local API, their F1 score soared to 0.84 across 27 languages.

Replicate this success

Save weeks on
research and coding

From side-projects to organizations without LLM engineers - we've got you covered.

Personal license

Organization license

Starter

$ 290 / lifetime license

Single payment. Endless projects

Get personal license

License for current version

Best LLMs available

Extended LLM memory

Prepared prompts & templates

Experimentation tracking

Privacy

Easy production deploy

Production metrics collection

Basic support

Pro

$ 390 / lifetime license

Endless projects with support

Get personal license

License for current version

Best LLMs available

Extended LLM memory

Prepared prompts & templates

Experimentation tracking

Privacy

Easy production deploy

Production metrics collection

Slack community

12 months of updates

Under 1 week support

Premium

Talk to sales

Solutions tailored to your needs

Talk to sales

License for current version

Best LLMs available

Extended LLM memory

Prepared prompts & templates

Experimentation tracking

Privacy

Easy production deploy

Production metrics collection

Slack community

All + custom updates

Under 72 hours support

Starter

$ 1190 / lifetime license

Single payment. Endless projects

Get organization license

License for current version

Best LLMs available

Extended LLM memory

Prepared prompts & templates

Experimentation tracking

Privacy

Easy production deploy

Production metrics collection

Basic support

Pro

$ 1490 / lifetime license

Endless projects with support

Get organization license

License for current version

Best LLMs available

Extended LLM memory

Prepared prompts & templates

Experimentation tracking

Privacy

Easy production deploy

Production metrics collection

Slack community

12 months of updates

Under 1 week support

Premium

Talk to sales

Solutions tailored to your needs

Talk to sales

License for current version

Best LLMs available

Extended LLM memory

Prepared prompts & templates

Experimentation tracking

Privacy

Easy production deploy

Production metrics collection

Slack community

All + custom updates

Under 72 hours support

Frequently Asked Questions

It is an organized collection of prompts with connected RAG and a vector database, an experimentation platform to evaluate models and prompts, and a code to help you easily not only ship your LLM service in production, but also evaluate and improve it.

Yes. The boilerplate in itself doesn't send your data anywhere. It can be stored privately on your machine or servers. You can also process the data on your servers if you choose open-source LLMs. But if you choose to use OpenAI, Anthropic, or other proprietary API, then your data will be processed by them, at a cost of privacy.

It's not a prerequisite because you can always use Continue.dev + Ollama to code in Python with the use of prompts in English. But it sure helps knowing Python basics.

Yes. You can create multi-step workflows with this boilerplate.

You can use OpenAI or Anthropic APIs with this boilerplate but at the cost of privacy.

This boilerplate helps you to launch LLMs but doesn't provide the computational power to run them. There are several ways to get that compute:

1. You can use OpenAI / Anthropic APIs so that they can process the LLM request on their side.
2. You can rent a GPU server on AWS, Google Cloud, vast.ai, Hetzner, etc.
3. You can buy and host your own GPU server.
4. You can run it on your device locally for research purposes (a Mac with Apple Silicon will be enough to run a few great models).

Inside the boilerplate we provide models for all budgets.

We use it in the LLM agency to ship models faster, so we update it regularly. Not to mention how fast the industry is going.

After you've got access to the boilerplate, it is yours forever, so it can't be refunded.

Have another question? Contact me on LinkedIn, Twitter or by email

0 to 1 to N

Set up your own LLM

Don't waste time on choosing the right stack or ideating on how to evaluate prompts and models.

Get LLM boilerplate

Boilerplate to set up your own LLM infrastructure

Go from 0 to 1 to N

What's included

Any LLM you want

Task-specific models

Extended context

Prompt engineering

Templates

Experimentation

Privacy

Backend

Metrics

Success stories

+65% gain in text safety

Save weeks onresearch and coding

Frequently Asked Questions

What do I get exactly?

Can I keep my data private?

Do I have to know Python?

Can I create AI agents and assistants with this?

Can I use OpenAI?

Are there any other costs associated?

How often is this boilerplate updated?

Can I get a refund?

Set up your own LLM

Save weeks on
research and coding