Beekeeper
Beekeeper is a lightweight web app designed to allow you to do AI training on a remote server as part of your home lab. At its core, it’s designed to handle -
- Cloning a repository.
- Setting up the python environment(based on your requirements.txt)
- Remote log streaming
- Tensorboard Display
- File downloads

Critical missing features…mostly security stuff.
- Authentication - Beekeeper has no authentication, and it does allow access to files you’ve cloned or generated in your training run. For now, I would strongly recommend running Beekeeper only in a home lab scenario, where the server is sitting safely on your local network, and avoiding any sensitive data.
- GitHub auth - Beekeeper has no method of authenticating with your remote repo. It only works on repos you’ve made public.
- Https - For https, you’ll need to put Beekeeper behind a proxy and, again, it’s not ready to do anything secure anyway.
- Multi-server support - Eventually, I’d like to have a central Hive server managing multiple workers, and farming jobs out. Today is not that day. This is a single server product.
Getting Started
Requirements
- Python 3.10+
- Git
- A Linux server with systemd(Currently tested on Ubuntu)
- One or more GPUs (optional but recommended)
Installation
Clone the repo and run the setup script. It creates a virtual environment, installs dependencies, and sets up a systemd service.
git clone https://github.com/bobcowher/beekeeper.git
cd beekeeper
bash setup.sh
The setup script will:
- Detect your Python version (3.12, 3.11, 3.10, or python3)
- Create a venv and install dependencies
- Generate and install a systemd service file (requires sudo)
- Enable and start the service
Once complete, Beekeeper is running on port 5000. Open http://your-server:5000 in a browser.
Managing the Service
# Check status
sudo systemctl status beekeeper
# View logs
journalctl -u beekeeper -f
# Restart
sudo systemctl restart beekeeper
# Stop
sudo systemctl stop beekeeper
Development Mode
If you prefer to run Beekeeper without systemd for development or testing:
cd beekeeper
source venv/bin/activate
python app.py
This runs Flask’s development server on port 5000 with auto-reload.
Creating a Project
From the dashboard, click + New Project and fill in:
| Field | Description | Default |
|---|---|---|
| Project Name | No spaces, used as the directory name | — |
| Git URL | Public Git repository URL | — |
| Branch | Git branch to clone and pull before each run | main |
| Python Version | Detected from system and conda | auto |
| Environment Type | venv or conda | venv |
| Training Script | Python file to execute when training starts | train.py |
| Tensorboard Log Dir | Where your script writes TB event files | runs |
| Requirements File | Pip requirements file installed at setup and before each run | requirements.txt |
| Setup Script | Optional shell script run at setup and before each training run | — |
| Data Dir (local) | Local path in the repo to symlink to your data volume | data |
| Data Dir (system) | Absolute path on the server to a persistent data volume | — |
Every field has a tooltip — hover the ? icon for a description.
Once you submit, Beekeeper runs the following in the background:
- Git clone — clones the repository at the specified branch into a
workspace/directory - Create environment — creates a venv or conda env with the selected Python version
- Data dir symlink — if enabled, creates a symlink from
workspace/<local path>to the system data directory - Setup script — if configured and the file exists, runs it from the workspace root
- Pip install — installs packages from the requirements file
The project page refreshes automatically and shows the current step. If any step fails, the error is displayed and a Retry Setup button appears. Retry is smart — it skips the clone and environment creation if they already completed successfully, and picks up from the failed step.
Running Training
Once setup completes, hit Start Training on the project page. Beekeeper runs the following sequence before launching your script:
- Git pull — pulls the latest code from your configured branch
- Data dir symlink — verifies or creates the symlink if a data directory is configured
- Setup script — runs your setup script if configured and present
- Pip install — installs/updates packages from the requirements file
- Launch — starts the training script as a detached subprocess
If any step fails, training is aborted and the error is shown on the project page.
Closing the browser tab has no effect on the running process.
The project page shows:
- Status — running, stopped, crashed, or idle
- PID and elapsed time while running
- Live logs — expand the Logs section to stream stdout/stderr in real time. Each run starts with a header showing the timestamp, hostname, git commit SHA, branch, Python version, training script, and GPU info, and ends with a footer showing elapsed time and exit status.
- Tensorboard — auto-starts alongside training with a dynamic port, embedded as an iframe with an option to open in a new tab
Hit Stop Training to send SIGTERM (with a SIGKILL fallback after 5 seconds).
Environment Variables
Training scripts often need environment variables — API keys, config flags, hyperparameters. Click Edit on the project info card to add key-value pairs. These are passed to the training process at startup.
Setup Script
If your project needs system-level setup beyond pip — downloading a dataset, linking shared weights, generating config files — you can point Beekeeper at a shell script.
# example setup.sh (place this in your repo root)
#!/bin/bash
set -e
mkdir -p data
if [ ! -f data/iris.csv ]; then
echo "Downloading dataset..."
curl -fsSL https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv \
-o data/iris.csv
fi
Set Setup Script to setup.sh (or whatever your script is named) when creating or editing a project. Beekeeper will run it from the repository root:
- Once during initial project setup (after the environment is created, before pip install)
- Again before every training run (after git pull, before pip install)
The script is silently skipped if the file doesn’t exist, so you can set it once and it won’t cause errors on environments that don’t have it yet.
Data Directory
For projects that need access to a large persistent dataset stored elsewhere on the server — a mounted NAS share, a shared /data volume, or any local path — use the Data Directory fields.
| Field | Purpose |
|---|---|
| Data Dir (local) | Path within the repo to create as a symlink (default: data) |
| Data Dir (system) | Absolute path on the server to link to |
Beekeeper creates a symlink at src/<local> → <system path> during project setup, and ensures it exists again before each training run. Your training script just reads from data/ as if the dataset lived inside the repo.
Leave the system path blank if you don’t need this feature.
Editing Project Settings
Click Edit on the project page to change:
- Git branch
- Training script path
- Tensorboard log directory
- Requirements file
- Setup script
- Data directory (local and system paths)
- Environment variables
Name, Git URL, Python version, and environment type are fixed after creation.
Viewing and Downloading Files
Expand the Files section on the project page to browse the project’s workspace directory. You can preview files inline, download individual files, or download entire directories as zip archives.
Inline Viewer
Click any viewable filename or the view button to open it in a modal without leaving the page.
| File type | Extensions | Behavior |
|---|---|---|
| Images | png, jpg, jpeg, gif, webp, svg, bmp, ico | Rendered inline. Auto-refreshes every 2 seconds — useful for monitoring debug images written during training. |
| Text / code | py, log, json, yaml, md, sh, csv, toml, js, ts, html, xml, and more | Displayed in a monospace viewer. Files over 1 MB fall back to download. |
Close the viewer with the × button, by clicking the backdrop, or by pressing Escape.
Using curl
The same endpoints that power the UI work with curl:
# List files in the project root
curl http://your-server:5000/projects/my-project/files/
# Download a specific file
curl -O http://your-server:5000/projects/my-project/files/checkpoints/model.pt
# Download a directory as a zip
curl -o checkpoints.zip 'http://your-server:5000/projects/my-project/files/checkpoints/?zip=1'
The JSON listing includes file names, sizes, and types — useful for scripting downloads of specific checkpoints or outputs.
Tensorboard
Tensorboard starts automatically when training starts and stops when training stops. It runs from the project’s own environment, so it uses whatever version of Tensorboard is in the project’s requirements.
The port is allocated dynamically starting at 6006. You can:
- View it inline in the iframe on the project page
- Expand the iframe to full height
- Open it directly in a new browser tab
- Clear accumulated Tensorboard logs with the Clear Tensorboard Logs button
REST API
Beekeeper exposes a REST API for programmatic control of projects. All endpoints return JSON with a consistent format:
{"success": true, "data": {...}}
{"success": false, "error": {"code": "...", "message": "..."}}
Key Endpoints
| Action | Method | Endpoint |
|---|---|---|
| List projects | GET | /api/v1/projects |
| Start training | POST | /api/v1/projects/<name>/training/start |
| Stop training | POST | /api/v1/projects/<name>/training/stop |
| Check status | GET | /api/v1/projects/<name>/training/status |
| Get logs | GET | /api/v1/projects/<name>/logs?tail=100 |
| Get metrics | GET | /api/v1/projects/<name>/tensorboard/latest |
| List files | GET | /api/v1/projects/<name>/files |
| Download file | GET | /api/v1/projects/<name>/files/<path> |
| System stats | GET | /api/v1/stats |
TensorBoard Metrics Analysis
The /tensorboard/latest endpoint analyzes your training metrics and returns insights:
curl http://your-server:5000/api/v1/projects/my-project/tensorboard/latest?detail=medium
Which run is analyzed?
- If training is running, it analyzes the current active run
- If training is idle, it analyzes the most recent completed run
- The response includes
is_active: true/falseto indicate which
To compare with past runs:
# List all runs
curl http://your-server:5000/api/v1/projects/my-project/runs
# Get metrics for a specific past run
curl http://your-server:5000/api/v1/projects/my-project/runs/3/metrics
The response includes trend analysis, convergence detection, and anomaly detection for each metric:
- trend:
improving,stable,worsening, orunstable - converged: boolean indicating if the metric has stabilized
- anomalies: array of unusual spikes or drops
- summary: human-readable interpretation
Agent Integration
Beta Feature — Available in the
developbranch. Not yet in stable release.
Beekeeper can be controlled by AI agents (like Claude Code). Each project page has an API section with two subsections:
- Human: curl examples for command-line use
- Agent: downloadable instructions file
Setting Up an Agent
- Open your project in Beekeeper
- Expand API → Agent
- Click Download BEEKEEPER_<project>.md (or use the curl command shown)
- Add the file to your project’s root directory or
~/.claude/
The downloaded file contains:
- Quick reference table of all endpoints
- Pre-flight check guidance (always check status before start/stop)
- Terminology mapping (“check logs” → which endpoint to use)
- Detailed metrics interpretation guide
- Common workflows
Agents can then control training runs, monitor progress, analyze metrics, and download results via HTTP requests
Run History
Each project tracks its training runs. Expand the Run History section on the project page to see:
- Start time and duration
- Status (completed, crashed, canceled)
- Git commit at the time of the run
- Download link for archived logs
Run logs are automatically archived when training completes or is stopped. The history is pruned to keep the last 20 runs.
Organizing Projects
As your project list grows, the dashboard gives you two tools to stay organized.
Sort Order
A toggle in the Projects header switches between:
- Last Run (default) — projects you’ve trained most recently float to the top. Projects that have never been run sink to the bottom.
- A–Z — alphabetical order.
Your preference is saved in the browser and remembered across sessions.
Pinning
Click the 📌 icon on any project row to pin it. Pinned projects always appear above the sorted list, regardless of sort order. Click again to unpin. Pin state is saved in project.json on the server.
System Stats
The dashboard shows live GPU, CPU, and memory stats, updated every 2 seconds. GPU monitoring uses nvitop for detailed per-GPU utilization, temperature, VRAM, and power draw.
Project Structure
Beekeeper organizes everything under its install directory:
projects/— one subdirectory per projectproject.json— config and stateworkspace/— cloned git repovenv/— Python environmenttrain.log— training output
app.py— Flask approutes/— HTTP endpointsservices/— business logicmodels/— data modelstemplates/— Jinja2 templatesstatic/— CSS and JSsetup.sh— installation script
Notes
- Beekeeper runs with a single Gunicorn worker (
-w 1) because training state is tracked in memory. The setup script configures this automatically. - Training processes are fully detached — they survive browser disconnects, but not server reboots. After a reboot, the systemd service restarts Beekeeper, but any previously running training jobs will need to be restarted from the UI.
- There’s no authentication yet. Don’t expose Beekeeper to the public internet without putting it behind a reverse proxy with auth.