About WareMax & Skelf Research

A Rust-core discrete-event simulator and Gymnasium environment for warehouse-robotics dispatching — built so that (seed, action sequence) ⇒ trajectory is a property, not a hope.

What it is

WareMax is an open-source project from Skelf Research. It is a Cargo workspace with two surfaces: a Rust CLI (waremax) for deterministic simulations, parameter sweeps, A/B tests with Welch’s t, and benchmarking with regression detection; and a Python extension built with maturin that exposes a Gymnasium environment (WaremaxAllocEnv) usable from stable-baselines3 and sb3-contrib.

What it simulates

WareMax models Robotic Mobile Fulfillment Systems — pod-to-person warehouses in the style of Kiva / Amazon Robotics. Robots are AMRs on a graph topology; stations are pick stations with concurrency and lognormal service times; orders arrive by a Poisson process with negative-binomial line counts and a Zipf SKU popularity model. The primary lever it studies is task allocation: which robot handles which pick task.

It does not simulate AS/RS cranes, conveyor sortation, fork-AGV tugger trains, or human pickers walking aisles, and it does not import warehouse CAD or DWG files. Scenarios are YAML files describing a graph topology, station list, robot count, traffic capacities, and policy stack.

Determinism

Reproducibility is enforced, not asserted. The core is single-threaded per scenario, uses a ChaCha8 RNG seeded from a u64, and applies canonical (id-based) tie-breaking throughout. The RL control loop wraps the simulator with a strict crossbeam ping-pong handshake so exactly one side runs at a time. Tests live in waremax-rl/tests/determinism.rs. As part of getting there, the project fixed several latent HashMap-iteration-order bugs that had made prior “seeded” results silently irreproducible. See verifying determinism for the full mechanism.

A finding, not a sales pitch

On the built-in presets, the trained RL dispatchers match the nearest-robot and round-robin heuristics but do not surpass them. The system is capacity- and destination-contention-bound; state-blind round-robin is near-optimal. WareMax exposes this as one of its findings, and provides the tunable structure — load, congestion, replicas, inventory SKU count — to let you find the regimes where dispatching choice does have leverage.

License and source

MIT-licensed. Source of truth: github.com/Skelf-Research/waremax. Authoritative docs: docs.skelfresearch.com/waremax/.

Who built it

Skelf Research is a research group that publishes practical, narrowly-scoped tooling. WareMax backs an ongoing research effort on warehouse dispatching and reward design under controllability constraints. Issues and pull requests on the GitHub repository are the canonical channel.