Q: How much does it cost, and what is the free credit?

Pricing is pay-per-token and platform-set: the same per-model rate for every provider, served live from GET /v1/models, with free signup credit and no card required. Any "free tokens" figure is an estimate; the real token count depends on the model and your input/output mix.

Q: Can I actually pay, or is everything simulated?

Developer top-up is live: you buy prepaid inference credit with a card via Stripe, and the charges are real. Credit is spendable on inference only, non-refundable, non-withdrawable, and has no cash value.

Q: When can providers get paid out?

Not yet: cross-border payout support is still being worked out, so no money is paid to providers during alpha. Earnings accrue in the meantime, but accrued amounts are projected, non-binding, and not withdrawable, and future payouts would be governed by separate terms.

Q: How do provider earnings work?

Earnings combine base pay (for being attested, online, and ready, scaled by resident model footprint) and an on-demand per-token component from catalog prices, a majority share of metered on-demand revenue. Withdrawable earnings require the code_attested tier, which is why payouts are gated during alpha.

Q: What Mac do I need to host?

An Apple Silicon Mac (M1 or newer) running macOS; Intel is not supported. Models are matched to unified memory (the console shows each model's minimum), and machines with roughly 16 GB or more can host models that qualify for base pay. Install with one command: curl -fsSL https://tryumbra.dev/setup | sh.

Q: Where do the models come from?

Every model is a public, open-weight GGUF on Hugging Face, selected and pulled by a provider with their own HF key; there are no buyer uploads or secret weights. For each one, /v1/models and the console surface the HF repo, pinned revision, license, GGUF SHA-256, architecture, quant, and minimum memory.

Question 1

What is Umbra?

Accepted Answer

Umbra is a marketplace for private-prompt inference on attested Apple Silicon. It is an OpenAI- and Anthropic-compatible API for open-weight, uncensored, and community GGUF models the incumbents will not host. Providers run the models on their own Apple Silicon Macs; developers reach them by swapping their SDK base URL.

Question 2

How is my prompt kept private?

Accepted Answer

Your prompt is decrypted in memory only, used to generate the response, then zeroized: never logged, written to disk, or stored, not even by the machine owner. The design assumes the provider is adversarial and closes the software paths rather than relying on a promise.

Question 3

Is the hardware attestation live in alpha?

Accepted Answer

Partly. The coordinator runs in an attested SEV-SNP confidential VM and Apple Managed Device Attestation of provider hardware is live, so the reference provider runs at the hardware tier. Independent runtime code identity (code_attested) and signed per-response receipts are still rolling out during alpha.

Question 4

What are the trust levels?

Accepted Answer

From weakest to strongest: none < self_signed < hardware < code_attested. hardware means Secure Enclave identity plus Apple MDA validation, freshness, serial and prompt-key binding; code_attested adds an independent proof of the running code identity. The verifier currently caps real providers at hardware.

Question 5

How much does it cost, and what is the free credit?

Accepted Answer

Pricing is pay-per-token and platform-set: the same per-model rate for every provider, served live from GET /v1/models, with free signup credit and no card required. Any "free tokens" figure is an estimate; the real token count depends on the model and your input/output mix.

Question 6

Can I actually pay, or is everything simulated?

Accepted Answer

Developer top-up is live: you buy prepaid inference credit with a card via Stripe, and the charges are real. Credit is spendable on inference only, non-refundable, non-withdrawable, and has no cash value.

Question 7

When can providers get paid out?

Accepted Answer

Not yet: cross-border payout support is still being worked out, so no money is paid to providers during alpha. Earnings accrue in the meantime, but accrued amounts are projected, non-binding, and not withdrawable, and future payouts would be governed by separate terms.

Question 8

How do provider earnings work?

Accepted Answer

Earnings combine base pay (for being attested, online, and ready, scaled by resident model footprint) and an on-demand per-token component from catalog prices, a majority share of metered on-demand revenue. Withdrawable earnings require the code_attested tier, which is why payouts are gated during alpha.

Question 9

What Mac do I need to host?

Accepted Answer

An Apple Silicon Mac (M1 or newer) running macOS; Intel is not supported. Models are matched to unified memory (the console shows each model's minimum), and machines with roughly 16 GB or more can host models that qualify for base pay. Install with one command: curl -fsSL https://tryumbra.dev/setup | sh.

Question 10

Where do the models come from?

Accepted Answer

Every model is a public, open-weight GGUF on Hugging Face, selected and pulled by a provider with their own HF key; there are no buyer uploads or secret weights. For each one, /v1/models and the console surface the HF repo, pinned revision, license, GGUF SHA-256, architecture, quant, and minimum memory.

Question 11

Does Umbra allow uncensored models?

Accepted Answer

Yes. Umbra is content-neutral with provider approval: many models are uncensored, and you may use them for any lawful purpose, including legitimate uses mainstream providers decline to serve. Clearly-illegal models and uses (CSAM, fraud, malware, attacks on the network) are prohibited under the Terms of Service.

Question 12

How do I verify a model is the real one?

Accepted Answer

Each catalog entry pins a Hugging Face revision and a GGUF SHA-256 digest, verified on every request, and you should review the model's HF page and license before using it. Attestation proves the machine is genuine Apple Silicon but does not warrant outputs: verifying the model is your responsibility.

Question 13

Can Umbra hand my prompts to anyone?

Accepted Answer

No: prompts and outputs are never retained, so there is nothing to produce, sell, or hand over. Umbra keeps only content-free account and metering data (email, a one-way hash of each API key, request and token counts, model used, wallet entries), never prompt or response text.

Question 14

How do I get started as a developer?

Accepted Answer

Sign in, mint a scoped API key in the console, then point your OpenAI or Anthropic client at https://api.tryumbra.dev/v1 with your umbra- key. Call GET /v1/models first to pick a model id that is live now rather than hardcoding one, since the catalog is small during alpha.

Frequently asked questions