Key Takeaways
- Stockholm based Redpine raised a €6.8 million seed round led by NordicNinja, with Luminar Ventures and node.vc also participating, bringing total funding to about €9 million.
- The startup provides a real time API that lets AI agents and AI companies query and pay for premium licensed datasets, with a strong focus on scientific and other non public data.
- Angels from OpenAI, Perplexity, Spotify and other notable AI and data founders back the company, positioning Redpine as infrastructure for high stakes AI use cases in healthcare, law, finance and research.
- The new capital will fund international expansion and deepen exclusive data partnerships so agents can access more proprietary, domain specific content via a token based model.
Quick Recap
Redpine, a Stockholm based AI data infrastructure startup, has closed a €6.8 million seed round led by NordicNinja, with Luminar Ventures and node.vc joining the round. The funding brings its total capital raised to about €9 million and was first highlighted publicly through coverage by specialist SaaS and EU tech news outlets and social channels, including The SaaS News and Tech.eu, which flagged the round as breaking funding news for the European AI ecosystem.
Building a Licensed Data API for AI Agents
Redpine’s core product is a headless API that gives AI agents and AI builders on demand access to licensed, high quality datasets across sensitive verticals like healthcare, law, finance, and scientific research. Instead of scraping or relying on public web data, agents can query Redpine’s catalogue in real time and pay on a token based usage model, similar to how developers pay for model inference today.
The company positions itself as “Spotify for data,” arguing that better licensed access can outperform informal or grey area data use in both quality and compliance. Earlier disclosures indicated Redpine already offers tens of billions of tokens of premium, multimodal data, spanning text, images, video, audio and code, which can be used for pre training, fine tuning and retrieval augmented generation pipelines. The new seed capital will support further platform development, additional data connectors and security controls, and expansion of its network of proprietary and exclusive data partners.
Why This Round Matters Now?
The funding lands at a moment when AI agents are moving from prototypes to production, exposing a bottleneck around access to trustworthy, rights cleared data. Enterprises in regulated industries are under pressure to adopt AI while avoiding hallucinations and IP risk, which increases demand for curated, licensed datasets that can be audited and governed. Redpine’s model aligns with this shift by offering an infrastructure layer that sits between foundation models and domain specific content owners.
Redpine is entering a field that includes data annotation and platform players like Scale AI, Appen and Defined.ai, but those incumbents are historically annotation first and service heavy, whereas Redpine is API native and tuned for autonomous agents rather than human in the loop labeling at scale. If agentic frameworks continue to mature and more labs and enterprises look for “data as a service” contracts, this kind of programmable access layer could become a core part of AI stacks in Europe and beyond.
Competitive Landscape and Feature Comparison
Below, Redpine is compared with two relevant data infrastructure competitors in the AI training and agent data space: Scale AI and Defined.ai. Public sources do not disclose like for like metrics such as exact context window or per token pricing for these data providers, so cells are marked as not publicly disclosed where appropriate.
AI Data Infrastructure Feature Snapshot
| Feature/Metric | Redpine (subject) | Competitor A: Scale AI | Competitor B: Defined.ai |
| Core positioning | Licensed data API for AI agents and labs. | Data annotation and RLHF platform for AI teams. | Data collection and annotation marketplace for AI. |
| Context Window | Uses downstream model context windows; not a model provider, so no native fixed window disclosed. | Not applicable as a data and labeling provider, not a base model; context depends on client models. | Not applicable as a data provider; context depends on client models. |
| Pricing per 1M tokens | Token based access to datasets, exact price per 1M tokens not publicly disclosed. | Project and volume based pricing, no standard per token list rate disclosed. | Project based pricing, no public per token pricing. |
| Multimodal support | Yes, offers text, images, video, audio and code across domains. | Yes for data collection and labeling across text, image, video, audio. | Yes for multiple data types including speech, text and image. |
| Agentic capabilities | Designed explicitly as an API layer for autonomous AI agents to query real time licensed data. | Indirect, supports training models used by agents but not built as an agent query API. | Indirect, focuses on supplying datasets rather than powering live agent queries. |
| Primary customers | AI labs, autonomous agent platforms, enterprises in regulated sectors. | Big tech, model labs, enterprises needing labeled data. | Enterprises and AI teams sourcing domain specific datasets. |
| Geographic focus | Europe headquartered, expanding globally. | US headquartered, global operations. | Europe headquartered, global operations. |
From a strategic standpoint, Redpine appears to “win” on agent centric design, since its product is built as an always on licensed data API rather than a services led annotation shop. Scale AI and Defined.ai remain stronger choices for teams that primarily need custom data collection and labeling projects, but they are less directly focused on powering real time agent queries into proprietary content.
TechnoTrenz’s Takeaway
In my experience, this round is bullish for the emerging “data rails” layer of the AI stack, especially for anyone serious about deploying agents in regulated or high stakes domains. I think this is a big deal because it validates the idea that access to clean, licensed, domain specific data is becoming as strategic as the choice of foundation model itself.
For builders, Redpine’s approach could make it easier to stay compliant while still tapping into non public datasets that actually move the needle on quality. I generally prefer infrastructure plays that align economic incentives between data owners, model builders and end users, and this seed round suggests investors see the same alignment here for the next generation of AI agents.