Prototyping an MCP server for OKO: an R&D deep dive

A highlight of working at Magnopus is the dedicated time and space engineers receive for R&D and experimentation (read more about that here). This allows us to use our creativity and explore ideas in focused sessions outside our day-to-day work. The following is the journey of a recent R&D session of mine.

The idea

In recent months, you may have seen several articles and social media posts about "MCP servers" (a type of server allowing AI tools to access external systems) and connecting "AI assistants" to applications to automate tasks using Large Language Models (LLMs), which are powerful AI systems trained on vast amounts of data to predict and generate human-like text. Looking at these examples of integrations and the cool things people could do with them, I wanted to explore what integrating with an LLM could do for our platform, OKO, and its users in the future.

Why could this be useful?

One of our aims with OKO is to make 3D experiences available to everyone. This is especially true of the web version, which you can use by simply opening app.oko.live in a web browser. No installation needed!

We want to empower our users to create the cross-reality, connected spaces they imagine with easy-to-use tools, allowing them to express their creativity and share it with others. However, there will always be a learning curve when getting to grips with a new application. As simple as we try to make things, understanding the concepts behind the platform and how to use the tools takes time and can be daunting for some.

So, what if you could simply ask someone more experienced how to get started, or better yet, get them to work alongside you? Integrating an AI assistant into OKO would enable users to take their first steps, create a first draft of a space, or automate unwanted tasks. Beyond that, it creates pathways to improve accessibility for disabled or visually impaired users.

In addition, an AI assistant could also be useful when exploring a space. Imagine a virtual museum experience where an AI guide could take you on a personalised tour, answering your questions along the way. Applied responsibly, AI could unlock a whole new set of possibilities for those who wish to use it. The potential is there, and so begins our journey to a prototype! 

But first, let’s dive into what an MCP server is.

MCP and MCP servers

Model Context Protocol is an open-source standard created by Anthropic that enables AI assistants to connect with external data sources and tools in a standardised way. It offers a communication model that allows AI models to interact with various applications, data sources and services in a controlled manner.

The MCP architecture follows a client-server model where a client, for example, an AI assistant, can connect to one or more servers, exposing specific capabilities to the client. An MCP server is typically focused on a specific application or service, such as OKO.

In particular, an MCP server can expose a set of tools performing specific operations within an external system. These are the actions the AI assistant will be able to perform via the server – in our case, this will be creating or deleting objects inside OKO. 

A tool consists of three parts:

  • A name and description giving the AI model some context about what the tool is for.

  • A schema that unambiguously describes the data the tool accepts.

  • A function with the code to carry out the operation.

Creating the prototype

This work was partly inspired by the efforts of the PlayCanvas team to create an MCP server for the PlayCanvas editor. We already use the PlayCanvas engine for the web version of OKO, so we're huge fans of their open-source tech and follow them closely.

To get started quickly with a prototype, I leveraged the infrastructure of PlayCanvas’ MCP server repository for our implementation. Remember, the goal was to get a prototype up and running, not to get bogged down in the details of getting the components to communicate.  

Happily, it turns out we can do something very similar. The PlayCanvas version consists of two main parts: the MCP server itself, which plugs into an LLM-powered desktop application (for example, Claude desktop), and a browser extension that runs commands within the editor via an API. These communicate over a WebSocket, effectively allowing Claude desktop to interact with the PlayCanvas editor.

For the OKO version, we can use the same architecture. We'll create:

  • An OKO MCP server that Claude will connect to.

  • A browser extension that will run commands using an OKO console API.

We don't currently have an OKO console API, so we'll need to create one to interface with the extension. The API is a set of commands we expose on the global window object in the browser that can be accessed within the browser JavaScript console. A console API will also be useful for advanced OKO users, which is another benefit of this work. 

OKO MCP server tools

As mentioned, an MCP server exposes a set of tools to the LLM-powered host application. We need to think about what tools would be useful for the prototype and could produce an interesting demo. 

An OKO space consists of entities and components, concepts that will be familiar to anyone who has worked with a game engine using an entity-component system. In short, an entity is an object that exists at a location in a space and is made of components that give the object an appearance and behaviours.

Entities and components are therefore key to allow the LLM to act within a space. To start with, we'll create the following tools:

  • Query what entities and components are in the space.

  • List what primitives are available to create in a space.

  • Add a new entity to the space.

  • Add a new model component to an entity.

Model components in OKO allow you to add a 3D model to an entity. In OKO, users can upload their own 3D models or search Sketchfab for suitable assets. For our prototype, we'll expose OKO's built-in primitive assets (sphere, cube, etc). This minimal set of tools should give us enough to prove the concept.

Creating an MCP server tool

To create a tool for our MCP server, we need to do a few things. Let's take creating a new entity in OKO as an example. An entity has the following properties:

  • A name.

  • A position, rotation, and scale in 3D space.

  • A set of components belonging to the entity.

First, define the shape of the data we expect as a Zod schema. Zod (https://zod.dev/) is a common schema validation library used for exactly this.

import { z } from "zod";

export const EntitySchema = z.object({
  name: z.string().optional().describe("The name of the entity."),
  position: Vec3Schema.describe(
    "The position of the entity in local space (x, y, z)."
),
  rotation: QuaternionSchema.describe(
    "The rotation of the entity in local space (qx, qy, qz, qw as a quaternion)."
  ),
  scale: Vec3Schema.describe(
    "The scale of the entity in local space (sx, sy, sz)."
  ),
  components: z
    .array(ComponentPropsTypesSchema)
    .describe("An array of components belonging to the entity")
});

The descriptions here are important as they will provide information to the LLM about what each property is for.

Secondly, we register a tool that uses our schema.

import { type McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { type WSS } from "../wss";
import { EntitySchema } from "./schemas/entity";

export const register = (server: McpServer, wss: WSS) => {
  server.tool(
    "create_entity",
    "Create a single entity",
    {
      entity: EntitySchema,
    },
    ({ entity }) => {
      return wss.call("entities:create_entity", entity);
    }
  );
};

Here, "create_entity" is our tool’s name, followed by a short description. This registers a tool that takes input data following our schema and creates a new entity in the space.

The last argument in the server.tool definition is a function that gets executed when the AI assistant invokes the tool. Here we are simply sending a command over a WebSocket with the entity creation data, following a custom protocol. On the other side, our browser extension will receive the command and invoke the appropriate call in OKO.

wsc.method("entities:create_entity", async (entityProps) => {
  console.log(`entities:create_entity`);

  const okoApi = window.okoConsoleApi;

  const entity = await okoApi.createEntity(entityProps);

  if (!entity) {
    return { error: "Failed to create entity" };
  }

  console.log(`Created entity ${entity.id}`);

  return { data: toJson(entity) };
});

This function will be called when the browser extension receives an "entities:create_entity" message. Here we retrieve the OKO console API from the global window object and call an associated function in the OKO console API. I mentioned earlier that we don't have one yet, so we'll also need to create that. Following a similar process, we can implement the remaining tools from our list.

Does it work?

To test our MCP server, we'll use Claude desktop. By using MCP, our prototype will be compatible with many AI assistants and tools, but Claude desktop is the most commonly used and easiest to set up.

Having first installed our browser extension, we open Claude desktop's config file (via File > Settings > Developer > Edit Config) and add our server.

{
  "mcpServers": {
    "oko": {
      "command": "cmd",
      "args": [
        "/c",
        "npx",
        "tsx",
        "C:\\path\\to\\mcp\\server.ts"
      ]
    }
  }
}

Success! We can see our server is running.

Let's try a few examples. First, we try with the prompt: "Create a 3x3 grid of cubes in OKO. Make them different colours".

It's working! You can see it’s adding each cube into the space one by one. An improvement here would be to implement a tool for batch-creating entities to speed up the process.

Let's try something a bit more ambitious: “Create a level with platforms and ramps to jump on using primitive shapes. Add some decoration”. Here’s what Claude created. We can even try it out!

Going even further: "Make a rendition of the Manhattan skyline using primitive shapes".

I suppose you can only go so far with primitive shapes, but you can see what Claude is trying to do here. Notice how it names all the objects in the space. These examples look quite basic, but we can start to see the possibilities. Not bad for a first go!

Learnings

With R&D and prototyping in general, it's important to focus on the actual creation rather than getting bogged down by time-consuming infrastructure setup or boilerplate work.

By learning from and adapting an existing project, we were able to reduce the setup and get straight into the interesting part: the OKO MCP server itself. Don't be afraid to lean on open-source and build on other projects to try out your idea!

That said, we did end up spending a chunk of time on creating a console API for OKO, which we didn't expect going into the project. In hindsight, perhaps there was a simpler way of achieving the same thing? That time spent also limited the number of tools we ended up creating for our prototype. This mini-project also shows the value of just jumping in sometimes. Do we understand all the ins and outs of MCP servers? No, but we have learnt a lot! By simply doing, we can get far, often further than we think.

What's next?

This made for an interesting prototype, but it remains a prototype. The setup is a little clunky with many moving parts, so we need to consider simplifying it and making it more user-friendly. It's not ideal to need a browser extension, for example, so we need to think of alternatives.

We've exposed only a very minimal subset of OKO's functionality in our MCP server. By exposing more tools, we believe we can exponentially expand possibilities and make it more compelling to use.

Finally, if we did proceed with this as a product feature, we'd need to consider integration. Ideally, we'd want an AI assistant directly inside the OKO UI. Imagine if we could directly chat with the AI assistant within OKO and ask it to create a first draft of a space, or help us figure out a problem we're having. This kind of direct, intelligent assistance would fundamentally transform how users build and explore, unlocking unparalleled creative freedom and efficiency for every user.

Next
Next

Meet the Magnopians: Mira Murphy