Throughout most of 2024 I was a part of the team responsible for building the Early Access version of Replit Agent ↗.
Replit ↗ is an online software development platform, accessible through the browser, and backed by virtual machines running in the cloud. This cloud-nativeness provides unique advantages over browser-based IDEs which run the code in the browser itself. Almost coincidentally, these advantages can be also useful to LLMs, which created a foundation for developing an autonomous Agent, capable of creating software on the platform.
The target user for the Replit Agent is not a professional software engineer, but a "local developer" ↗ (after Bonnie Nardi), or a "barefoot" one ↗ (after Maggie Appleton). This, in many ways, makes this work a part of the End-User Programming ↗ canon.
For a long time, my personal hope was that we're going to create/discover a magical programming system ↗ (a system, not a language) with clear heuristics, and an interface that makes programming not feel like programming ↗.
For a while after LLMs started getting better at outputting code, I thought that none of that matters anymore — if the code can be generated based on requests stated in plain English, why would we care about anything other than the results of executing it?
After working on the Agent, I believe that the original learnings are as important as ever. We still need a programming system, one where humans and LLMs can collaborate on creating software. The interfaces matter even more so, as there's one more "player" in the mix now — in addition to the user interacting with the environment, there's the LLM which interacts with both.
LLMs are pretty good at generating (specific kinds of) code, and only getting better. But, maybe unsurprisingly, just throwing code at the problem doesn't lead to the best results — the LLM operates blindly, with no feedback from the environment. Turns out, LLMs can be more powerful, if given proper tools.
A lot of work on this project was centered around giving the Agent well-designed interfaces to the environment (and to the user), for example:
Interestingly, we didn't build almost any of these tools from scratch — our IDE already had them. We only had to surface and package them for the Agent. In many ways this is an UI problem, just that the user in question is an LLM, and the interface has to be synchronous and textual.
In this project, I was responsible for early research, design, and prototyping, both for the user-facing interface to the Agent, and for the Agent-facing interfaces to the environment and to the user.
The prototypes had a lot of influence over the final project, where I also contributed production code, ranging from implementing some of Agent's tools to building parts of the user interface. * ↗
624 words published on 2024-09-05 — let me know what you think