awaBerry Version 2: A New Dimension of Device Automation

Version 2 marks the transition from a remote access platform into a full AI-native automation platform. Harald Hagen walks through what changed, what is new, and why the combination of the Smart Automation Framework and the Agentic API represents something fundamentally different.

Every version release is a milestone, but version 2 is something different in kind. It is the moment where awaBerry stops being a platform that gives you access to your devices and becomes a platform that makes your devices work for you — autonomously, on a schedule, across the whole fleet, without human intervention for each run.

I want to walk through what is actually in version 2, and more importantly, why we built it the way we did.

The Context: Why Automation Had to Come Next

When we shipped the first versions of awaBerry, the problem we were solving was access. Getting to a device without a VPN, without open ports, from anywhere in the world — that problem was not solved well by anything that existed, especially for the kinds of devices our users care about: edge hardware, SoC boards, machines behind NAT, containers without public IPs. We built awaBerry Anywhere to solve that.

But access is not an end in itself. It is a prerequisite. What users actually want is not to be able to reach a device — they want the device to do something. And doing something repeatedly, reliably, on a schedule, without a person initiating each run, is automation.

So version 2 is the automation layer. And because we had already built the access layer, we could build the automation layer correctly — not as a separate product, but as a natural extension of the infrastructure that was already there.

The Smart Automation Framework

The centrepiece of version 2 is the Smart Automation Framework. It is an AI-native approach to automation that I think is architecturally quite different from what most people expect when they hear "AI automation".

Here is the key design decision: we strictly separate the writing phase from the running phase. When you create a new automation project, a capable reasoning model — Gemini 2.5 Pro — analyses your plain-English instruction, explores the local environment on the target device, and writes a deterministic, executable script (Python, JavaScript, or shell, depending on what makes sense). This happens once. The AI tokens are consumed once. The script is saved.

Every subsequent scheduled or triggered run executes that script directly — local CPU only, zero AI tokens consumed. If you need AI summarization of the output at runtime, a cheaper model like Gemini 2.5 Flash Lite is called selectively — roughly 100 to 500 tokens per run. But the core logic runs at zero token cost, every time, indefinitely.

This is not how most AI automation tools work. Most of them re-involve the LLM on every execution. We thought carefully about that tradeoff and decided it was wrong: it is slow, expensive, and non-deterministic. Your automation should behave the same way on run 1 and run 1,000. AI reasoning is for writing the logic. Execution should be a fast, reliable script.

What the Framework Can Automate

The framework handles a wide range of automation categories:

Intelligent web automation — scraping dynamic single-page applications via headless browsers (Playwright / Puppeteer) generated automatically from a plain-English description. Price monitoring, competitor tracking, content aggregation, form automation.
Local system management — parsing log files, managing local databases, categorising and routing documents, file system operations on the device that owns the data.
API and service orchestration — fetching data from external endpoints, transforming payloads, pushing sanitised data to dashboards or CRMs on a schedule.
Legacy system bridging — generating OS-level automation scripts that interact with older software UIs that lack modern APIs.

Scheduling for Remote Desktop and Web-to-Local

Version 2 also extends the scheduling system to cover two of the most-used awaBerry Anywhere features: remote desktop (VNC and RDP) and web-to-local port forwarding. You can now define time-based schedules to automatically activate these connection types on registered devices.

This is particularly useful when automation workflows depend on a visual interface or a locally-running service being available at a predictable time. Instead of a person initiating the connection before the workflow starts, the system handles it automatically.

Near-Real-Time Connection Speed

There is a third change in version 2 that is less visible but arguably just as impactful: connection speed. Infrastructure improvements in this release reduce the time from initiating a connection to having a usable session down to near-real-time. This matters for interactive use, but it matters even more for automated workflows — where a slow connection startup time multiplies across dozens of devices and hundreds of scheduled runs.

How It All Fits Together

The way I think about the version 2 architecture is as two tightly coupled layers. The Agentic API is the access fabric — it can reach any registered device, anywhere, over a zero-trust HTTPS tunnel, authenticated by a Project Key with precisely scoped permissions. The Smart Automation Framework is the intelligence layer — it knows what to do when it gets there.

Neither is complete without the other. The Smart Automation Framework, on its own, can only automate things on the local machine it is running on. The Agentic API, on its own, gives you access but requires you to write all the logic. Together, they form a closed loop: describe what you want to happen across your fleet, and it happens — on a schedule, on every device, without your ongoing involvement.

That is the platform we set out to build. Version 2 is where it becomes real. Explore awaBerry Device Automation →