Migrate from Ephemeral to Durable Streaming

Move an existing AI chat app from brittle, single-connection streaming to durable, reconnectable streaming with Workflow DevKit.

If your AI app streams responses over a single HTTP connection, a page reload or network interruption kills the response. The user starts over. The server may still be generating, but the client has no way back in.

This guide shows how to move from ephemeral streaming to durable streaming with Workflow DevKit. After this migration, the workflow keeps running when the client disconnects, and the client reconnects to the same in-progress run.

What changes

	Ephemeral streaming	Durable streaming
Connection model	Response tied to a single HTTP connection	Response tied to a durable workflow run
Page refresh	Response lost, user starts over	Client reconnects to the same run
Network drop	Response lost	Workflow continues on server, client resumes
Retries	Manual implementation required	Built into workflow steps
Observability	Custom logging	Built-in step tracing and Web UI
Local debugging	Console logs	Step debugger with execution trace

Example

We will migrate a chat app that streams AI responses using the AI SDK. The app currently uses a standard route handler with streamText.

We will start with wrapping the generation in a workflow step, then expose the runId, and finally swap in WorkflowChatTransport for reconnectable client streaming.

Move generation into a workflow step

The existing app streams directly from a route handler. The response lives and dies with the HTTP connection.

app/api/chat/route.ts

import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: openai("gpt-4o"),
    messages,
  });

  return result.toDataStreamResponse();
}

Add the "use workflow" directive and move the generation into a workflow function using DurableAgent. DurableAgent internally executes each LLM call as a durable step, so you do not need to wrap it in a separate step function.

workflows/chat/workflow.ts

import { DurableAgent } from "@workflow/ai/agent"; 
import { getWritable } from "workflow"; 
import type { ModelMessage, UIMessageChunk } from "ai";

export async function chatWorkflow(messages: ModelMessage[]) { 
  "use workflow"; 

  const writable = getWritable<UIMessageChunk>(); 

  const agent = new DurableAgent({
    model: "anthropic/claude-haiku-4.5",
    system: "You are a helpful assistant.",
  });

  await agent.stream({ 
    messages,
    writable,
  }); 
}

app/api/chat/route.ts

import type { UIMessage } from "ai";
import { convertToModelMessages, createUIMessageStreamResponse } from "ai";
import { start } from "workflow/api"; 
import { chatWorkflow } from "@/workflows/chat/workflow";

export async function POST(req: Request) {
  const { messages }: { messages: UIMessage[] } = await req.json();
  const modelMessages = convertToModelMessages(messages); 

  const run = await start(chatWorkflow, [modelMessages]); 

  return createUIMessageStreamResponse({ 
    stream: run.readable, 
  }); 
}

The generation now runs inside a durable workflow. DurableAgent executes each LLM call as a step, so you get automatic retries and observability for every call.

Verify by running the app locally and opening the Workflow Web UI:

npx workflow inspect runs --web

Confirm the step appears in the execution trace.

Return the run ID for reconnection

Each workflow execution gets a runId. The client needs this ID to reconnect after a disconnect.

Return the runId in a response header so the client can store it:

app/api/chat/route.ts

// ... existing imports and workflow ...

export async function POST(req: Request) {
  const { messages }: { messages: UIMessage[] } = await req.json();
  const modelMessages = convertToModelMessages(messages);

  const run = await start(chatWorkflow, [modelMessages]);

  return createUIMessageStreamResponse({
    stream: run.readable,
    headers: { 
      "x-workflow-run-id": run.runId, 
    }, 
  });
}

Add a reconnection endpoint that returns the stream for an existing run:

app/api/chat/[id]/stream/route.ts

import { createUIMessageStreamResponse } from "ai";
import { getRun } from "workflow/api"; 

export async function GET(
  request: Request,
  { params }: { params: Promise<{ id: string }> }
) {
  const { id } = await params;
  const { searchParams } = new URL(request.url);

  const startIndexParam = searchParams.get("startIndex"); 
  const startIndex = startIndexParam
    ? parseInt(startIndexParam, 10)
    : undefined;

  const run = getRun(id); 
  const stream = run.getReadable({ startIndex }); 

  return createUIMessageStreamResponse({ stream });
}

The startIndex parameter lets the client resume from the last chunk it received, so no data is duplicated or lost.

Use WorkflowChatTransport on the client

Replace the default transport with WorkflowChatTransport. This transport handles storing the run ID and reconnecting to in-progress runs automatically.

app/page.tsx

"use client";

import { useChat } from "@ai-sdk/react";
import { WorkflowChatTransport } from "@workflow/ai"; 
import { useMemo, useState } from "react";

export default function ChatPage() {
  const activeRunId = useMemo(() => {
    if (typeof window === "undefined") return;
    return localStorage.getItem("active-workflow-run-id") ?? undefined;
  }, []);

  const { messages, sendMessage, status } = useChat({
    resume: Boolean(activeRunId), 
    transport: new WorkflowChatTransport({ 
      api: "/api/chat",
      onChatSendMessage: (response) => {
        const workflowRunId = response.headers.get("x-workflow-run-id");
        if (workflowRunId) {
          localStorage.setItem("active-workflow-run-id", workflowRunId);
        }
      },
      onChatEnd: () => {
        localStorage.removeItem("active-workflow-run-id");
      },
      prepareReconnectToStreamRequest: ({ api, ...rest }) => {
        const runId = localStorage.getItem("active-workflow-run-id");
        if (!runId) throw new Error("No active workflow run ID found");
        return {
          ...rest,
          api: `/api/chat/${encodeURIComponent(runId)}/stream`,
        };
      },
    }), 
  });

  // ... render your chat UI
}

Verify reconnection by starting a long response, refreshing the page mid-stream, and confirming the client picks up where it left off. Open the Workflow Web UI locally to inspect the step trace and confirm the run continued through the refresh.

Common gotchas

WorkflowChatTransport request body shape

WorkflowChatTransport shapes its POST body differently than the default AI SDK transport. If you need custom fields in the request body, use the prepareSendMessagesRequest hook:

new WorkflowChatTransport({
  prepareSendMessagesRequest: async (config) => ({
    ...config,
    body: JSON.stringify({
      ...JSON.parse(config.body as string),
      customField: "value",
    }),
  }),
})

See the WorkflowChatTransport API reference for all configuration options.

Streaming must live inside a step

You cannot read from or write to streams directly within a workflow function. All stream operations must happen in step functions. This constraint enables Workflow to track, retry, and observe the streaming operation as a discrete unit. See Streaming for details.

What you get after migrating

Retries built into workflow steps, without custom retry logic
Observability through the Workflow Web UI and CLI, without wiring a separate system
Local debugging with the step debugger to inspect runs, traces, and step state on your machine
Reconnectable streams that survive page refreshes, network drops, and function timeouts

Resumable Streams - Detailed guide on stream resumption with WorkflowChatTransport
Streaming - Core streaming concepts and patterns
WorkflowChatTransport API Reference - Full configuration options
Observability - Inspecting and debugging workflows
Building Durable AI Agents - Complete guide to DurableAgent

Migrate from Ephemeral to Durable Streaming

On this page