Production YouTube Cookie Auth
How Summry added shared yt-dlp cookie configuration for production YouTube metadata extraction and audio downloads.
Production YouTube Cookie Auth
Problem
During production testing, yt-dlp failed against a YouTube URL with a bot-check message:
Sign in to confirm you're not a bot.
That failure matters because YouTube access sits near the front of Summry's core workflow. A user submits a URL, the API extracts metadata, the app creates or reuses a video record, and the worker later downloads audio for transcription. If either server-side yt-dlp call gets challenged, the product flow stops before summarization can begin.
Local development can make this feel less urgent because a laptop has a normal browser environment, familiar network reputation, and easier manual debugging. Production is different. The backend runs in a Railway container, and yt-dlp traffic comes from server egress without a browser profile.
Context
Summry uses yt-dlp in two places:
integrations/youtube_service.pyextracts metadata before video creation.services/video_processing_service.pydownloads audio before transcription.
The first instinct could be to patch only the failing line. That would be brittle. If metadata extraction used cookies but audio download did not, users could still create chat rooms that later fail in the worker. If the worker used cookies but metadata extraction did not, the app might never create the video record.
The more useful fix was to treat yt-dlp configuration as one production integration concern.
Implementation
I added a small shared helper:
integrations/ytdlp_options.py
It builds yt-dlp options from a caller's base options and adds production downloader configuration when secret-managed credentials are available.
The preferred production path is to store downloader credentials as a managed deployment secret and materialize them only at runtime. Raw cookie files are awkward and risky to handle directly, so the app keeps the credential payload out of source control, writes it to an ephemeral runtime location with restrictive permissions, and passes that generated file to yt-dlp.
Both yt-dlp call sites now use the helper:
- metadata extraction wraps its existing
quiet,skip_download, andnoplaylistoptions - audio download wraps its existing format, output template, and FFmpeg postprocessor options
The patch also updates production docs so the deployment behavior is visible:
docs/deployment.mdlists the optional downloader environment variables.docs/production-readiness.mdexplains why cookies belong on both API and worker services.
Why Not Cookies From Browser
yt-dlp suggests --cookies-from-browser for many local workflows, but that is not the right production default here.
The Railway container does not have a stable logged-in Chrome or Firefox profile. Even if a browser profile could be added, it would make the production runtime harder to reason about and harder to rotate safely. A secret-managed cookie file is a cleaner fit for the current Docker-based API and worker setup.
This also keeps app authentication separate from downloader authentication. Summry users sign in with Google to access their own workspace. Those user sessions should not be mixed with server-side YouTube downloader credentials.
Tradeoffs
This improves reliability against one common class of YouTube production failures, but it is still a workaround.
Cookie-based access has operational costs:
- cookies can expire or rotate
- the YouTube account can be challenged or rate-limited
- the Railway egress IP can still develop poor reputation
- secrets need careful handling because YouTube cookies are account credentials
The implementation intentionally does not make cookies required in production. Some public videos may still work anonymously, and forcing the secret would make initial deployment more brittle. The downloader auth is optional, but ready when production traffic needs it.
Tests
The test updates cover the behavior that matters:
- config exposes the new optional yt-dlp environment values
- metadata extraction still uses
noplaylistandskip_download - shared yt-dlp options add a configured cookie file
- encoded credential content is materialized into the generated runtime file
- invalid credential content fails with a clear runtime error
Lessons Learned
Production integrations need to be traced through the whole workflow, not just the line that threw the error.
In this case, yt-dlp was not a single metadata utility. It was part of both the API path and the worker path. Making downloader configuration shared reduced the chance of a partial fix where one stage succeeds and the next stage fails.
This was also a reminder that authentication boundaries should stay explicit. The app's Google session authenticates a Summry user. The yt-dlp cookie file authenticates server-side access to YouTube. Keeping those separate makes the system easier to secure, document, and operate.
Next Steps
After deploying this, the practical follow-up is operational:
- keep downloader credentials isolated from normal user authentication
- store and rotate them through the hosting provider's secret manager
- configure the API and worker consistently
- smoke test metadata extraction and worker audio download with the same video
- watch worker failures for repeated bot-check or rate-limit messages
If YouTube access remains unstable under real usage, the next decision is infrastructure-level: dedicated egress/proxy strategy, cookie rotation, PO-token support, or a third-party transcript/download provider.
Follow-Up: EJS Challenge Solver
After the cookie configuration was accepted, production hit a second YouTube failure:
n challenge solving failed
Cannot read properties of undefined (reading 'origin')
Requested format is not available
That moved the issue from authentication to extraction. yt-dlp now needs a JavaScript runtime and matching EJS challenge solver scripts for some YouTube requests. The production image was updated to install Deno, and the Python dependency was changed from yt-dlp to yt-dlp[default] so the companion EJS package is installed.
The pinned yt-dlp version also needs to stay current. The origin Deno failure was fixed upstream after the February version that Summry had pinned, so the dependency was bumped to a newer stable release.