[ case study ]
Smart school radio - System for the school
We put the school PA in students' hands. .NET runs the bell schedule (Hangfire), Redis and SignalR power live voting on the LAN, and a Python AI guardrail blocks profanity. A lean on-prem deployment wired into real audio hardware.
Full-Stack & DevOps Lead
3-person team · 400+ active users
Stack
Goal
Community-driven break music with a safety guarantee. AI keeps trolls off the PA, and on-prem infra stays locked to real school bell timing.
Shipped and running at my former vocational school: ZSZ Gostyń
01 - From the cloud to the server room
From the cloud to the school server room
A lot of junior case studies end at a cloud deploy and a Vercel link. My previous project - GroupNote - was a beautiful architectural beast that fell to over-engineering and never met the market. The lesson: real engineering is not perfect systems in a vacuum - it is solving messy, physical problems.
Smart Radiowęzeł is the opposite of a startup fantasy: born and shipped in the real world - inside the closed network of ZSZ im. Powstańców Wielkopolskich in Gostyń.
Proof of work: this did not live on “my machine.” It ran on a school bare-metal server, serving hundreds of students the moment the bell rang - all trying to push favorite (and often forbidden) tracks.
Problem: how do you hand students the PA without a disaster?
A school public-address stack is high risk. Sound hits corridors, classrooms, and the yard. The old setup was a PC and a static playlist. We wanted phone-first control - with eyes open about what could go wrong:
Trolling and profanity: the obvious first move is explicit lyrics or “ironic” anthems. The system needed strict zero trust on content.
Sync with real-world time: voting and playback had to track the bell schedule - no mystery latency from the cloud.
Thundering herd: when the bell hits, hundreds of phones hammer the API in the same second - the database and realtime layer had to absorb it.
Team and ~18 months of iteration
For about a year and a half, our three-person team (me plus two classmates) worked with our IT teacher to connect modern web software to real hardware.
I owned lead backend, DevOps, and architecture: database design, .NET, SignalR, Redis, Docker, the student-facing React app, and an admin app in Flutter. Teammates delivered the critical Python AI Guardrail microservice - scraping lyrics/context from the web and scoring suitability with LLMs.
This case study is about code meeting hundreds of concurrent users and about wiring modern software to the copper running out of the school amplifier.
02 - Architecture
Architecture - modular monolith on bare metal
[ design principle ]
Pragmatism over hype. Carving this into microservices for ~400 users on a school LAN would have been textbook over-engineering - and an operational nightmare. Pure layered spaghetti would have made safe production updates between breaks impossible. I chose a modular monolith.
The system had to run reliably in Docker on a local server on the ODN_Uczniowie network. Instead of scattering services across machines, I shipped one orchestrating process - logically partitioned, operationally unified.
Vertical slices and in-process messaging
The ASP.NET Core backend was split into vertical modules: Modules.Users, Modules.Votings, Modules.Admin, Modules.Feedback.
Instead of REST calls inside the monolith, I used explicit in-process messaging with MediatR: each use case is a command + handler. Cross-module contracts live in Shared/Events.
Result: one server process, but clear domain boundaries (bounded contexts) in code.
Database: one server, four contexts
A single PostgreSQL instance, but strict separation in code: each module owned its DbContext and migrations - the usual monolith trap (shared tables, blurred boundaries) was ruled out upfront.
public class VotingDbContext : DbContext
{
public DbSet<Song> Songs { get; set; }
public DbSet<VotingSession> Votings { get; set; }
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.HasDefaultSchema("votings"); // Izolacja na poziomie schematu DB
// Konfiguracja specyficzna dla modułu głosowań
}
}Admin could not “shortcut” reads from Votings tables - only through an explicit contract (e.g. a MediatR query).
Docker Compose as the operational backbone
On bare metal there is no managed database or autoscaling - our contract with the box was docker-compose.yml. It had to come back clean after every reboot: .NET API, PostgreSQL, Redis (sessions + SignalR), telemetry (Aspire Dashboard).
# Fragment infrastruktury - jedna sieć LAN
services:
radiowezelapi:
build:
context: .
dockerfile: RadiowezelAPI/Dockerfile
environment:
- OTEL_EXPORTER_OTLP_ENDPOINT=http://radiowezel.dashboard:18889
- PYTHON_URL=${PYTHON_URL}
- TZ=Europe/Warsaw # Krytyczne: synchronizacja z czasem szkolnym
ports:
- "8080:8080"
depends_on:
- radiowezel.cache
- radiowezel.postgresThe critical line: TZ=Europe/Warsaw. While much of the cloud-native world defaults to UTC, we drove physical school bells in Poland - Hangfire had to respect local time, DST, and the exact moment 8:35 fires.
Product pipeline in short: YouTube link → validation → queue → voting → playback on speakers.
03 - Hardware bridge & time
Hardware bridge & physical time (Hangfire)
Many web apps treat time as DateTime.UtcNow. We had to track real school bells. A few seconds late meant music during class - and an instant shutdown order from the office.
Hangfire and the timezone trap
We used Hangfire on PostgreSQL to open and close voting windows per break. In Docker we hit the default UTC container clock vs bells in Europe/Warsaw with DST shifts.
[ time zone ]
We set TZ=Europe/Warsaw on the container and forced the .NET scheduler to respect local server time. Without that, the whole break timetable would drift every DST change.
services.AddHangfire(cfg => cfg
.SetDataCompatibilityLevel(CompatibilityLevel.Version_170)
.UseSimpleAssemblyNameTypeSerializer()
.UseRecommendedSerializerSettings()
.UsePostgreSqlStorage(connectionString));
// Krytyczny detal: harmonogram musi rozumieć fizyczny czas lokalny
var jobsOptions = new RecurringJobOptions { TimeZone = TimeZoneInfo.Local };
// Rejestracja jobów startujących głosowanie (Cron)
RecurringJob.AddOrUpdate<VotingJobHandler>(
"StartVoting_Przerwa1",
x => x.StartVotingAsync(CancellationToken.None),
"35 8 * * 1-5", // Pon-Pt, 8:35
jobsOptions);Hardware bridge: from .NET to the copper cable
Closing a vote in the database does not make sound - we had to push winners to real speakers. The rig was a PC wired into an amp running AIMP. With no budget for a proprietary audio API, we wrapped a CLI tool with a small service.
Teammates found an AIMP CLI; on the machine next to the PA rack we ran a thin FastAPI app that periodically called the main backend:
GET /voting/songs-to-play - winning playlist (URL + duration), then map to AIMP commands (play, pause, volume).
Closed loop: Now Playing
The system was bidirectional: when AIMP started playback, the Python controller hit POST /voting/playing-song. The backend wrote short-lived state to Redis and immediately broadcast via SignalR - so “Now playing” on phones lined up with the moment bass hit the hallway.
POST /voting/playing-song
{
"songId": "guid-utworu",
"duration": 215
}04 - Break-time traffic
Break-time traffic (Redis & SignalR)
In a typical app, load spreads across the day. For us, 45 minutes of class meant near-zero traffic - then the bell dropped and hundreds of students opened the app in the same moment. Textbook thundering herd.
Hitting PostgreSQL on every refresh to ask “is voting live?” and recount likes would have drained the connection pool in seconds.
Redis as a database shield
Redis held hot, break-scoped state in RAM:
voting window open/closed,
current track (Now playing),
session tokens to limit multi-account abuse,
a fast cache of cast votes.
The API answered from memory instead of hammering Postgres on every list interaction.
// Modules.Users/Auth - szybki check sesji w Redis (antyspam)
var sessionExists = await cache.GetStringAsync($"session:{userId}");
if (sessionExists is null)
{
await cache.SetStringAsync($"session:{userId}", "true",
options: new DistributedCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(15)
});
return false;
}
return true; // Blokada: uczeń już działa w tej sesjiA pragmatic SignalR event bus
Polling would have melted the WLAN. The React client needed realtime updates, so we used SignalR - without dozens of hub methods per action.
One ReceiveMessage channel: the server pushed short strings; the client branched on content and updated UI.
// React - jeden kanał ReceiveMessage zamiast armii metod w Hubie
const connection = new signalR.HubConnectionBuilder()
.withUrl(`${API_BASE_URL}/voting-hub`)
.withAutomaticReconnect()
.build();
connection.on("ReceiveMessage", (message: string) => {
if (isPlayingSongDto(message)) {
onPlayingSongUpdate?.(message);
} else if (message === "Like added to song.") {
onLikeUpdate?.();
} else if (message === "Voting started.") {
onVotingStarted?.();
} else if (message === "Voting ended.") {
onVotingEnded?.();
}
});Engineering trade-off: we gave up strongly typed per-event hub contracts to keep integration dead simple and a single message stream. Shipping a new backend signal (e.g. an outage banner) did not require hub signature churn - only another branch on the client.
05 - AI Guardrail
AI Guardrail - shield against trolls
Hand hundreds of students a music queue and the first moves are explicit lyrics or “ironic” anthems. Teacher review on every track would have killed the product on day one - we needed an automated guardrail.
The AI Guardrail scored content before anything entered the vote pool.
Teamwork & API boundaries: I did not build the Python FastAPI + LLM module alone - two teammates owned scraping and model calls (Gemini). On the .NET side I was the orchestrator: I defined the HTTP contracts, applied domain rules after each response, and mapped outcomes into real tables - RejectedSongs and SongsToCheck - so their service stayed a black box with a crisp boundary.
Split: orchestrator vs worker
We kept lyric scraping and Gemini calls out of the main C# codebase - responsibilities were explicit:
Python (worker): YouTube URL, metadata, lyrics from external sources, LLM call, standardized label (Positive / Negative / Neutral).
.NET (orchestrator): domain flow, database, final decision - and persistence the Python service never had to understand.
From the backend’s point of view their module was a black box behind a small contract:
var result = await $"{PythonApiUrl}/sentiment"
.PostJsonAsync(new { URL = request.Url }) // Contract agreed with Python team
.ReceiveJson<ValidateSongResponse>();Domain logic in .NET
“Call the model” alone is not a product. I implemented a three-way branch on the Python result:
Positive - auto-accept into the voting pool.
Negative - hard block, row in RejectedSongs, error surfaced to the client.
Neutral - safety buffer: enqueue in SongsToCheck for one-tap review in the Flutter admin app when the model is unsure or misses irony.
// AddSong handler excerpt (.NET)
if (isInRejectedSongs)
{
return Result.Failure<AddSongResponse>(
Error.Conflict("Track blocked by AI Guardrail.")
);
}
if (aiResult.Sentiment == Sentiment.Neutral)
{
await context.SongsToCheck.AddAsync(new SongToCheck { Url = request.Url });
await context.SaveChangesAsync();
return Result.Success(new AddSongResponse("Song pending moderation."));
}State: pending
State: success
06 - Auth pragmatism
Security & auth pragmatism
Enterprise defaults push full IAM - OAuth2, OpenID Connect, Azure AD. The school had Microsoft 365 for every student, so “Sign in with your school account” was the textbook-correct path.
A product-minded engineer still knows when “best practices” would ship a dead product.
[ UX vs security ]
Picture ~400 students with ~10 minutes of break. Forcing long school emails and passwords on phones would have killed adoption on day one. I dropped Azure AD in favor of maximum friction reduction.
Zero-friction login: four characters
Instead of corporate SSO, each student received a generated, unique four-character alphanumeric code. Typing it on a phone keyboard took seconds.
Your code
Sign in
Simple codes sound like a security nightmare, but we were not protecting bank data - only access to the PA. To limit abuse (one student, many tabs, vote spam), I added a fast Redis session lock tied to the user for the break window:
// Modules.Users/Auth - Redis session lock
var sessionExists = await cache.GetStringAsync($"session:{userId}");
if (sessionExists is not null)
{
return Result.Failure(Error.Conflict("Session already active."));
}
await cache.SetStringAsync(
$"session:{userId}",
"true",
new DistributedCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(15)
});That kept lightweight auth usable under thundering-herd load and made it harder to stretch one identity across many devices during voting.
Admin auth: pragmatic OTP
The same “no heavy identity” stance applied to the admin panel (used to clear the Neutral buffer). I did not stand up IdentityServer - I shipped a simple OTP over SMTP to predefined inboxes. Codes lived ~25 minutes in Redis and were enforced by middleware on selected routes:
// Middleware excerpt - admin routes
if (context.Request.Path.StartsWithSegments("/admin/songs"))
{
if (!context.Request.Headers.TryGetValue("X-Admin-OTP", out var providedOtp))
{
context.Response.StatusCode = StatusCodes.Status401Unauthorized;
return;
}
var cachedOtp = await cache.GetStringAsync("OTP");
if (string.IsNullOrEmpty(cachedOtp) || cachedOtp != providedOtp)
{
context.Response.StatusCode = StatusCodes.Status403Forbidden;
return;
}
}No role matrices, no sprawling claim policies - a raw header check and cache, sized right for a LAN deployment.
07 - Observability
Observability in the rack - OpenTelemetry
On Vercel you get a polished error UI. On a school bare-metal box you usually start with flat Docker logs and a black terminal.
When break music dies, there is no time to SSH-grep - you need to know fast: did Postgres fall over, did Python time out, or did someone ship a bad link?
[ minimum viable ops ]
I did not build a war room with Grafana, Prometheus, and ELK - our tiny server could not carry that (another anti-over-engineering datapoint). Instead: plain OpenTelemetry in .NET and a lightweight Aspire Dashboard container.
Structured telemetry (.NET + OTLP)
In the main API I registered instrumentation that collected metrics and traces from the hot paths:
// RadiowezelAPI/Program.cs - OpenTelemetry registration
services.AddOpenTelemetry()
.ConfigureResource(resource => resource.AddService("radiowezel"))
.WithMetrics(metrics =>
{
metrics.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation(); // Correlate calls to Python
metrics.AddOtlpExporter();
})
.WithTracing(tracing =>
{
tracing.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddEntityFrameworkCoreInstrumentation(); // SQL insight
tracing.AddOtlpExporter();
});AddEntityFrameworkCoreInstrumentation surfaced Postgres query timings - when a break “felt slow,” the panel showed whether the database or something like a Redis lock was the culprit.
AddHttpClientInstrumentation gave the full picture of how long the Python AI Guardrail spent processing lyrics.
Aspire Dashboard in Docker Compose
I added Microsoft's Aspire Dashboard image to docker-compose.yml - on the same LAN it ingested telemetry over OTLP:
# docker-compose.yml (excerpt)
services:
radiowezel.dashboard:
image: mcr.microsoft.com/dotnet/nightly/aspire-dashboard:latest
ports:
- "15677:18888" # Local access for developers
radiowezelapi:
environment:
# API pushes OpenTelemetry to the dashboard (OTLP)
- OTEL_EXPORTER_OTLP_ENDPOINT=http://radiowezel.dashboard:18889Result: one URL on the school server with traces tying the student request, database time, and AI hop together - near-zero ops cost versus a full observability platform.
08 - Real users, day one
Colliding with real users - day one
Localhost tests with a handful of rows are one thing. Letting ~400 students loose - whose main hobby is finding holes and making hallway drama - is another. Day one was a harsh live ops lesson.
Chinese characters and a missing MaxLength
Within hours the database ballooned. Someone noticed the add-song form took arbitrary-length text and flooded the endpoint with thousands of characters in the title field - script or paste.
[ live hotfix ]
Classic missing domain validation at the API edge. Fix: FluentValidation length rules, garbage cleanup in Postgres from the CLI, and a Docker container restart on the school box - before the bell for the next break.
Stress-testing the AI Guardrail
Once students saw tracks were not instant, they probed the Python model: Soviet anthem, profanity hidden in lyrics, edge-case content.
The Neutral buffer (SongsToCheck) paid off: the LLM rejected obvious junk (Negative), and when people got clever - no official lyrics online - it stamped uncertain instead of letting it through. We drained the queue from the Flutter admin in one tap.
Product: dev-to-user comms
People rage when “my song never plays” with no explanation. Rather than silence, we shipped thin modules: Modules.Feedback and announcements.
Dev announcements - a React banner for everyone: when Hangfire drifted from the bell or the API was down for a hotfix, a clear “paused, back next break” message.
POST /feedback - a simple inbox from the app so students felt heard.
We built a system, but hitting real users turned it into an actual product.
09 - Takeaways & handover
Takeaways, handover, and a last word
Most school IT projects die the day the year ends. Smart Radiowęzeł still runs. As we finished vocational school, we did a real handover to younger IT cohorts.
We handed over docs, GitHub repos, and infra access (including Supabase in prod). The fact that other students could spin up dev and keep shipping - that is my biggest engineering win: a system mature enough to outlive its authors.
What the trenches taught me - three lessons
01 - UX beats the enterprise-security checklist. Auth has to fit context: skipping Azure AD for four-character codes saved adoption. The “safest” system nobody uses is a failed system.
02 - Pragmatic architecture. A modular monolith for ~400 users on a LAN: multiple DbContexts on one database gave clean boundaries without a microservice ops tax. On bare metal with one docker-compose.yml, deploy simplicity matters.
03 - AI as guardrail, not a toy. Instead of centering the LLM in the UX like my GroupNote side project, here the Python model is a shield - sentiment,SongsToCheck, hours of moderation saved. That is utility AI.
Closing thought: GroupNote taught me advanced code and cloud scale. Radiowęzeł taught me shipping a product: code is a tool for messy physical problems - bells, hallways, speakers.
In GroupNote, the same modular pattern invited scope creep - modules stacked frictionlessly, and “one more feature” delayed real users (the classic founder trap: coding feels like progress while you avoid the market). Here, similar boundaries in code worked the other way: we held a scope we could operate as one stack on a school LAN.
A small set that had to survive the bell
Not a feature army - voting, queue, playback; an AI Guardrail with a Neutral buffer instead of a perfect model on day one; auth without Azure AD; OpenTelemetry + Aspire instead of a Grafana farm on day one.
“Just one more thing” - grounded in reality
We still shipped hotfixes: field limits, announcements after migrations, feedback. The difference: the next iteration came from real student behavior, not a wishlist; the bar was “works next break,” not “complete for a slide deck.”
Live ops has a cost - but it buys something you cannot fake in an IDE: hundreds of phones in the same second the bell rings. Not a factory with no customers - a product that rolled on real hallway asphalt.
I end this case study where GroupNote ended in an archive: at a deployment that outlasted us.