“Engineering Leader and mentor sharing practical stories and lessons on system design at scale, backend engineering, and career growth in tech—from small-town beginnings to leading modern teams."
Introduction Hello! My name is Baidyanath Prasad, and I'm an Engineering Manager based in India. I'm passionate about solving problems using technologies that make a meaningful difference in people's lives. With 10 years of experience in the IT industry, I have worked on several products across various tech stacks and successfully delivered them to market. Every day brings new challenges and opportunities for learning, which is what makes this industry truly exciting. Early Life and Education I come from a lower-middle-class family in Madhubani, Bihar, India. I completed my primary education through the seventh standard in my village. In 2009, I completed my matriculation from Rama Prasad Dutt Janta High School, Jitwarpur (Bihar School Examination Board, Patna). With a strong aptitude for Mathematics, I chose the Science stream and moved to Patna, Bihar, where I studied Mathematics, Physics, and Chemistry. After completing my intermediate studies at Thakur Prasad Singh Col...
While working on a backend system supporting millions of users , Redis was chosen as the go-to solution for real-time data—sessions, counters, recommendations, you name it. The Redis setup ran on Google Cloud Memorystore’s lowest configuration , with best-practice TTLs, eviction, and well-designed keys baked in. However, as user traffic increased, a subtle bottleneck emerged. Surprisingly, it was not the memory or dataset size that held us back—our entire hot data set was under 5GB and always fresh. Instead, the challenge was the enormous number of direct requests: every microservice and API call was reaching out to Redis in real time, leading to network congestion, latency, and a stretched-thin Redis instance. Problem: More Calls, Not More Data We looked at options. Scaling hardware felt excessive since memory and CPU were already sufficient for the modest data set. It wasn’t the amount of information, but the pattern of access —thousands of ...
Introduction: Recently, during the more than a week-long year-end break (thanks to my employer), I finally cut off the daily work—spent quality time with family, travelled a bit, and, importantly, sat with my own thoughts (yes, intentionally). Next month, I'll mark ten years (decade) of working in tech, and that milestone has pushed me to question what the next decade should look like. This post is a part of my self-reflection and an open note to my network about what is changing and how that’s reshaping my plans. In 2025, a fundamental shift occurred in the tech job market. The traditional narrative of keeping your head down, climbing the corporate ladder, and retiring quietly began to break down as layoffs reached into the hundreds of thousands globally. This transformation reshaped the very meaning of "stability" within the industry. While many positions vanished, new opportunities surged for those who were easy to find, easy to trust, and visibly skilled at the...
"Is your system designed for Teamwork or Broadcast? The difference is just one line of code." Imagine you’re debugging a critical production issue. You’re staring at the logs, coffee in hand, trying to track an order through your system. You have a Kafka Topic named orders with 2 Partitions (P1 & P2) . You publish two messages: Order #101 (lands on Partition P1). Order #102 (lands on Partition P2). You fire up two PODs of your microservice— let’s call them Consumer A and Consumer B — to process these orders. You watch the terminal, waiting for them to light up. Consumer A picks up Order #101 . Consumer B picks up Order #102 . But then you notice something unsettling. Consumer A never saw Order #102. And Consumer B completely ignored Order #101. If you come from a traditional pub-sub world (like JMS or ActiveMQ), the panic starts to set in. "Did the message get lost? Why didn't Consumer A see both orders? Is the partition broken?" The short answer to ...
Introduction The marker is in your hand. The interviewer says, " Design Spotify." For most engineers, this is the moment panic sets in. You start drawing random boxes—a Load Balancer here, a Database there—hoping to stumble upon the right answer. You throw in buzzwords like "Sharding" and "Microservices" to fill the silence. Ten minutes later, you have a messy whiteboard and a skeptical interviewer. You have just demonstrated the classic "Junior Trap" : focusing on components instead of architecture . To pass a Senior or Principal (SDE4 & above) interview, you need to stop guessing and start structuring. I use a method called The S.C.A.L.E. Framework . It turns the chaos of an open-ended question into a systematic engineering defense. Here is how to use S.C.A.L.E. to design a system that actually works.
Introduction In modern backend engineering, it is easy to treat a managed database like a black box: you write a row, it saves it, and the cloud handles the rest. But when you transition from building side projects to scaling systems for millions of users, you discover a harsh physical reality. If you don't understand how your specific storage engine physically uses RAM and Disk, a single architectural decision can quietly cut your performance in half. A Primary Key is not just a logical identifier to prevent collisions; it is a physical routing mechanism. As I learned while scaling a global backend platform for 20M+ daily active users, what works perfectly for one database engine will absolutely destroy another. Before we talk about distributed clusters or millions of queries per second, we have to understand the fundamental physics of how a single byte is saved to a disk. The Fundamental Divide: The 90% Rule If you look under the hood of almost every major database in the world—f...
Introduction In my experience scaling systems to 3M+ concurrent users , I’ve learned that the most difficult challenges aren't found in the code—they are found in the Ambiguity of the requirements. Most Senior Engineering Managers (EMs) are experts at managing Complexity . We know how to handle distributed deadlocks, gRPC migrations, and database partitioning. Complexity is a known quantity; it follows the laws of logic. But Ambiguity is a "twisted" game. It’s where the requirements shift mid-stream, and if you don't catch the pivot, you end up building a perfectly engineered solution for the wrong problem. The Challenge: A "Simple" Notification Dashboard I recently participated in a design session for a Head of Engineering role. The prompt seemed straightforward: “We have 30 microservices sending notifications with zero visibility. Build a centralized dashboard to visualize the flow.” My immediate technical response was to solve for Observability : Inges...
Introduction Building an AI application as a backend developer no longer requires pivoting to a new language or managing complex cloud infrastructure. By leveraging Spring AI , you can treat a Large Language Model (LLM) as just another service in your ecosystem. Matcha was prototyped and polished in just 3–4 hours . This speed is possible because Spring AI abstracts the "AI complexity" into familiar POJO-based patterns, allowing for rapid iterations—tuning prompts and refining logic in minutes rather than days. To ensure a systematic engineering defense of the architecture, I applied the S.C.A.L.E. Framework . This framework turns the chaos of open-ended design into a structured, defensible plan by focusing on trade-offs rather than just components. S: Scope and Size Let's begin by defining the Requirements ( The MVP ) and then calculating the Constraints . This defines the project boundary for a local-first recruitment tool. Functional Requirement (FR): A user can uplo...
Introduction In Part-1 and Part-2 , we mastered the transactional heavyweights. We learned how B-Trees and LSM-Trees manage the "Two-Layer Problem" of disk and network. But what happens when your data isn't just a row, but a relationship, a search term, or a high-dimensional concept? When general-purpose tools become your biggest bottleneck, you must enter the world of Specialized Physics . 1. The Inverted Index: The Physics of Search (Elasticsearch) Traditional databases are "Forward Indexes" ( $Key \rightarrow Row$ ). If you want to find every log entry containing the word CRITICAL , a B-Tree must perform a Full Table Scan , reading every byte of every row ( $O(N)$ ). The Mechanic: The Inverted Index. During ingestion, the engine (Lucene) tokenizes text into "terms." It builds a sorted map where the "Key" is the word and the "Value" is a Posting List (a compressed list of IDs where that word appears). Practical Example: Searc...
Introduction In Part-1 , we explored how the physical storage engine (B-Trees vs. LSM-Trees) dictates your primary key strategy and single-node performance. But when you scale a database across multiple machines or global regions, the physical disk is only half the battle. One of the biggest mistakes engineers make is confusing the storage engine with the distributed protocol . If both Apache Cassandra and Google Cloud Spanner use LSM-Trees underneath, why is Cassandra eventually consistent while Spanner is strictly consistent? To choose the right database, you must evaluate the Two-Layer Problem . 1. The Two-Layer Database Architecture A distributed database is actually built of two completely separate architectural layers. Layer 1: The Local Storage Engine (The Disk) The Goal: Write bytes to a specific SSD as fast as mathematically possible. The Tech: B-Trees (PostgreSQL, MySQL) or LSM-Trees (Cassandra, Spanner, DynamoDB). This layer has absolutely no concept of "Consistency...