Murphy's Law is inevitable. Chaos isn't.

Katherin Schuppener, Inbound Supply Chain Lead at Picnic, and Luis Almeida Santos, Domain Tech Lead at Picnic, share how Picnic grew from scrappy start-up chaos to scaled-up resilience - supporting millions of customers, 30+ warehouses, and 400+ microservices.

  • Katherin Schuppener
    Katherin Schuppener, Inbound Supply Chain Lead, Picnic
  • Luis Almeida Santos
    Luis Almeida Santos, Domain Tech Lead, Picnic
The transcript below has been generated using AI and may not fully match the audio.
Luis: Hello everyone. So we are now talking about, scaling incidents response at Picnic. 'cause only Steven wants more incidents to train ai. We want to bundle them better. I'm Luis, and with me and my friend and, colleague Katherin.Katherin: I'm in charge of the email supply chain. So not really in the tech world.we're on the business. annoying engineers, a bit like Adrian said side, but very excited to be here.Luis: latte is picnic.Katherin: Yes. So let's go. first one step on what actually, picnic is, yes. Video.Luis: Oh, there's audio.Katherin: There's audio on the video. But that's fine. I can luckily explain what Picnic is because I've been with them for nine years. at Picnic we've been trying to revolutionize the concept of food logistics and food shopping for our customers. Not only in the sense of we provide.Goods for you, which, yes, maybe your bananas and your bread and your milk, but also by providing solutions. So we'll go a bit more into what kind of solutions, we provide, but it is definitely an exciting journey that we started this year, 10 years ago. So it's also very fitting, moment for us to share a journey with you.picture a world where you don't need to decide between being sustainable and having great prices. Having a unique customer proposition that provides a flywheel of growth. at the beginning we didn't have marketing, for example, we believed, I am sorry if Chris listens, but we believe marketing was not necessary.you basically had word of mouth and that was enough, and that was a bit of the flywheel, we had at the beginning. now we do have a marketing team, and we're logistics. Operating at high excellence really becomes the true customer-centric innovator. so as mentioned, this is picnic for us.This is an online supermarket if you wanna be very, basic on the explanation. But we are in essence a tech company and that's also how we approach our way of operating. we started with a very. Key and perhaps obvious insight, and that is don't focus on what the latest technology and hype, thing is, what everyone is already jumping on, and where you can only be better if you're stronger and faster.That's not easy necessarily if you're starting up. So what we did was we took a concept that is quite known and perhaps slightly boring, which is the milkman. Back in the day, everyone, not in my days, but here, there were Milkman's and they were, I'm not that old, and there were, bottles and they would come and you knew the person and there was some sort of relationship and, and they were providing a service that was reliable, that was close, that was friendly.so we took this concept and reinvented it using the latest technology. So you, of course, don't land at the perfect solution from the start. But we have really taken the approach of improving 1% a little bit. every single day we have a, one of our founders, his name is F Fredic, and he has on Slack one improvement a day, keeps f fredic away.and that has really been the, modus of brand for the last, years. but how do we make this happen? So it starts, of course, with our growers and producers. we have advanced AI models that have enabled us to have highly adaptive supply chains. we have a very short lead times in our supply chain, so we really need to be adaptive, and also to have very close collaboration, and data sharing with our partners.Cannot tell you the number of times we have ran outta stock, because of a TikTok trend. it's, I don't have TikTok, but it's really mind boggling. And this is the type of data and insights we have that perhaps our growers and suppliers don't have. So we do a lot of forecast sharing, for instance.then the orders, are actually assembled in our fulfillment centers. We have three types. We have manual, hybrid, and fully automated. but what is quite unique about us is that we own our full software stack, all of it for all three types, including hardware controllers, which was, we'll go into that a bit later.also quite a challenge. and then we have the inner city logistics. So we have developed our own electric vehicle, together with the, company that manufactures theses. On top of this, we have built solutions for adaptive route planning, including lots of control and un route monitoring, but also things that you might not realize make a difference, like a safety coach for the drivers, providing them with real feedback as they drive down the roads.when we started in the Netherlands, so it's tight roads, providing them with feedback on, Hey, you took that turn too fast, minus one point. Or, Hey, you did a great safe drive, happy, smiley face, things like this. And it all, of course, would not, mean anything if we didn't have an amazing app. we don't have a website.we only have an app. It's highly intuitive and it really enables our customers to not just explore the assortment, because. The benefit of the supermarket is you're forced to walk through the aisles and as you walk through, you're exploring, you're seeing things, things that are on promo, there's balloons, there's, I dunno, marketing things.but we have built an, app that enables the customer to explore in a tiny screen, but also discover a recipe. So we have over a thousand recipes, a meal planners, really solving the question of how do I. what am I gonna cook this, week? so at the moment we're already in, the Netherlands, Germany, France.We have over 400 locations. So it's been an extremely exciting journey. but the fun in this talk is actually how did we get there? and, there at least we'll tell you a little bit about our first phase, which is the honeymoon phase.Luis: Yes. So hopefully you're now sold on how great picnic is, but how did we get there?We are going to divide our story in three phases. The first one is the honeymoon phase that covers roughly the beginning, 2000 thousand 15, 2019. So it all started, in the Netherlands, in a town. Called Nike, I hope I said it correctly. with a handful of delivery vehicles. very small city, very small operations.but also the beginning of how we dealt with incidents because, things just go up as you can see, from zero to one deliveries, one to 10, 10 to a thousands. also a lot of more people got involved meanwhile, and incidents. Next happened, but there's basically a group of close friends. This is the typical startup vibe that everyone knows each other, everyone goes out with each other.Everyone knows their own phone numbers. and we always try to solve for the customers with a bunch of heroes that can pick anything at any time. There was an, as you can imagine, enormous, scaling challenges. All those pictures that, Katherin shows they are difficult to manage, but without it, with one day after the other.at the beginning there was basically no stability. That's crazy talk. No, there was chaos all the time. Incidents. All the time. Actually, we didn't call them incidents. We call them emergencies, which is fitting. but it could be things like the system is down or there's a storm coming up and we need to deal with something, or people just didn't show up to work.That's an actual incident that happened. at this point there was all hands on deck. Like in this picture, it's,Katherin: Yeah, so I, my background isn't fulfillment, so I may or may not have been the one taking this picture. but this was, a Saturday where 30% of the workflows didn't show up. basically you cannot finish picking, you cannot finish the day.I know we're an incident. I like the emergency channel. It's nice, everyone's in it. You post. All eyes on it. and this is actually one of the founders, with his family. they came in on a Saturday to pick the orders because we needed all hands on deck. So it was literally no blame on.Was I a bit too ambitious on how many orders we could pick? Maybe should I have planned maybe better than no-show percentage? Yes. but it was really all hands on deck and me, focused on getting the orders out.Luis: Yeah, so it was a really, a blameless culture at the time, but we needed to organize somehow.So we did create this obviously Slack channel called Emergency Obvious name, super pragmatic. Everyone was there. You just shout, someone will come up. It works. but, it. It takes a bit of effort there, so we needed something to make a bit more structured, solution to this. at this point, we were already operating in 50 different, localities, 50 different locations.Which is already a difficult thing to manage, and there was a thousands, maybe thousands of people already involved. So we did build something. We did build our own Slack bot. Why, not? We can do this thing with no code. This was before ai, so we actually coded it. this, this, call, we call this Mr.Murphy. Perfect name, the play on words. and it brings some structure to what was the emergency channel, some structure to the incident management processes. Katherin knows all about this.Katherin: I know a little bit more than I want to admit about this, early chaos. but, we were already pumped.It was now structured, input was provided in a way that. Meant hopefully the support groups, meaning in this case, tech didn't have to ask 10 questions to actually understand the context of what was happening. and you could create separate Slack channels for the incident management and the communication also having like this dedicated channels like Adrian mentioned.So we were really like, okay, this is it. This is everything we need. Bring on the future. We're prepared for anything. Then COVID came. And that was a bit like out of scope, in our plans. as you can imagine, we had been growing at a high pace at this point, but whatever growth we had been doing that we were very proud of was not enough anymore.The deliveries picked up. The systems could not cope with the workload. We had to, rush to hire tons of people. and basically we were all going and trying to navigate this environment while trying to serve the, most amount of customers we could. so as you can see, deliveries are now, also having a very clear, jump up.You can actually even see the hiring peak there on 2021 blo. and, it was of course, Slightly positive for us because all of a sudden we had a new subset of customers that had not thought of ordering online before and now all of a sudden rushed to the app. but it also meant we had no margin of error anymore.So this graph, for instance, is the display of, the orders in our capacity, during the shutdown announcement in the Netherlands. So they announced the shutdown and within two hours. Our capacity, a hundred percent of it for the next 14 days, which by the way, is everything we showed the customer was full, a hundred percent.We had no margin for error. so it was also. To be honest, quite scary. We were now essential workers. We, couldn't disappoint customers because otherwise they'll have to go themselves to the store and no one wants that. So it was really, scary times. but we pushed through and tried to make the best of it.We grew our engineering team from a hundred to 180 engineers and only one year, the releases doubled. The lines of code that were released also doubled to 4 million. But what's most important is that we were also able to serve. Many more customers and not in the double ratio, but we also managed to improve efficiency.so we also, were able to reach, fourfold, so reaching 2 million customers. but still we're trying to make the best of this tailwind. We're operating at max capacity. We're expanding because we still have our business plans, right? So those, we also need to push forward because you cannot just say, it's a pandia, I'm not gonna make the targets.so we were also expanding ambitiously in the Netherlands, in Germany, in France. We started running. Then designing our first distribution center in that order. So you can imagine, it was not a great, sequence. We were forced a little bit by the toilet paper debacle, to do this. and then we were also designing, building our first ever automated warehouse.which open Inre? we are in, the uk so it's quite fitting that I have a picture of a king, in my deck. So this is actually the prince, and he came to open our, FC and he has a custom orange, box, there. so this was a huge milestone, but I think also nice to show you, actually a bit how it looks from the inside.Luis: Yep.Katherin: Which also has no sound, but I can also describe it. so this is a, some of you may know this. So this is a goods to person concept. So in our manual warehouses, the people go to the goods. Here we have a fully automated warehouse. the 50 kilometers of conveyor belts, high automation, and we.not only manage the software and own the software that manage this, but also the, hardware controlling system. so we were super pumped. The king came, we opened this warehouse. This is the next level in our growth. big new toy, whole new category of incidents that we were really not prepared to cope with.So it's a custom robot, 42,000, square meters, 50 kilometers of convey belts operating 24 7. Not so nice for on call. when On Call is like a group of heroes, and you just call the people so we had many, big incidents. the Chill Cell. Shutting down in the middle of the night.The fail safe failed, the monitoring failed. so you can imagine integrations, between hardware and warehouse management not working and having the FC blocked. So really painful outages that we're having. Also real customer impact. We were by now very thinly spread. we were reaching scaling limits, both in technology and people.we had many incidents across the organization, ordering multiple trucks of mangoes, which actually happened, almost. We did not pay them, but almost paying suppliers three times the same invoice. So lots of crazy, incidents. And we had to come to a realization that is that, We were not really working together.Yes, we were collaborating. Very closely, but we were not realizing that the instability in one part was having a ripple effect in the other part again and again So we had to, do a self-reflection and realize that we needed to stop solving for today and having this fixed applied, and actually.spending time on developing future-proof solutions, that would enable us to also be more flexible in the future. so that was a big shift as well in our mentality and how we approached, challenges.Luis: So that brings us to the last chapter in our story where we go, to structure scale up from 2023 ish to now that, we want to do less firefighting and more future pro future proofing.We are no longer a five vehicle company in we are now operating, as Katherin said, they're almost everywhere. and, con deliveries continue to rise exponentially. People started to, flat out. So that's good for automation. but probably what got us here probably will not get us where we want to go next.so what we realize is alignment is not guaranteed. There's more people. It's harder to connect, is our to share a vision, to share agreements. to share processes that were unwritten at this point, there's this, dichotomy between tech debt and the new features always happening. it was not great, but we, actually realized that we can improve together, things that is a flywheel effect of improvement, improvements in one team.Makes other teams actually want to improve and continue this cycle, which is momentum building to mutually grow each other. We also want to do more learnings and not firefighting. Don't get me wrong, we still want to get things fixed as fast as possible. That's one of our core beliefs, to solve our customers, to get them what they ordered at the right time, at the right place.but we want to have least repeated incidents, more energy spent on long-term solutions and not only on, firefighting, Mr. Murphy wasn't super opinionated. There was some process. but we maybe think about that. we actually did not have a lot of metrics, so we couldn't really measure how we were doing and wanted to do better of that, and not a lot of.Postmortems existed. We also wanted to learn from things. So this was a critical, area that we saw that we didn't do. Mr. Murphy, as I said, was not really fit anymore. We couldn't really find support to, in the organization to improve it or space in our many things that we do. so now we, have faced the question, we accept Murphy's Law or we turn it around and make it, a strength.So what we actually did. 24 7 oncall. Before it was this group of heroes best effort. Now we have a formalized oncall process. Everybody participates in it from the engineers and product managers, product owners, operations, everyone, directors, et cetera. brings back a bit of the old spirit of we are on this together, right?So that it's really good. next, Mr. Murphy. Conference, of course we bought incident I, we de commissioned Mr. Murphy. I'm personally very happy for this. It gives us, all the structure for onco incident manager and status pages. So I guess we are one of those companies that use many products from San Antonio and we are very happy We formalized processes. We defined SLOs and SLAs with our customers, my customers, Katherin. we have a library of accessible documentation, one books, et cetera. We improve on those things, and I think more importantly, we became a mandate for the team to improve the situation. Teams are now empowering to go and propose things and do things that make their lives and our lives better.The post-incident process that I mentioned, we were not really good at that. we got some best practices or learnings from the, industry how to do this. We provided training for the engineers how to run incident management. Actually not,Katherin: no. So they didn't train business. so incident I always rolled out.But yeah, within a couple weeks we were trained. So then we got, rolling.Luis: And I think the testament to always it is the first incident that was created incident io was even before we announced the proof of concept, before it was even fully configured, people just started using it and run An incident just worked.I was very amazed. then now we can't, we have some metrics that you can steer the process and learn what we can do, improve, we are, We also kept this blameless culture, this culture of improving together, of being together in the same boat. and the effect of this is that our product roadmaps now are much more influenced by future proofing in fixing the things that happen today, so they don't happen in the future.Is this the. Perfect world that I'm saying no, we, and listening to what, what people said already, this conference, we need to improve a lot more and, we will do it. But I feel that now we have the groundwork to in place to keep iterating and keep improving. And I want to, we want to live with a few tick, whoops.Katherin: Sorry, I'll clickLuis: takeaways.Katherin: Collaboration. so we have, we want to indeed, leave you with some of the takeaways, of what we have learned so far. so the first one would be structure, growth and nurture the culture. We are high energy can-do attitude. I, at the beginning ask about standards or processes and that was like.Forbidden words, you just go and do so it doesn't come naturally to us. But we realized in the last years that by structuring our growth, we're also able to funnel our energy into, really building the future.Luis: Let go of no what no longer serves you. So we had, Mr. Murphy serves you for a long time, but question the things that you are using to see if they are still fit for purpose.And don't be afraid to let go of the things in the past and, go in the new things that can propel you even further in the future.Katherin: Build trust to push through the hard times. we always work together in a humble and open way. We own up to our mistakes and there is a blameless culture, but it's really key to build that trust so that you can push through the hard times.So assume good intent and that will unlock the flywheel.Luis: Tools are the engine, not the journey. No tool will solve all your problems, that's for sure. But pick the tool that, gets you the baseline that gets you the things going, and you can improve with the tools that it can grow with you that can go for a better future.Also, don't try to think that you all, your problems are unique and you are very special and you need to maybe build your own things. probably you're not. And you can learn from the tools and the industry.Katherin: And last, don't forget to have fun. So this is a silly one of course, but we are very serious when it comes to our values and our data-driven decision making for, instance.But we also don't take ourselves too seriously and we try to celebrate every win together. And I also think that makes a big difference. so this is where the storytelling of our, first 10 years ends. we're of course still writing the story, as we speak, but we also hope that by sharing openly our struggles or hiccups, but also, wins, this enables you to take some learnings and also build your own future.Thank you very much.

London 2025 Sessions