Fabrizio De Cicco
IoT Infrastructure Architect & Lead DevOps Engineer
I build scalable, reliable, and secure cloud foundations for connected products — bridging development and operations across Azure and Kubernetes through Infrastructure as Code, CI/CD, observability, and SRE.
About
I lead the IoT infrastructure domain at Culligan International — designing and evolving our global IoT Platform's cloud architecture, automation, and observability so software and data teams can ship faster with confidence.
My focus is bridging development and operations into systems that are efficient, maintainable, and resilient. Experienced across Terraform, Azure DevOps, Kubernetes, and cloud cost optimization, I care as much about reliability and security baselines as I do about developer experience.
I enjoy mentoring engineers, simplifying complexity, and driving collaboration across teams — turning infrastructure into something the whole organization can build on and trust.
How I work
Reliability is a feature.
Design for failure, measure what users actually feel, and treat error budgets as real budgets.
Automate the toil.
If I've done it by hand twice, it becomes code. Humans are for judgment, not repetition.
You can't operate what you can't see.
Observability isn't an add-on — metrics, logs, and SLOs ship with the system, not after it.
Boring and predictable wins.
Simple, well-understood infrastructure beats clever and fragile every single time.
Infrastructure as code, or it didn't happen.
Everything reproducible, reviewable, and versioned. No snowflakes, no surprises.
Lead by enabling.
Mentor, document, and remove friction so the team moves faster than any one person could.
Experience
- Promoted to lead IoT infrastructure architecture and reliability across the Culligan IoT Platform organization.
- Own end-to-end infrastructure design and operations, ensuring scalability, security, and cost efficiency.
- Architect and evolve the platform's CI/CD, IaC, and observability frameworks supporting global IoT solutions.
- Automate provisioning with Terraform and Azure DevOps pipelines; implement Datadog monitoring that reduced incident MTTR.
- Mentor and guide the DevOps/SRE team, driving automation, observability, and reliability practices.
- Built and maintained scalable CI/CD pipelines on Azure DevOps using YAML across a large application portfolio.
- Automated infrastructure provisioning with Ansible, Bicep, and Rundeck, cutting manual configuration work.
- Wrote automation tooling in PowerShell and Bash that accelerated deployment cycles and reduced errors.
- Rolled out observability and logging with Datadog, improving issue detection and resolution.
- Supervised and mentored a 12-person application support team, improving SLA compliance and productivity.
- Collaborated across developers, analysts, and vendors to accelerate incident resolution and customer satisfaction.
- Developed backend and web applications in C#, SQL, ASP.NET, and Angular using Agile and TDD.
- Maintained high availability of microservices on Azure with Docker and Kubernetes.
Skills & stack
Cloud & platform
IaC & automation
CI/CD & delivery
Containers & orchestration
Observability & SRE
Languages & data
Selected work
Multi-agent engineering assistant
A multi-agent AI system that helps a platform team review, test, ship, and operate software with more consistency and less toil — adopted across the lead engineering team.
Problem
Across a growing set of repositories, code review, QA, release coordination, and reliability monitoring were largely manual — slow to perform, hard to keep consistent, and a constant pull on senior engineers' time.
Approach
I designed and built a set of specialist AI agents in Claude Code, each owning a single workflow and sharing a common library of reusable skills. They integrate with the team's existing toolchain through MCP, sit behind safety guardrails that gate anything destructive, and feed a weekly self-analysis loop that surfaces gaps. Two decisions shaped it: organizing agents by workflow rather than by role, so each carries only the context it needs, and pushing shared conventions into a skills layer to avoid duplicating knowledge.
Outcome
Rolled out to the lead engineering team and now part of how the platform ships. I measure it against adoption, cycle time, and defect-leakage baselines rather than vanity metrics — with the feedback loop steadily improving the agents over time.
$review --pr 1487
→ orchestrator dispatching specialists
✓ code review standards + pipeline impact
✓ qa branch checkout + tests
✓ release promotion readiness
✓ reliability SLOs + incident signals
✓ review ready — 0 blocking, 2 suggestions
# tip: this prompt is live — type 'help'
Internal tooling — happy to walk through the architecture and the results.
Certifications
Full list of licenses & certifications on LinkedIn.
Contact
Let's build something reliable.
Happy to talk infrastructure, reliability, or AI-assisted engineering — reach out and let's chat.
Usually responds within a day.