Agentic Failure Triage System
LLM-powered root-cause analysis for CI/CD test failures at scale
Built at Ciena to address a core challenge in large-scale hardware validation: extracting signal from thousands of test failures per build cycle. An agentic workflow ingests structured failure logs from Jenkins CI/CD runs, uses LangGraph to orchestrate multi-step LLM reasoning chains, and produces clustered root-cause summaries with confidence scores and suggested remediation paths — compressing engineer triage time from hours to minutes.