Free Carrier KPI Tool

Stop running carrier scorecards on a stale spreadsheet.

Drop in your carrier data. We benchmark it against the ClickPost network — segmented by your vertical — and generate a board-ready scorecard in 2 minutes. Built for VPs and Directors of Logistics who've outgrown the monthly Excel ritual.

  • 8
    Performance metrics
  • 5
    Vertical benchmarks
  • 600+
    Carriers in network
  • $0
    Cost to use
The Generator

Carrier performance scorecard generator

Pick your vertical, drop in your carrier numbers from the last 30/60/90 days, and we'll generate a benchmarked scorecard with grades, deltas, and a verdict — instantly.

01 · Pick your vertical
02 · Enter your carrier data
Carrier OTD % 1st-Attempt % RTO % Transit (days) Damage % Pickup % Cost / shp Complaint %
Most teams benchmark 3–6 carriers · leave fields blank if not tracked · max 10 rows
Live scorecard

Carrier Performance Scorecard

Benchmarked against
ClickPost Network
Benchmarks reflect aggregated, anonymized carrier performance from the ClickPost network (Q4 2025). Verticals normalized to comparable shipment profiles. Individual results vary with mix, geography, SLA tier, and seasonality.
How it works

From scattered data to scorecard in four steps

No login, no upload. Pick your vertical, paste in your carrier numbers, and get a board-ready scorecard you can share with your COO or carrier partners.

  1. 1 Step 1

    Pick your vertical

    Choose D2C, Marketplace, Quick Commerce, B2B/Freight, or Cross-border. Benchmarks shift dramatically — a 91% OTD is excellent for cross-border and mediocre for Q-commerce, so we cut the data accordingly.

  2. 2 Step 2

    Drop in your numbers

    Add your carriers and paste in last 30/60/90-day data across 8 KPIs: OTD, 1st-attempt delivery, RTO, transit, damage, pickup success, cost, and complaint rate. Leave fields blank if you don't track them — partial data is fine.

  3. 3 Step 3

    Get your scorecard

    Each carrier gets a weighted overall score, letter grade (A/B/C/D), and per-metric variance vs. ClickPost network benchmark. Color-coded green/red so the laggards jump out immediately.

  4. 4 Step 4

    Share & act

    Download as a board-ready PDF or share the link with your carrier partners ahead of your monthly business review. Or automate the whole thing with ClickPost and stop running this manually every month.

The KPIs

The eight metrics on every carrier scorecard

These are the eight KPIs we benchmark, why they matter, and how much they influence the overall carrier score. Together they cover the full delivery experience — from pickup to doorstep to refund.

  • 🎯

    On-Time Delivery (OTD)

    Weight: 22% · Higher is better

    % of shipments delivered within the promised SLA window. The single most-watched carrier KPI — directly tied to NPS, repeat purchase rate, and WISMO ticket volume. D2C network avg: 91.5%. Below 85% triggers carrier reviews; above 95% is best-in-class.

    Inverse of: Late Delivery Rate
  • 📞

    1st-Attempt Delivery (NDR)

    Weight: 18% · Higher is better

    % of shipments delivered on the first attempt — i.e., the inverse of NDR rate. Failed first attempts cascade into rescheduling overhead, customer complaints, and eventually RTOs. D2C network avg: 87%. Each percentage point of NDR improvement reduces RTO by 0.4–0.6pp.

    Inverse of: NDR rate
  • ↩️

    Return-to-Origin (RTO)

    Weight: 18% · Lower is better
    % of shipments returned to origin without successful delivery. Combines refused orders, repeated NDR failures, and address issues. The most expensive carrier failure mode — each RTO costs the brand the forward leg, the reverse leg, and the lost margin. D2C avg: 4.8%.
    Drives: Reverse logistics cost
  • ⏱️

    Average Transit Time

    Weight: 12% · Lower is better

    Mean number of days from pickup to delivery. Affects EDD accuracy, conversion at checkout, and customer perception. D2C avg: 3.2 days. Q-commerce: 0.4 days. Long-tail carriers often inflate transit by 1–2 days vs. their stated SLA — track this monthly.

    Drives EDD reliability
  • 📦

    Damage / Loss Rate

    Weight: 10% · Lower is better

    Damage and loss claims per 100 shipments. A small absolute number with outsized impact — each damage event triggers a refund, a replacement, and often a 1-star review. D2C avg: 0.6%. Above 1% indicates carrier handling problems or packaging mismatch.

    Drives: Refund + replacement cost
  • 🚚

    Pickup Success Rate

    Weight: 8% · Higher is better

    % of scheduled pickups completed within window. The leading indicator nobody tracks until it breaks. Failed pickups push entire shipments into next-day SLA breach territory. D2C avg: 96%. Below 92% means you're starting late on every dispatch cycle.

    Leading indicator for: OTD
  • 💰

    Cost per Shipment

    Weight: 6% · Lower is better

    Blended landed cost per shipment (freight + last-mile + surcharges, excluding RTO). Lower weight on the scorecard because cost is table-stakes — most teams already have this in their carrier contract. The point is to flag carriers charging above-market for below-market service.

    Driver of: Gross shipping margin
  • 💬

    Customer Complaint Rate

    Weight: 6% · Lower is better

    % of shipments generating a delivery-related complaint or WISMO ticket. The voice-of-customer KPI that catches issues operational metrics miss — rude delivery agents, mishandled COD, location accuracy problems. D2C avg: 1.4%. Often the earliest signal of brand-damage risk.

    Drives: CSAT + brand risk
What drives carrier scores

Eight factors that determine carrier performance

Carrier performance isn't just about the carrier. Eight variables — most of them upstream of the carrier itself — explain why some accounts hit benchmark and others don't, even on the same lane.

Factor 01

Vertical & shipment profile

D2C apparel, B2B freight, and Q-commerce groceries can't be benchmarked against each other. Profile differences (weight, value, urgency, addressability) explain 30–40% of headline KPI variance. Always compare like-for-like — that's why we segment benchmarks by vertical.

Factor 02

Geographic mix

Tier-1 metros, tier-2 cities, and rural pin codes have wildly different OTD baselines. A carrier with 95% OTD in Bangalore may run 78% in tier-3 UP. Always look at carrier performance bucketed by region — a "good" carrier in one zone can be a disaster in another.

Factor 03

SLA tier selected

Same carrier, different service tier — wildly different performance. Express tier usually hits 96%+ OTD; surface/economy tier often runs in the low 80s. Make sure your scorecard separates carriers by tier, not just by carrier brand. Otherwise you're averaging out signal.

Factor 04

Address quality

40–60% of NDR cases trace back to address issues — incomplete pin codes, missing landmarks, wrong phone numbers. A carrier scoring poorly on NDR may have a great delivery network and a bad upstream data pipe. Audit your address validation before blaming the carrier.

Factor 05

EDD logic & customer comms

"Late delivery" is partly a comms problem. Aggressive EDD promises at checkout cause SLA breaches that wouldn't have existed with realistic dates. Your scorecard's OTD number reflects both carrier speed and your own EDD discipline. Tighten one to fix the other.

Factor 06

NDR handling workflow

Two carriers with similar 1st-attempt rates can have wildly different RTO outcomes — depending on whether NDR triggers an automatic re-attempt request, a customer call, or sits idle for 48 hours. Workflow design accounts for 1.5–3pp of RTO variance between brands.

Factor 07

Pickup scheduling discipline

Pickup success is largely controllable upstream. Late manifests, poorly scheduled pickup windows, and unmet ready-by times cause 60% of failed pickups. Before blaming the carrier on pickup, look at your warehouse outbound rhythm and dispatch SOPs.

Factor 08

Volume allocation strategy

If you ship 10K orders/month split 9K/1K between two carriers, the small-share carrier often runs poorly — they're not getting the consolidated route density to perform. Allocation isn't just about cost; below a threshold, performance degrades structurally. Monitor this.

The math

How the carrier score is calculated

Scorecards are only as useful as the methodology behind them. Here's exactly how each carrier's overall score and grade are calculated — fully transparent so you can defend the output to your team and your carrier partners.

  • Per-Metric Points

    Variance to benchmark

    delta = (actual − bench) ÷ bench

     

    ≥ +5% better → 90–100 pts

    ±5% (in line) → 75 pts

    5–10% worse → 60 pts

    10–20% worse → 35 pts

    20–35% worse → 15 pts

    > 35% worse → 0 pts

    Direction-aware: higher is better for OTD/FADS/pickup

    Lower is better for RTO/transit/damage/cost/complaint

  • Letter Grade

    Grade boundaries

    A → 85–100 (top performer)

    B → 70–84 (in line / above)

    C → 55–69 (material gaps)

    D → 0–54 (review required)

    Grade B = "performing at the network avg"

    Grade D almost always = upstream issues

    Use grades for QBR talking points

Scorecard weighting

How metric weights shift by vertical

Not every KPI matters equally to every business. Q-commerce penalizes a transit miss harder than D2C does. B2B weights damage rate higher because the unit value is higher. Here's how the scorecard adapts.

MetricD2CMarketplaceQ-commerceB2B / FreightCross-border
On-Time Delivery 22%primary KPI 22% 28% 24% 22%
1st-Attempt Delivery 18%NDR proxy 18% 18% 16% 16%
Return-to-Origin 18%cost driver 18% 12% 12% 14%
Avg Transit Time 12% 12% 14%SLA-critical 14% 14%
Damage / Loss 10% 10% 10% 14%high-value goods 12%
Pickup Success 8% 8% 8% 8% 6%
Cost per Shipment 6% 6% 4% 6% 10%
Customer Complaint 6% 6% 6% 6% 6%

* Weights sum to 100% per vertical. Q-commerce weights OTD heaviest because the SLA window is sub-30-minute and customers churn fast on a single missed promise. B2B weights damage higher because consignment value is higher and refurbishment is rarely possible. Cross-border weights cost higher because customs-cleared landed cost is what shippers actually optimize against.

Network benchmarks

ClickPost network benchmarks by vertical

The reference data the scorecard runs against. These are the network averages — i.e., what a "B-grade, in-line-with-market" carrier looks like for your vertical. Top performers run 5–10% better; D-grade carriers run 15–25% worse.

VerticalOTD1st-AttemptRTOTransitDamageCost / shp
D2C / E-commerce 91.5%retail benchmark 87.0% 4.8% 3.2 d 0.6% ₹78
Marketplace 89.0% 84.0% 6.5% 3.8 d 0.9% ₹72
Quick Commerce 96.5%tightest SLA 94.0% 1.2% 0.4 d 0.3% ₹45
B2B / Freight 87%retail benchmark 92.0% 1.5% 4.5 d 1.2%higher value ₹240
Cross-border 82.0%retail benchmark 86.0% 3.5% 7.5 dincl. customs 1.0% ₹420

* Benchmarks reflect the ClickPost network as of Q4 2025. D2C/Marketplace/Q-commerce/B2B costs in INR; cross-border in USD. Top-quartile carriers consistently exceed these numbers; bottom-quartile carriers run 15–25% below. For carrier-specific or lane-specific benchmarks, ClickPost customers can pull AWB-level data live from the platform.

Action playbook

Nine ways to use a carrier scorecard well

A scorecard isn't worth running unless it changes a decision. Here's what the best logistics teams do with theirs — and what separates a scorecard that drives accountability from one that gets filed away.

  • 1
    Monthly

    Run it on a fixed monthly cadence

    The same week, same day, every month. Carrier partners will adapt their internal review cycle to yours once they realize the scorecard discussion is non-negotiable. Quarterly cadence is too slow to catch performance drift; weekly is too noisy.

  • 2
    QBR

    Share with carriers before reviews

    Send the scorecard 2–3 days before your carrier business review. Gives the account manager time to bring data, not excuses. Cold-handing them a low score in the meeting wastes the meeting on defending; warm-handing it forces a real action plan.

  • 3
    Top 2

    Focus the discussion on the bottom 2 metrics

    Don't try to fix everything. Pick the two metrics where this carrier is most below benchmark, and drive a 60-day improvement plan on those. Carriers that try to fix six metrics at once usually fix none. Constraint creates outcome.

  • 4
    ↑ Volume

    Reallocate volume — not just words

    The scorecard only matters if it's tied to carrier mix decisions. Top-grade carriers should see volume increase; D-grade carriers should see volume decrease. If your scorecard never changes the allocation split, your carriers won't take it seriously.

  • 5
    Region

    Cut by region for the second look

    Headline carrier scores hide regional differences. After looking at the topline scorecard, always pull a second view sliced by zone — a carrier scoring 78 overall might be a 92 in metros and a 58 in tier-3. The fix is rerouting tier-3 volume, not firing the carrier.

  • 6
    SLA

    Score by service tier, not just carrier

    Treat express + surface as separate "carriers" for scoring purposes. Same brand, different operational reality. Aggregating them masks tier-level performance issues — and makes you blame the wrong tier when you reroute volume.

  • 7
    Upstream

    Audit upstream before blaming carriers

    If multiple carriers are missing benchmark on the same metric (typically NDR or RTO), the issue isn't the carriers — it's upstream. Address quality, EDD logic, NDR workflows. Fix the upstream once and watch every carrier's score lift simultaneously.

  • 8
    Trend

    Track the trend, not the snapshot

    A carrier going from 78 to 82 is more interesting than a carrier sitting at 85. Direction matters more than altitude. Best-practice scorecards include a 3-month trend line per metric so you can spot improvement (reward it) and decay (escalate it) early.

  • 9
    Auto

    Automate before scaling carrier count

    Manual scorecards work for 3–4 carriers. Beyond that, the cost of stitching the data each month exceeds the value. Once you're managing 6+ carriers, get to a tool that auto-pulls AWB-level data and refreshes the scorecard live — like ClickPost.

Vertical playbook

What "good" carrier performance looks like

No single benchmark fits every business. Here's how the bar shifts across the five major verticals — and what to focus on first when your scorecard goes red.

  • D2C
    E-commerce, retail brands

    Watch OTD and RTO most closely — they map directly to NPS and reverse-logistics cost. Average spend is small but volumes are high, so percentage-point shifts compound fast.

    Bench OTD: 91.5% · RTO: 4.8%
  • Marketplace
    Multi-seller platforms

    Higher RTO is structural (more first-time buyers, less brand trust). Track 1st-attempt delivery as the leading indicator. Carrier mix typically broader than D2C, so allocation matters more.

    Bench OTD: 89.0% · RTO: 6.5%
  • Q-commerce
    Sub-hour delivery

    OTD is everything — and the SLA window is sub-30-minute, so misses are extremely punishing. Pickup-to-doorstep time should be the hero metric on this scorecard, weighted higher than anything else.

    Bench OTD: 96.5% · Transit: 0.4 d
  • B2B
    Bulk, palletized, freight

    Damage rate matters more here — high consignment value, refurbishment usually impossible. Receiver-rated quality and appointment compliance often added to scorecards beyond the standard 8.

    Bench OTD: 87.0% · Damage: 1.2%
  • Cross-border
    International, customs-cleared

    Customs delays distort OTD — separate carrier-controllable from customs-controllable variance. Brokerage cost and DDP transparency drive cost-per-shipment more than freight rate itself.

    Bench OTD: 82.0% · Transit: 7.5 d

A carrier performance scorecard is a structured monthly tracking document that logistics leaders use to evaluate shipping carriers against KPIs like on-time delivery, RTO rate, NDR rate, transit time, damage rate, and cost per shipment. It standardizes carrier reviews, drives accountability conversations with carrier partners, and identifies which carriers deserve more (or less) volume allocation. ClickPost's free generator auto-creates one with vertical-specific benchmarks pre-filled — instead of building from scratch in Excel.

The eight metrics that matter most for D2C and retail logistics teams are: On-Time Delivery (OTD), First-Attempt Delivery Success or NDR rate, Return-to-Origin (RTO) rate, Average Transit Time, Damage and Loss rate, Pickup Success rate, Cost per Shipment, and Customer Complaint rate. Enterprise B2B teams may add receiver-rated quality and appointment compliance. Q-commerce teams should weight OTD and pickup-to-delivery time most heavily. Cross-border teams need to separate carrier-controllable variance from customs-controllable variance.

Benchmarks reflect aggregated, anonymized carrier performance data across the ClickPost shipping platform — billions of shipments processed across 600+ carrier integrations. Numbers are segmented by vertical (D2C, Marketplace, Quick Commerce, B2B/Freight, Cross-border) so you're comparing like-for-like. A 91% OTD is excellent for cross-border and mediocre for Q-commerce, so we cut benchmarks accordingly. Updated quarterly. Individual carrier performance varies with mix, geography, SLA tier, and seasonality.

On-time delivery benchmarks vary dramatically by vertical. D2C ecommerce: 91.5% network average, 94%+ for top performers. Marketplace: 89%. Quick commerce: 96.5% (because the SLA window is tight, sub-30-minute). B2B/Freight: 87%. Cross-border: 82% (customs delays distort the number). The standard a logistics leader should hold carriers to is the network average for their vertical — anything 5+ percentage points below that warrants a carrier conversation, and 10+ percentage points below warrants a volume reallocation discussion.

Monthly is the standard cadence for operational reviews — frequent enough to catch performance drift, slow enough to smooth out random week-to-week variance. Quarterly carrier business reviews (QBRs) should use rolling 90-day data for a longer-term view. Real-time dashboards (which ClickPost provides) are useful for daily triage but shouldn't replace the structured monthly review where you set commitments, track action items, and reallocate volume between carriers based on the previous month's performance.

Yes — sharing scorecards directly with your carrier partners is one of the most effective ways to drive performance improvement. It creates accountability, shows you're tracking professionally, and gives carrier account managers the data they need to escalate internally. Best practice: share the scorecard 2–3 days before a monthly business review, focus the discussion on the bottom 2 metrics where the carrier is most below benchmark, and set joint commitments with deadlines. Carriers respond to data, not anecdotes — and respond to consistent professional pressure, not occasional complaint emails.

Not always — and this is where most logistics teams get it wrong. If multiple carriers are missing benchmark on the same metric, the issue is almost certainly upstream of the carriers. Common culprits: address quality (broken NDR), aggressive EDD promises (broken OTD), poor NDR handling workflows (broken RTO), late warehouse manifests (broken pickup). Audit your upstream operations first. If only one carrier is missing benchmark while others on similar lanes are hitting it, then the carrier is genuinely the problem — and that's a different conversation.

ClickPost automates the full carrier performance lifecycle: real-time tracking across 600+ carriers, AWB-level performance attribution, automated NDR/RTO workflows, intelligent carrier allocation that routes each shipment to the best-fit carrier based on weight/zone/SLA, live carrier scorecards (no manual Excel updates), and proactive exception management. For brands shipping 50K+ orders/month, ClickPost typically identifies 12–18% in cost savings and 5–9 percentage points of OTD lift in the first 90 days. The free scorecard generator on this page is the manual version of what ClickPost runs automatically every minute of every day.

Stop running carrier scorecards manually. Automate the whole thing.

ClickPost auto-generates live carrier scorecards from real shipment data — across 600+ carriers, segmented by SLA tier, region, and weight bucket. No manual Excel, no monthly stitching. Plus intelligent carrier allocation, automated NDR/RTO workflows, and AWB-level performance attribution.

  • 600+
    Carrier integrations
  • 12-18%
    Cost savings (90 days)
  • 5-9pp
    OTD lift (90 days)
  • 99.9%
    Uptime SLA