BigQuery Top Queries by Duration
Long-running queries tie up resources, delay downstream pipelines, and frustrate users waiting for results. This query identifies the longest-running queries in your project with their execution time, cost, and a runtime-to-bytes ratio that helps distinguish legitimate large scans from poorly optimized queries.
Why This Matters
Execution time matters beyond just cost. A query that runs for 30 minutes blocks a data pipeline, delays dashboard refreshes, or causes an analyst to context-switch. Under Editions pricing, long-running queries also consume more slot-hours (minimum 1-minute billing). Reducing query duration improves both cost and productivity.
How It Works
The query calculates execution time as the TIMESTAMP_DIFF between end_time and start_time. It also computes a seconds-per-GiB ratio: high values indicate queries that are slow relative to the amount of data they process, suggesting optimization potential.
SQL Query
Fill in your details to get a ready-to-run query:
-- Slowest queries ranked by wall-clock execution time
DECLARE lookback_days INT64 DEFAULT 7;
WITH jobs AS (
SELECT
user_email,
query,
project_id,
start_time,
end_time,
TIMESTAMP_DIFF(end_time, start_time, SECOND) AS duration_sec,
COALESCE(total_bytes_billed, 0) AS bytes_billed,
total_slot_ms,
ROW_NUMBER() OVER (PARTITION BY job_id ORDER BY end_time DESC) AS rn
FROM `your-project`.`region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL lookback_days DAY)
AND job_type = 'QUERY' AND state = 'DONE' AND total_slot_ms IS NOT NULL
),
deduplicated AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY query ORDER BY bytes_billed DESC) AS qrn
FROM jobs WHERE rn = 1
)
SELECT
user_email,
query,
project_id,
start_time,
duration_sec,
ROUND(SAFE_DIVIDE(total_slot_ms, duration_sec * 1000), 0) AS avg_slots,
...Query Explanation
Two levels of deduplication: first by job_id to remove duplicate job entries, then by query text to show only the most expensive execution of each unique query. The sec_per_gib metric divides execution time by bytes billed — high values mean the query is slow relative to how much data it reads.
Key Insights
Queries over 10 minutes are likely scanning unpartitioned tables or performing expensive shuffles.
A high sec_per_gib means the query is compute-bound (complex JOINs, sorts, window functions) rather than I/O-bound.
A low sec_per_gib means the query is scanning a lot of data quickly — it could benefit from partition pruning.
Long-running queries in pipelines can cascade delays to all downstream dependencies.
Best Practices
- 1
Set query timeout limits to prevent runaway queries from consuming resources indefinitely.
- 2
For ETL pipelines, break long-running monolithic queries into smaller, staged transformations.
- 3
Use EXPLAIN PLAN to identify the slowest stages and optimize them specifically.
- 4
Consider approximate aggregation functions (APPROX_QUANTILES, APPROX_TOP_COUNT) for exploratory queries.
Want CloudClerk to find these savings automatically?
Our platform connects to your BigQuery project, runs these analyses automatically, and delivers AI-powered optimisation recommendations — all with your data fully anonymised.
Related Guides
BigQuery Top Queries by Cost
Find the most expensive BigQuery queries by on-demand cost. Rank queries by total bytes billed to identify the biggest cost drivers in your project.
Read guideBigQuery Top Queries by Complexity
Find the most compute-intensive BigQuery queries ranked by slot usage. Identify queries consuming disproportionate computational resources.
Read guideBigQuery Top Queries by Frequency
Identify the most frequently executed BigQuery queries. Find repeated queries that are candidates for caching, materialized views, or consolidation.
Read guideBigQuery Slot Usage by Hour
Monitor hourly BigQuery slot consumption to identify peak usage windows and optimize your reservation scheduling.
Read guide