Technolila Webtools
3 views

Computed Truth

The SQL Query Cost Estimator proves that **Index Seek (O(log N))** operations are exponentially faster than **Full Table Scans (O(N))** as data grows. For a table with 1M rows, an Index Seek is ~20,000x cheaper than a Full Scan. However, at >10-15% selectivity, the Database Optimizer often forces a partial scan due to Random I/O penalties.

SQL Query Cost Estimator

Estimate Query Cost

Low % = highly selective (good for indexes)

The Technical Proof

Database cost models rely on Page Reads as the primary currency. The formula differs by access method:

1. Full Table Scan (Heap Scan) - \( O(N) \)

$$ Cost_{scan} = \frac{N_{rows} \times Row_{width}}{Page_{size}} $$
Every page is read sequentially. High throughput, but reads unnecessary data.

2. Index Seek (B-Tree Loop) - \( O(\log N) \)

$$ Cost_{seek} = Height_{tree} + \lceil Selectivity \times N_{rows} \rceil $$
Where \( Height_{tree} \approx \log_{branching\_factor}(Pages) \). Usually 3-4 reads deep. Efficient for fetching specific rows.

Note: This model assumes a Cold Buffer Cache (worst-case I/O). In a Warm Cache scenario, logical reads are CPU-bound, not I/O-bound.

Step-by-Step Logic

  1. Input Analysis: User provides Table Size (\( N \)) and Selectivity (\( S \)).
  2. Determine Access Path:
    • If "Scan" is selected, the engine must traverse all Leaf Pages.
    • If "Index Seek" is selected, the engine traverses the B-Tree Root \(\to\) Intermediate \(\to\) Leaf.
  3. Calculate Page Cost:
    • Assume average row width of 200 bytes.
    • Rows per page = \( Page_{size} / 200 \).
    • Total Pages = \( N / Rows_{page} \).
  4. Apply Optimizer Heuristics: If Selectivity \( > 15\% \), warn that Random I/O cost exceeds Sequential Scan cost, likely causing an Index Scan flip.
  5. Convert to Time: Map Page Reads to hardware latency (HDD vs SSD standards).