Cache Memory

Introduction to Cache Memory

Imagine you are a librarian who needs to fetch books for many readers. If every time a reader requests a book, you have to go to a huge warehouse far away, it will take a lot of time. But if you keep the most popular books on a small shelf near you, you can quickly hand them out without delay. This is exactly how cache memory works in a computer system.

The Central Processing Unit (CPU) is the brain of the computer, performing millions of operations every second. However, the CPU is extremely fast, while the main memory (RAM) is comparatively slower. This speed difference creates a bottleneck, slowing down the overall system.

Cache memory acts as a small, very fast storage located close to the CPU. It stores frequently accessed data and instructions so that the CPU can retrieve them quickly, improving system performance significantly.

In this chapter, we will explore what cache memory is, how it is organized, the techniques used to manage it, and how to measure its performance.

Cache Memory Basics

What is Cache Memory?

Cache memory is a small-sized, high-speed memory located between the CPU and the main memory. It temporarily holds copies of data and instructions that the CPU is likely to reuse soon.

Why is Cache Needed?

The CPU operates at speeds much faster than the main memory. Without cache, the CPU would spend a lot of time waiting for data to be fetched from the slower main memory. Cache reduces this waiting time by providing faster access to frequently used data.

Cache vs Main Memory

Size: Cache is much smaller than main memory.
Speed: Cache is faster than main memory.
Cost: Cache memory is more expensive per byte than main memory.

Benefits of Cache Memory

Speeds up data access for the CPU.
Reduces average memory access time.
Improves overall system performance.

Cache Organization and Levels

Cache memory is organized in multiple levels to balance speed, size, and cost:

L1 Cache: Closest to the CPU, smallest size, fastest access.
L2 Cache: Larger than L1, slower but still faster than main memory.
L3 Cache: Shared among CPU cores, largest and slowest cache level.

Each cache level stores data in blocks (also called cache lines). The block size is the amount of data transferred between cache and main memory in one operation.

Choosing the right cache size and block size is a trade-off between cost, speed, and hit rate (how often data is found in cache).

**Comparison of Cache Levels**
Cache Level	Typical Size	Access Time (nanoseconds)	Location
L1 Cache	32 KB - 64 KB	1 - 3 ns	Inside CPU core
L2 Cache	256 KB - 512 KB	3 - 10 ns	Inside or near CPU core
L3 Cache	2 MB - 16 MB	10 - 20 ns	Shared across CPU cores

Cache Mapping Techniques

When the CPU requests data, the cache must decide where to store or find that data. This is done using cache mapping techniques. There are three main types:

1. Direct Mapping

Each block of main memory maps to exactly one cache line. It is like having a fixed shelf slot for each book in the library.

Advantages: Simple and fast mapping.

Disadvantages: Can cause frequent conflicts if multiple blocks map to the same cache line.

2. Associative Mapping

A block can be placed in any cache line. The cache searches all lines to find a block.

Advantages: Flexible, fewer conflicts.

Disadvantages: More complex and slower to search.

3. Set-Associative Mapping

Combines direct and associative mapping. Cache is divided into sets, each containing multiple lines. A block maps to one set but can be placed in any line within that set.

Advantages: Balances speed and flexibility.

Disadvantages: More complex than direct mapping.

graph TD    A[Memory Address] --> B{Mapping Technique}    B --> C[Direct Mapping]    B --> D[Associative Mapping]    B --> E[Set-Associative Mapping]    C --> F[One Cache Line per Block]    D --> G[Any Cache Line]    E --> H[One Set, Multiple Lines]

Cache Performance Metrics

To evaluate cache effectiveness, we use several key metrics:

Hit Rate: The fraction of memory accesses found in cache.
Miss Rate: The fraction of memory accesses not found in cache.
Average Memory Access Time (AMAT): Average time to access memory considering hits and misses.

**Cache Performance Formulas**
Metric	Formula	Description
Hit Rate	\(\text{Hit Rate} = \frac{\text{Number of Hits}}{\text{Total Memory Accesses}}\)	Proportion of accesses found in cache
Miss Rate	\(\text{Miss Rate} = 1 - \text{Hit Rate}\)	Proportion of accesses not found in cache
Average Memory Access Time (AMAT)	\(\text{AMAT} = \text{Hit Time} + (\text{Miss Rate} \times \text{Miss Penalty})\)	Average time to access memory including cache misses

Cache Write and Replacement Policies

When the CPU writes data, the cache must decide how and when to update main memory. This is controlled by write policies:

Write-Through

Every write to cache is immediately written to main memory. This keeps memory always updated but increases memory traffic.

Write-Back

Writes are made only to cache. Main memory is updated later when the cache block is replaced. This reduces memory traffic but requires more complex control.

When cache is full and a new block must be loaded, a replacement policy decides which block to remove:

Least Recently Used (LRU): Replace the block that was used least recently.
First In First Out (FIFO): Replace the oldest block in cache.

These policies impact cache efficiency and complexity.

Formula Bank

Hit Rate

\[ \text{Hit Rate} = \frac{\text{Number of Hits}}{\text{Total Memory Accesses}} \]

where: Number of Hits = count of cache hits; Total Memory Accesses = total memory requests

Miss Rate

\[ \text{Miss Rate} = 1 - \text{Hit Rate} \]

where: Hit Rate = fraction of cache hits

Average Memory Access Time (AMAT)

\[ \text{AMAT} = \text{Hit Time} + (\text{Miss Rate} \times \text{Miss Penalty}) \]

where: Hit Time = time to access cache; Miss Rate = fraction of misses; Miss Penalty = additional time to access main memory

Worked Examples

Example 1: Calculating Hit Rate and Miss Rate Easy

A cache memory system recorded 800 hits and 200 misses during 1000 memory accesses. Calculate the hit rate and miss rate.

Step 1: Identify total memory accesses = 1000

Step 2: Number of hits = 800

Step 3: Calculate hit rate using formula:

\(\text{Hit Rate} = \frac{800}{1000} = 0.8\) or 80%

Step 4: Calculate miss rate:

\(\text{Miss Rate} = 1 - 0.8 = 0.2\) or 20%

Answer: Hit Rate = 80%, Miss Rate = 20%

Example 2: Average Memory Access Time Calculation Medium

A cache has a hit time of 2 ns, miss penalty of 50 ns, and a hit rate of 90%. Calculate the average memory access time (AMAT).

Step 1: Given: Hit Time = 2 ns, Miss Penalty = 50 ns, Hit Rate = 0.9

Step 2: Calculate Miss Rate:

\(\text{Miss Rate} = 1 - 0.9 = 0.1\)

Step 3: Use AMAT formula:

\(\text{AMAT} = 2 + (0.1 \times 50) = 2 + 5 = 7 \text{ ns}\)

Answer: Average Memory Access Time = 7 ns

Example 3: Direct Mapping Cache Address Mapping Medium

A direct mapped cache has 16 lines and block size of 4 bytes. The memory address is 8 bits. For the memory address 10110110, determine the tag, index, and block offset.

Step 1: Calculate number of bits for block offset:

Block size = 4 bytes = \(2^2\) bytes, so block offset = 2 bits

Step 2: Calculate number of bits for index:

Number of cache lines = 16 = \(2^4\), so index = 4 bits

Step 3: Remaining bits are for tag:

Tag bits = Total bits - (Index bits + Block offset bits) = 8 - (4 + 2) = 2 bits

Step 4: Break down the address 10110110:

Tag (2 bits): 10
Index (4 bits): 1101
Block Offset (2 bits): 10

Answer: Tag = 10, Index = 1101 (decimal 13), Block Offset = 10

Example 4: Set-Associative Cache Mapping Example Hard

Consider a 2-way set associative cache with 8 sets and block size of 4 bytes. The memory address is 8 bits. For the address 11010110, determine the set number and tag.

Step 1: Calculate block offset bits:

Block size = 4 bytes = \(2^2\), so block offset = 2 bits

Step 2: Calculate set index bits:

Number of sets = 8 = \(2^3\), so set index = 3 bits

Step 3: Calculate tag bits:

Tag bits = Total bits - (Set index + Block offset) = 8 - (3 + 2) = 3 bits

Step 4: Break down the address 11010110:

Tag (3 bits): 110
Set Index (3 bits): 101
Block Offset (2 bits): 10

Step 5: Convert set index to decimal:

\(101_2 = 5\)

Answer: Set Number = 5, Tag = 110

graph TD    A[Memory Address 11010110] --> B[Tag: 110]    A --> C[Set Index: 101 (Set 5)]    A --> D[Block Offset: 10]    C --> E{Check Set 5}    E --> F[Compare Tag with 110 in Set 5]    F --> G{Hit or Miss}

Example 5: Write Policy Impact on Cache Performance Hard

A system uses write-through and write-back policies alternatively. Explain with examples how write-back reduces memory traffic compared to write-through.

Step 1: Write-Through Example:

Every time the CPU writes data to cache, it immediately writes to main memory.

If the CPU writes 100 times, there are 100 memory write operations.

Step 2: Write-Back Example:

CPU writes only to cache. Main memory is updated only when the cache block is replaced.

If the CPU writes 100 times to the same block, only 1 write to main memory occurs when the block is replaced.

Step 3: Impact on Performance:

Write-back reduces memory traffic, improving performance.
Write-through ensures data consistency but increases memory bus usage.

Answer: Write-back policy reduces memory traffic by delaying writes to main memory, unlike write-through which writes immediately, causing more memory operations.

Tips & Tricks

Tip: Remember that hit rate + miss rate = 1

When to use: Quickly calculate one metric if the other is known.

Tip: Use block offset bits to identify data within a cache block

When to use: While breaking down memory addresses for cache mapping problems.

Tip: For direct mapping, index bits directly determine cache line

When to use: Solving direct mapped cache problems quickly.

Tip: In set-associative caches, focus on set index first, then compare tags

When to use: Mapping addresses in set-associative caches efficiently.

Tip: Write-back reduces memory traffic compared to write-through

When to use: Evaluating cache write policies in performance questions.

Common Mistakes to Avoid

❌ Confusing hit rate with miss rate

✓ Remember that miss rate = 1 - hit rate

Why: Hit and miss rates are complementary probabilities; mixing them leads to wrong calculations.

❌ Misidentifying bits for tag, index, and block offset

✓ Carefully calculate bit lengths based on cache and block size

Why: Incorrect bit division results in wrong cache line mapping and errors in address decoding.

❌ Assuming all caches have the same access time

✓ Understand that L1 is fastest, L3 slowest among cache levels

Why: Different cache levels have different speeds, affecting performance calculations.

❌ Mixing write-through and write-back policies

✓ Learn characteristics and consequences of each policy separately

Why: Confusion leads to incorrect analysis of cache write behavior and performance impact.

❌ Ignoring miss penalty in AMAT calculation

✓ Always include miss penalty multiplied by miss rate

Why: Miss penalty significantly affects average access time and overall system speed.

The Joy of Learning

Login

The Joy of Learning

Sign-up

The Joy of Learning

Forgot Password

Introduction to Cache Memory

Cache Memory Basics

Cache Organization and Levels

Cache Mapping Techniques

1. Direct Mapping

2. Associative Mapping

3. Set-Associative Mapping

Cache Performance Metrics

Cache Write and Replacement Policies

Write-Through

Write-Back

Formula Bank

Formula Bank

Worked Examples

Tips & Tricks

Common Mistakes to Avoid

Try Practice next.

Rank

eBook

Online Test Series + eBook

Book is added to your cart!