Pipeline Design and Speedup Analysis
Published:
QUESTION-1:
A task $T$ that will be executed on integers consists of seven suboperations $X_i$ ($i = 1, 2, 3, 4, 5$) and $Y_i$ ($i = 1, 2$). These suboperations are implemented using combinational digital circuits with the following propagation delays:
$X_1 = 20\text{ns}, X_2 = 15\text{ns}, X_3 = 20\text{ns}, X_4 = 20\text{ns}, X_5 = 5\text{ns}, Y_1 = 25\text{ns}, Y_2 = 10\text{ns}$.
Fig. 1 on the right shows the block diagram of the circuit.
- The arrows between the suboperations indicate the dependencies between them. Therefore, the suboperations should be executed in the given order.
- Suboperation $Y_1$ can execute in parallel with $X_2$.
- Similarly, suboperation $Y_2$ can also execute in $X_4$.
a) (50 pts) Design a pipeline $P_A$ that executes task $T$ and meets the following constraints:
- The pipeline achieves the highest possible speedup if it is executed on an array with an infinite number of elements (maximize the speedup).
- Use a minimum number of registers with a propagation delay of 5 ns.
- Use the units given in the original circuit ($X_i(i=1,2,3,4,5)$ and $Y_i (i=1,2)$).
b) (10 pts) How long does it take to execute the task only on the first element of array using the pipeline $P_A$? ($T_1 = ?$)
c) (10 pts) Calculate the highest possible speedup pipeline $P_A$ can achieve when it executes a task $T$ on an array with an infinite number of elements.
This question includes visual content: A block diagram titled 'Fig. 1: Dependency diagram of the circuit' showing seven blocks labeled X1, X2, X3, X4, X5 and Y1, Y2. The flow starts at X1, which points down to X2. X1 also has an arrow branching out to Y1. X2 and Y1 both point into X3. X3 points down to X4 and branches out to Y2. X4 and Y2 both point into X5. The final output comes out of X5. Arrows indicate logical flow and dependency.
Animated Video Solution
The first half plays free, the full solution is in the app.
Step by Step Written Solution
Let's work through this pipeline design problem. We will start by examining the given block diagram and finding the total unpipelined execution time.
Non-Pipelined Analysis
To determine the base execution time, we need to find the critical path. This is the sequence of blocks from top to bottom that yields the maximum total delay.
If we follow the arrows, the longest path visits X1, branches to Y1, continues into X3, down to X4, and finally to X5.
Substituting the given propagation delays, we add twenty, twenty-five, twenty, twenty, and five nanoseconds.
This gives us a total unpipelined execution time of ninety nanoseconds.
Now for Part A. Our goal is to design a pipeline that achieves the highest possible speedup, meaning we must minimize our clock cycle time. We are constrained to use the units as they are.
Part (a) Pipeline Design
Because we cannot split the operational units, our minimum cycle time is bound by the longest single unit delay. Looking at our units, Y1 is the slowest at twenty-five nanoseconds.
The pipeline registers have a delay of 5 nanoseconds, which gives an absolute minimum cycle time of thirty nanoseconds.
To satisfy the constraint of using the minimum number of registers, we must pack our units into the fewest possible number of stages, ensuring no stage's combinational delay exceeds twenty-five nanoseconds.
Stage Groupings:
Stage one will simply contain X1.
Stage two contains X2 and Y1 running in parallel. The stage delay is limited by the slower Y1 unit, which fits perfectly at twenty-five nanoseconds.
Stage three acts on its own containing X3.
The rest of this solution is on Solvi
12 more steps are locked. Watch the full animated, narrated solution for free.
Snap a photo, solve any question like this.
Watch the Rest for FreeFree to download · First solutions are on us