An Intuitive Explaination of FFTs (Cooley Tukey)

Kyle Finn

Note: There are many different algorithms for FFTs and this is just a common one

To explain at a high level:

A naive DFT implementation computes each output sample independently and uses every input sample in each computation (classic N² algorithm). The Cooley Tukey FFT uses symmetries and patterns in the DFT definition to do the computation in "layers" (log N layers), each layer with constant-time requirement per sample creating an N log N algorithm.

To explain by example:

The standard definition of an 8-point DFT written in matrix form is

$$ \omega = e^{\frac{-2 \pi i}{8}} $$ $$ DFT_8 \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \\ x_6 \\ x_7 \end{bmatrix} = \begin{bmatrix} X_0 \\ X_1 \\ X_2 \\ X_3 \\ X_4 \\ X_5 \\ X_6 \\ X_7 \end{bmatrix} = \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 0 & \omega ^ 1 & \omega ^ 2 & \omega ^ 3 & \omega ^ 4 & \omega ^ 5 & \omega ^ 6 & \omega ^ 7 \\ \omega ^ 0 & \omega ^ 2 & \omega ^ 4 & \omega ^ 6 & \omega ^ 0 & \omega ^ 2 & \omega ^ 4 & \omega ^ 6 \\ \omega ^ 0 & \omega ^ 3 & \omega ^ 6 & \omega ^ 1 & \omega ^ 4 & \omega ^ 7 & \omega ^ 2 & \omega ^ 5 \\ \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 \\ \omega ^ 0 & \omega ^ 5 & \omega ^ 2 & \omega ^ 7 & \omega ^ 4 & \omega ^ 1 & \omega ^ 6 & \omega ^ 3 \\ \omega ^ 0 & \omega ^ 6 & \omega ^ 4 & \omega ^ 2 & \omega ^ 0 & \omega ^ 6 & \omega ^ 4 & \omega ^ 2 \\ \omega ^ 0 & \omega ^ 7 & \omega ^ 6 & \omega ^ 5 & \omega ^ 4 & \omega ^ 3 & \omega ^ 2 & \omega ^ 1 \end{bmatrix} \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \\ x_6 \\ x_7 \end{bmatrix} $$

Let's split this by even/odd columns to expose some patterns

$$ \begin{bmatrix} X_0 \\ X_1 \\ X_2 \\ X_3 \\ X_4 \\ X_5 \\ X_6 \\ X_7 \end{bmatrix} = \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 0 & \omega ^ 2 & \omega ^ 4 & \omega ^ 6 \\ \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 \\ \omega ^ 0 & \omega ^ 6 & \omega ^ 4 & \omega ^ 2 \\ \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 0 & \omega ^ 2 & \omega ^ 4 & \omega ^ 6 \\ \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 \\ \omega ^ 0 & \omega ^ 6 & \omega ^ 4 & \omega ^ 2 \end{bmatrix} \begin{bmatrix} x_0 \\ x_2 \\ x_4 \\ x_6 \end{bmatrix} + \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 1 & \omega ^ 3 & \omega ^ 5 & \omega ^ 7 \\ \omega ^ 2 & \omega ^ 6 & \omega ^ 2 & \omega ^ 6 \\ \omega ^ 3 & \omega ^ 1 & \omega ^ 7 & \omega ^ 5 \\ \omega ^ 4 & \omega ^ 4 & \omega ^ 4 & \omega ^ 4 \\ \omega ^ 5 & \omega ^ 7 & \omega ^ 1 & \omega ^ 3 \\ \omega ^ 6 & \omega ^ 2 & \omega ^ 6 & \omega ^ 2 \\ \omega ^ 7 & \omega ^ 5 & \omega ^ 3 & \omega ^ 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_3 \\ x_5 \\ x_7 \end{bmatrix} $$

We can now see the top and bottom halves of these matrices differ only by a scalar multiplier, so let's split again:

$$ \begin{bmatrix} X_0 \\ X_1 \\ X_2 \\ X_3 \end{bmatrix} = \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 0 & \omega ^ 2 & \omega ^ 4 & \omega ^ 6 \\ \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 \\ \omega ^ 0 & \omega ^ 6 & \omega ^ 4 & \omega ^ 2 \end{bmatrix} \begin{bmatrix} x_0 \\ x_2 \\ x_4 \\ x_6 \end{bmatrix} + \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 1 & \omega ^ 3 & \omega ^ 5 & \omega ^ 7 \\ \omega ^ 2 & \omega ^ 6 & \omega ^ 2 & \omega ^ 6 \\ \omega ^ 3 & \omega ^ 1 & \omega ^ 7 & \omega ^ 5 \end{bmatrix} \begin{bmatrix} x_1 \\ x_3 \\ x_5 \\ x_7 \end{bmatrix} $$ $$ \begin{bmatrix} X_4 \\ X_5 \\ X_6 \\ X_7 \end{bmatrix} = \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 0 & \omega ^ 2 & \omega ^ 4 & \omega ^ 6 \\ \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 \\ \omega ^ 0 & \omega ^ 6 & \omega ^ 4 & \omega ^ 2 \end{bmatrix} \begin{bmatrix} x_0 \\ x_2 \\ x_4 \\ x_6 \end{bmatrix} + \omega ^ 4 \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 1 & \omega ^ 3 & \omega ^ 5 & \omega ^ 7 \\ \omega ^ 2 & \omega ^ 6 & \omega ^ 2 & \omega ^ 6 \\ \omega ^ 3 & \omega ^ 1 & \omega ^ 7 & \omega ^ 5 \end{bmatrix} \begin{bmatrix} x_1 \\ x_3 \\ x_5 \\ x_7 \end{bmatrix} $$

Also factor out the difference between rows in our left and right matrices

$$ \begin{bmatrix} X_0 \\ X_1 \\ X_2 \\ X_3 \end{bmatrix} = \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 0 & \omega ^ 2 & \omega ^ 4 & \omega ^ 6 \\ \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 \\ \omega ^ 0 & \omega ^ 6 & \omega ^ 4 & \omega ^ 2 \end{bmatrix} \begin{bmatrix} x_0 \\ x_2 \\ x_4 \\ x_6 \end{bmatrix} + \begin{bmatrix} \omega ^ 0 & 0 & 0 & 0 \\ 0 & \omega ^ 1 & 0 & 0 \\ 0 & 0 & \omega ^ 2 & 0 \\ 0 & 0 & 0 & \omega ^ 3 \end{bmatrix} \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 0 & \omega ^ 2 & \omega ^ 4 & \omega ^ 6 \\ \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 \\ \omega ^ 0 & \omega ^ 6 & \omega ^ 4 & \omega ^ 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_3 \\ x_5 \\ x_7 \end{bmatrix} $$ $$ \begin{bmatrix} X_4 \\ X_5 \\ X_6 \\ X_7 \end{bmatrix} = \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 0 & \omega ^ 2 & \omega ^ 4 & \omega ^ 6 \\ \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 \\ \omega ^ 0 & \omega ^ 6 & \omega ^ 4 & \omega ^ 2 \end{bmatrix} \begin{bmatrix} x_0 \\ x_2 \\ x_4 \\ x_6 \end{bmatrix} + \omega ^ 4 \begin{bmatrix} \omega ^ 0 & 0 & 0 & 0 \\ 0 & \omega ^ 1 & 0 & 0 \\ 0 & 0 & \omega ^ 2 & 0 \\ 0 & 0 & 0 & \omega ^ 3 \end{bmatrix} \begin{bmatrix} \omega ^ 0 & \omega ^ 0 & \omega ^ 0 & \omega ^ 0 \\ \omega ^ 0 & \omega ^ 2 & \omega ^ 4 & \omega ^ 6 \\ \omega ^ 0 & \omega ^ 4 & \omega ^ 0 & \omega ^ 4 \\ \omega ^ 0 & \omega ^ 6 & \omega ^ 4 & \omega ^ 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_3 \\ x_5 \\ x_7 \end{bmatrix} $$

Finally, notice the equivalence to a 4-point DFT

$$ \begin{bmatrix} X_0 \\ X_1 \\ X_2 \\ X_3 \end{bmatrix} = DFT_4 \begin{bmatrix} x_0 \\ x_2 \\ x_4 \\ x_6 \end{bmatrix} + \begin{bmatrix} \omega ^ 0 & 0 & 0 & 0 \\ 0 & \omega ^ 1 & 0 & 0 \\ 0 & 0 & \omega ^ 2 & 0 \\ 0 & 0 & 0 & \omega ^ 3 \end{bmatrix} DFT_4 \begin{bmatrix} x_1 \\ x_3 \\ x_5 \\ x_7 \end{bmatrix} $$ $$ \begin{bmatrix} X_4 \\ X_5 \\ X_6 \\ X_7 \end{bmatrix} = DFT_4 \begin{bmatrix} x_0 \\ x_2 \\ x_4 \\ x_6 \end{bmatrix} + \begin{bmatrix} \omega ^ 4 & 0 & 0 & 0 \\ 0 & \omega ^ 5 & 0 & 0 \\ 0 & 0 & \omega ^ 6 & 0 \\ 0 & 0 & 0 & \omega ^ 7 \end{bmatrix} DFT_4 \begin{bmatrix} x_1 \\ x_3 \\ x_5 \\ x_7 \end{bmatrix} $$ The computational result here is that now a size 8 FFT requres 8 multiplies + 2 * (size 4 FFT)... expanding all the way: $$ 8 + 2\cdot(4 + 2\cdot(2 + 2\cdot(1))) = 32 = 8\ (log_2(8) + 1) \qquad \ O( N\ log \ N) $$ This method of splitting even/odd or top-half bottom-half only works if the FFT size is divisible by 2. If say, it is dviisible by 3, you can split every 3rd, and top-middle-bottom thirds, and so on for any divisor. There are other methods for dealing with prime sized FFTs. Some convert to a larger composite FFT to then apply this method.