Written by Ian on Wednesday 22/08/07
A year or so ago I looked up the biggest and meatiest FPGA devices, and DSPs and compared them headtohead in terms of raw 16bit multiplies  which I measure in MMACS (million multiplyaccumultes per second).
First the DSPs. One each from the two market leaders (remember things might be faster now, a year later):
The Texas Instruments TMS320C64, clocking up to 1GHz has 8 MAC units which can calculate in parallel, so it can do 8000 MMACs. It costs something like 300 euro, so thats 0.028 euro per MMAC.
Analog Devices TS210S. This clocks at 600MHz (at least twice that now), also with 8 MAC units, so 4800 MMACs. At roughly the same price, that's 0.063 euro per MMAC.
Now the FPGAs. Again a year old  but the same vintage as the DSPs.
Altera Stratix EP2S180, clocking at 420MHz. This has 96 dedicated DSP blocks each with 4 MACs internally, and 176,000 logic elements (LEs). The LEs can configure to 962 multipliers at 180MHz. So overall, combining the LEs and DSP blocks gives us 161280 + 173160 = 334400 MMACs. Of course these aren't cheap at 4100 euro each, but that works out at a paltry 0.012 euro per MMAC.
The Xilinx XC4VSX55 is a little quicker at 500MHz, with 512 XtremeDSP blocks (each 1 MAC) and 55295 logic cells. It works out to 309280 MMACs. Not sure of the cost.
OK these don't include all those other little helper peripherals that the DSP has, and control logic for the FPGA multipliers. But still, over 300000 MMACs is A LOT.
*If* you can write the program (and it will probably be a lot harder than doing it using a DSP), quite clearly FPGAs are monster processing engines compared even to the fastest DSPs.
Just remember though that doing all that processing is no use if you can't get the data in and out. And that's where the DSPs fall down  they can't sustain their maximum speed when accessing external memory. The FPGAs potentially can do.
If you pit them headtohead on speed, there's just no comparison...
