The picoArray is a massively parallel, Multiple Instruction Multiple Data (MIMD) architecture composed of processing elements of various types linked together by the patented picoBus interconnect. There are two types of processing element — 16-bit Harvard architecture processors each with 3-way LIW and local memory and hardware co-processors to accelerate specific functions. The 3-way LIW Processors come in three slightly different variants:
| Array Element | Type | Description | Memory (bytes) |
Number per PC102 |
| 16-bit Harvard Architecture processor; 3-way LIW |
Standard | For Datapath operations. Includes dedicated MAC unit and Application Specific instructions for CDMA such as complex spread and despread |
768 | 240 |
| Memory | For local control and buffering. Includes larger data memory for buffering and a Multiply Unit | 8704 | 64 | |
| Control | For global control and buffering. Includes large data and instruction memory and Multiply Unit | 65536 | 4 | |
| Hardware co-processor |
Functional Accelerator Unit (FAU) | Flexible hardware engine for correlation and path metric calculation | N/A | 14 |
The processing element are linked together by the picoBus which provides huge
bus bandwidth (3300 Gigabits per second in PC102) and high efficiency through
totally deterministic communications between processors. Communication paths
to group processors into chains or arrays are configured at compile time and
do not need run-time arbitration allowing picoBus to operate routinely at >90%
maximum capacity. picoBus “bridges” connect individual devices to
allow the architecture to scale up to arrays of dozens of devices containing
many thousands of individual processing elements which can tackle even the most
intense signal processing challenges.
Click here to read a conference paper that describes the picoArray architecture and how to use it (architecture implementation is 1st generation PC101 – some detail differences with PC102).
The processing power of the picoArray beats any of today’s leading DSPs by more than a factor of 10. And this power is not just a theoretical maximum – the efficiency of picoBus and picoTools means that >90% of this computing power can be used in real systems with the complex mix of control and datapath processing that is typical of today’s advanced wireless systems.
The performance of a single PC102 device at 160MHz is summarized below:
| Operation | Peak / sec. / device (PC102 @ 160MHz) |
| MIPs | 197.12 Billion Instructions |
| MACs | 38.4 Billion MACs |
| MPYs | 10.88 Billion 16-bit MPYs |
| CDMA Spread | 307.2 Billion complex chips |
| CDMA Despread | 153.6 Billion complex chips |
| Complex Correlation | 143.36 Billion corr. points |
| Path Metric | 8.96 Billion Path Metrics |
"Von Neumann is a poor use of scaling — all the energy is going on the communication between the processor and the memory. It’s much better to use 20 microprocessors running at 100MHz than one at 2GHz"
Professor Hugo De Man
Co-founder IMEC