تارا فایل

پاورپوینت Architectural Analysis of a DSP Device the Instruction Set and the Addressing Modes


1
Architectural Analysis of a DSP Device, the Instruction Set and the Addressing Modes
SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications

Miodrag Bolic

2
Outline
FIR filter on ADPS-21x

DSP Requirements
Fast Multiply-Accumulates (Data-path)
Extended Precision Accumulator Register (Data-path)
Dual Operand Fetch (Memory)
Circular Buffering (Addressing)
Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming
SHARC
Blackfin
Performance Optimization

3
ADSP -21x
Copied from [Kester03]

4
CALCULATING OUTPUTS OF 4-TAP FIR FILTER USING A CIRCULAR BUFFER
y(3) = h(0) x(3) + h(1) x(2) + h(2) x(1) + h(3) x(0)
y(4) = h(0) x(4) + h(1) x(3) + h(2) x(2) + h(3) x(1)
y(5) = h(0) x(5) + h(1) x(4) + h(2) x(3) + h(3) x(2)
Memory
Location

0

1

2

3
Read

x(0)

x(1)

x(2)

x(3)
Write

x(4)

Read

x(4)

x(1)

x(2)

x(3)
Write

x(5)

Read

x(4)

x(5)

x(2)

x(3)
Copied from [Kester03]

5
FIR filter steps
1. Obtain a sample with the ADC; generate an interrupt
2. Detect and manage the interrupt
3. Move the sample into the input signal's circular buffer
4. Update the pointer for the input signal's circular buffer
5. Zero the accumulator
6. Control the loop through each of the coefficients
7. Fetch the coefficient from the coefficient's circular buffer
8. Update the pointer for the coefficient's circular buffer
9. Fetch the sample from the input signal's circular buffer
10. Update the pointer for the input signal's circular buffer
11. Multiply the coefficient by the sample
12. Add the product to the accumulator
13. Move the output sample (accumulator) to a holding buffer
14. Move the output sample from the holding buffer to the DAC

Copied from [Kester03]

6
FIR filter steps (cont.)
ADSP21xx Example code:

CNTR = N-1;
DO convolution UNTIL CE;
convolution:
MR = MR + MX0 * MY0(SS), MX0 = DM(I0,M1), MY0 = PM(I4,M5);
Copied from [Kester03]

7
Outline
FIR filter on ADPS-21x

DSP Requirements
Fast Multiply-Accumulates (Data-path)
Extended Precision Accumulator Register (Data-path)
Dual Operand Fetch (Memory)
Circular Buffering (Addressing)
Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming
SHARC
Blackfin
Performance Optimization

8
Copied from [Takala05]

9
Copied from [Takala05]

10
Motorola DSP5600X
Copied from [Takala05]

11
Copied from [Takala05]

12
Copied from [Takala05]

13
ADSP -21x
MAC
www.analog.com/dsp

14
Copied from [Takala05]

15
SHARC Architecture ADSP-2106X
Copied from [Takala05]

16
Outline
FIR filter on ADPS-21x

DSP Requirements
Fast Multiply-Accumulates (Data-path)
Extended Precision Accumulator Register (Data-path)
Dual Operand Fetch (Memory)
Circular Buffering (Addressing)
Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming
SHARC
Blackfin
Performance Optimization

17
Copied from [Takala05]

18
Copied from [Takala05]

19
Copied from [Takala05]

20
Outline
FIR filter on ADPS-21x

DSP Requirements
Fast Multiply-Accumulates (Data-path)
Extended Precision Accumulator Register (Data-path)
Dual Operand Fetch (Memory)
Circular Buffering (Addressing)
Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming
SHARC
Blackfin
Performance Optimization

21
Copied from [Takala05]

22
Copied from [Takala05]

23
Hardware loops
Software loop:
MOVE #16,B Initialize loop counter B
LOOP: MAC (R0)+,(R4)+,A Register-indirect addressing with post-increment
DEC B
JNE LOOP
Hardware loops: no time is spent on
Decrementing counters
Checking to see if the loop is finished
Branching back to the top of the loop

RPT #16
MAC (R0)+,(R4)+,A
[Lapsley97]

24
Copied from [Kester03]

25

Upto 3000MMACS
Image compression
Digital Still/Video Camera
MMOIP
Telematics
Biometrics
Upto 160MMACS
Wired Voice
Wireless Voice
VOIP/VON
Industrial Control

ADSP-218x/9x
Power Efficient
$5 – $10
Upto 4800MMACS (16-bit)
or 1200MMACS (32-bit)
2.5G/3G Infrastructure
Medical Imaging
Industrial Imaging
Multiprocessing
TigerSHARC
High-Performance
$35 – $200
Performance
Blackfin Media Enabled
$5 – $30
ADI General Purpose DSP Product Families
Upto 600MMACS (32-bit)
Audio
Infotainment
Industrial
SHARC
Low-Cost
Floating Point
$10 – $100
www.analog.com/dsp

26
Outline
FIR filter on ADPS-21x

DSP Requirements
Fast Multiply-Accumulates (Data-path)
Extended Precision Accumulator Register (Data-path)
Dual Operand Fetch (Memory)
Circular Buffering (Addressing)
Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming
SHARC
Blackfin
Performance Optimization

27
SHARC Architecture
Copied from [Smith97]

28
SHARC Architecture – Features
The Super Harvard ARChitecture
100MHz Core / 300 MFLOPS Peak
Parallel Operation of: Multiplier, ALU, 2 Address Generators &
Sequencer
No Arithmetic Pipeline; All Computations Are Single-Cycle
High Precision and Extended Dynamic Range
32/40-Bit IEEE Floating-Point Math
32-Bit Fixed-Point MAC’s with 64-Bit Product & 80-Bit Accumulation
Single-Cycle Transfers with Dual-Ported Memory Structures
Supported by Cache Memory and Enhanced HarvardArchitecture
Glueless Multiprocessing Features
JTAG Test and Emulation Port
DMA Controller, Serial Ports, Link Ports, External Bus, SDRAM
Controller, Timers
www.analog.com/dsp

29
ADSP-2106x Core Architecture
www.analog.com/dsp

30
Example- Dot product
C code
Copied from [Smith97]

31
Example- Dot product – Assembly
Copied from [Smith97]

32
Example- Dot product – Assembly
Copied from [Smith97]

33
C or Assembly
How complicated is the program?
Are you pushing the maximum speed of the DSP?
How many programmers will be working together?
Which is more important, product cost or development cost?
What is your background?
What does the DSP's manufacturer suggest you use?

Copied from [Smith97]

34
Outline
FIR filter on ADPS-21x

DSP Requirements
Fast Multiply-Accumulates (Data-path)
Extended Precision Accumulator Register (Data-path)
Dual Operand Fetch (Memory)
Circular Buffering (Addressing)
Zero-Overhead Looping (Instruction set)

Analog Devices Architectures and Programming
SHARC
Blackfin
Performance Optimization

35
BLACKfin Processor Core
Two 16-bit Multipliers
Two 40-bit ALUs, Four 8-bit Video ALUs
Barrel Shifter
Sixteen 16-bit /Eight 32-bit Math Registers

Two DAGs, byte addressing
Eight 32-bit pointer registers
Four Sets of 32-bit Index, Modify, Length, Base

16-bit Instructions, 32-bit Instructions
Multi-Issue, 64-bit Instructions

Interlocked Pipeline
Micro Signal Architecture, developed with Intel
www.analog.com/dsp

36
ADSP-BF535 BLACKfin Processor Architecture
Great Performance Value
Highest Frequency (350 MHz)
1.0V to 1.6V
260 PBGA
High System Integration
Address range 768Mbytes
SPORTs support 8 Channels of I2S Audio
(532Mbps) I/O Bandwidth, DMA Bandwidth & Memory Bandwidth
Microcontroller features include WDT, PCI, USB1.1 SDRAM controller
To 350 MHz
BLACKfin
Processor Core
SDRAM
FLASH/SRAM
Interfaces
Real Time
Clock
Watchdog
JTAG
System Peripherals
308 Kbytes
On-Chip
SRAM
DMA
SPI 2
UART 2
Timers 3 (32bit)
GPIO 16
User Peripherals
Dynamic
Power
Management
SPORTs 2
PCI
Memory
PLL
264Kbytes
On-Chip
SRAM
48 Kbytes
On-Chip
Cache
USB 1.1
www.analog.com/dsp

37
Seminars about Blackfin

38
Seminars about Blackfin

39
Seminars about Blackfin

40
Seminars about Blackfin

41
Seminars about Blackfin

42
Seminars about Blackfin

43
Seminars about Blackfin

44
Seminars about Blackfin

45
Seminars about Blackfin

46
Seminars about Blackfin

47
Seminars about Blackfin

48
Seminars about Blackfin

49
Seminars about Blackfin

50
Seminars about Blackfin


تعداد صفحات : 50 | فرمت فایل : .ppt

بلافاصله بعد از پرداخت لینک دانلود فعال می شود