ARM processor organization
|
|
- Suzan Davidson
- 5 years ago
- Views:
Transcription
1 ARM processor organization P. Bakowski
2 ARM register bank The register bank,, which stores the processor state. r00 r01 r14 r15 P. Bakowski 2
3 ARM register bank It has two read ports and one write port which can each be used to access any register, plus an additional read port and an additional write port that give special access to r15, the program counter. r00 r01 r14 r15 P. Bakowski 3
4 ARM register bank It has two read ports and one write port which can each be used to access any register, plus an additional read port and an additional write port that give special access to r15, the program counter. r00 r01 r14 r15 P. Bakowski 4
5 ARM register bank It has two read ports and one write port which can each be used to access any register, plus an additional read port and an additional write port that give special access to r15, the program counter. r00 r01 r14 r15 P. Bakowski 5
6 ARM register bank It has two read ports and one write port which can each be used to access any register, plus an additional read port and an additional write port that give special access to r15, the program counter. r00 r01 r14 r15 P. Bakowski 6
7 ARM barrel shifter The barrel shifter,, which can shift or rotate one operand by any number of bits. number of bits P. Bakowski 7
8 ARM ALU The ALU, which performs the arithmetic and logic functions required by the instruction set. operands functions P. Bakowski 8
9 ARM 3-stage 3 pipeline The address register and incrementer,, which select and hold addresses and generate sequential addresses when required. address register incrementer P. Bakowski 9
10 ARM 3-stage 3 pipeline The data registers, which hold data passing to and from memory. data instructions data out register data in register to/from memory d[31:0] P. Bakowski 10
11 ARM 3-stage 3 pipeline The instruction decoder and associated control logic. control path control signals data path instructions data in register P. Bakowski 11
12 Three stage pipeline : ARM 1,2,3 FETCH; ; the instruction is fetched from memory and placed in the instruction pipeline. to instruction register address register data in register memory fetch clock cycle P. Bakowski 12
13 Three stage pipeline : ARM 1,2,3 DECODE; ; the instruction is decoded and the datapath control signals prepared for the next cycle. instruction register control path control signals fetch decode fetch P. Bakowski 13
14 Three stage pipeline : ARM 1,2,3 EXECUTE; ; the instruction controls the datapath; ; the register bank is read, an operand shifted the ALU result generated and written back into a destination register. control signals data path fetch decode fetch execute decode P. Bakowski 14
15 Three stage pipeline : ARM 1,2,3 instruction throughput : 1 instruction per clock cycle instruction latency : 3 clock cycles fetch decode execute fetch decode execute clock cycle fetch decode execute P. Bakowski 15
16 Three stage pipeline : ARM 1,2,3 instruction throughput : 1 instruction per clock cycle instruction latency : 3 clock cycles fetch decode fetch execute decode fetch execute decode execute P. Bakowski 16
17 ARM 1,2,3 architecture a[31:0] instruction register control path control signals Bbus data in register address register incrementer incrementer PC register bank M BS Abus ALUbus ALU M multiplier, BS barrel shifter data out register d[31:0] P. Bakowski 17
18 Multi cycle execution fetch decode STR fetch STR store instruction needs two execution cycles: address calculation cycle and data transfer cycle P. Bakowski 18
19 Multi cycle execution fetch decode STR execute decode fetch address fetch address calculation cycle and data transfer cycle P. Bakowski 19
20 Multi cycle execution 4 clock cycles fetch decode execute STR decode address transfer fetch decode execute fetch decode address calculation cycle and data transfer cycle P. Bakowski 20
21 Multi cycle execution 4 clock cycles fetch decode execute STR decode address transfer fetch decode execute fetch decode address calculation cycle and data transfer cycle Attention: only one memory transfer per clock cycle P. Bakowski 21
22 Processor performance The time T,, required to execute a given program is given by: T=(N inst *CPI)/f clk N inst - number of instructions in the program CPI - clock cycles per instruction fclk - clock frequency P. Bakowski 22
23 Processor performance The time T,, required to execute a given program is given by: T=(N inst *CPI)/f clk N inst - number of instructions in the program CPI - clock cycles per instruction fclk - clock frequency P. Bakowski 23
24 Processor performance The time T,, required to execute a given program is given by: T=(N inst *CPI)/f clk N inst - number of instructions in the program CPI - clock cycles per instruction (throughput) fclk - clock frequency P. Bakowski 24
25 Processor performance The time T,, required to execute a given program is given by: T=(N inst *CPI)/f clk N inst - number of instructions in the program CPI - clock cycles per instruction fclk - clock frequency P. Bakowski 25
26 Processor performance Since N insi is constant for a given program there are only two ways to increase performance: increase the clock rate, clk f. reduce the average number of clock cycles per instruction, CPI. P. Bakowski 26
27 Clock rate increase Increase the clock rate, clk f. This requires the logic in each pipeline stage to be simplified and, therefore, the number of pipeline stages to be increased. stage1 stage2 clock cycle 3 stages stage3 P. Bakowski 27
28 Clock rate increase Increase the clock rate, clk f. This requires the logic in each pipeline stage to be simplified and, therefore, the number of pipeline stages to be increased. stage1 stage2 clock cycle 3 stages stage3 stage1 stage2 stage3 clock cycle 5 stages stage4 stage5 P. Bakowski 28
29 Clock rate increase Note that the clock rate to be used depends heavily on the implementation technology. stage1 stage1 stage2 stage2 clock cycle 3 stages stage3 clock cycle 3 stages stage3 new implementation technology P. Bakowski 29
30 More hardware resources Reduce the average number of clock cycles per instruction, CPI. This requires the introduction of more parallelism that means more hardware resources to be used in a given clock cycle. D/I M DM IM data read or instruction fetch data read and instruction fetch P. Bakowski 30
31 5-stage pipeline organization Higher performance ARM cores employ a 5-stage 5 pipeline and have separate instruction and data memories. Breaking instruction execution down into five stages rather than three reduces the maximum work which must be completed in a clock cycle, and hence allows a higher clock frequency to be used. The separate instruction and data memories seen as separate caches connected to a unified instruction and data main memory allow a significant reduction in the core's CPI. P. Bakowski 31
32 5-stage pipeline organization Higher performance ARM cores employ a 5-stage 5 pipeline and have separate instruction and data memories. Breaking instruction execution down into five stages rather than three reduces the maximum work which must be completed in a clock cycle, and hence allows a higher clock frequency to be used. The separate instruction and data memories seen as separate caches connected to a unified instruction and data main memory allow a significant reduction in the core's CPI. P. Bakowski 32
33 5-stage pipeline organization Higher performance ARM cores employ a 5-stage 5 pipeline and have separate instruction and data memories. Breaking instruction execution down into five stages rather than three reduces the maximum work which must be completed in a clock cycle, and hence allows a higher clock frequency to be used. The separate instruction and data memories seen as separate caches connected to a unified instruction and data main memory allow a significant reduction in the core's CPI. P. Bakowski 33
34 incrementer incrementer Fetch stage fetch decode execute buffer write FETCH - the instruction is fetched from memory and placed in the instruction cache. next PC I cache to decoder P. Bakowski 34
35 Decode stage fetch decode execute buffer write DECODE - the instruction is decoded and register operands read from the register file. decode I - decode P. Bakowski 35
36 Decode stage fetch decode execute buffer write There are three operand read ports in the register file, so most instructions can obtain all their operands in one cycle. decode I - decode register file P. Bakowski 36
37 Execute stage fetch decode execute buffer write EXECUTE - an operand is shifted and the ALU result generated. register file M BS ALU ALUbus +4 P. Bakowski 37
38 Execute stage fetch decode execute buffer write EXECUTE - an operand is shifted and the ALU result generated. register file M BS ALU ALUbus +4 P. Bakowski 38
39 Execute stage fetch decode execute buffer write If the instruction is a load or store the memory address is computed in the ALU. register file M BS ALU ALUbus +4 P. Bakowski 39
40 Buffer stage fetch decode execute buffer write BUFFER data - data memory is accessed if required. Otherwise the ALU result is simply buffered for one clock cycle to give the same pipeline flow for all instructions. byte replication +4 D cache rotation/sign extension P. Bakowski 40
41 Write back stage fetch decode execute buffer write WRITE-back; the results generated by the instruction are written back to the register file, including any data loaded from memory. ALU ALUbus register file D cache rotation/sign extension P. Bakowski 41
42 Data forwarding In the 5-stage 5 pipeline instruction execution is spread across three pipeline stages,, the only way to resolve data dependencies without stalling the pipeline is to introduce forwarding paths. fetch decode execute buffer write P. Bakowski 42
43 Data forwarding to register file fetch decode execute buffer write fetch decode execute buffer write Data dependencies arise when an instruction needs to use the result of one of its predecessors before that result has returned to the register file. P. Bakowski 43
44 Data forwarding Forwarding paths (by-pass) allow the intermediate results to be passed between stages as soon as they are available, in the 5-stage 5 ARM pipeline each of the three source operands can be forwarded from any of three intermediate result registers to register file fetch decode execute buffer write by-pass paths fetch decode execute buffer write P. Bakowski 44
45 PC organization - compatibility The programming behavior of the PC implemented through r15 is based on the operational characteristics of the 3-stage 3 ARM pipeline. Basically the 5-stage 5 pipeline reads the instruction operands one stage earlier and that is incompatible with 3-stage design. P. Bakowski 45
46 PC organization - compatibility The programming behavior of the PC implemented through r15 is based on the operational characteristics of the 3-stage 3 ARM pipeline. Basically the 5-stage 5 pipeline reads the instruction operands one stage earlier and that is incompatible with 3-stage design. P. Bakowski 46
47 PC organisation - solution This problem is resolved by the incrementation of the PC value from the fetch stage in the decode stage, bypassing the pipeline register between the two stages. PC+4 for the next instruction is equal to PC+8 for the current instruction (4 bytes farther), so the correct r15 value is obtained without additional hardware. P. Bakowski 47
48 PC organisation - solution PC+4 for the next instruction is equal to PC+8 for the current instruction (4 bytes farther), so the correct r15 value is obtained without additional hardware. incrementer incrementer I cache next PC+4 to decoder register file next PC+8 r15 P. Bakowski 48
49 ARM programming model The Instruction Set Architecture (ISA) defines the operations that the programmer can use to change the state of the system incorporating the processor. This state usually comprises the values of the data items in the visible registers and the memory. Each instruction performs a defined transformation from the state before the instruction is executed to the state after it has completed. P. Bakowski 49
50 ARM programming model The Instruction Set Architecture (ISA) defines the operations that the programmer can use to change the state of the system incorporating the processor. This state usually comprises the values of the data items in the visible registers and the memory. Each instruction performs a defined transformation from the state before the instruction is executed to the state after it has completed. P. Bakowski 50
51 ARM programming model The Instruction Set Architecture (ISA) defines the operations that the programmer can use to change the state of the system incorporating the processor. This state usually comprises the values of the data items in the visible registers and the memory. Each instruction performs a defined transformation from the state before the instruction is executed to the state after it has completed. P. Bakowski 51
52 ARM memory subsystem ARM memory may be viewed as a linear array of bytes numbered from zero up to linear array of bytes 0 P. Bakowski 52
53 ARM memory subsystem Data items may be 8-bit 8 bytes,, 16-bit half-words or 32-bit words bytes words P. Bakowski 53
54 ARM memory subsystem Words are always aligned on 4-byte boundaries (the two least significant address bits are zero) ) and half- words are aligned on even byte boundaries byte number word address P. Bakowski 54
55 ARM memory subsystem Words are always aligned on 4-byte boundaries (the two least significant address bits are zero) ) and half- words are aligned on even byte boundaries little endian organization P. Bakowski 55
56 ARM load-store architecture The processing instruction (add, subtract, and so on) take the values from the registers and always place the results into a register. register file M BS ALU ALUbus +4 P. Bakowski 56
57 ARM load-store architecture The processing instruction (add, subtract, and so on) take the values from the registers and always place the results into a register. register file M BS ALU ALUbus +4 P. Bakowski 57
58 ARM load-store architecture The only instructions which apply to memory state are ones which copy memory values into register (load instructions) or copy register values into memory (store instructions). register file D -cache memory P. Bakowski 58
59 ARM load-store architecture The only instructions which apply to memory state are ones which copy memory values into register (load instructions) or copy register values into memory (store instructions). register file D -cache memory P. Bakowski 59
60 ARM instructions In general the ARM instructions fall into one of the following three categories: data processing instructions data transfer instructions control flow instructions P. Bakowski 60
61 ARM instructions In general the ARM instructions fall into one of the following three categories: data processing instructions data transfer instructions control flow instructions P. Bakowski 61
62 ARM instructions In general the ARM instructions fall into one of the following three categories: data processing instructions data transfer instructions control flow instructions P. Bakowski 62
63 ARM instructions In general the ARM instructions fall into one of the following three categories: data processing instructions data transfer instructions control flow instructions P. Bakowski 63
64 ARM data processing Data processing instructions: these use and change only register values; P. Bakowski 64
65 ARM data processing For example, an instruction can add two registers and place the result in a register. register file M BS ALU ALUbus +4 P. Bakowski 65
66 ARM data transfer Data transfer instructions copy memory values into registers (load( instructions) or copy register values into memory (store( instructions); An additional form, useful only in systems code, exchanges a memory value with a register value. P. Bakowski 66
67 ARM data transfer Data transfer instructions copy memory values into registers (load instructions) or copy register values into memory (store instructions); An additional form, useful only in systems code, exchanges a memory value with a register value. P. Bakowski 67
68 ARM data transfer Data transfer instructions copy memory values into registers (load instructions) or copy register values into memory (store instructions); An additional form, useful only in systems code, exchanges a memory value with a register value. e.g. test and set instruction P. Bakowski 68
69 ARM control flow Control flow instructions cause execution to switch to a different address,, either permanently (branch instructions) ) or saving a return address to resume the original sequence (branch and link instructions) or trapping into system code (supervisor calls). P. Bakowski 69
70 ARM control flow Control flow instructions cause execution to switch to a different address, either permanently (branch instructions) or saving a return address to resume the original sequence (branch and link instructions) or trapping into system code (supervisor calls). link address P. Bakowski 70
71 ARM control flow Control flow instructions cause execution to switch to a different address, either permanently (branch instructions) or saving a return address to resume the original sequence (branch and link instructions) or trapping into system code (supervisor calls). link address system code P. Bakowski 71
72 ARM supervisor mode The ARM processor supports a protected supervisor mode.. The protection mechanism ensures that user code cannot gain supervisor privileges without appropriate checks being carried out to ensure that the code is not attempting illegal operations. I/O driver illegal operation? user code P. Bakowski 72
73 ARM supervisor mode The upshot of this for the user-level programmer is that system-level functions can only be accessed through specified supervisor call. system/supervisor call I/O driver system code user code P. Bakowski 73
74 ARM I/O programming The ARM handles I/0 (input/output) peripherals (such as disk controllers, network interfaces, and so on) as memory-mapped mapped devices with interrupt support. memory-mapped mapped devices P. Bakowski 74
75 ARM I/O programming The internal registers in these devices appear as addressable locations within the ARM's memory map and may be read and written using the same (load( load- store) ) instructions as any other memory locations. store memory locations load P. Bakowski 75
76 ARM I/O interruptions Peripherals may attract the processor's attention by making an interrupt request using either the normal interrupt (IRQ)) or the fast interrupt (FIQ) input. to CPU - FIQ to CPU - IRQ P. Bakowski 76
77 ARM I/O interruptions Both interrupt inputs are level-sensitive and maskable. Normally most interrupt sources share the IRQ input, with just one or two time-critical sources connected to the higher-priority FIQ input. IRQ FIQ P. Bakowski 77
78 ARM I/O interruptions Some systems may include direct memory access (DMA) hardware external to the processor to handle high-bandwidth traffic. system bus DMA traffic P. Bakowski 78
79 ARM exceptions The ARM architecture supports a range of : interrupts traps supervisor calls all grouped under the general heading of exceptions. P. Bakowski 79
80 ARM exceptions The ARM architecture supports a range of : interrupts traps supervisor calls all grouped under the general heading of exceptions. P. Bakowski 80
81 ARM exceptions The ARM architecture supports a range of : interrupts traps supervisor calls all grouped under the general heading of exceptions. P. Bakowski 81
82 ARM exceptions The ARM architecture supports a range of : interrupts traps supervisor calls all grouped under the general heading of exceptions. P. Bakowski 82
83 ARM exceptions The ARM architecture supports a range of : interrupts traps supervisor calls all grouped under the general heading of exceptions. P. Bakowski 83
84 ARM exceptions The general way of exception handling is the same in all cases: the current state is saved by copying the PC into rl4_exc and the CPSR into SPSR_exc (where exc stands for the exception type); the processor operating mode is changed to the appropriate exception mode; the PC is forced to a value between and 1C 16, the particular value depending on the type of exception. P. Bakowski 84
85 ARM exceptions The general way of exception handling is the same in all cases: the current state is saved by copying the PC into rl4_exc and the CPSR into SPSR_exc (where exc stands for the exception type); the processor operating mode is changed to the appropriate exception mode; the PC is forced to a value between and 1C 16, the particular value depending on the type of exception. P. Bakowski 85
86 ARM exceptions The general way of exception handling is the same in all cases: the current state is saved by copying the PC into rl4_exc and the CPSR into SPSR_exc (where exc stands for the exception type); the processor operating mode is changed to the appropriate exception mode; the PC is forced to a value between and 1C 16, the particular value depending on the type of exception. P. Bakowski 86
87 ARM exceptions The general way of exception handling is the same in all cases: the current state is saved by copying the PC into rl4_exc and the CPSR into SPSR_exc (where exc stands for the exception type); the processor operating mode is changed to the appropriate exception mode; the PC is forced to a value between and 1C 16, the particular value depending on the type of exception. P. Bakowski 87
88 ARM instruction execution a[31:0] instruction register control path control signals Bbus data in register address register incrementer incrementer PC register bank M BS Abus ALUbus ALU M multiplier, BS barrel shifter data out register d[31:0] P. Bakowski 88
89 incrementer incrementer Data processing instruction address register address register data out register data out register data in register data in register register register operations d[31:0] Bbus PC register bank M BS Abus ALUbus ALU P. Bakowski 89 a[31:0]
90 incrementer incrementer Data processing instruction address register address register data out register data out register ir register ir register data in register data in register register immediate operations d[31:0] Bbus PC register bank M BS Abus ALUbus ALU P. Bakowski 90 a[31:0]
91 incrementer incrementer Store instruction address register address register data out register data out register data in register data in register d[31:0] ir register ir register compute address operation Bbus PC register bank M ALU P. Bakowski 91 BS Abus ALUbus new address a[31:0]
92 incrementer incrementer Store instruction address register address register data out register data out register ir register ir register data in register data in register store data auto-index d[31:0] Bbus PC register bank M BS Abus ALUbus ALU auto-index P. Bakowski 92 a[31:0]
93 incrementer incrementer Branch instruction address register address register data out register data out register ir register ir register data in register data in register compute branch address d[31:0] Bbus PC register bank M BS Abus ALUbus ALU branch address P. Bakowski 93 a[31:0]
94 incrementer incrementer Branch instruction address register address register data out register data out register ir register ir register data in register data in register store return address d[31:0] Bbus PC register bank R14 M BS Abus ALUbus ALU return address P. Bakowski 94 a[31:0]
95 Summary ARM register bank ARM barrel shifter and ALU ARM 3-stage 3 and 5-stage 5 pipelines ARM programming model ARM instructions P. Bakowski 95
96 Summary ARM register bank ARM barrel shifter and ALU ARM 3-stage 3 and 5-stage 5 pipelines ARM programming model ARM instructions P. Bakowski 96
97 Summary ARM register bank ARM barrel shifter and ALU ARM 3-stage and 5-stage pipelines ARM programming model ARM instructions P. Bakowski 97
98 Summary ARM register bank ARM barrel shifter and ALU ARM 3-stage 3 and 5-stage 5 pipelines ARM programming model ARM instructions P. Bakowski 98
99 Summary ARM register bank ARM barrel shifter and ALU ARM 3-stage 3 and 5-stage 5 pipelines ARM programming model ARM instructions P. Bakowski 99
Samsung S3C4510B. Hsung-Pin Chang Department of Computer Science National Chung Hsing University
Samsung S3C4510B Hsung-Pin Chang Department of Computer Science National Chung Hsing University S3C4510B A 16/32-bit RISC microcontroller is a cost-effective, highperformance microcontroller 16/32-bit
More informationARM ARCHITECTURE. Contents at a glance:
UNIT-III ARM ARCHITECTURE Contents at a glance: RISC Design Philosophy ARM Design Philosophy Registers Current Program Status Register(CPSR) Instruction Pipeline Interrupts and Vector Table Architecture
More informationCPE300: Digital System Architecture and Design
CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Pipelining 11142011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review I/O Chapter 5 Overview Pipelining Pipelining
More informationCPU Structure and Function
Computer Architecture Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com http://www.yildiz.edu.tr/~naydin CPU Structure and Function 1 2 CPU Structure Registers
More informationCS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS
CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight
More informationEmbedded Systems Ch 15 ARM Organization and Implementation
Embedded Systems Ch 15 ARM Organization and Implementation Byung Kook Kim Dept of EECS Korea Advanced Institute of Science and Technology Summary ARM architecture Very little change From the first 3-micron
More informationWilliam Stallings Computer Organization and Architecture 8 th Edition. Chapter 12 Processor Structure and Function
William Stallings Computer Organization and Architecture 8 th Edition Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationLecture1: introduction. Outline: History overview Central processing unite Register set Special purpose address registers Datapath Control unit
Lecture1: introduction Outline: History overview Central processing unite Register set Special purpose address registers Datapath Control unit 1 1. History overview Computer systems have conventionally
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationJob Posting (Aug. 19) ECE 425. ARM7 Block Diagram. ARM Programming. Assembly Language Programming. ARM Architecture 9/7/2017. Microprocessor Systems
Job Posting (Aug. 19) ECE 425 Microprocessor Systems TECHNICAL SKILLS: Use software development tools for microcontrollers. Must have experience with verification test languages such as Vera, Specman,
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationUNIT- 5. Chapter 12 Processor Structure and Function
UNIT- 5 Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data CPU With Systems Bus CPU Internal Structure Registers
More informationARM Processors for Embedded Applications
ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or
More informationCPU Structure and Function. Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition
CPU Structure and Function Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition CPU must: CPU Function Fetch instructions Interpret/decode instructions Fetch data Process data
More informationFinal Lecture. A few minutes to wrap up and add some perspective
Final Lecture A few minutes to wrap up and add some perspective 1 2 Instant replay The quarter was split into roughly three parts and a coda. The 1st part covered instruction set architectures the connection
More informationComputer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics
Computer and Hardware Architecture I Benny Thörnberg Associate Professor in Electronics Hardware architecture Computer architecture The functionality of a modern computer is so complex that no human can
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationChapter 4. Enhancing ARM7 architecture by embedding RTOS
Chapter 4 Enhancing ARM7 architecture by embedding RTOS 4.1 ARM7 architecture 4.2 ARM7TDMI processor core 4.3 Embedding RTOS on ARM7TDMI architecture 4.4 Block diagram of the Design 4.5 Hardware Design
More informationHercules ARM Cortex -R4 System Architecture. Processor Overview
Hercules ARM Cortex -R4 System Architecture Processor Overview What is Hercules? TI s 32-bit ARM Cortex -R4/R5 MCU family for Industrial, Automotive, and Transportation Safety Hardware Safety Features
More informationWhat is Pipelining? RISC remainder (our assumptions)
What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism
More informationImproving Performance: Pipelining
Improving Performance: Pipelining Memory General registers Memory ID EXE MEM WB Instruction Fetch (includes PC increment) ID Instruction Decode + fetching values from general purpose registers EXE EXEcute
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationChapter 9 Pipelining. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan
Chapter 9 Pipelining Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Basic Concepts Data Hazards Instruction Hazards Advanced Reliable Systems (ARES) Lab.
More informationUniversität Dortmund. ARM Architecture
ARM Architecture The RISC Philosophy Original RISC design (e.g. MIPS) aims for high performance through o reduced number of instruction classes o large general-purpose register set o load-store architecture
More informationEN1640: Design of Computing Systems Topic 07: I/O
EN1640: Design of Computing Systems Topic 07: I/O Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring 2017 [ material
More informationThe Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture
The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count
More informationCPE300: Digital System Architecture and Design
CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Arithmetic Unit 10032011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Recap Chapter 3 Number Systems Fixed Point
More informationInput Output (IO) Management
Input Output (IO) Management Prof. P.C.P. Bhatt P.C.P Bhatt OS/M5/V1/2004 1 Introduction Humans interact with machines by providing information through IO devices. Manyon-line services are availed through
More informationASSEMBLY LANGUAGE MACHINE ORGANIZATION
ASSEMBLY LANGUAGE MACHINE ORGANIZATION CHAPTER 3 1 Sub-topics The topic will cover: Microprocessor architecture CPU processing methods Pipelining Superscalar RISC Multiprocessing Instruction Cycle Instruction
More informationChapter 4. The Processor Designing the datapath
Chapter 4 The Processor Designing the datapath Introduction CPU performance determined by Instruction Count Clock Cycles per Instruction (CPI) and Cycle time Determined by Instruction Set Architecure (ISA)
More informationARM Architecture (1A) Young Won Lim 3/20/18
Copyright (c) 2014-2018 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published
More informationChapter 12. CPU Structure and Function. Yonsei University
Chapter 12 CPU Structure and Function Contents Processor organization Register organization Instruction cycle Instruction pipelining The Pentium processor The PowerPC processor 12-2 CPU Structures Processor
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationVE7104/INTRODUCTION TO EMBEDDED CONTROLLERS UNIT III ARM BASED MICROCONTROLLERS
VE7104/INTRODUCTION TO EMBEDDED CONTROLLERS UNIT III ARM BASED MICROCONTROLLERS Introduction to 32 bit Processors, ARM Architecture, ARM cortex M3, 32 bit ARM Instruction set, Thumb Instruction set, Exception
More informationThe Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
The Processor: Datapath and Control Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Introduction CPU performance factors Instruction count Determined
More informationWilliam Stallings Computer Organization and Architecture
William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function Rev. 3.2.1 (2005-06) by Enrico Nardelli 11-1 CPU Functions CPU must: Fetch instructions Decode instructions
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationCOMPUTER ORGANIZATION AND DESI
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler
More informationECEC 355: Pipelining
ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly
More informationWilliam Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function
William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Registers
More informationINSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing
UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 05
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware 4.1 Introduction We will examine two MIPS implementations
More informationChapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations
Chapter 4 The Processor Part I Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations
More informationCPU Structure and Function
CPU Structure and Function Chapter 12 Lesson 17 Slide 1/36 Processor Organization CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Lesson 17 Slide 2/36 CPU With Systems
More informationTypical Processor Execution Cycle
Typical Processor Execution Cycle Instruction Fetch Obtain instruction from program storage Instruction Decode Determine required actions and instruction size Operand Fetch Locate and obtain operand data
More informationENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design
ENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University
More informationInstruction Level Parallelism. Appendix C and Chapter 3, HP5e
Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation
More informationWilliam Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ William Stallings Computer Organization and Architecture 10 th Edition 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. 2 + Chapter 14 Processor Structure and Function + Processor Organization
More informationWhat is Pipelining? Time per instruction on unpipelined machine Number of pipe stages
What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism
More informationLet s put together a Manual Processor
Lecture 14 Let s put together a Manual Processor Hardware Lecture 14 Slide 1 The processor Inside every computer there is at least one processor which can take an instruction, some operands and produce
More informationPipelining. CSC Friday, November 6, 2015
Pipelining CSC 211.01 Friday, November 6, 2015 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not
More information1 MALP ( ) Unit-1. (1) Draw and explain the internal architecture of 8085.
(1) Draw and explain the internal architecture of 8085. The architecture of 8085 Microprocessor is shown in figure given below. The internal architecture of 8085 includes following section ALU-Arithmetic
More informationAdvanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017
Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation
More informationLecture 10 Exceptions and Interrupts. How are exceptions generated?
Lecture 10 Exceptions and Interrupts The ARM processor can work in one of many operating modes. So far we have only considered user mode, which is the "normal" mode of operation. The processor can also
More informationDigital System Design Using Verilog. - Processing Unit Design
Digital System Design Using Verilog - Processing Unit Design 1.1 CPU BASICS A typical CPU has three major components: (1) Register set, (2) Arithmetic logic unit (ALU), and (3) Control unit (CU) The register
More informationControl unit. Input/output devices provide a means for us to make use of a computer system. Computer System. Computer.
Lecture 6: I/O and Control I/O operations Control unit Microprogramming Zebo Peng, IDA, LiTH 1 Input/Output Devices Input/output devices provide a means for us to make use of a computer system. Computer
More informationProcessor Structure and Function
WEEK 4 + Chapter 14 Processor Structure and Function + Processor Organization Processor Requirements: Fetch instruction The processor reads an instruction from memory (register, cache, main memory) Interpret
More informationThe CPU Pipeline. MIPS R4000 Microprocessor User's Manual 43
The CPU Pipeline 3 This chapter describes the basic operation of the CPU pipeline, which includes descriptions of the delay instructions (instructions that follow a branch or load instruction in the pipeline),
More informationChapter 4. MARIE: An Introduction to a Simple Computer. Chapter 4 Objectives. 4.1 Introduction. 4.2 CPU Basics
Chapter 4 Objectives Learn the components common to every modern computer system. Chapter 4 MARIE: An Introduction to a Simple Computer Be able to explain how each component contributes to program execution.
More informationstructural RTL for mov ra, rb Answer:- (Page 164) Virtualians Social Network Prepared by: Irfan Khan
Solved Subjective Midterm Papers For Preparation of Midterm Exam Two approaches for control unit. Answer:- (Page 150) Additionally, there are two different approaches to the control unit design; it can
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More informationInstruction Pipelining Review
Instruction Pipelining Review Instruction pipelining is CPU implementation technique where multiple operations on a number of instructions are overlapped. An instruction execution pipeline involves a number
More informationAdvanced Computer Architecture
Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes
More informationProcessor (I) - datapath & control. Hwansoo Han
Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two
More informationMinimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline
Instruction Pipelining Review: MIPS In-Order Single-Issue Integer Pipeline Performance of Pipelines with Stalls Pipeline Hazards Structural hazards Data hazards Minimizing Data hazard Stalls by Forwarding
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationMemory General R0 Registers R1 R2. Input Register 1. Input Register 2. Program Counter. Instruction Register
CPU Organisation Central Processing Unit (CPU) Memory General R0 Registers R1 R2 ALU R3 Output Register Input Register 1 Input Register 2 Internal Bus Address Bus Data Bus Addr. $ 000 001 002 Program Counter
More informationModule 5 - CPU Design
Module 5 - CPU Design Lecture 1 - Introduction to CPU The operation or task that must perform by CPU is: Fetch Instruction: The CPU reads an instruction from memory. Interpret Instruction: The instruction
More informationWhat Are The Main Differences Between Program Counter Pc And Instruction Register Ir
What Are The Main Differences Between Program Counter Pc And Instruction Register Ir and register-based instructions - Anatomy on a CPU - Program Counter (PC): holds memory address of next instruction
More informationComputer Architecture and Organization (CS-507)
Computer Architecture and Organization (CS-507) Muhammad Zeeshan Haider Ali Lecturer ISP. Multan ali.zeeshan04@gmail.com https://zeeshanaliatisp.wordpress.com/ Lecture 4 Basic Computer Function, Instruction
More informationCISC Processor Design
CISC Processor Design Virendra Singh Indian Institute of Science Bangalore virendra@computer.org Lecture 3 SE-273: Processor Design Processor Architecture Processor Architecture CISC RISC Jan 21, 2008
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19
CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be
More informationby I.-C. Lin, Dept. CS, NCTU. Textbook: Operating System Concepts 8ed CHAPTER 13: I/O SYSTEMS
by I.-C. Lin, Dept. CS, NCTU. Textbook: Operating System Concepts 8ed CHAPTER 13: I/O SYSTEMS Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests
More informationPIPELINING: HAZARDS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah
PIPELINING: HAZARDS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission deadline: Jan. 30 th This
More informationARM Processors ARM ISA. ARM 1 in 1985 By 2001, more than 1 billion ARM processors shipped Widely used in many successful 32-bit embedded systems
ARM Processors ARM Microprocessor 1 ARM 1 in 1985 By 2001, more than 1 billion ARM processors shipped Widely used in many successful 32-bit embedded systems stems 1 2 ARM Design Philosophy hl h Low power
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationDEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK
DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK SUBJECT : CS6303 / COMPUTER ARCHITECTURE SEM / YEAR : VI / III year B.E. Unit I OVERVIEW AND INSTRUCTIONS Part A Q.No Questions BT Level
More informationWhere Does The Cpu Store The Address Of The
Where Does The Cpu Store The Address Of The Next Instruction To Be Fetched The three most important buses are the address, the data, and the control buses. The CPU always knows where to find the next instruction
More information18-349: Embedded Real-Time Systems Lecture 2: ARM Architecture
18-349: Embedded Real-Time Systems Lecture 2: ARM Architecture Anthony Rowe Electrical and Computer Engineering Carnegie Mellon University Basic Computer Architecture Embedded Real-Time Systems 2 Memory
More informationTopics in computer architecture
Topics in computer architecture Sun Microsystems SPARC P.J. Drongowski SandSoftwareSound.net Copyright 1990-2013 Paul J. Drongowski Sun Microsystems SPARC Scalable Processor Architecture Computer family
More informationComplex Pipelines and Branch Prediction
Complex Pipelines and Branch Prediction Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. L22-1 Processor Performance Time Program Instructions Program Cycles Instruction CPI Time Cycle
More informationLecture 15: Pipelining. Spring 2018 Jason Tang
Lecture 15: Pipelining Spring 2018 Jason Tang 1 Topics Overview of pipelining Pipeline performance Pipeline hazards 2 Sequential Laundry 6 PM 7 8 9 10 11 Midnight Time T a s k O r d e r A B C D 30 40 20
More informationAdvanced Parallel Architecture Lesson 3. Annalisa Massini /2015
Advanced Parallel Architecture Lesson 3 Annalisa Massini - 2014/2015 Von Neumann Architecture 2 Summary of the traditional computer architecture: Von Neumann architecture http://williamstallings.com/coa/coa7e.html
More informationRunning Applications
Running Applications Computer Hardware Central Processing Unit (CPU) CPU PC IR MAR MBR I/O AR I/O BR To exchange data with memory Brain of Computer, controls everything Few registers PC (Program Counter):
More informationBasic concepts UNIT III PIPELINING. Data hazards. Instruction hazards. Influence on instruction sets. Data path and control considerations
UNIT III PIPELINING Basic concepts Data hazards Instruction hazards Influence on instruction sets Data path and control considerations Performance considerations Exception handling Basic Concepts It is
More informationFINALTERM EXAMINATION Fall 2008 CS501- Advance Computer Architecture (Session - 1) Marks: 75
FINALTERM EXAMINATION Fall 2008 CS501- Advance Computer Architecture (Session - 1) Marks: 75 Question No: 1 ( Marks: 1 ) - Please choose one Which one of the following is the memory organization of SRC
More informationCPU ARCHITECTURE. QUESTION 1 Explain how the width of the data bus and system clock speed affect the performance of a computer system.
CPU ARCHITECTURE QUESTION 1 Explain how the width of the data bus and system clock speed affect the performance of a computer system. ANSWER 1 Data Bus Width the width of the data bus determines the number
More informationInterrupt/Timer/DMA 1
Interrupt/Timer/DMA 1 Exception An exception is any condition that needs to halt normal execution of the instructions Examples - Reset - HWI - SWI 2 Interrupt Hardware interrupt Software interrupt Trap
More informationADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-11: 80x86 Architecture
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-11: 80x86 Architecture 1 The 80x86 architecture processors popular since its application in IBM PC (personal computer). 2 First Four generations
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationSAE5C Computer Organization and Architecture. Unit : I - V
SAE5C Computer Organization and Architecture Unit : I - V UNIT-I Evolution of Pentium and Power PC Evolution of Computer Components functions Interconnection Bus Basics of PCI Memory:Characteristics,Hierarchy
More information