## Module 2: ## **Processor architecture** Introduction to computers II José Manuel Mendías Cuadros Dpto. Arquitectura de Computadores y Automática Universidad Complutense de Madrid ## **Outline** - ✓ The RISC-V architecture. - ✓ Instructions and data. - Memory model. - ✓ Registers. - Addressing modes. - ✓ Instruction set. - ✓ Extensions. - ✓ RISC vs. CISC architectures. #### These slides are based on: - S.L. Harris and D. Harris. Digital Design and Computer Architecture. RISC-V Edition. - D.A. Patterson and J.L. Hennessy. Computer Organization and Design. RISC-V Edition. # The RISC-V architecture - RISC-V ISA (Instruction Set Architecture) is an architecture: - Open, not proprietary and in evolution. - Originally designed at UC Berkeley in 2010. - Currently coordinated by the RISC-V International consortium. - It is a RISC-type architecture (<u>Reduced Instruction Set Computer</u>), and thus: - It has a reduced set of simple instructions. - Only the load and store instructions can access memory. - The rest of instructions work with data stored in registers. - o It has a large number of general-purpose registers. - It has a reduced set of addressing modes. - Instructions has a fixed size, with a reduced number of formats. - We will study the RV32I base set with the RVM extension. - o 32-bit integer data and 32-bit instructions (RV32I). - With integer multiplication and division (RVM). ## Instructions and data - All RISC-V instructions have 32 bits. - RISC-V instructions operate with 32-bit data or addresses. - Data are integer numbers (signed) or natural numbers (unsigned), encoded in two's complement or pure binary, respectively. - Addresses are natural numbers encoded in pure binary. - However, it can work with smaller-width numbers: - Typically, they are extended to 32 bits before operating with them. - Depending on the case, they will be sign-extended (sExt) or zero-extended (zExt). - The most common data sizes are: - o Word: 32 bits. - o Half word: 16 bits. - Byte: 8 bits. # Memory model - It consists of a 4-GiB RAM main memory $(2^{32} \times 8b = 2^{30} \times 32b)$ : - 32-bit data and address buses. - Byte-addressable (each byte has a unique address). - It contains 8, 16 and 32-bit data and 32-bit instructions. - All of them are aligned and with little-endian organization. # **Memory model** ## **Alignment** - In the RISC-V memory, information is aligned, i.e., there are location constraints depending on its size. - Byte: it can be located in any address - Half word: it can only be located in multiple-of-2 addresses (even). - Word: it can only be located in multiple-of-4 addresses - This applies to 32-bit data and instructions. - In general, N-byte data must be located in multiple-of-N addresses. - When different-size data are stored consecutively, empty gaps are created. | Size | Data | |--------|------------| | byte | 0x24 | | word | 0x3b257a02 | | ½ word | 0x3e27 | | word | 0x01c6d823 | | Addr. | +0 | +1 | +2 | +3 | |----------|----|------|------|----| | 3c000000 | 24 | | | | | 3c000004 | | 3b25 | 7a02 | | | 3c000008 | 3e | 27 | | | | 3c00000c | | 01c6 | d823 | | | Addr. | +0 | +1 | +2 | +3 | |----------|-------|------------|-------|----| | 3c000000 | 24 | 3 | 3b257 | a | | 3c000004 | 02 | 3 <b>e</b> | 27 | 01 | | 3c000008 | 6d823 | | | | | 3c00000c | | | | | **RISC-V: Aligned data** **Not-aligned data** # **Memory model** #### Organization - In the RISC-V memory, the word/half word bytes follow a little-endian organization: - The <u>least significant</u> byte is located at the <u>lowest address</u>, i.e., the data address coincides with its least significant byte address. - o Bits within the byte keep the usual organization. - Other processors can follow a big-endian organization: - The <u>most significant</u> byte is located at the lowest address, i.e., the data address coincides with its most significant byte address. | Size | Data | |--------|------------| | byte | 0x24 | | word | 0x3b257a02 | | ½ word | 0x3e27 | | word | 0x01c6d823 | | Addr. | +0 | +1 | +2 | +3 | |----------|----|----|----|----| | 3c000000 | 24 | | | | | 3c000004 | 02 | 7a | 25 | 3b | | 3c000008 | 27 | 3e | | | | 3c00000c | 23 | d8 | с6 | 01 | | • | | | | | | Addr. | +0 | +1 | +2 | +3 | |----------|----|----|----|----| | 3c000000 | 24 | | | | | 3c000004 | 3b | 25 | 7a | 02 | | 3c000008 | 3e | 27 | | | | 3c0000c | 01 | с6 | d8 | 23 | **RISC-V: Little-Endian** **Big-Endian** # Registers - All data in a program are stored in memory, but in order to be used by RISC-V, they must be previously loaded in registers. - A RISC-V has 32 general-purpose registers, each one with 32 bits. - They can be used interchangeably. - o They are numbered from x0 to x31. - The x0 register, always contains constant 0, and any write operation in this register has no effect. - O However, in order to simplify programming, each register has an alias that allows remembering its most conventional use. - Besides, RISC-V has a special register, PC (Program Counter) - It contains the memory address of the instruction in execution. - After this instruction is executed, the PC is incremented +4 (each instruction takes 4Bytes) - Except if the executed instruction is a branch # module 2: **Processor architecture** # Registers | # Reg. | Alias | Description | |-------------------------|-------|--------------------------------| | <b>x</b> 0 | zero | zero | | <b>x</b> 1 | ra | return address | | <b>x</b> 2 | sp | stack pointer | | ж3 | gp | global pointer | | <b>x</b> 4 | tp | thread pointer | | <b>x</b> 5 <b>x</b> 7 | t0t2 | temporary register | | <b>x</b> 8 | s0/fp | saved register / frame pointer | | <b>x</b> 9 | s1 | saved register | | x10x17 | a0a7 | argument register | | <b>x</b> 18 <b>x</b> 27 | s2s11 | saved register | | x28x31 | t3t6 | temporary register | - The addressing modes are the mechanisms to indicate where the instruction operands are located - They indicate the data location and how to get them. - The instruction operands can be located in: - The instruction itself. - A processor register, indicating which one. - The computer memory, indicating its memory address. - There are only 4 addressing modes in RISC-V: - Immediate addressing: the operand is a constant located in the instruction. - Register addressing: the operand is located in a processor register. - Base addressing: the operand is located in the memory. - Its address is obtained by adding the content of a base register plus an offset. - PC-relative addressing: The operand is an address (branch target). - It is obtained by adding the content of the PC plus an offset. 11 # **Addressing modes** #### Immediate addressing - The operand is a constant contained in the instruction. - The constant is explicitly indicated in assembly: The machine instruction has a field where the constant is stored: 14 addi - Since instructions take 32b and the immediate operands are contained in them, constants always have a smaller width: - Unsigned 5-bit immediate: Used without extension. - Signed 12/13-bit immediate: Extended to 32 bits before using them. - If the constant has 13 bits, the instruction only stores the 12 most significant bits. - 20-bits immediate: Used without extension, but shifted. - 21-bit immediate: Extended to 32 bits before using them. - The instruction only stores the 20 most significant bits. ## Register addressing - The operand is stored in a processor register. - o The register name is indicated in assembly: o The machine instruction has a field that indicates the register number: #### Base addressing - Adding the content of a processor register (base register) with an offset (constant) indicated in the instruction - The offset and the register are indicated in assembly: The machine instruction has fields to indicate both elements: #### Base addressing - A particular case of this addressing mode is when the operand is the calculated address itself. - It is used in branch instructions. - The branch address is calculated in the same way: adding the content of a register with an offset contained in the instruction. #### **PC-relative addressing** Only the offset is explicitly indicated in assembly: The machine instruction has a field that indicates the offset: 16 # **Addressing modes** #### About relative addressing - There is no absolute (direct) addressing in RISC-V because relative addressing (PC or base) is more convenient in most cases. - Absolute addressing: the instruction contains the explicit memory address where the data/instruction is located. - Absolute addressing requires 32 bits to indicate the address. - In relative addressing only the difference between two instructions is indicated, which usually requires fewer bits to be encoded. - Offsets may be short immediate because: - o For instructions, the usual case is to branch to nearby addresses. - Which implies a short offset from the PC - o For data, these usually gather in an adjacent memory region. - If the start address of this region is stored in a base register, all data could be accessed with small offsets relative to that base register - Besides, PC-relative addressing allows relocatable code: - The calculation of branch addresses is always correct, regardless of the memory address in which the program is located. ## Instruction set #### Concepts and types of instructions - The instruction set is composed of all the instructions that can be executed by a processor. - All programs executed by a computer are sequences of instructions that belong to the same set. - Instructions are classified in different types: - Data transfer: they copy data between the registers and the memory. - Arithmetic: they perform arithmetic operations. - Logical: they perform bitwise logical operations. - Shift: they perform bit shift operations. - Branch: they break the execution order by modifying the PC. - Privileged: they allow access to functionality that controls the system. - The RISC-V instruction set: - Is very reduced, in order to avoid duplicities. - Instructions and addressing modes are strongly coupled. ## Arithmetic (i) - They perform arithmetic operations with 2 source operands and 1 destination operand, all of them with 32 bits. - o The left operand is always in a register. - The right operand is either in a register or is a short immediate. - The immediate constant takes 12b in C2, in the [-2048, +2047] range, but its sign is extended to 32b before using it. - The result is always stored in a register. 19 ## Instruction set ## Arithmetic (ii) | Instruc | tion | | | Operation | Description | |---------|------|------|--------------------|----------------------------------------------------------------------------------|-------------------------------------| | add | rd, | rs1, | rs2 | rd ← rs1 + rs2 | add | | sub | rd, | rs1, | rs2 | rd ← rs1 – rs2 | <b>sub</b> tract | | slt | rd, | rs1, | rs2 | rd ← <i>if</i> ( rs1 < <sub>s</sub> rs2 )<br><i>then</i> ( 1 ) <i>else</i> ( 0 ) | set if less than (signed) | | sltu | rd, | rs1, | rs2 | rd ← <i>if</i> ( rs1 < <sub>U</sub> rs2 )<br><i>then</i> ( 1 ) <i>else</i> ( 0 ) | set if less than unsigned | | addi | rd, | rs1, | imm <sub>12b</sub> | rd ← rs1 + sExt(imm) | add immediate | | slti | rd, | rs1, | $imm_{12b}$ | rd $\leftarrow$ if ( rs1 < <sub>s</sub> sExt(imm) )<br>then (1) else (0) | set if less than immediate (signed) | | sltiu | rd, | rs1, | imm <sub>12b</sub> | $rd \leftarrow if (rs1 <_{\cup} sExt(imm))$<br>then (1) else (0) | set if less than immediate unsigned | - There are different comparison instructions for signed and unsigned data. - There is no subtraction with immediate operand, since this can be performed by adding the opposite. ## Multiplication and division (i) - There are two different types of multiplication instructions: one to calculate the upper part of the result and another one to calculate the lower part. - o There are different instructions to obtain the upper part of the result depending on whether the source operands are signed or unsigned. - There is only one instruction to obtain the lower part of the result. - All the operands in these instructions are located in registers. 21 ## Instruction set #### Multiplication and division (ii) - The integer division of two 32 bits operands produces two results: the quotient and the remainder, both with 32 bits. - o For that reason, there are two different instructions: one to obtain the quotient and another one to obtain the remainder. - o Each one has variations to operate with signed and unsigned data. - All the operands in these instructions are located in registers. ## Instruction set ## Multiplication and division (iii) These instructions are not in the RV32I set, but are part of the RVM extension. #### Logical (i) - They preform bitwise logical operations with 2 source operands and 1 destination operand, all of them with 32 bits. - The left operand is always in a register. - The right operand is either in a register or is a short immediate. - The immediate constant takes 12b in C2, but its sign is extended to 32b. - The result is always stored in a register. Registers ## Instruction set #### Logical (ii) - The bitwise logical operations, are used to manipulate individual bits within data. - One operand contains the data to manipulate. - Another operand contains a mask that indicates the bits to change. - Different operations are used depending on the required manipulation the or instruction sets (=1) those data bits whose corresponding mask bits are 1 | | 000011110101000000001010111001011 | |-----|-----------------------------------| | xor | 000000000000000000000011111111 | | | 0000111101010000000101000110100 | | | 00000000000000000000000000110100 | | and | 0000000000000000000000011111111 | | | 00001111010100000000101000110100 | the and instruction resets (=0) those data bits whose corresponding mask bits are 0 the **xor** instruction toggles ( $1 \leftrightarrow 0$ ) those data bits whose corresponding mask bits are 1 ## Logical (iii) | Instruction | Operation | Description | |----------------------------------|----------------------|---------------| | and rd, rs1, rs2 | rd ← rs1 & rs2 | and | | or rd, rs1, rs2 | rd ← rs1 rs2 | or | | xor rd, rs1, rs2 | rd ← rs1 ^ rs2 | xor | | andi rd, rs1, imm <sub>12b</sub> | rd ← rs1 & sExt(imm) | and immediate | | ori rd, rs1, imm <sub>12b</sub> | rd ← rs1 sExt(imm) | or immediate | | xori rd, rs1, imm <sub>12b</sub> | rd ← rs1 ^ sExt(imm) | xor immediate | ## Shift (i) - They shift bits of a source operand a number of positions indicated by another one. Then it writes the result in another operand. - o The left operand (32b) is always in a register. - o The right operand (5b) is either in a register or is a short immediate. - The 5 least significant bits of the register are taken. - The immediate constant takes 5b is pure binary that are not extended. - The result is always stored in a register. Registers 27 ## Instruction set #### Shift (ii) - Logical shift instructions insert 0s through one side of the data and discard the same number of bits at the other side. - This allows rescaling unsigned data: - Left shifting n bits is the same as multiplying by 2<sup>n</sup> - Right shifting n bits is the same as dividing by 2<sup>n</sup> 0000000000001010001101000000000 $$2612 << 7 = 2612 \times 2^7 = 334336$$ $$2612 \gg 7 = 2612 \div 2^7 = 20$$ After a bitwise logic operation, this allows extracting fields from data: 28 ## Instruction set #### Shift (iii) - Right arithmetic shift instructions propagate the sign bit on the left and discard the rightmost bits. - o This allows rescaling signed data : ``` right shifts 7 bits 11111111111111111111000110100 srai 00111 1111111111111111111111000110100 -58828 >> 7 = -58828÷2<sup>7</sup> = = -460 ``` - There is no instruction for left arithmetic shift - For valid results (the rescaled signed data can be represented with 32b), the logical shift is equivalent. ## Shift (iv) | Instruction | | Operation | Description | |--------------|----------------------|----------------------------------|------------------------------------| | sll rd, rs. | l, rs2 | rd ← rs1 << rs2 <sub>4:0</sub> | shift left logical | | srl rd, rs. | l, rs2 | $rd \leftarrow rs1 >> rs2_{4:0}$ | shift right logical | | sra rd, rs. | l, rs2 | rd ← rs1 >>> rs2 <sub>4:0</sub> | shift right arithmetical | | slli rd, rs. | l, imm <sub>5b</sub> | rd ← rs1 << imm | shift left logical immediate | | srli rd, rs. | l, imm <sub>5b</sub> | rd ← rs1 >> imm | shift right logical immediate | | srai rd, rs. | l, imm <sub>5b</sub> | rd ← rs1 >>> imm | shift right arithmetical immediate | #### Data transfer: load - They copy data from memory to a register. - It uses base addressing to indicate the memory address of the data - This address is the sum of a base address and an offset. - The base address is in a register. - The offset is a C2 12b immediate, whose sign is extended to 32b. - The data read from memory is loaded in a register. #### Data transfer: store - They copy data from a register to memory. - The data is in a register. - It uses base addressing to indicate the memory address to store the data - This address is the sum of a base address and an offset. - The base address is in a register. - The offset is a C2 12b immediate, whose sign is extended to 32b. 32 ## Instruction set ## Data transfer (i) | Instruction | Operation | Description | |----------------------------------|-------------------------------------------------------|-----------------------------| | lw rd, imm <sub>12b</sub> (rs1) | $rd \leftarrow Mem[ rs1 + sExt(imm) ]$ | load word | | lh rd, imm <sub>12b</sub> (rs1) | $rd \leftarrow sExt(Mem[rs1 + sExt(imm)]_{15:0})$ | load half<br>signed | | lhu rd, imm <sub>12b</sub> (rs1) | $rd \leftarrow zExt(Mem[rs1 + sExt(imm)]_{15:0})$ | load half unsigned unsigned | | lb rd, imm <sub>12b</sub> (rs1) | $rd \leftarrow sExt(Mem[rs1 + sExt(imm)]_{7:0})$ | load byte<br>signed | | lbu rd, imm <sub>12b</sub> (rs1) | $rd \leftarrow zExt(Mem[rs1 + sExt(imm)]_{7:0})$ | load byte unsigned unsigned | | sw rs2, imm <sub>12b</sub> (rs1) | Mem[ rs1 + sExt(imm) ] ← rs2 | store word | | sh rs2, imm <sub>12b</sub> (rs1) | $Mem[\;rs1+sExt(imm)\;]_{15:0} \leftarrow rs2_{15:0}$ | store half | | sb rs2, imm <sub>12b</sub> (rs1) | $Mem[\;rs1+sExt(imm)\;]_{7:0}\leftarrowrs2_{7:0}$ | store byte | - There are different instructions to copy 8b, 16b or 32b data. - Also for sign extension (signed) or zero extension (unsigned). #### Data transfer (ii) | Instruction | Operation | Result | |---------------|------------------------------------------------|----------------------------------------------------| | lw x2, 0(x1) | $x2 \leftarrow Mem[ x1 + sExt(0) ]$ | load <b>8b257a02</b> in x2 | | lhu x2, 0(x1) | $x2 \leftarrow zExt(Mem[x1 + sExt(0)]_{15:0})$ | load <b>00007a02</b> in x2 (7a02 = $_2$ +31234) | | lhu x2, 2(x1) | $x2 \leftarrow zExt(Mem[x1 + sExt(2)]_{15:0})$ | load <b>00008b25</b> in x2 (8b25 = 2 +35621) | | lh x2, 0(x1) | $x2 \leftarrow sExt(Mem[x1 + sExt(0)]_{15:0})$ | load <b>00007a02</b> in x2 (7a02 = $_{C2}$ +31234) | | lh x2, 2(x1) | $x2 \leftarrow sExt(Mem[x1 + sExt(2)]_{15:0})$ | load <b>ffff8b25</b> in x2 (8b25 = $_{C2}$ -29915) | | lbu x2, 3(x1) | $x2 \leftarrow zExt(Mem[x1 + sExt(3)]_{7:0})$ | load <b>0000008b</b> in x2 (8b = 2 +139) | | lb x2, 3(x1) | $x2 \leftarrow sExt(Mem[x1 + sExt(3)]_{7:0})$ | load <b>ffffff8b</b> in x2 (8b = $_{C2}$ -177) | | lh x2, 3(x1) | $x2 \leftarrow sExt(Mem[x1 + sExt(3)]_{7:0})$ | alignment error | ## Conditional branch (i) - They allow breaking the execution sequence branching to a nearby address when a certain condition is met. - It compares 2 two source operands located in registers. - It uses PC-relative addressing to indicate the new PC address in the case the condition is met. - This address is the sum of the PC content and a short offset. - The offset is a C2 13b constant, whose sign is extended to 32b ## Instruction set #### Conditional branch (ii) Conditional branch instructions are used to implement control structures e.g. if, while, for... in assembly. ``` dir beq x5, x6, 8 ← dir+4 addi x5, x5, 1 sub x7, x7, x6 ... ``` when beq is executed, the PC contains its address; adding 8 to the PC, the program would branch to sub (only if the comparison is true) $a \rightarrow x5$ $b \rightarrow x6$ $c \rightarrow x7$ Assignment of the C variables to the RSIC-V registers - The immediate offset (that is added to the PC to perform the branch): - o It is signed, and therefore it allows forward or backward branches. - Since it has 13 bits and each instruction takes 4B, it allows branching up to 1024 instructions backwards and 1023 forwards, from the branch instruction. - A C2 13b constant is in the [-4096, +4095] range. - o Its 2 least significant bits are 0, because all the instructions are aligned. - In fact, the least significant bit is not stored in the instruction. ## Instruction set ## Conditional Branch (iii) | Instruction | | | | Operation | Description | |-------------|------|------|--------------------|-------------------------------------------------------------------------------------------|---------------------------------------------------| | beq | rs1, | rs2, | imm <sub>13b</sub> | if (rs1 = rs2)<br>then (PC $\leftarrow$ PC + sExt(imm <sub>12:1</sub> << 1)) | branch if equal | | bne | rs1, | rs2, | imm <sub>13b</sub> | if ( rs1 $\neq$ rs2 )<br>then ( PC $\leftarrow$ PC + sExt(imm <sub>12:1</sub> << 1) ) | branch if not equal | | blt | rs1, | rs2, | $imm_{13b}$ | $if (rs1 <_{s} rs2)$<br>then (PC $\leftarrow$ PC + sExt(imm <sub>12:1</sub> << 1)) | branch if less than signed | | bge | rs1, | rs2, | $imm_{13b}$ | $if (rs1 \ge_{S} rs2)$<br>then (PC $\leftarrow$ PC + sExt(imm <sub>12:1</sub> << 1)) | branch if greater than or equal signed | | bltu | rs1, | rs2, | imm <sub>13b</sub> | $if$ ( rs1 < $_{\rm U}$ rs2 )<br>then ( PC $\leftarrow$ PC + sExt(imm $_{12:1}$ << 1) ) | branch if less than unsigned unsigned | | bgeu | rs1, | rs2, | imm <sub>13b</sub> | $if$ ( rs1 $\geq_U$ rs2 )<br>then ( PC $\leftarrow$ PC + sExt(imm <sub>12:1</sub> << 1) ) | branch if greater than or equal unsigned unsigned | - There are different instructions to compare signed and unsigned data. - There are no branch instructions with an immediate operand. - There are no "greater than" or "less than or equal", because these can be implemented using the other instructions and changing the order of the operands. ### Branch to function: jal - They allow breaking the execution sequence by branching to a faraway address, but saving the return address. - It uses PC-relative addressing to indicate the new PC address. - This address is the sum of the PC content and a long offset. - The offset is a C2 21b constant, whose sign is extended to 32b. - The next instruction address (return) is saved in a register. ### Branch to function: jalr - They allow breaking the execution sequence by branching to a faraway address, but saving the return address. - It uses base addressing to indicate the new PC address. - This address is the sum of the PC content and a short offset. - The offset is a C2 12b constant, whose sign is extended to 32b. - The next instruction address (return) is saved in a register. ### Branch to function (i) - Branch to function instructions are used to implement function calls in assembly. - Each of the functions in a program is located in a different memory address. - To call a function means branching to the address of its first instruction. - Returning from a function means branching to the address of the instruction following the one that made the function call. #### C language $a \rightarrow x5$ $b \rightarrow x6$ $c \rightarrow x7$ dir-4 dir dir+4 #### RISC-V assembly Assignment of the C variables to the RSIC-V registers when jal is executed, the PC contains its address; PC+4 is saved in x1 and 1000 is added to the PC in order to branch to the function. FC-2 ### Branch to function (ii) | Instruction | Operation | Description | |--------------------------------------------|-------------------------------------------------------------------|---------------------------------------------------------------------| | jalr rd, rs1, imm <sub>12b</sub> | PC ← rs1 + sExt(imm), rd ← PC+4 | jump and link register branch to function with base addressing | | $\mathtt{jal}$ $rd$ , $\mathtt{imm}_{21b}$ | $PC \leftarrow PC + sExt(imm_{20:1} << 1)$ , $rd \leftarrow PC+4$ | jump and link<br>branch to function with PC-<br>relative addressing | - In the RISC-V instruction set there are no return or unconditional branch instructions, but these can be implemented using jal and jalr: - O Assuming that the return address is stored in register xn, the return can be performed as: jalr x0, xn, 0 - O An unconditional branch (PC-relative) to the address of a certain instruction can be performed as: jal x0, imm<sub>21b</sub> ### Instruction set ### lui instruction (i) - It loads a constant in the upper part of a register. - The source operand is a 20b immediate constant. - It is zero-extended to 32b, adding 12 0s on the right. - The result is a stored in a register. | Instru | ction | Operation | Description | |--------|--------------------|----------------|----------------------| | lui | $rd$ , $imm_{20b}$ | rd ← imm << 12 | load upper immediate | ### lui instruction (ii) - The lui instruction is used to operate with 32b long constants. - Immediate operands in arithmetic-logic instructions are short (12b). O Since the addi instruction extends the sign of the immediate value, if bit 11 of the long constant is 1, then the lui constant has to be increased by 1 43 ### Instruction set #### lui instruction (iii) - Although it is not very common, the lui instruction can also be used to work with 32b absolute memory addresses. - To transfer data located in any absolute memory address. To branch to functions located in any absolute memory address. ``` ... lui x6, 0x76543 jalr x1, x6, 0x210 branches to the instruction located in address 0x76543210 ... ``` O Since the lw, sw and jalr instructions also extend the sign of the immediate offset, when bit 11 of the absolute address is 1, the lui constant has to be increased by 1. ### Instruction set ### auipc instruction (i) T E - It adds a constant to the upper part of the PC. - The source operand is a 20b immediate constant. - It is zero-extended to 32b, adding 12 0s on the right. - The result is a stored in a register. | Instruction | Operands | Descriptions | |------------------------------|------------------------------------|---------------------------| | auipc rd, imm <sub>20b</sub> | rd $\leftarrow$ PC + ( imm << 12 ) | add upper immediate to PC | ### Instruction set ### auipc instruction (ii) - The jal instruction, even using a long offset, only allows branching to addresses in the ±1MiB range, which sometimes is not enough. - O Since the offset has 21b and each instruction takes 4B, it only allows branching up to 262,144 instructions backwards and 262,143 forwards. - The auipc instruction, together with jalr, are used to perform PC-relative branches to functions located in any memory address. - o This covers the full ±4GiB address space. ``` addr auipc x6, 0x76543 jalr x1, x6, 0x210 branches to the instruction located 0x76543210 bytes ahead of addr. ``` o Since the jalr instruction extends the sign of the immediate value, when bit 11 of the 32b offset is 1, the auipc constant has to be increased by 1. 46 ## Instruction set ### auipc instruction (iii) - The load and store instructions use base addressing with a short offset - This covers a ±2KiB range respect the base address. - Combining these instructions with auipc, any memory address (PC-relative) can be accessed. As in previous cases, when bit 11 of the PC-relative 32b offset is 1, the auipc constant has to be increased by 1. ### Instruction set ### Most popular instructions This can be measured counting how many times each instruction is executed in a set of standard programs (SPEC CPU2006) | Instruction | Description | Frequency | Accumulated | |-------------|------------------------------|-----------|-------------| | lw | load | 19.48% | 19.48% | | addi | add immediate | 17.22% | 36.70% | | SW | store | 8.05% | 44.75% | | add | add | 7.57% | 52.32% | | bne | branch if not equal | 4.14% | 56.46% | | slli | shift left logical immediate | 3.65% | 60.11% | | beq | branch if equal | 3.27% | 63.38% | | mul | multiply | 2.02% | 65.40% | ## **RISC-V: ISA and extensions** ns - RISC-V is an open and flexible architecture. - o It defines instruction sets architectures (ISA) and extensions. - All RISC-V processor must support one of the ISA and optionally some of the extensions. #### ISA: - RV32I: 32-bit instructions and data/addresses. - RV32E: RV32I version with only 16 registers. - RV64I: 32-bit instructions and 64-bit data/addresses. - RV128I: 32-bit instructions and 128-bit data/addresses. #### Extensions: - > RVM: includes integer multiplication, division and remainder. - RVF: includes 32 floating point data registers, as well as floating point arithmetic, relational and conversion operations. - RVD: 64-bit floating point data version (double precision). - RVQ: 128-bit floating point data version (quadruple precision). - RVC: extension with 16-bit compressed instructions. ### **RISC-V: ISA and extensions** Example: RV64I ISA (i) #### RV64I ISA. - 64-bit integer data and 32-bit instructions. - o It has 32 registers, with 64 bits. - Includes 64b memory transfer instructions. - Redefines the 32b transfer instructions. - Includes a new instruction for 32b unsigned data load. | Inst | ruction | 1 | Operation | Description | |------|---------|----------|-----------------------------------------------|-------------------| | ld | rd, | imm(rs1) | $rd \leftarrow Mem[ rs1 + sExt(imm_{12b}) ]$ | load double word | | sd | rs2, | imm(rs1) | $Mem[ rs1 + sExt(imm_{12b}) ] \leftarrow rs2$ | store double word | | Instructio | n | Operation | Description | |------------|----------|---------------------------------------------------------|--------------------| | lw rd, | imm(rs1) | $rd \leftarrow sExt(Mem[rs1 + sExt(imm_{12b})]_{31:0})$ | load signed word | | lwu rd, | imm(rs1) | $rd \leftarrow zExt(Mem[rs1 + sExt(imm_{12b})]_{32:0})$ | load unsigned word | | sw rs2, | imm(rs1) | $Mem[ rs1 + sExt(imm_{12b}) ] \leftarrow rs_{31:0}$ | store word | ### **RISC-V: ISA and extensions** Example: RV64I ISA (ii) #### RV64I ISA. - Arithmetic-logic instructions work with 64b data (immediate operands still have 12b but extended to 64b) - Redefines the shift instructions to work with 64b data (they require 6b to indicate the number of bits to shift) | Instruction | Operation | Description | |-------------------|-----------------------------------|----------------------------------| | sll rd, rs1, rs2 | $rd \leftarrow rs1 << rs2_{5:0}$ | shift left logical | | srl rd, rs1, rs2 | $rd \leftarrow rs1 >> rs2_{5:0}$ | shift right logical | | sra rd, rs1, rs2 | $rd \leftarrow rs1 >>> rs2_{5:0}$ | shift right arithmetic | | slli rd, rs1, imm | rd ← rs1 << imm <sub>6b</sub> | shift left logical immediate | | srli rd, rs1, imm | $rd \leftarrow rs1 >> imm_{6b}$ | shift right logical immediate | | srai rd, rs1, imm | $rd \leftarrow rs1 >>> imm_{6b}$ | shift right arithmetic immediate | ## **RISC-V: ISA and extensions** Example: RV64I ISA (iii) #### RV64I ISA. Includes arithmetic-logic and shift instructions to work with 32b data (w suffix). | Instruction | Operation | Description | |------------------------------|-------------------------------------------------------------|--------------------| | addw rd, rs1, rs2 | $rd \leftarrow sExt( rs1_{31:0} + rs2_{31:0} )$ | add word | | <pre>subw rd, rs1, rs2</pre> | $rd \leftarrow sExt( rs1_{31:0} - rs2_{31:0} )$ | subtract word | | addiw rd, rs1, imm | $rd \leftarrow sExt( rs1_{31:0} + sExt(imm_{12b})_{31:0} )$ | add immediate word | | Instruction | Operation | Description | |------------------------------|--------------------------------------------------|---------------------------------------| | sllw rd, rs1, rs2 | $rd \leftarrow sExt( rs1_{31:0} << rs2_{4:0} )$ | shift left logical word | | <pre>srlw rd, rs1, rs2</pre> | $rd \leftarrow sExt( rs1_{31:0} >> rs2_{4:0} )$ | shift right logical word | | sraw rd, rs1, rs2 | $rd \leftarrow sExt( rs1_{31:0} >>> rs2_{4:0} )$ | shift right arithmetic word | | slliw rd, rs1, imm | $rd \leftarrow sExt( rs1_{31:0} << imm_{5b} )$ | shift left logical immediate word | | srliw rd, rs1, imm | $rd \leftarrow sExt( rs1_{31:0} >> imm_{5b} )$ | shift right logical immediate word | | sraiw rd, rs1, imm | $rd \leftarrow sExt( rs1_{31:0} >>> imm_{5b} )$ | shift right arithmetic immediate word | 52 ## RISC vs. CISC architectures - RISC-V is a clear example of a RISC architecture. - Other RISC architectures are: PowerPC, DEC Alpha, MIPS, ARM, SPARC... - o It is the predominant architecture in mobile devices. - 75% of the processors have an ARM architecture. - But there are also other architectures with a different paradigm, called CISC architectures (Complex Instruction Set Computer). - They have a large set of complex instructions. - Instructions can work both with data stored in memory as well as data in registers - They have a reduced number of registers, some of them with a specific purpose. - They have a large number of addressing modes. - Instructions have a variable size and many different formats. - Example of CISC architectures are: Motorola 68K, Intel x86, AMD x86-64... - It is the predominant architecture in personal computers. ## RISC vs. CISC architectures ### x86 architecture (i) - Introduced by Intel in 1978 in the 8086 y 8088 microprocessors, it has evolved through several generations. - Used by personal computers since the launch of the IBM-PC in 1981. | Feature | RISC-V (RV32I) | x86 | |------------------------|-----------------------------|------------------------------------------| | Number of registers | 32 general purpose | 8, with some use restrictions | | Number of operands | 3 (2 source, 1 destination) | 2 (1 source, 1 source/destination) | | Operand location | Registers or immediate | Registers, immediate or memory | | Operand size | 32 bits | 8, 16 or 32 bits | | Condition flags | No | Yes | | Number of instructions | Reduced | Large | | Type of instructions | Simple | Simple and complex | | Instruction encoding | Fixed: 4 bytes/instruction | Variable: from 1 to 15 bytes/instruction | ## RISC vs. CISC architectures ### x86 architecture (ii) | Instr | uction | Operation | Adds with | |-------|---------------------|---------------------------------|-------------------------------------------------| | add | AH, BL | $AH \leftarrow AH + BL$ | 8b registers | | add | AX, -1 | $AH \leftarrow AH + 0xffff$ | 16b immediate | | add | EAX, EBX | $EAX \leftarrow EAX + EBX$ | 32b register | | add | EAX, 42 | EAX ← EAX + 0x0000002a | 32b immediate | | add | EAX, [20] | $EAX \leftarrow EAX + Mem[20]$ | absolute addressing | | add | EAX, [ESP] | $EAX \leftarrow EAX + Mem[ESP]$ | base addressing | | add | EAX, [EDX+40] | EAX ← EAX + Mem[EDX+40] | base addressing with offset | | add | EAX, [60+EDI*4] | EAX ← EAX + Mem[60+EDI*4] | scaled index register with offset | | add | EAX, [EDX+80+EDI*4] | EAX ← EAX + Mem[EDX+80+EDI*4] | base register, offset and scaled index register | | add | [20], EAX | Mem[20] ← Mem[20] + EAX | 32b register stored in memory | | add | [20], 42 | Mem[20] ← Mem[20] + 42 | immediate stored in memory | ## **About Creative Commons** ### CC license (Creative Commons) This license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. If you remix, adapt, or build upon the material, you must license the modified material under identical terms: #### Attribution: Credit must be given to the creator. #### Non commercial: Only noncommercial uses of the work are permitted. #### Share alike: Adaptations must be shared under the same terms. More information: https://creativecommons.org/licenses/by-nc-sa/4.0/