Home
Memoirs of a Gamer
Movies I watched
Guidebook
Links
This section is not intended to be comprehensive, rather as a starter for newbies into the world of Intel Assembly. The information here will therefore never be complete and should be expanded upon with other texts. This will also primarily only focus on the capabilities of the original 8086 itself, and thus ignore most of the expansions to the instruction set that have been added in subsequent generations of the '86 series of microprocessors.
Name | Low 8 bits | Hi 8 bits | 16 bit | 32 bit |
The general purpose Registers | ||||
Accumulator | AL | AH | AX | EAX |
Base | BL | BH | BX | EBX |
Counter | CL | CH | CX | ECX |
Data | DL | DH | DX | EDX |
String Indices | ||||
Source Index | SI | ESI | ||
Destination Index | DI | EDI | ||
Stack Pointers | ||||
Stack Pointer | SP | ESP | ||
Base Pointer | BP | EBP | ||
Segments | ||||
Code Segment | CS | |||
Data Segment | DS | |||
Extra Segment | ES | |||
Stack Segment | SS |
Although for the most part the 4 primary registers and their 8 half registers are referred to as general purpose registers, they do have some special usages, as specified below.
The 8086 and subsequent chips support a series of instructions referred to as String Instructions. An incomplete list would be movsb, lodsb, stosb, cmpsb, and so on. The last letter in the instruction name is the bit width desired.
You might be wondering why these are referred to as "String" instructions. Generally strings are what we call arbitrarily lengthed arrays of characters, like "Hello, world!" Well that's what these instructions are designed for, arbitrary lengthed chunks of data. To automatically loop over an entire block of data to either copy it, set it to an initial value, or even compare it against another block of data, just use the prefix command "rep" to the line and set CX to the number of iterations and the processor will loop that single instruction until CX equals 0.
rep movsb |
In general, computer memory is split into two logical blocks by programmers: the Heap and the Stack. The Heap grows up from the bottom of memory and the Stack grows down. The Heap is used for long term memory allocation, and the Stack is used for short term data storage, often times storing values for only a handful of instructions before the data is popped off the Stack back into use by one of our registers.
16 bit Assembly | 32 bit Assembly |
;A normal function opener function_name: push bp mov bp, sp ;Feel free to start modifying the stack! |
;A normal function opener function_name: push ebp mov ebp, esp ;Feel free to start modifying the stack! |
;And to restore the stack state: mov sp, bp pop bp ret |
;And to restore the stack state: mov esp, ebp pop ebp ret |
The original 8086 supported a 20 bit address space. In olden days we would call this a Megabyte, but after a small tiff between engineers and the SI which led to data sizes being renamed from the generally understood Kilobyte / Megabyte / Gigabyte to Kibibyte / Mebibyte / Gibibyte. Looking back at older software can lead to some miscommunication because of this relatively recent (read just over 20 years old) redefinition.
Terminology not withstanding, there was a problem arising from a 16 bit processor where all available registers are 16 bits wide trying to address 20 bits of memory address space. Intel's solution was effective, if a little confusing. To expand it's 16 bit registers to 20 bits it combined 2 16 bit values: the Segment, and the Offset. This is commonly written in the format SEGMENT:OFFSET. Unfortunately here's where it gets a bit obtuse. Instead of the segment register acting as the bits starting past the 16 bit offset, Intel chose to add it so that the maximum address space available was 1 Mebibyte. The equation to combine the two is: