Programmer's Corner - Floating Point Calculations in ASM (Assembly)

Backup and Security Solutions 10% off all products with promo code: VISI-P1YR
Get the Programmer's Corner FireFox Search Plug-In

FLOATING POINT CALCULATIONS IN ASSEMBLY

I. REQUIREMENTS

II. WHAT THIS COVERS or INTRODUCTION

Every code digger analyses every commercial program and tries to find how a particular job is done eg. how the program saves a file etc. In the earlier stages of computing every code digger would stay away from floating point arithmetic since not many processors supported it then. But now things are different. Floating point math is as powerful and fast as interger arithmetic. Developers have realised that only a selected few have knowledge of how float variables work in Assembly. So wherever possible they use float and double variables so that their analysis becomes difficult. Hence my objective will be to help you understand how Floating point arithmetic works in Assembly and to explain the instruction set for float calculations. This Article is divided into 3 segments namely:

  1. THE BASICS
  2. Passing Float Arguments to Functions
  3. Floating Point Instructions
  4. Returning Float Values via Stack through Functions

So you can jump onto any section incase you are interested in just one of them or just read through the whole thing if you wanna be a floating point master.

III. THE BASICS

Every Float Argument has to be pushed on the co-processor stack or the Floating Point Unit Stack (FPU). Hence every Floating point instruction is preceded by a 'F'. Usually every float operation starts with a FLD INSTRUCTION which "LOADS A FLOAT NUMBER ON TOP OF THE FPU STACK". Then it can be stored in a variable with the help of the FST and FSTP Instruction which is explained later in this Article. That's the first part. Another thing we must remember that both the double and float data types use the same instruction set.Now how do we differentiate whether the number used is a float or a double.In such a case we have to sink our teeth in the machine code of the program.Depending on the machine code we can determine whether a float or double is being manipulated and this will be shown a little later.

IV. PASSING FLOAT ARGUMENTS TO FUNCTIONS

Have a look at this example:

#include 


void func(float f,double d)
{
  printf("f=%f\n",f);
  printf("d=%f\n",d);
}

void main()
{
  float ff=22.22f;
  double dd=11.11;
  func(ff,dd);
}

HERE IS THE DISASSEMBLED LISTING OF THE EXAMPLE

func            proc near

 var_8           = qword ptr -8
 arg_0           = dword ptr  8
 arg_4           = byte ptr  0Ch
 arg_8           = dword ptr  10h

                 push    ebp ;Original Value of ebp is stored
                 mov     ebp, esp ; Stack Frame is Opened
                 fld     [ebp+arg_0]
; The First Argument is pushed on top of the FPU Stack.
                 add     esp, 0FFFFFFF8h
; 8 bytes is allocated on the stack for the local variable.
                 fstp    [esp+8+var_8]
; var_8 is located at (esp - 8). So esp + 8 - 8 = esp.
; Hence the Float Value is stored in var_8 and popped off the FPU Stack
                 push    offset aFF   ;  "f=%f\n"
                 call    _printf
                 add     esp, 0Ch ; 12 bytes popped off the stack
                 push    [ebp+arg_8]
; The Last Half of the double data is pushed
                 push    dword ptr [ebp+arg_4]
; The First Half of the double data is pushed
                 push    offset aDF      ;  "d=%f\n"
                 call    _printf
                 add     esp, 0Ch ; 12 bytes popped from stack
                 pop     ebp   ; Value of ebp Restored
                 retn    ; Return to main()
func            endp


; int __cdecl main(int argc,const char **argv,const char *envp)
; Attributes: bp-based frame

_main           proc near               ; DATA XREF: .data:0040A0B8o

 var_dbl_1       = dword ptr -0Ch
 var_dbl_2       = dword ptr -8
 var_float       = dword ptr -4
 argc            = dword ptr  8      ; COMMAND-LINE
 argv            = dword ptr  0Ch    ; ARGUMENTS
 envp            = dword ptr  10h    ; ENVIRONMENT VARIABLES

                 push    ebp            ; Original Value of ebp saved
                 mov     ebp, esp       ; Stack Frame Opened
                 add     esp, 0FFFFFFF4h ; 12 bytes cleared from stack
                 mov     [ebp+var_float], 41B1C28Fh
; 22.22 is stored in var_float variable
                 mov     [ebp+var_dbl_1], 0EB851EB8h
; The last half of the double is assigned
                 mov     [ebp+var_dbl_2], 40263851h
; The First Half of the double is assigned
                 push    [ebp+var_dbl_2] ; Second Half Pushed
                 push    [ebp+var_dbl_1] ; First Half Pushed
                 push    [ebp+var_float] ; Float Value pushed
                 call    func            ; func() called
                 add     esp, 0Ch        ; 12 bytes freed from stack
                 mov     esp, ebp        ; Stack Frame Closed
                 pop     ebp             ; Value of ebp restored
                 retn
_main           endp

I have included comments on almost every line which in Assembly starts after a semicolon ';' and ends after the End of Line (EOL). However the main explanation is here. Looking at the listing above we can see that a simple PUSH instruction is used to pass Float and Double Types to Functions. They can also be passed via general purpose registers. They can also be passed by the registers of the FPU Stack.The 80x87 Coprocessor has eight 80-bit registers called ST(0),ST(1),ST(2).....ST(7). When we say that a value is on the top of the FPU Stack it also means that it is located in the ST(0) Register. The Contents of ST(1......7) are located immediately below the top of the FPU Stack. When the Float Value needs to be stored or manipulated they are first pushed on the top of the FPU Stack using the FLD instruction. To store a float value in a variable the FST instruction is used. Here the FSTP Instruction is used and the value on the top of the FPU Stack is popped after the value is assigned to the variable.

V. FLOATING POINT INSTRUCTIONS

Given Below is a list of the most frequently encountered float instructions.

--------------------------------------------------------------------------------
Instruction                Purpose
--------------------------------------------------------------------------------
FLD [source]               Pushes a Float Number from the source onto the top of
                           the FPU Stack.

FST [destination]          Copies a Float Number from the top of the FPU Stack
                           into the destination.

FSTP [destination]         Pops a Float Number from the top of the FPU Stack
                           into the destination.

FLDZ                       Pushes +0.0 on top of FPU Stack

FLD1                       Pushes +1.0 on top of FPU Stack

FLDPI                      Pushes PI on the top of FPU Stack

FILD [source]              Pushes an integer from the source to the top of the
                           FPU Stack.

FIST [destination]         Copies an integer from the top of the FPU Stack to
                           the destination.

FISTP [destination]        Pops an integer from the top of FPU Stack into the
                           destination.

FCHS                       Compliments the sign-bit of a float value located on
                           the top of the FPU Stack or ST(0) Register.

FNOP                       Performs no FPU Operation.[It's a 2 byte instruction
                           unlike that of NOP which is a 1 byte instruction.]

FABS                       Replaces the float value on the top of the stack with
                           it's absolute value.

FADD [operand]             Adds the value of the operand with the value located
                           on the top of the FPU Stack and store the result on
                           the top of the FPU Stack.

FCOS/FSIN                  Replaces the value on the top of the FPU Stack with
                           it's cosine/sine value.

FDIV [operand]             Divide the value on the top of the FPU Stack with the
                           operand and store the result on the top of FPU Stack.

FMUL [operand]             Multiply the value on the top of the FPU Stack with
                           the operand and store the result on top of FPU Stack.

FSUB [operand]             Subtract operand value from the value on top of FPU
                           Stack and store the result on top of FPU Stack.

FXCHST (index)             Exchanges values between top of FPU Stack and the
                           ST(index) register.

FCOM                       Compares the float value located on top if FPU Stack
                           with the operand located in memory or the FPU Stack.

FCOMP                      Same as FCOM but pops the float value from the top of
                           the FPU Stack.

FNSTSW AX                  Store FPU Status Word in AX. {Used for Conditions.}
--------------------------------------------------------------------------------

There are many more float instructions but these are the prominent ones. If you want to learn about the others you can refer Volume 2 of Intel's Software Developers Manual ie."Instruction Set Reference"

V.I IS FADD == FADD ?

While going through disassembled source code we may encounter instructions suchas FADD or FSUB and we may wonder whether it's operating on a double or a float. In such a case we have to look up it's machine code instruction. Let's consider this example so you'll understand what I mean.

#include 

void main()
{
  double d=11.11;
  float f=2.2;
  printf("d+f=%f and f+d=%f\n",(d+f),(f+d));
  printf("d-f=%f and f-d=%f\n",(d-f),(f-d));
  printf("d*f=%f and f*d=%f\n",(d*f),(f*d));
  printf("d/f=%f and f/d=%f\n",(d/f),(f/d));
}

Now this example generates the following code:

; int __cdecl main(int argc,const char **argv,const char *envp)
; Attributes: bp-based frame

_main           proc near               ; DATA XREF: .data:0040A0B8o

       var_14          = qword ptr -14h
       var_C           = dword ptr -0Ch
       var_8           = qword ptr -8
       argc            = dword ptr  8
       argv            = dword ptr  0Ch
       envp            = dword ptr  10h

 55                    push    ebp
 8B EC                 mov     ebp, esp
 83 C4+                add     esp, 0FFFFFFF4h ; 12 bytes Allocated on Stack
 C7 45+                mov     dword ptr [ebp+var_8], 0EB851EB8h
 C7 45+                mov     dword ptr [ebp+var_8+4], 40263851h
           ; Double Stored in var_8
 C7 45+                mov     [ebp+var_C], 400CCCCDh
           ; Float Stored in var_C
 D9 45+                fld     [ebp+var_C]
           ; Float Stored in ST(0) Register
 DC 45+                fadd    [ebp+var_8]
           ; Add the double at var_8 to float at ST(0) and store
           ; the result in ST(0) Coprocessor Register.
 83 C4+                add     esp, 0FFFFFFF8h ; 8 bytes allocated
 DD 1C+                fstp    [esp]
           ; Value in ST(0) Register is popped into the 8 bytes allocated on
           ; the CPU Stack. Since 8 bytes are being used the result is a DOUBLE.
 DD 45+                fld     [ebp+var_8]
           ; The Double value located at var_8 is pushed into ST(0) Register.
 D8 45+                fadd    [ebp+var_C]
           ; The float value is added to double value at ST(0) and the result
           ; is stored in ST(0).
 83 C4+                add     esp, 0FFFFFFF8h ; 8 more bytes are allocated
 DD 1C+                fstp    [esp]
           ; The Result of the Addition at ST(0) is stored in the 8 bytes
           ; allocated in the CPU stack. Again the result is a double.
 68 E8+                push    offset aDFFAndFDF
           ; The Format "d+f=%f and f+d=%f\n" is pushed on the CPU Stack
 E8 CB+                call    _printf
           ; The two addition results are displayed on the screen.
 83 C4+                add     esp, 14h
           ; 0x14 bytes are freed from the stack.
           ; two 8-byte doubles + one 4 byte offset = 0x14 bytes
 D9 45+                fld     [ebp+var_C]
           ; The Float value is loaded in ST(0)
 DC 65+                fsub    [ebp+var_8]
           ; Subtracts double value at var_8 from ST(0) and place the result in
           ; ST(0).
 83 C4+                add     esp, 0FFFFFFF8h ; 8 bytes for the result cleared
 DD 1C+                fstp    [esp]
           ; The Result of the subtraction is popped on top of the CPU Stack
           ; occupying 8 bytes allocated for it.
 DD 45+                fld     [ebp+var_8]
           ; Load the Double at var_8 into ST(0).
 D8 65+                fsub    [ebp+var_C]
           ; Subtract float value at var_C from ST(0) and place the result in
           ; ST(0)
 83 C4+                add     esp, 0FFFFFFF8h ; 8 bytes for the result freed
 DD 1C+                fstp    [esp]
           ; Result stored on top of CPU Stack occupying 8 bytes previously
           ; allocated by the add instruction.
 68 FB+                push    offset aDFFAndFDF_0
           ; Format "d-f=%f and f-d=%f\n" is pushed
 E8 A6+                call    _printf
           ; Both Results are Displayed.
 83 C4+                add     esp, 14h
           ; 0x14 bytes are freed as explained above
 D9 45+                fld     [ebp+var_C]
           ; Float Value located at var_C loaded into ST(0) Register.
 DC 4D+                fmul    [ebp+var_8]
           ; Multiply ST(0) with double value in var_8 and store the result in
           ; ST(0).
 83 C4+                add     esp, 0FFFFFFF8h ; 8 bytes for the result freed
 DD 1C+                fstp    [esp]
           ; The Result of the Multiplication is popped onto the top of the
           ; stack occupying 8 bytes previously allocated by the add instruction
 DD 45+                fld     [ebp+var_8]
           ; Load the Double value into ST(0).
 D8 4D+                fmul    [ebp+var_C]
           ; Multiply ST(0) by float value in var_C and store result in ST(0).
 83 C4+                add     esp, 0FFFFFFF8h ; Free 8 bytes
 DD 1C+                fstp    [esp]
           ; Pop the result from ST(0) into the top of the stack occupying 8
           ; bytes.
 68 0E+                push    offset aDFFAndFDF_1
           ; The format "d*f=%f and f*d=%f\n" is pushed
 E8 81+                call    _printf
           ; The Result is displayed on screen.
 83 C4+                add     esp, 14h
           ; 14 bytes are popped off the stack.
 D9 45+                fld     [ebp+var_C]
           ; Load the Float Value into ST(0)
 DC 75+                fdiv    [ebp+var_8]
           ; Divide ST(0) by the double value and store result in ST(0)
 83 C4+                add     esp, 0FFFFFFF8h ; Make space for result
 DD 1C+                fstp    [esp] ; Result stored on stack
 DD 45+                fld     [ebp+var_8]
           ; Load Double into ST(0).
 D8 75+                fdiv    [ebp+var_C]
           ; Divide ST(0) by Float Value and store result in ST(0).
 83 C4+                add     esp, 0FFFFFFF8h ; Make space for result
 DD 1C+                fstp    [esp] ; Pop result on CPU Stack
 68 21+                push    offset aDFFAndFDF_2
           ; Format "d/f=%f and f/d=%f\n" is pushed
 E8 5C+                call    _printf         ; and is displayed on screen
 83 C4+                add     esp, 14h ; 0x14 bytes deallocated off the Stack.
 8B E5                 mov     esp, ebp ; Stack Frame Closed
 5D                    pop     ebp      ; Original Value of ebp Restored
 C3                    retn
_main           endp

On the left of the Assembly code is given two bytes of machine code. Unlike instructions like add,sub,mov etc. which have the same machine code everywhere instructions like FADD,FSUB,FDIV,FMUL and FLD have different machine code depending on the data they act on. Each of the above instructions mentioned have been called twice and you can see that they differ from data-type to data-type. Hence I have also made a table that will help us distinguish which instruction is being called. You can modify this program to work on only float values and on disassembling you will find that the result of manipulating float numbers always results in a double value. So if you are trying to save code size by using float values for mathematical problems I'd suggest you to use double data type since in any case the result will be converted to a double. In fact if you are using a double the processor need not upcast the float into a double and you will save a lot of CPU clock cycles.

--------------------------------------------------------------------------------
        FIRST BYTE OF COMMON FLOAT INSTRUCTIONS DEPENDING ON DATA TYPE
--------------------------------------------------------------------------------
Instruction                                  DATA TYPE
                                  Float                       Double
--------------------------------------------------------------------------------
FLD                                0xD9                        0xDD
FSTP                               0xD9                        0xDD
FST                                0xD9                        0xDD
FADD                               0xD8                        0xDC
FADDP                              0xDE                        0xDA
FSUB                               0xD8                        0xDC
FDIV                               0xD8                        0xDC
FMUL                               0xD8                        0xDC
FCOM                               0xD8                        0xDC
FCOMP                              0xD8                        0xDC
--------------------------------------------------------------------------------

Now that you have seen how basic math is performed on float values, let's move on to another program that will include a few more instructions. Here is the program:

#include 

void main()
{
 int i=16;
 float f=6.6f;
 printf("i+f=%d\n",(i+f));
 printf("-f=%f\n",-f);
 float ff=16.16f;
 if(f==ff)
   printf("f==ff\n");
 else
   printf("f!=ff\n");
}

It's Disassembled Listing is as Follows:

; int __cdecl main(int argc,const char **argv,const char *envp)
; Attributes: bp-based frame

_main           proc near

 var_C           = dword ptr -0Ch
 var_8           = dword ptr -8
 var_4           = dword ptr -4
 argc            = dword ptr  8
 argv            = dword ptr  0Ch
 envp            = dword ptr  10h

                 push    ebp
                 mov     ebp, esp
                 add     esp, 0FFFFFFF4h
       ; 12 bytes are allocated on the stack.
                 mov     eax, 10h
       ; EAX is set to 16
                 mov     [ebp+var_4], 40D33333h
       ; var_4 contains a Float Value
                 mov     [ebp+var_C], eax
       ; Now var_C contains an Integer
                 fild    [ebp+var_C]
       ; Integer 16.0 is loaded in ST(0)
                 fadd    [ebp+var_4]
       ; ST(0) is added with float in var_4 and the result is stored in ST(0)
                 add     esp, 0FFFFFFF8h
       ; 8 bytes are allocated for the result of double type.
                 fstp    [esp]
       ; The Resulting Double is stored on top of CPU Stack occupying 8 bytes.
                 push    offset aIFD
       ; Format "i+f=%d\n" is pushed
                 call    _printf
       ; and the result is displayed
                 add     esp, 0Ch
       ; 12 bytes are freed from the stack.
                 fld     [ebp+var_4]
       ; The Float is loaded in ST(0) Register.
                 fchs
       ; It's sign bit is inverted and the result is stored in ST(0)
                 add     esp, 0FFFFFFF8h
       ; 8 bytes for the resulting double is allocated on the CPU Stack
                 fstp    [esp]
       ; Result is pushed on the CPU Stack and popped from FPU Stack
                 push    offset aFF
       ; Format "-f=%f\n" is pushed
                 call    _printf
       ; And displayed on screen
                 add     esp, 0Ch
       ; 12 bytes freed from stack
                 mov     [ebp+var_8], 418147AEh
       ; Another float of value 16.16 is stored in var_8
                 fld     [ebp+var_4]
       ; The previous float is loaded in ST(0)
                 fcomp   [ebp+var_8]
       ; Compares ST(0) with float in var_8 and pop register stack.
                 fnstsw  ax
       ; Store FPU Status Word in AX
                 sahf
       ; Loads SF,ZF,AF,PF and CF Flags into the EFLAGS Register Values from the
       ; corresponding bits in the AH Register ie.(bits 7,6,4,2 respectively)
                 jnz     short not_equal
       ; Jump if ZERO_FLAG is ZERO to not_equal
                 push    offset aFFf
       ; Format "f==ff\n" is pushed
                 call    _printf
       ; And Displayed
                 pop     ecx
       ; An equivalent of disallocating 4 bytes on the stack
                 jmp     short end_condition
       ; Unconditional Jump to end_condition

 not_equal:
                 push    offset aFFf_0
       ; Format "f!=ff\n" is pushed
                 call    _printf
       ; and displayed
                 pop     ecx
       ; 4 bytes popped off from stack

 end_condition:
                 mov     esp, ebp
       ; Stack Frame Closed
                 pop     ebp
       ; Original Value of ebp restored
                 retn
_main           endp

As you saw that every float value manipulation results in a double value even if an integer is added to it. The FCOMP Mechanism is slightly tricky. There are 4 condition code flags in the FPU. Here is the table by which we can understand the changes in the condition code flags when a FCOMP is used

--------------------------------------------------------------------------------
                             THE 3 CONDITION CODE FLAGS MODIFIED BY FCOMP
--------------------------------------------------------------------------------
CONDITION                      C3                 C2                C0
--------------------------------------------------------------------------------
ST(0) > [source]               0                   0                0

ST(0) < [source]               0                   0                1

ST(0) = [source]               1                   0                0
--------------------------------------------------------------------------------

The FST Register composes of 4 condition flags. And then the FST Value is transferred to AH Register. Then using SAHF instruction the CPU Flags are modified according to the corresponding bits in the AH Register(as shown above). Then and only then can a conditional jump take place on a float condition.

VI. RETURNING FLOAT VALUES VIA STACK THROUGH FUNCTIONS

Consider this program:

#include 

template 
T ret(T a,T b)
{
  return (a+b);
}


void main()
{
  float f1=1.1f,f2=2.2f;
  double d1=3.3,d2=4.4;
  printf("f1 + f2 = %f\n",ret(f1,f2));
  printf("d1 + d2 = %f\n",ret(d1,d2));
}

To save space I have used function templates since the body of the function is the same for both data-types. If you don't know Function Templates Yet, refer to the Function Template Tutorial. Here is it's disassembled listing:

; int __cdecl main(int argc,const char **argv,const char *envp)
; Attributes: bp-based frame

_main           proc near               ; DATA XREF: .data:0040A0B8o

 var_18          = dword ptr -18h
 var_14          = dword ptr -14h
 var_10          = dword ptr -10h
 var_C           = dword ptr -0Ch
 var_8           = dword ptr -8
 var_4           = dword ptr -4
 argc            = dword ptr  8
 argv            = dword ptr  0Ch
 envp            = dword ptr  10h

                 push    ebp
                 mov     ebp, esp
                 add     esp, 0FFFFFFE8h
       ; 24 bytes allocated on the CPU Stack
                 mov     [ebp+var_4], 3F8CCCCDh
       ; Float stored at var_4 (1.1)
                 mov     [ebp+var_8], 400CCCCDh
       ; Float stored at var_8  (2.2)
                 mov     [ebp+var_10], 66666666h
                 mov     [ebp+var_C], 400A6666h
       ; Double Stored
                 mov     [ebp+var_18], 9999999Ah
                 mov     [ebp+var_14], 40119999h
       ; Double Stored
                 push    [ebp+var_8]
                 push    [ebp+var_4]
       ; Floats are passed to ret_float function
                 call    ret_float
                 add     esp, 8
                 add     esp, 0FFFFFFF8h ; char
                 fstp    [esp]
       ; Result popped from ST(0) onto top of CPU Stack.
       ; Imitates a push instruction
                 push    offset aF1F2F   ; __va_args
                 call    _printf
                 add     esp, 0Ch
                 push    [ebp+var_14]
                 push    [ebp+var_18]
                 push    [ebp+var_C]
                 push    [ebp+var_10]
                 call    ret_double
                 add     esp, 10h
                 add     esp, 0FFFFFFF8h ; char
                 fstp    [esp]
       ; Result popped from ST(0) onto top of CPU Stack
                 push    offset aD1D2F   ; __va_args
                 call    _printf
                 add     esp, 0Ch
                 mov     esp, ebp
                 pop     ebp
                 retn
_main           endp


ret_float       proc near               ; CODE XREF: _main+36p

 arg_0           = dword ptr  8
 arg_4           = dword ptr  0Ch

                 push    ebp
                 mov     ebp, esp
                 fld     [ebp+arg_0]
                 fadd    [ebp+arg_4]
       ; Result left on top of FPU Stack Itself
                 pop     ebp
                 retn
ret_float       endp


ret_double      proc near               ; CODE XREF: _main+5Dp

 arg_0           = qword ptr  8
 arg_8           = qword ptr  10h

                 push    ebp
                 mov     ebp, esp
                 fld     [ebp+arg_0]
                 fadd    [ebp+arg_8]
      ; Result left in ST(0) or top of FPU Stack itself
                 pop     ebp
                 retn
ret_double      endp

So when a float has to be returned, instead of placing it in the EAX Register the value to be returned is kept in ST(0) or on the top of the FPU Stack.If the returned value is required just once the FSTP instruction is used, otherwise FST Instruction is used. Notice that even though the source code had just one templated function the actual code has different versions of the same function for different data types.

This is the end of the Article. Hope you understand how float and double value calculations are done. If you want to learn more try using various functions included in the header file eg. log10, sin, cos etc. and you can see the other instructions at work.

If you have any question on Float Interpretation you can mail me at:born2c0de@hotmail.com

Author Information:

Sanchit Karve

http://www.freewebs.com/born2c0de/

born2c0de@hotmail.com

Comments:

Add your comments here.

Name

Comment

You can also send feedback to feedback@programmers-corner.com


Shaun - July 6, 2005 10:39 AM

Thank you so much for this. I desperately needed a tutorial on manipulating floats in assembly. I am very grateful that someone is willing to share knowledge in a field that is almost forgotten today. Thank you very much!