Random rounding is a numerical technique
We presented our version of the technique in
G. Knizia, W. Li, S. Simon, H.-J. Werner; J. Chem. Theory Comput. 7 2387 (2011)
with applications to the Molpro quantum chemistry package, a numerical program with several million lines of mixed Fortran 77, Fortran 90, C and C++ code. We were able to obtain accurate error bars on final numbers calculated by numerical programs consisting of 10 000s of lines of code.The meta-compiler script we described can be found below. The following text gives a short overview over the technique. More information and references to related work of other authors can be found in the article.
When a FP operation is executed, then in general
the exact result of the computation cannot be represented in a FP number,
since FP numbers have limited resolution. The result of the computation
thus needs to be
Under unfortunate conditions, rounding errors can accumulate and amplify
in non-transparent ways and lead to
Let a = 1.4142136 b = -1.4142136 c = 1.2345678 * 10-5 Then c1 = (a + b) + c = 1.2345678 * 10-5 c2 = a + (b + c) = 1.2397766 * 10-5
While here it is quite obvious what happened, in large programs
such errors can occur burried deep in nested procedures, and without
any indication that the final result is inaccurate (you just get a number, but no information on how many of its digits are significant).
Note that all effective subtractions of large FP numbers resulting in a small result lead to relative accuracy loss.
Also, often such problems show up only in large calculations which
cannot be easily debugged by hand, or when uncommon calculation
parameters are used.
In general it is very hard to find out how accurate calculated FP numbers are.
By default the FPU will round results of floating point operations to the nearest representable floating point number. Random rounding analysis works by changing that behavior: Before each FP operation, a small code segment is executed which changes the rounding mode to UP or DOWN randomly. Due to this, an artificial numerical noise is induced, which is of the same order of magnitude as the inherent FP rounding error, but in a random direction. Since this is done for each FP operation, a proper numerical error propagation is emulated.
Subsequently, multiple runs of the entire program will give different
results. These results can be analyzed statistically to estimate
the effects of
E1 = 0.99999997 E2 = 1.00000656 E3 = 1.00000134 E4 = 1.00000670 E5 = 0.99999066 E6 = 0.99999421 E7 = 0.99999345 E8 = 1.00000280You can evaluate the mean result and root mean square deviation as
mean(E) = 0.999999459745 rms(E) = 5.687e-06and conclude that your final calculation results is in reality
E = (0.999 999 5 +/- 0.000 011 4) (double confidence interval)(i.e., there are only -log10(2*5.687e-06)=4.9 significant digits) and not, for example,
E = 1.000 003 129 723 ...as a single straight run of the program (without random rounding) might want to make you believe. More information on the technique, the background, and the pitfalls of the technique can be found in the article.
The f95rr script implements random rounding
Concretely, the script can be invoked as a Fortran compiler or C++ compiler and will then do the following:
xchg rax, [LC_CWINDEX] lea ax, [rax+1] fldcw word ptr [LC_CWDATA+2*rax] xchg rax, [LC_CWINDEX]This code first increments the current rounding mode table index (LC_CWINDEX) and then loads a new floating point control word to the FPU. This control word is taken from the 216-entry rounding mode table table ([LC_CWDATA]) at the current index. The last xchg writes the new table index back and restores the original register rax value. All instructions in this assembly fragment preserve the CPU flags register and do not affect other code (apart from the changed rounding mode of course).
export PATH=$HOME/programs/f95rr/:$PATHand on csh with
setenv PATH $HOME/programs/f95rr/:$PATH(instead of $HOME/programs/f95rr/ put in the actual location to where the files were unpacked)
program bla implicit double precision (a-h,o-z) f = 0.0 do i = 10000, 1, -1 x = dble(i) f = f + x*dsin(x) end do print "(1x,A,2x,F20.15)", "final sum:", f end programRunning the compiler script should produce the following output:
~/dev/f95rr> gfortran example.F90; a.out * Setting default rounding; Random rounding PRNG: * JKISS [D. Jones, 'Good practice in (pseudo) random number generation..'] final sum: 7186.139201355133991This is the default output of the program (exact double precision result), apart from the first line, which is emitted by cw_control.c and can be disabled if this is desired. It produces the same number as before (although more slowly..) because the rounding mode table is initialized to 64bit, round-to-nearest. If we now want to measure the numerical stability of this process, we need to enable random rounding for the computation:
program bla implicit double precision (a-h,o-z)If the program is now recompiled (just as before) and executed multiple times, it will give different results:! from here on rounding mode table is double precision, random rounding. call set_random_rounding_64bit() f = 0.0 do i = 10000, 1, -1 x = dble(i) f = f + x*dsin(x) end do! from here on rounding mode is back to normal (double precision, round to nearest) call set_default_rounding() print "(1x,A,2x,F20.15)", "final sum:", f end program
~/dev/f95rr> gfortran example.F90; a.out * Setting default rounding; Random rounding PRNG: * JKISS [D. Jones, 'Good practice in (pseudo) random number generation..'] final sum: 7186.139201355161276 ~/dev/f95rr> a.out final sum: 7186.139201355102159 ~/dev/f95rr> a.out final sum: 7186.139201355098521 ~/dev/f95rr> a.out final sum: 7186.139201355199475 ~/dev/f95rr> a.out final sum: 7186.139201355158548 ~/dev/f95rr> a.out final sum: 7186.139201355080331 ~/dev/f95rr> a.out final sum: 7186.139201355074874This scattering can then be analyzed statistically to get a measure of the numerical stability (we recommend using root mean square scattering numbers).
If you get unresolved symbols "init_cw_table__" or similar, open the f95rr.py file and set "second_underscore" to True.
If the program fails to do linking at all:
This may require logic readjustments in f95rr. Parsing the compiler commands 100% perfectly is not easy due to the creative gcc syntax.Please address feedback to Gerald Knizia via knizia(at)theochem.uni-stuttgart.de. I'll try to help to get the script to work if I can. It's still experimental code and may take some tweaking in order to get working (but that is most likely still much easier than other forms of numerical stability analysis!).
I would greatly appreciate it if you would occasionally slip in a citation for the article if you used the f95rr script in your scholary work and found it useful.