Upcoming F# struct tuples: are they always faster?

Don Syme has been working on struct tuples for F# language. Let's see if they are more performant than "old" (heap allocated) tuples in simple scenario: returning tuple from function. The code is very simple:

open System
open System.Diagnostics
[<EntryPoint>]
let main _ =
let foo x = (x, Math.Sin x)
let sw = Stopwatch.StartNew()
for n in 1..100000000 do
let (x, sinx) = foo (float n / 1000.)
()
sw.Stop()
printfn "Run: %O" sw.Elapsed
sw.Restart()
GC.Collect 2
sw.Stop()
printfn "GC.Collect: %O" sw.Elapsed
Console.ReadKey() |> ignore
0
// Debug
// Run: 00:00:03.6734753
// GC.Collect: 00:00:00.0008279
// Release
// Run: 00:00:00.3454602
// GC.Collect 2: 00:00:00.0000765

Decompiled code in Release configuration:

[EntryPoint]
public static int main(string[] _arg1)
{
Stopwatch sw = Stopwatch.StartNew();
for (int i = 1; i < 100000001; i++)
{
double a = (double)i / 1000.0;
double num = Math.Sin(a);
}
sw.Stop();
PrintfFormat<FSharpFunc<TimeSpan, Unit>, TextWriter, Unit, Unit> format = new PrintfFormat<FSharpFunc<TimeSpan, Unit>, TextWriter, Unit, Unit, TimeSpan>("Run: %O");
PrintfModule.PrintFormatLineToTextWriter<FSharpFunc<TimeSpan, Unit>>(Console.Out, format).Invoke(sw.Elapsed);
sw.Restart();
GC.Collect(2);
sw.Stop();
format = new PrintfFormat<FSharpFunc<TimeSpan, Unit>, TextWriter, Unit, Unit, TimeSpan>("GC.Collect 2: %O");
PrintfModule.PrintFormatLineToTextWriter<FSharpFunc<TimeSpan, Unit>>(Console.Out, format).Invoke(sw.Elapsed);
ConsoleKeyInfo consoleKeyInfo = Console.ReadKey();
return 0;
}



Everything we need to change to switch to struct tuples, is adding "struct" keyword in front of constructor and pattern matching:

let foo x = struct (x, Math.Sin x)
...
for n in 1..100000000 do
let struct (x, sinx) = foo (float n / 1000.)
...
// Debug
// Run: 00:00:03.5127102
// GC.Collect 2: 00:00:00.0001495
// Release
// Run: 00:00:00.3443932
// GC.Collect 2: 00:00:00.0000796

Decompiled code in Release configuration:

[EntryPoint]
public static int main(string[] _arg1)
{
Stopwatch sw = Stopwatch.StartNew();
for (int i = 1; i < 100000001; i++)
{
double num = (double)i / 1000.0;
StructTuple<double, double> structTuple = new StructTuple<double, double>(num, Math.Sin(num));
}
sw.Stop();
PrintfFormat<FSharpFunc<TimeSpan, Unit>, TextWriter, Unit, Unit> format = new PrintfFormat<FSharpFunc<TimeSpan, Unit>, TextWriter, Unit, Unit, TimeSpan>("Run: %O");
PrintfModule.PrintFormatLineToTextWriter<FSharpFunc<TimeSpan, Unit>>(Console.Out, format).Invoke(sw.Elapsed);
sw.Restart();
GC.Collect(2);
sw.Stop();
format = new PrintfFormat<FSharpFunc<TimeSpan, Unit>, TextWriter, Unit, Unit, TimeSpan>("GC.Collect 2: %O");
PrintfModule.PrintFormatLineToTextWriter<FSharpFunc<TimeSpan, Unit>>(Console.Out, format).Invoke(sw.Elapsed);
ConsoleKeyInfo consoleKeyInfo = Console.ReadKey();
return 0;
}

I don't know about you, but I was surprised with those results. The performance roughly the same. GC is not a bottleneck as no objects were promoted to generation 1.

Conclusions:

  • Using struct tuples as a faster or "GC-friendly" alternative to return multiple values from functions does not make sense.
  • Building in release mode erases away heap allocated tuples, but not struct tuples.
  • Building in release mode inlines the "foo" function, which makes the code 10x faster.
  • You can fearlessly allocate tens of millions of short-living object per second, performance will be great.


Comments

Unknown said…
The first example, when compiled in release, do not even construct tuples as the compile optimizes them away.
[CompilationMapping(SourceConstructFlags.Module)]
public static class Program
{
[EntryPoint]
public static int main(string[] _arg1)
{
Stopwatch stopwatch = Stopwatch.StartNew();
for (int index = 1; index < 100000001; ++index)
Math.Sin((double) index / 1000.0);
stopwatch.Stop();
PrintfModule.PrintFormatLineToTextWriter>(Console.Out, (PrintfFormat, TextWriter, Unit, Unit>) new PrintfFormat, TextWriter, Unit, Unit, TimeSpan>("Run: %O")).Invoke(stopwatch.Elapsed);
stopwatch.Restart();
GC.Collect(2);
stopwatch.Stop();
PrintfModule.PrintFormatLineToTextWriter>(Console.Out, (PrintfFormat, TextWriter, Unit, Unit>) new PrintfFormat, TextWriter, Unit, Unit, TimeSpan>("GC.Collect: %O")).Invoke(stopwatch.Elapsed);
Console.ReadKey();
return 0;
}
}
Vasily said…
@Will Smith: yeah, thanks!

I compiled in Release configuration the variant which uses struct tuples and they _are not eraised_:

[EntryPoint]
public static int main(string[] _arg1)
{
Stopwatch sw = Stopwatch.StartNew();
for (int i = 1; i < 100000001; i++)
{
double num = (double)i / 1000.0;
StructTuple structTuple = new StructTuple(num, Math.Sin(num));
}
sw.Stop();
PrintfFormat, TextWriter, Unit, Unit> format = new PrintfFormat, TextWriter, Unit, Unit, TimeSpan>("Run: %O");
PrintfModule.PrintFormatLineToTextWriter>(Console.Out, format).Invoke(sw.Elapsed);
sw.Restart();
GC.Collect(2);
sw.Stop();
format = new PrintfFormat, TextWriter, Unit, Unit, TimeSpan>("GC.Collect 2: %O");
PrintfModule.PrintFormatLineToTextWriter>(Console.Out, format).Invoke(sw.Elapsed);
ConsoleKeyInfo consoleKeyInfo = Console.ReadKey();
return 0;
}
Unknown said…
Thanks for updating the post. Though, the non-struct version doesn't even allocate tuples to begin with as the compiler optimizes them out. I'm sure if you change it to where it allocates tuples, the results will be very different.
Vasily said…
in debug mode the tuples are allocated indeed, so we can compare them to struct tuples.
jackfoxy said…
I would still like to see a comparison in release.
Vasily said…
@jeckfoxy the snippets include execution time in both debug and release modes.
jackfoxy said…
But as @Will points out the compiler optimizes them away. Perhaps by following the construction with a fst statement will prevent the optimization.
B.C. said…
Try a more complicated experiment that makes tuples leave the nursery generation: http://flyingfrogblog.blogspot.com/2014/01/on-performance-of-boxed-tuples.html
nhuthuy said…
Thanks for sharing, nice post! Post really provice useful information!

Giaonhan247 chuyên dịch vụ ship hàng mỹ từ dịch vụ nhận mua hộ hàng mỹ từ trang ebay vn cùng với dịch vụ mua hộ hàng mỹ về VN uy tín, giá rẻ.

Popular posts from this blog

Regular expressions: Rust vs F# vs Scala

Hash maps: Rust, F#, D, Go, Scala

Haskell: performance