Hello guilers!

I was wondering if we should make an effort to replace guile's C-based write
function with a more "schemy" solution that allows us to use write/display
on
large scheme data structures without stalling the fibers system and be very
fast and feature rich at the same time.

I did a take some months ago with the repository found at:
https://gitlab.com/tampe/guile-persist

To import this library's write/display just make sure that it is installed
and use
(use-modules (ice-9 write))

Then use write and display from that module just as you usually do in guile.

Now the solution I suggest we look at is to combine C routines with scheme
to get the best of both worlds. In C we get quite a lot of speed from
taking advantage of SIMD instructions like testing if 16 bytes are all in
the range [0,127] in one instruction and the coded library tries to do good
for all kinds of encodings and type of scheme strings (hence quite large).
Some string operations will be in an order of 100X faster with this library
than guiles own C implementation. Now C is bad in two ways. It is a less
powerful language than Scheme and harder to maintain. However quite a lot
of the power of scheme can be saved by separating more logic into scheme
and just focusing the C code on smaller parts. Also we would like to use
scheme because it interoperates well with fibers. As it is now, writing a
huge scheme datastructure freezes the thread if all is in C. But this does
not mean that it is impossible to use C. By instead trampolining into C
serializing chunks of data we can improve the behavior in a fibers context
quite dramatically. The reason to still use C is speed though and that
guile cannot take advantage of SIMD instructions. This should in the end
change with time but no work is currently planned for this and gcc today
has a really nice infrastructure towards such constructs in C.

Here are some blog posts about the writer I propose that we look into,
http://itampe.com/display-strings.html
http://itampe.com/printing-doubles.html

This code handles doubles and floats better than guiles C based writer.
First it fixes a bug when we write out a bytevector of f32 currently guile
ends up writing them as doubles which creates a lot of noise in the
printout. And secondly we can tell the writer to print floats in hex format
which guile cannot (we also have i reader in the repository that can read
hex floats) we actually can write in any of base 2,4,8,16,32 base
floats/doubles

The drawback is most likely tougher code to maintain, but thanks to gcc's
support for SIMD instructionsm, not too bad in my view.

WDYT?

Reply via email to