[sage-devel] Re: Using MPIR or GMP with multiple memory managers

William Stein Sun, 23 Aug 2009 13:59:06 -0700

On Sun, Aug 23, 2009 at 1:42 PM, Nils Bruin<nbr...@sfu.ca> wrote:
>
> The following problem came up while trying to use ecl as a library
> inside sage:
>
> Both sage and ecl use GMP for their multi-precision arithmetic, and
> both call mp_set_memory_functions to register their memory managers.
> This obviously doesn't work, since GMP only keeps track of one set of
> memory management routines. Ecl needs to use its memory manager, since
> it relies on Boehm for garbage collection and sage needs to use its
> own because python relies on reference counting.
>
> I have left the discussion specific to ECL, but the problem arises in
> general: GMP has global state, so using the same GMP from several
> places can lead to conflicts.
>
> Juanjo, the lead developer of ecl has suggested several work-arounds
> (Thank you, Juanjo. Your expertise is a great help). I hope some
> people with more knowledge of sage memory management, mpir and build
> processes can comment on the feasibility and maintainability of the
> different options:
>
>  1. Build ecl with its private gmp library.
>   (a) Can that be done on link level? i.e., take ecl.so and gmp.so
> and produce ecl-gmp.so from the two, with all references of ecl.so to
> gmp.so resolved and made private? (i.e., make an ecl.so with gmp
> "statically linked in").


Yes, it's possible.   We've accidentally done this sort of thing in
the past several times, with NTL, Singular, etc., on various
platforms, and it was often the reason behind many subtle segfaults.

>  (b) Can that be done compile-time  for ecl? i.e., when you compile
> ecl to produce ecl.so, make the compiler/linker resolve any references
> to gmp and prevent any of the gmp symbols from leaking out as public
> in ecl.so?

I suspect so.

>  (c) Is there an economical and maintainable way of pushing gmp-for-
> ecl in its own namespace?
>
>  2. Make the mp_set_memory_functions state thread-local and run ecl in
> its own thread.
>  (a) Does that involve changing the gmp/mpir library?

I don't know how to do this.

>
>  3. Make sage use Boehm as well :-). That may be an option when
> reference counting is discovered to be an insufficient memory
> management strategy for sage in general.

Sage = Python, so I don't see how that could work.

>  4. Change ECL so that its uses of GMP are braced by
> mp_set_memory_function
>  (a) That wouldn't be thread-safe, would it? Or are GMP instructions
> "atomic" with respect to threads? Could we make them that way?
>  (b) In that case, it might be helpful to have a complete list of GMP
> instructions that could possibly lead to calls to alloc,realloc,free.
> Only those would need treatment.
>  (c) Could this be done by properly mangling gmp.h?
>
> Thanks to Juanjo for helping this project progress. Other possibly
> relevant remarks:
>
>  5. This problem isn't unique for ecl/sage. Any setting where gmp is
> used by several libraries at once potentially causes this problem

We do use GMP by several libraries at once right now.  Basically a
similar problem did come up when Martin Albrecht wrote libSingular.

>
>  6. This problem does not arise in general when you try to extend
> python with ecl, because normal python does not use gmp. So building a
> python/lisp hybrid is quite feasible.
>
>  7. The main drive for using ecl-as-a-library in sage is to get a
> faster interface with Maxima. If all options above turn out to involve
> a lot of work or are expensive to maintain, perhaps we need to find
> another way of speeding up the interface. As a side effect, we already
> have maxima-as-a-library in ecl now. Processes can communicate
> efficiently in various ways. Shared memory? Send data info through a
> pipe? With a little programming on both the sage and the ecl side, we
> could make a binary communication protocol. Definitely not as
> satisfying a solution as a library interface, but perhaps good enough.

To whet our appetite though, might you do some benchmarks that compare
the speed of adding 2+2 via the library and via pexpect?

sage: a = maxima(2)
sage: timeit("a+a")
125 loops, best of 3: 1.87 ms per loop

This would be the analogue of the following:

sage: a = pari(2)
sage: timeit("a+a")
625 loops, best of 3: 2.24 µs per loop
sage: a = gp(2)
sage: timeit("a+a")
625 loops, best of 3: 208 µs per loop
sage: 208/2.24
92.8571428571428

By the way, it's interesting that the maxima pexpect interface is
nearly 10 times slower than the PARI pexpect interface; that's because
the Maxima one is way more complicated, due to Maxima "going
interactive" and doing all kinds of things that aren't pseudotty
friendly.

Anyway, the speedup going from pexpect to C library should be between
100 and 1000 times.  For libgap, I was seeing 2000x speedups.   You
might be more enthusiastic about your current awesome work making a
Sage <--> Maxima library interface if you do some little benchmarks
and see that we might get a 1000-2000x speedup.  You write:  "With a
little programming on both the sage and the ecl side, we  could make a
binary communication protocol. Definitely not as  satisfying a
solution as a library interface, but perhaps good enough."  It's true
that it is "good enough", because pexpect is already "good enough".
However, I think the difference in latency between any sort of IPC and
a library interface is going to be a factor of 100 or 1000, so your
current path is the only hope to really win, and I think you'll win
big.

William

--~--~---------~--~----~------------~-------~--~----~
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel-unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---

[sage-devel] Re: Using MPIR or GMP with multiple memory managers

Reply via email to