Re: [sage-devel] Fwd: [ODK participants] Blog post on fast multivariate arithmetic
It seems like you've done a good job with it anyway. Thanks for the description. Bill. On Thursday, 13 July 2017 20:46:21 UTC+2, bluescarni wrote: > > mppp also uses a small value optimisation. The number of limbs that can be > stored without dynamic memory allocation can be selected at compile time, > and the benchmarks on the website are done using 1 limb (64 bits) of static > storage. > > I can think of a few things that might influence positively mppp's > performance with respect to FLINT: > > - inlining (as mppp is header-only) avoids the overhead of function calls > and might allow the compiler to optimise better. > - mppp does not do automatic downsizing, once you are using dynamic > storage it's up to you to demote to a static integer. I would imagine this > saves a few branches with respect to FLINT. > - I spent a lot of time tinkering with the add/sub/mul code, trying to > find the code flow/layout that would squeeze out the best performance for > small operands. Maybe I just got lucky with a specific way of arranging the > code particularly friendly to GCC, but I don't know really - I am not a > low-level/assembly type of guy. I just tried many different variations and > picked the one that performed better. > > Cheers, > > Francesco. > > On 13 July 2017 at 12:25, 'Bill Hart' via sage-devel < > sage-...@googlegroups.com > wrote: > >> So why is it faster than Flint say? Except for the overhead in the Flint >> fmpz type, which uses a single word initially and only upgrades to an mpz_t >> on overflow, it should currently be doing more allocations than Flint. And >> Flint should be faster for something like a dot product, especially if the >> integers are all small, since it never actually allocates mpz_t's in that >> case. What is the new innovation? >> >> Bill. >> >> On Wednesday, 12 July 2017 16:00:16 UTC+2, bluescarni wrote: >>> >>> In the benchmarks I use the C++ interfaces of FLINT and >>> Boost.Multiprecision only for ease of initialization/destruction. The bulk >>> of the operations is performed using directly the C API of FLINT and GMP. >>> mp++ itself has some moderate template metaprogramming in place, but for >>> instance it is currently lacking expression templates support (unlike >>> fmpzxx), the focus at the moment being on fast low-level primitives >>> (add/sub/mul/addmul etc.). >>> >>> Cheers, >>> >>> Francesco. >>> >>> On 12 July 2017 at 15:13, 'Bill Hart' via sage-devel < >>> sage-...@googlegroups.com> wrote: >>> Beware, Bernard Parisse has just helped me track down why the Flint timings for the sparse division only benchmark looked so ridiculously low. It turns out that due to an accident of interfacing between Nemo and Flint, it was using reflected lexicographical ordering instead of true lexicographical ordering. If I had labelled them "exact division", instead of "quotient only" and not included the x^(n - 3) term in the benchmark itself, the timings could be considered correct (though Giac would also have been able to do the computation much faster in that case). But unfortunately, this discovery means I had to change the timings for Flint for that benchmark. It is now correct on the blog. The timings for mppp are really good. I'm surprised you beat the Flint timings there, since we use pretty sophisticated templating in our C++ interface. But clearly there are tricks we missed! Bill. On Wednesday, 12 July 2017 12:16:33 UTC+2, bluescarni wrote: > > Interesting timings, they give me some motivation to revisit the dense > multiplication algorithm in piranha :) > > As an aside (and apologies if this is a slight thread hijack?), I have > been spending some time in the last few weeks decoupling the > multiprecision > arithmetic bits from piranha into its own project, called mp++: > > https://github.com/bluescarni/mppp > > So far I have extracted the integer and rational classes, and > currently working on the real class (arbitrary precision FP). > > Cheers, > > Francesco. > -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com. To post to this group, send email to sage-...@googlegroups.com. Visit this group at https://groups.google.com/group/sage-devel. For more options, visit https://groups.google.com/d/optout. >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "sage-devel" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to sage-devel+...@googlegroups.com . >> To post to this group, send email to sage-...@googlegroups.com >> . >> Visit this group at
Re: [sage-devel] Fwd: [ODK participants] Blog post on fast multivariate arithmetic
mppp also uses a small value optimisation. The number of limbs that can be stored without dynamic memory allocation can be selected at compile time, and the benchmarks on the website are done using 1 limb (64 bits) of static storage. I can think of a few things that might influence positively mppp's performance with respect to FLINT: - inlining (as mppp is header-only) avoids the overhead of function calls and might allow the compiler to optimise better. - mppp does not do automatic downsizing, once you are using dynamic storage it's up to you to demote to a static integer. I would imagine this saves a few branches with respect to FLINT. - I spent a lot of time tinkering with the add/sub/mul code, trying to find the code flow/layout that would squeeze out the best performance for small operands. Maybe I just got lucky with a specific way of arranging the code particularly friendly to GCC, but I don't know really - I am not a low-level/assembly type of guy. I just tried many different variations and picked the one that performed better. Cheers, Francesco. On 13 July 2017 at 12:25, 'Bill Hart' via sage-devel < sage-devel@googlegroups.com> wrote: > So why is it faster than Flint say? Except for the overhead in the Flint > fmpz type, which uses a single word initially and only upgrades to an mpz_t > on overflow, it should currently be doing more allocations than Flint. And > Flint should be faster for something like a dot product, especially if the > integers are all small, since it never actually allocates mpz_t's in that > case. What is the new innovation? > > Bill. > > On Wednesday, 12 July 2017 16:00:16 UTC+2, bluescarni wrote: >> >> In the benchmarks I use the C++ interfaces of FLINT and >> Boost.Multiprecision only for ease of initialization/destruction. The bulk >> of the operations is performed using directly the C API of FLINT and GMP. >> mp++ itself has some moderate template metaprogramming in place, but for >> instance it is currently lacking expression templates support (unlike >> fmpzxx), the focus at the moment being on fast low-level primitives >> (add/sub/mul/addmul etc.). >> >> Cheers, >> >> Francesco. >> >> On 12 July 2017 at 15:13, 'Bill Hart' via sage-devel < >> sage-...@googlegroups.com> wrote: >> >>> Beware, Bernard Parisse has just helped me track down why the Flint >>> timings for the sparse division only benchmark looked so ridiculously low. >>> It turns out that due to an accident of interfacing between Nemo and Flint, >>> it was using reflected lexicographical ordering instead of true >>> lexicographical ordering. If I had labelled them "exact division", instead >>> of "quotient only" and not included the x^(n - 3) term in the benchmark >>> itself, the timings could be considered correct (though Giac would also >>> have been able to do the computation much faster in that case). But >>> unfortunately, this discovery means I had to change the timings for Flint >>> for that benchmark. It is now correct on the blog. >>> >>> The timings for mppp are really good. I'm surprised you beat the Flint >>> timings there, since we use pretty sophisticated templating in our C++ >>> interface. But clearly there are tricks we missed! >>> >>> Bill. >>> >>> On Wednesday, 12 July 2017 12:16:33 UTC+2, bluescarni wrote: Interesting timings, they give me some motivation to revisit the dense multiplication algorithm in piranha :) As an aside (and apologies if this is a slight thread hijack?), I have been spending some time in the last few weeks decoupling the multiprecision arithmetic bits from piranha into its own project, called mp++: https://github.com/bluescarni/mppp So far I have extracted the integer and rational classes, and currently working on the real class (arbitrary precision FP). Cheers, Francesco. >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "sage-devel" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to sage-devel+...@googlegroups.com. >>> To post to this group, send email to sage-...@googlegroups.com. >>> Visit this group at https://groups.google.com/group/sage-devel. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "sage-devel" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sage-devel+unsubscr...@googlegroups.com. > To post to this group, send email to sage-devel@googlegroups.com. > Visit this group at https://groups.google.com/group/sage-devel. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To post to this group,
Re: [sage-devel] Fwd: [ODK participants] Blog post on fast multivariate arithmetic
So why is it faster than Flint say? Except for the overhead in the Flint fmpz type, which uses a single word initially and only upgrades to an mpz_t on overflow, it should currently be doing more allocations than Flint. And Flint should be faster for something like a dot product, especially if the integers are all small, since it never actually allocates mpz_t's in that case. What is the new innovation? Bill. On Wednesday, 12 July 2017 16:00:16 UTC+2, bluescarni wrote: > > In the benchmarks I use the C++ interfaces of FLINT and > Boost.Multiprecision only for ease of initialization/destruction. The bulk > of the operations is performed using directly the C API of FLINT and GMP. > mp++ itself has some moderate template metaprogramming in place, but for > instance it is currently lacking expression templates support (unlike > fmpzxx), the focus at the moment being on fast low-level primitives > (add/sub/mul/addmul etc.). > > Cheers, > > Francesco. > > On 12 July 2017 at 15:13, 'Bill Hart' via sage-devel < > sage-...@googlegroups.com > wrote: > >> Beware, Bernard Parisse has just helped me track down why the Flint >> timings for the sparse division only benchmark looked so ridiculously low. >> It turns out that due to an accident of interfacing between Nemo and Flint, >> it was using reflected lexicographical ordering instead of true >> lexicographical ordering. If I had labelled them "exact division", instead >> of "quotient only" and not included the x^(n - 3) term in the benchmark >> itself, the timings could be considered correct (though Giac would also >> have been able to do the computation much faster in that case). But >> unfortunately, this discovery means I had to change the timings for Flint >> for that benchmark. It is now correct on the blog. >> >> The timings for mppp are really good. I'm surprised you beat the Flint >> timings there, since we use pretty sophisticated templating in our C++ >> interface. But clearly there are tricks we missed! >> >> Bill. >> >> On Wednesday, 12 July 2017 12:16:33 UTC+2, bluescarni wrote: >>> >>> Interesting timings, they give me some motivation to revisit the dense >>> multiplication algorithm in piranha :) >>> >>> As an aside (and apologies if this is a slight thread hijack?), I have >>> been spending some time in the last few weeks decoupling the multiprecision >>> arithmetic bits from piranha into its own project, called mp++: >>> >>> https://github.com/bluescarni/mppp >>> >>> So far I have extracted the integer and rational classes, and currently >>> working on the real class (arbitrary precision FP). >>> >>> Cheers, >>> >>> Francesco. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "sage-devel" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to sage-devel+...@googlegroups.com . >> To post to this group, send email to sage-...@googlegroups.com >> . >> Visit this group at https://groups.google.com/group/sage-devel. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To post to this group, send email to sage-devel@googlegroups.com. Visit this group at https://groups.google.com/group/sage-devel. For more options, visit https://groups.google.com/d/optout.
Re: [sage-devel] Fwd: [ODK participants] Blog post on fast multivariate arithmetic
In the benchmarks I use the C++ interfaces of FLINT and Boost.Multiprecision only for ease of initialization/destruction. The bulk of the operations is performed using directly the C API of FLINT and GMP. mp++ itself has some moderate template metaprogramming in place, but for instance it is currently lacking expression templates support (unlike fmpzxx), the focus at the moment being on fast low-level primitives (add/sub/mul/addmul etc.). Cheers, Francesco. On 12 July 2017 at 15:13, 'Bill Hart' via sage-devel < sage-devel@googlegroups.com> wrote: > Beware, Bernard Parisse has just helped me track down why the Flint > timings for the sparse division only benchmark looked so ridiculously low. > It turns out that due to an accident of interfacing between Nemo and Flint, > it was using reflected lexicographical ordering instead of true > lexicographical ordering. If I had labelled them "exact division", instead > of "quotient only" and not included the x^(n - 3) term in the benchmark > itself, the timings could be considered correct (though Giac would also > have been able to do the computation much faster in that case). But > unfortunately, this discovery means I had to change the timings for Flint > for that benchmark. It is now correct on the blog. > > The timings for mppp are really good. I'm surprised you beat the Flint > timings there, since we use pretty sophisticated templating in our C++ > interface. But clearly there are tricks we missed! > > Bill. > > On Wednesday, 12 July 2017 12:16:33 UTC+2, bluescarni wrote: >> >> Interesting timings, they give me some motivation to revisit the dense >> multiplication algorithm in piranha :) >> >> As an aside (and apologies if this is a slight thread hijack?), I have >> been spending some time in the last few weeks decoupling the multiprecision >> arithmetic bits from piranha into its own project, called mp++: >> >> https://github.com/bluescarni/mppp >> >> So far I have extracted the integer and rational classes, and currently >> working on the real class (arbitrary precision FP). >> >> Cheers, >> >> Francesco. >> > -- > You received this message because you are subscribed to the Google Groups > "sage-devel" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sage-devel+unsubscr...@googlegroups.com. > To post to this group, send email to sage-devel@googlegroups.com. > Visit this group at https://groups.google.com/group/sage-devel. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To post to this group, send email to sage-devel@googlegroups.com. Visit this group at https://groups.google.com/group/sage-devel. For more options, visit https://groups.google.com/d/optout.
Re: [sage-devel] Fwd: [ODK participants] Blog post on fast multivariate arithmetic
Beware, Bernard Parisse has just helped me track down why the Flint timings for the sparse division only benchmark looked so ridiculously low. It turns out that due to an accident of interfacing between Nemo and Flint, it was using reflected lexicographical ordering instead of true lexicographical ordering. If I had labelled them "exact division", instead of "quotient only" and not included the x^(n - 3) term in the benchmark itself, the timings could be considered correct (though Giac would also have been able to do the computation much faster in that case). But unfortunately, this discovery means I had to change the timings for Flint for that benchmark. It is now correct on the blog. The timings for mppp are really good. I'm surprised you beat the Flint timings there, since we use pretty sophisticated templating in our C++ interface. But clearly there are tricks we missed! Bill. On Wednesday, 12 July 2017 12:16:33 UTC+2, bluescarni wrote: > > Interesting timings, they give me some motivation to revisit the dense > multiplication algorithm in piranha :) > > As an aside (and apologies if this is a slight thread hijack?), I have > been spending some time in the last few weeks decoupling the multiprecision > arithmetic bits from piranha into its own project, called mp++: > > https://github.com/bluescarni/mppp > > So far I have extracted the integer and rational classes, and currently > working on the real class (arbitrary precision FP). > > Cheers, > > Francesco. > -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To post to this group, send email to sage-devel@googlegroups.com. Visit this group at https://groups.google.com/group/sage-devel. For more options, visit https://groups.google.com/d/optout.
Re: [sage-devel] Fwd: [ODK participants] Blog post on fast multivariate arithmetic
Interesting timings, they give me some motivation to revisit the dense multiplication algorithm in piranha :) As an aside (and apologies if this is a slight thread hijack?), I have been spending some time in the last few weeks decoupling the multiprecision arithmetic bits from piranha into its own project, called mp++: https://github.com/bluescarni/mppp So far I have extracted the integer and rational classes, and currently working on the real class (arbitrary precision FP). Cheers, Francesco. On 9 July 2017 at 17:39, William Steinwrote: > New blog post from Bill Hart. It includes a section claiming Sage’s > multivariate polynomial arithmetic speed is much worse than expected... > > -- Forwarded message - > From: Bill Hart > Date: Sun, Jul 9, 2017 at 8:34 AM > Subject: [ODK participants] Blog post on fast multivariate arithmetic > To: Opendreamkit participants > > > Dear all, > > I've written a blog post on the fast multivariate arithmetic we've been > doing for ODK [1]. This is a deliverable which is due at the end of the > four years, so we've got a long way to go, but it is progressing nicely in > the interim. > > The next stage is to parallelise this, and we've hired Daniel Schultz to > work on our ODK deliverable which will do precisely this. He's an > experienced developer from the US who was already writing his own CAS (in > assembly language, believe it or not!) > > Bill. > > [1] https://wbhart.blogspot.de/2017/07/update-on-fast- > multivariate-polynomial.html > -- > -- William Stein > > -- > You received this message because you are subscribed to the Google Groups > "sage-devel" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sage-devel+unsubscr...@googlegroups.com. > To post to this group, send email to sage-devel@googlegroups.com. > Visit this group at https://groups.google.com/group/sage-devel. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To post to this group, send email to sage-devel@googlegroups.com. Visit this group at https://groups.google.com/group/sage-devel. For more options, visit https://groups.google.com/d/optout.
[sage-devel] Fwd: [ODK participants] Blog post on fast multivariate arithmetic
New blog post from Bill Hart. It includes a section claiming Sage’s multivariate polynomial arithmetic speed is much worse than expected... -- Forwarded message - From: Bill HartDate: Sun, Jul 9, 2017 at 8:34 AM Subject: [ODK participants] Blog post on fast multivariate arithmetic To: Opendreamkit participants Dear all, I've written a blog post on the fast multivariate arithmetic we've been doing for ODK [1]. This is a deliverable which is due at the end of the four years, so we've got a long way to go, but it is progressing nicely in the interim. The next stage is to parallelise this, and we've hired Daniel Schultz to work on our ODK deliverable which will do precisely this. He's an experienced developer from the US who was already writing his own CAS (in assembly language, believe it or not!) Bill. [1] https://wbhart.blogspot.de/2017/07/update-on-fast-multivariate-polynomial.html -- -- William Stein -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To post to this group, send email to sage-devel@googlegroups.com. Visit this group at https://groups.google.com/group/sage-devel. For more options, visit https://groups.google.com/d/optout.