Hi, just a quick update: assisted lookup is significantly faster than Gamma#logGamma when the STEP parameter is about 8 or less. So, the implementation variants seem to be the following: 1. leave one static lookup-based version of the method; requires fixing some arbitrary constants and introducing increased memory footprint even for those users who do not require fast factorial; 2. create a helper class on demand; requires more work on the part of the user (not to mention knowing where to look for it), but provides a more flexible solution.
Sincerely, Aleksei Dievskii. On Thu, Nov 19, 2015 at 12:29 PM, Alexey Dievsky <[email protected]> wrote: > Hi Gilles, > > thanks for your response. I've run a small JMH benchmark of both old > (direct summation) and new (default 16KB tabulation) approaches, and here > are the results: > > Benchmark Mode Samples > Score Error Units > o.a.c.m.s.FactorialLogBench.newFactorialLog100 thrpt 20 > 75066.697 ± 7122.980 ops/ms > o.a.c.m.s.FactorialLogBench.newFactorialLog1000 thrpt 20 > 76905.243 ± 5965.782 ops/ms > o.a.c.m.s.FactorialLogBench.newFactorialLog10K thrpt 20 > 5770.486 ± 285.069 ops/ms > o.a.c.m.s.FactorialLogBench.newFactorialLog100K thrpt 20 > 6156.102 ± 278.842 ops/ms > o.a.c.m.s.FactorialLogBench.oldFactorialLog100 thrpt 20 > 585.576 ± 26.802 ops/ms > o.a.c.m.s.FactorialLogBench.oldFactorialLog1000 thrpt 20 > 29.883 ± 1.663 ops/ms > o.a.c.m.s.FactorialLogBench.oldFactorialLog10K thrpt 20 > 2.744 ± 0.095 ops/ms > o.a.c.m.s.FactorialLogBench.oldFactorialLog100K thrpt 20 > 0.264 ± 0.007 ops/ms > > Here *100 calculates the factorial of a random number in range [0..99], > *1000 -- of a number in range [900..999], *10K -- of a number in range > [9900..9999], and *100K -- of a number in range [99900..99999]. The default > tabulated version covers the first two cases with a direct lookup, the > third one with a use-the-closest-stored-value approach (about two logs, two > multiplications and two additions on average), and the fourth one delegates > to logGamma. > As you can see, the tabulated (new) approach is about 130x faster on small > numbers and the difference grows drastically from there on, up to about > 23000x on fairly big numbers. > > Another interesting result is that logGamma doesn't seem to be slower than > the assisted lookup, which possibly indicates that we can drop the assisted > lookup altogether, reducing the memory footprint. > > Sincerely, > Aleksei Dievskii. > > On Thu, Nov 19, 2015 at 11:18 AM, Gilles <[email protected]> > wrote: > >> Hello. >> >> On Thu, 19 Nov 2015 10:34:46 +0100, Alexey Dievsky wrote: >> >>> Hi, >>> >>> I recently submitted an issue to JIRA ( >>> https://issues.apache.org/jira/browse/MATH-1293) and received an advice >>> to >>> discuss it here. >>> >>> In short, the current implementation of CombinatoricsUtils#factorialLog >>> employs direct summation and thus has linear complexity (O(n) additions >>> and >>> O(n) logs). I suggest replacing it with a tabulated lookup. This solution >>> will need to allocate some additional memory, but the speed-up for >>> larger n >>> would be tremendous. For the values that are too large to be looked up, >>> equivalent Gamma#logGamma is still much faster than the existing >>> algorithm. >>> >>> I attached a proposed patch to the issue. However, there are a lot of >>> details that need to be worked out, first and foremost whether such >>> optimization is at all worth it. >>> >> >> According to your statement above (significantly improved performance), >> there >> is actually no doubt that the feature should be offered to CM users.[1][2] >> The main question is whether a single speed vs memory (vs possibly >> increased >> initialization time) trade-off is always acceptable. >> >> Hence at first sight, I'd prefer to provide a factory to create an >> object (that will compute the factorial) with a user-defined trade-off. >> >> Regards, >> Gilles >> >> [1] It would also be nice if you could provide a benchmark. >> [2] A real-life use-case would also be nice to enhance the CM userguide. >> >> Thus I ask for your opinion. Please let me >>> know what think. >>> >>> Sincerely, >>> Aleksei Dievskii. >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> >
