[
https://issues.apache.org/jira/browse/MATH-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211419#comment-14211419
]
Adam Stelmaszczyk edited comment on MATH-1169 at 11/13/14 11:04 PM:
--------------------------------------------------------------------
Main class should be run, it needs Caliper JAR:
https://code.google.com/p/caliper/downloads/detail?name=caliper-1.0-beta-1-all.jar
Every call to evaluate() I'm creating a new double[] array filled with
different random numbers.
FloydRivest contatins implementation of algorithm basing on pseudocode from its
Wikipedia page.
PercentileFloydRivest.java is Percentile.java with modified evaluate(), so it
calls FloydRivest.select() instead of typical select(). pivotsHeap was removed
for simplicity (maybe it can improve efficiency even more).
was (Author: adams):
Main class should be run, it needs Caliper JAR:
https://code.google.com/p/caliper/downloads/detail?name=caliper-1.0-beta-1-all.jar
Every call to evaluate() I'm creating a new double[] array filled with
different random numbers.
FloydRivest contatins implementation basing on
http://en.wikipedia.org/wiki/Floyd%E2%80%93Rivest_algorithm pseudocode.
PercentileFloydRivest.java is Percentile.java with modified evaluate(), so it
calls FloydRivest.select() instead of typical select(). pivotsHeap was removed
for simplicity (maybe it can improve efficiency even more).
> Faster Percentile
> -----------------
>
> Key: MATH-1169
> URL: https://issues.apache.org/jira/browse/MATH-1169
> Project: Commons Math
> Issue Type: Improvement
> Reporter: Adam Stelmaszczyk
> Priority: Minor
> Attachments: FloydRivest.java, Main.java, PercentileFloydRivest.java
>
>
> Percentile class is now using Hoare selection algorithm (aka
> [Quickselect|http://en.wikipedia.org/wiki/Quickselect]) with median of 3 for
> choosing a pivot and caching pivots. Quickselect expected runtime is about
> 3.4N + O(N).
> The constant can be improved to 1.5n by different pivot strategy involving
> sampling, yielding the [Floyd–Rivest
> algorithm|http://en.wikipedia.org/wiki/Floyd%E2%80%93Rivest_algorithm], which
> has average complexity of 1.5N + O(N) for median, with other k being faster.
> I implemented Floyd–Rivest algorithm without caching pivots following
> Wikipedia and benchmarked it with
> [Caliper|https://code.google.com/p/caliper/].
> Array size - runtime in seconds for Floyd-Rivest - runtime for current
> Percentile - % faster
> 100000 - 6.907 - 7.083 - 2.5%
> [link|https://microbenchmarks.appspot.com/runs/a0d947ee-57fc-4636-b687-b4bc5170f1d7#r:scenario.benchmarkSpec.methodName]
> 1000000 - 70.3 - 75.4 - 6.8%
> [link|https://microbenchmarks.appspot.com/runs/77291c2e-6dbb-4666-bf37-c174e4b53f2e#r:scenario.benchmarkSpec.methodName]
> 10000000 - 708 - 868 - 18.4%
> [link|https://microbenchmarks.appspot.com/runs/c0f65e35-44c0-458e-b6e0-634b4e6fa68b#r:scenario.benchmarkSpec.methodName]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)