[
https://issues.apache.org/jira/browse/MATH-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adam Stelmaszczyk updated MATH-1169:
------------------------------------
Description:
Percentile class is now using Hoare selection algorithm (aka Quickselect) with
median of 3 for choosing a pivot and caching pivots. Quickselect expected
runtime is about 3.4N + o(N).
The constant can be improved to 1.5n by a bit complicated pivot strategy,
yielding the Floyd–Rivest algorithm, which has average complexity of 1.5N +
O(N) for median, with other k being faster.
I implemented Floyd–Rivest algorithm without caching pivots following wikipedia
and benchmarked it with Caliper (https://code.google.com/p/caliper/).
Array size - runtime in seconds for Floyd-Rivest - current Percentile - % faster
100000 - 6.907 - 7.083 - 2.5%
1000000 - 70.3 - 75.4 - 6.8%
10000000 - 708 - 868 - 18.4%
(https://microbenchmarks.appspot.com/runs/c0f65e35-44c0-458e-b6e0-634b4e6fa68b#r:scenario.benchmarkSpec.methodName)
was:
Percentile class is now using Hoare selection algorithm (aka Quickselect) with
median of 3 for choosing a pivot and caching pivots. Quickselect expected
runtime is about 3.4n + o(n).
The constant can be improved to 1.5n by a bit complicated pivot strategy,
yielding the Floyd–Rivest algorithm, which has average complexity of 1.5n +
O(n^0.5) for median, with other k being faster.
I implemented Floyd–Rivest algorithm without caching pivots following wikipedia
and benchmarked it with Caliper (https://code.google.com/p/caliper/).
Array size - runtime in seconds for Floyd-Rivest - current Percentile - % faster
100000 - 6.907 - 7.083 - 2.5%
1000000 - 70.3 - 75.4 - 6.8%
10000000 - 708 - 868 - 18.4%
(https://microbenchmarks.appspot.com/runs/c0f65e35-44c0-458e-b6e0-634b4e6fa68b#r:scenario.benchmarkSpec.methodName)
> Faster Percentile
> -----------------
>
> Key: MATH-1169
> URL: https://issues.apache.org/jira/browse/MATH-1169
> Project: Commons Math
> Issue Type: Improvement
> Reporter: Adam Stelmaszczyk
> Priority: Minor
> Attachments: FloydRivest.java, Main.java, PercentileFloydRivest.java
>
>
> Percentile class is now using Hoare selection algorithm (aka Quickselect)
> with median of 3 for choosing a pivot and caching pivots. Quickselect
> expected runtime is about 3.4N + o(N).
> The constant can be improved to 1.5n by a bit complicated pivot strategy,
> yielding the Floyd–Rivest algorithm, which has average complexity of 1.5N +
> O(N) for median, with other k being faster.
> I implemented Floyd–Rivest algorithm without caching pivots following
> wikipedia and benchmarked it with Caliper
> (https://code.google.com/p/caliper/).
> Array size - runtime in seconds for Floyd-Rivest - current Percentile - %
> faster
> 100000 - 6.907 - 7.083 - 2.5%
> 1000000 - 70.3 - 75.4 - 6.8%
> 10000000 - 708 - 868 - 18.4%
> (https://microbenchmarks.appspot.com/runs/c0f65e35-44c0-458e-b6e0-634b4e6fa68b#r:scenario.benchmarkSpec.methodName)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)