The median of a dataset can't be calculated in a distributed fashion, so
no such function exists for it.
John
On 25/08/2017 14:15, Eliot Kimber wrote:
> I’m upgrading my profiling system to use cts:math functions for doing math on
> large numbers of durations—this speeds things up tremendously of course.
>
> However, there doesn’t appear to be a median-aggregate() function in ML 8 or
> ML 9, only cts:median(), which operates on a sequence of doubles.
>
> For example, for a range index that is xs:dayTimeDurations I can I do:
>
> let $average :=
> cts:avg-aggregate(cts:element-reference(xs:QName("prof:overall-elapsed")),
> ("item-frequency"),
>
> cts:collection-query(epf:get-trial-collection($trial-number)))
>
> But to get the equivalent median the only solution I’m seeing is to convert
> all the durations to doubles and then take the median, which is very slow.
>
> At least in my data set, the median is a better measure of overall
> performance than average because I have a small number of very slow outliers,
> so I really need both median and average.
>
> This seems like an obvious oversight in the ct:math package—am I missing a
> solution?
>
> Thanks,
>
> Eliot
>
> --
> Eliot Kimber
> http://contrext.com
>
>
>
>
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
--
John Snelson, Principal Engineer http://twitter.com/jpcs
MarkLogic Corporation http://www.marklogic.com
_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general