[
https://issues.apache.org/jira/browse/LUCENE-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doron Cohen updated LUCENE-3501:
--------------------------------
Attachment: LUCENE-3501.patch
Before applying this patch should do:
{noformat}
svn mv modules/facet/src/java/org/apache/lucene/facet/util/RandomSample.java
modules/facet/src/java/org/apache/lucene/facet/search/sampling/RepeatableSampler.java
{noformat}
I looked at this, and also discussed with Gilad, and it seems that:
* The test is broken as it claims to do N trials in case of failure but it does
not, because its try/catch does not catch AssertionError, and so only one trial
is attempted. (Few trials make sense because with sampling, there is always a
possibility that the selected sample set of docs would not contain the
"correct" best facets even with a high over sampling ratio (over sampling means
that for the selected set of docs more best facets are collected).
* Even after fixing the test to actually try more than once, it still fails,
because there is no randomness in RandomSample... surprising but true.
In this patch:
* Sampler made an abstract class.
* RandomSample renamed to RepeatableSampler which extends RandomSampler.
* RandomSampler was added - it too extends Sampler - this is a simple *random*
implementation, which is now the default (used by default in
SamplingWrapperAccumulator).
* The test randomly selects between the two sampler implementations. If you
want to see the behavior that created the bug, remove that latter randomness by
setting to false the variable *useRandomSampler* of
*BaseSampleTestTopK.testCountUsingSamping()*.
I think this is ready to commit.
Wasn't sure though, where should the Changes entry go?
> random sampler is not random (and so facet SamplingWrapperTest occasionally
> fails)
> ----------------------------------------------------------------------------------
>
> Key: LUCENE-3501
> URL: https://issues.apache.org/jira/browse/LUCENE-3501
> Project: Lucene - Java
> Issue Type: Bug
> Components: modules/facet
> Reporter: Doron Cohen
> Assignee: Doron Cohen
> Priority: Minor
> Attachments: LUCENE-3501.patch
>
>
> RandomSample is not random at all:
> It does not even import java.util.Random, and its behavior is deterministic.
> in addition, the test testCountUsingSamping() never retries as it was
> supposed to (for taking care of the hoped-for randomness).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]