[ https://issues.apache.org/jira/browse/OAK-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14483334#comment-14483334 ]
Florin Iordache commented on OAK-2725: -------------------------------------- cc: [~tmueller] > Wrong indexed query estimates exceed more than double the actual index entries > ------------------------------------------------------------------------------ > > Key: OAK-2725 > URL: https://issues.apache.org/jira/browse/OAK-2725 > Project: Jackrabbit Oak > Issue Type: Bug > Components: query > Affects Versions: 1.1.8 > Reporter: Florin Iordache > Priority: Critical > Fix For: 1.2 > > Attachments: OAK-2725-test.patch > > > The {{ApproximateCounter.adjustCountSync}} public method that is used by the > indexing engine will sometimes produce very unrealistic cost estimates. > The problem is that it can produce an estimated cost that exceeds the > estimated cost of the full traversal query, thus causing the index to be > bypassed altogether, resulting in a full traversal rather than the use of the > existing index. > Problem resides in the way the property counts are updated: > * The count property update goes through if two randoms are not zero: > random(100) and random({1, 2, 4, 8, 16, ...}). > * Same static pseudo random generator for all invocations. > Even if #1 might seem improbable, it is statistically possible to reach a > very high count with only a handful of invocations. > In practice I've found that running 100 tests with 1000 invocations if the > adjustCountSync method will yield costs exceeding value 2000 in 4-10% of the > tests. Attaching a patch for {{ApproximateCounterTest}} with this test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)