[jira] [Created] (OAK-1892) OrderedIndexConcurrentClusterIT takes too long

2014-06-16 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-1892:
-

 Summary: OrderedIndexConcurrentClusterIT takes too long
 Key: OAK-1892
 URL: https://issues.apache.org/jira/browse/OAK-1892
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: jcr
 Environment: trunk and 1.0 branch
Reporter: Marcel Reutegger
Assignee: Davide Giannella


The OrderedIndexConcurrentClusterIT takes too long and times out on travis. See 
e.g. https://travis-ci.org/apache/jackrabbit-oak/builds/27445383



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1892) OrderedIndexConcurrentClusterIT takes too long

2014-06-16 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032156#comment-14032156
 ] 

Marcel Reutegger commented on OAK-1892:
---

Disabled test for now in http://svn.apache.org/r1602809

 OrderedIndexConcurrentClusterIT takes too long
 --

 Key: OAK-1892
 URL: https://issues.apache.org/jira/browse/OAK-1892
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: jcr
 Environment: trunk and 1.0 branch
Reporter: Marcel Reutegger
Assignee: Davide Giannella

 The OrderedIndexConcurrentClusterIT takes too long and times out on travis. 
 See e.g. https://travis-ci.org/apache/jackrabbit-oak/builds/27445383



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OAK-1788) ConcurrentConflictTest fails occasionally

2014-06-16 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-1788:
--

Fix Version/s: 1.0.1

Merged into 1.0 branch in http://svn.apache.org/r1602810

 ConcurrentConflictTest fails occasionally
 -

 Key: OAK-1788
 URL: https://issues.apache.org/jira/browse/OAK-1788
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core, mongomk
Affects Versions: 1.0
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
Priority: Minor
  Labels: concurrency
 Fix For: 1.0.1, 1.1


 Occurs every now and then on buildbot. E.g.:
 http://ci.apache.org/builders/oak-trunk-win7/builds/16



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1892) OrderedIndexConcurrentClusterIT takes too long

2014-06-16 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032162#comment-14032162
 ] 

Marcel Reutegger commented on OAK-1892:
---

Also disabled in 1.0 branch: http://svn.apache.org/r1602811

 OrderedIndexConcurrentClusterIT takes too long
 --

 Key: OAK-1892
 URL: https://issues.apache.org/jira/browse/OAK-1892
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: jcr
 Environment: trunk and 1.0 branch
Reporter: Marcel Reutegger
Assignee: Davide Giannella

 The OrderedIndexConcurrentClusterIT takes too long and times out on travis. 
 See e.g. https://travis-ci.org/apache/jackrabbit-oak/builds/27445383



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (OAK-1893) MBean to dump Lucene Index content and related stats

2014-06-16 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-1893:


 Summary: MBean to dump Lucene Index content and related stats
 Key: OAK-1893
 URL: https://issues.apache.org/jira/browse/OAK-1893
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: oak-lucene
Affects Versions: 1.0
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Priority: Minor
 Fix For: 1.1


Currently the Lucene index is stored within NodeStore as a content. To enable 
debugging and better understanding of Lucene index content it would be helpful 
to provide a JMX Bean which can dump the index content to filesystem



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1893) MBean to dump Lucene Index content and related stats

2014-06-16 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032380#comment-14032380
 ] 

Chetan Mehrotra commented on OAK-1893:
--

Implemented with http://svn.apache.org/r1602853

[~alex.parvulescu], [~mmarth] Can we include this in 1.0 branch. Would be 
helpful for debugging and understanding Lucene issues in customer deployments

 MBean to dump Lucene Index content and related stats
 

 Key: OAK-1893
 URL: https://issues.apache.org/jira/browse/OAK-1893
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: oak-lucene
Affects Versions: 1.0
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Priority: Minor
 Fix For: 1.1


 Currently the Lucene index is stored within NodeStore as a content. To enable 
 debugging and better understanding of Lucene index content it would be 
 helpful to provide a JMX Bean which can dump the index content to filesystem



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OAK-1877) Hourly async reindexing on an idle instance

2014-06-16 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu updated OAK-1877:
-

Attachment: updates-without-indexed-changes.patch

there's a regression from the latest changes: content changes that are not 
indexed would generate a new checkpoint, but fail to properly link it from the 
'async' reference, resulting in a large number of warning messages because of 
the missing initial checkpoint.

attached a proposed patch with some updates tests that also check the 'async' 
reference to the checkpoints.

[~jukkaz] can you take a look?

 Hourly async reindexing on an idle instance
 ---

 Key: OAK-1877
 URL: https://issues.apache.org/jira/browse/OAK-1877
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core
Affects Versions: 1.0
Reporter: Jukka Zitting
Assignee: Jukka Zitting
Priority: Critical
 Fix For: 1.0.1, 1.1

 Attachments: updates-without-indexed-changes.patch


 OAK-1292 introduced the following interesting but not very nice behavior:
 On an idle system with no changes for an extended amount of time, the 
 OAK-1292 change blocks the async indexer from updating the reference to the 
 last indexed checkpoint. After one hour (the default checkpoint lifetime), 
 the referenced checkpoint will expire, and the indexer will fall back to full 
 reindexing.
 The result of this behavior is that once every hour, the size of an idle 
 instance will grow with dozens or hundreds of megabytes of new index data 
 generated by reindexing. Older index data becomes garbage, but the compaction 
 code from OAK-1804 is needed to make it collectable. A better solution would 
 be to prevent the reindexing from happening in the first place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (OAK-1877) Hourly async reindexing on an idle instance

2014-06-16 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu reopened OAK-1877:
--


 Hourly async reindexing on an idle instance
 ---

 Key: OAK-1877
 URL: https://issues.apache.org/jira/browse/OAK-1877
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core
Affects Versions: 1.0
Reporter: Jukka Zitting
Assignee: Jukka Zitting
Priority: Critical
 Fix For: 1.0.1, 1.1

 Attachments: updates-without-indexed-changes.patch


 OAK-1292 introduced the following interesting but not very nice behavior:
 On an idle system with no changes for an extended amount of time, the 
 OAK-1292 change blocks the async indexer from updating the reference to the 
 last indexed checkpoint. After one hour (the default checkpoint lifetime), 
 the referenced checkpoint will expire, and the indexer will fall back to full 
 reindexing.
 The result of this behavior is that once every hour, the size of an idle 
 instance will grow with dozens or hundreds of megabytes of new index data 
 generated by reindexing. Older index data becomes garbage, but the compaction 
 code from OAK-1804 is needed to make it collectable. A better solution would 
 be to prevent the reindexing from happening in the first place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (OAK-1877) Hourly async reindexing on an idle instance

2014-06-16 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu resolved OAK-1877.
--

Resolution: Fixed

changed the patch a bit, talked to Jukka and we've decided to keep the latest 
checkpoint and simply update the ref to it, instead of keeping the old one 
(good for gc reasons too).

Marking as fixed, on trunk with rev http://svn.apache.org/r1602872, on 1.0 with 
rev 1602874.

 Hourly async reindexing on an idle instance
 ---

 Key: OAK-1877
 URL: https://issues.apache.org/jira/browse/OAK-1877
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core
Affects Versions: 1.0
Reporter: Jukka Zitting
Assignee: Jukka Zitting
Priority: Critical
 Fix For: 1.0.1, 1.1

 Attachments: updates-without-indexed-changes.patch


 OAK-1292 introduced the following interesting but not very nice behavior:
 On an idle system with no changes for an extended amount of time, the 
 OAK-1292 change blocks the async indexer from updating the reference to the 
 last indexed checkpoint. After one hour (the default checkpoint lifetime), 
 the referenced checkpoint will expire, and the indexer will fall back to full 
 reindexing.
 The result of this behavior is that once every hour, the size of an idle 
 instance will grow with dozens or hundreds of megabytes of new index data 
 generated by reindexing. Older index data becomes garbage, but the compaction 
 code from OAK-1804 is needed to make it collectable. A better solution would 
 be to prevent the reindexing from happening in the first place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1893) MBean to dump Lucene Index content and related stats

2014-06-16 Thread Jukka Zitting (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032475#comment-14032475
 ] 

Jukka Zitting commented on OAK-1893:


-0 I'm not too excited about a remotely accessible feature that can be used to 
write to an arbitrary location in the local file system.

In general I think a low-level feature like this would be better implemented as 
a debugging tool in oak-run, for example as a new console command. Implementing 
it like that would also remove the need to backport the feature to a 
maintenance branch.

 MBean to dump Lucene Index content and related stats
 

 Key: OAK-1893
 URL: https://issues.apache.org/jira/browse/OAK-1893
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: oak-lucene
Affects Versions: 1.0
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Priority: Minor
 Fix For: 1.1


 Currently the Lucene index is stored within NodeStore as a content. To enable 
 debugging and better understanding of Lucene index content it would be 
 helpful to provide a JMX Bean which can dump the index content to filesystem



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1892) OrderedIndexConcurrentClusterIT takes too long

2014-06-16 Thread Jukka Zitting (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032493#comment-14032493
 ] 

Jukka Zitting commented on OAK-1892:


BTW, I'm also seeing very slow progress when indexing larger amounts of 
content. Updating the ordered index appears to be much slower than updating the 
Lucene index, including text extraction, which seems troublesome.

 OrderedIndexConcurrentClusterIT takes too long
 --

 Key: OAK-1892
 URL: https://issues.apache.org/jira/browse/OAK-1892
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: jcr
 Environment: trunk and 1.0 branch
Reporter: Marcel Reutegger
Assignee: Davide Giannella

 The OrderedIndexConcurrentClusterIT takes too long and times out on travis. 
 See e.g. https://travis-ci.org/apache/jackrabbit-oak/builds/27445383



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (OAK-1894) PropertyIndex only considers the cost of a single indexed property

2014-06-16 Thread Justin Edelson (JIRA)
Justin Edelson created OAK-1894:
---

 Summary: PropertyIndex only considers the cost of a single indexed 
property
 Key: OAK-1894
 URL: https://issues.apache.org/jira/browse/OAK-1894
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson


The existing PropertyIndex loops through the PropertyRestriction objects in the 
Filter and essentially only calculates the cost of the first indexed property. 
This isn't actually the first property in the query and 
Filter.propertyRestrictions is a HashMap.

More confusingly, the plan for a query with multiple indexed properties outputs 
*all* indexed properties, even though only the first one is used.

For queries with multiple indexed properties, the cheapest property index 
should be used in all three relevant places: when calculating the cost, when 
executing the query, and when producing the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OAK-1894) PropertyIndex only considers the cost of a single indexed property

2014-06-16 Thread Justin Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Edelson updated OAK-1894:


Attachment: OAK-1894.patch

Patch which calculates the cheapest property.

 PropertyIndex only considers the cost of a single indexed property
 --

 Key: OAK-1894
 URL: https://issues.apache.org/jira/browse/OAK-1894
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
 Attachments: OAK-1894.patch


 The existing PropertyIndex loops through the PropertyRestriction objects in 
 the Filter and essentially only calculates the cost of the first indexed 
 property. This isn't actually the first property in the query and 
 Filter.propertyRestrictions is a HashMap.
 More confusingly, the plan for a query with multiple indexed properties 
 outputs *all* indexed properties, even though only the first one is used.
 For queries with multiple indexed properties, the cheapest property index 
 should be used in all three relevant places: when calculating the cost, when 
 executing the query, and when producing the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (OAK-1894) PropertyIndex only considers the cost of a single indexed property

2014-06-16 Thread Justin Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032501#comment-14032501
 ] 

Justin Edelson edited comment on OAK-1894 at 6/16/14 3:03 PM:
--

Patch which calculates the cheapest property and uses it. It also outputs 
significantly more diagnostic information so that you can see the cost per 
property (even though only one of those properties, the cheapest, will be used 
for the actualy query).

The only thing I dislike about this approach is that the cost calculation can 
happen multiple times per query. However, I don't see a way around this without 
refactoring of the API to allow indexes to return an object from getCost() 
which can then be passed into query() later.


was (Author: justinedelson):
Patch which calculates the cheapest property.

 PropertyIndex only considers the cost of a single indexed property
 --

 Key: OAK-1894
 URL: https://issues.apache.org/jira/browse/OAK-1894
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
 Attachments: OAK-1894.patch


 The existing PropertyIndex loops through the PropertyRestriction objects in 
 the Filter and essentially only calculates the cost of the first indexed 
 property. This isn't actually the first property in the query and 
 Filter.propertyRestrictions is a HashMap.
 More confusingly, the plan for a query with multiple indexed properties 
 outputs *all* indexed properties, even though only the first one is used.
 For queries with multiple indexed properties, the cheapest property index 
 should be used in all three relevant places: when calculating the cost, when 
 executing the query, and when producing the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1893) MBean to dump Lucene Index content and related stats

2014-06-16 Thread Michael Marth (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032575#comment-14032575
 ] 

Michael Marth commented on OAK-1893:


Agree with the general usefulness of this feature for deployment debugging and 
Jukka's concerns (i.e. different implementation)

 MBean to dump Lucene Index content and related stats
 

 Key: OAK-1893
 URL: https://issues.apache.org/jira/browse/OAK-1893
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: oak-lucene
Affects Versions: 1.0
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Priority: Minor
 Fix For: 1.1


 Currently the Lucene index is stored within NodeStore as a content. To enable 
 debugging and better understanding of Lucene index content it would be 
 helpful to provide a JMX Bean which can dump the index content to filesystem



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1894) PropertyIndex only considers the cost of a single indexed property

2014-06-16 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032858#comment-14032858
 ] 

Thomas Mueller commented on OAK-1894:
-

 without refactoring of the API to allow indexes to return an object from 
 getCost() which can then be passed into query() later.

We do have a solution for that: the interface QueryIndex.AdvancedQueryIndex, 
method getPlans. But property indexes don't use that API yet (only ordered 
indexes use it right now). So what you want might be possible, I will check.

But first I will have a look at your patch.

 PropertyIndex only considers the cost of a single indexed property
 --

 Key: OAK-1894
 URL: https://issues.apache.org/jira/browse/OAK-1894
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
 Attachments: OAK-1894.patch


 The existing PropertyIndex loops through the PropertyRestriction objects in 
 the Filter and essentially only calculates the cost of the first indexed 
 property. This isn't actually the first property in the query and 
 Filter.propertyRestrictions is a HashMap.
 More confusingly, the plan for a query with multiple indexed properties 
 outputs *all* indexed properties, even though only the first one is used.
 For queries with multiple indexed properties, the cheapest property index 
 should be used in all three relevant places: when calculating the cost, when 
 executing the query, and when producing the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OAK-1894) PropertyIndex only considers the cost of a single indexed property

2014-06-16 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-1894:


Fix Version/s: 1.0.2
   1.1

 PropertyIndex only considers the cost of a single indexed property
 --

 Key: OAK-1894
 URL: https://issues.apache.org/jira/browse/OAK-1894
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
 Fix For: 1.1, 1.0.2

 Attachments: OAK-1894.patch


 The existing PropertyIndex loops through the PropertyRestriction objects in 
 the Filter and essentially only calculates the cost of the first indexed 
 property. This isn't actually the first property in the query and 
 Filter.propertyRestrictions is a HashMap.
 More confusingly, the plan for a query with multiple indexed properties 
 outputs *all* indexed properties, even though only the first one is used.
 For queries with multiple indexed properties, the cheapest property index 
 should be used in all three relevant places: when calculating the cost, when 
 executing the query, and when producing the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1894) PropertyIndex only considers the cost of a single indexed property

2014-06-16 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032911#comment-14032911
 ] 

Thomas Mueller commented on OAK-1894:
-

A workaround for queries that have two property conditions is to use double 
negation as in:

{noformat}
(a) where property1 = 'x' and property2 = 'y'
(b) where property1 = 'x' and not not property2 = 'y'
{noformat}

Those two queries are equivalent (return the same nodes), but query (a) could 
use an index on property2 (if there is one), while query (b) can not, because 
the query engine doesn't currently optimize the not not condition.

 PropertyIndex only considers the cost of a single indexed property
 --

 Key: OAK-1894
 URL: https://issues.apache.org/jira/browse/OAK-1894
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
 Fix For: 1.1, 1.0.2

 Attachments: OAK-1894.patch


 The existing PropertyIndex loops through the PropertyRestriction objects in 
 the Filter and essentially only calculates the cost of the first indexed 
 property. This isn't actually the first property in the query and 
 Filter.propertyRestrictions is a HashMap.
 More confusingly, the plan for a query with multiple indexed properties 
 outputs *all* indexed properties, even though only the first one is used.
 For queries with multiple indexed properties, the cheapest property index 
 should be used in all three relevant places: when calculating the cost, when 
 executing the query, and when producing the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1894) PropertyIndex only considers the cost of a single indexed property

2014-06-16 Thread Justin Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032934#comment-14032934
 ] 

Justin Edelson commented on OAK-1894:
-

[~tmueller] I looked briefly at AdvancedQueryIndex, but the JavaDoc made it 
seem like that was only to be used when a query could handle sorting.

Would it make more sense for me to convert PropertyIndex to implement 
AdvancedQueryIndex?

 PropertyIndex only considers the cost of a single indexed property
 --

 Key: OAK-1894
 URL: https://issues.apache.org/jira/browse/OAK-1894
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
 Fix For: 1.1, 1.0.2

 Attachments: OAK-1894.patch


 The existing PropertyIndex loops through the PropertyRestriction objects in 
 the Filter and essentially only calculates the cost of the first indexed 
 property. This isn't actually the first property in the query and 
 Filter.propertyRestrictions is a HashMap.
 More confusingly, the plan for a query with multiple indexed properties 
 outputs *all* indexed properties, even though only the first one is used.
 For queries with multiple indexed properties, the cheapest property index 
 should be used in all three relevant places: when calculating the cost, when 
 executing the query, and when producing the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (OAK-1895) ClassCastException can occur if the TraversalIndex is cheaper than an OrderedIndex (or a different AdvancedQueryIndex impl)

2014-06-16 Thread Justin Edelson (JIRA)
Justin Edelson created OAK-1895:
---

 Summary: ClassCastException can occur if the TraversalIndex is 
cheaper than an OrderedIndex (or a different AdvancedQueryIndex impl)
 Key: OAK-1895
 URL: https://issues.apache.org/jira/browse/OAK-1895
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
 Fix For: 1.0.2


Because the TraversalIndex is added last, the `bestPlan` variable will be 
non-null if an OrderedIndex was usable for the query. If the TraversalIndex 
ends up being cheaper, then the `bestIndex` variable is set to the 
TraveralIndex, but the `bestPlan` remains set to a non-null value.

Later, in SelectorImpl, the fact that the plan is non-null causes the index to 
be cast to AdvancedQueryIndex which fails with a ClassCastException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (OAK-1895) ClassCastException can occur if the TraversalIndex is cheaper than an OrderedIndex (or a different AdvancedQueryIndex impl)

2014-06-16 Thread Justin Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Edelson resolved OAK-1895.
-

   Resolution: Fixed
Fix Version/s: (was: 1.0.2)
   1.1
 Assignee: Justin Edelson

fixed in r1603010

 ClassCastException can occur if the TraversalIndex is cheaper than an 
 OrderedIndex (or a different AdvancedQueryIndex impl)
 ---

 Key: OAK-1895
 URL: https://issues.apache.org/jira/browse/OAK-1895
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
Assignee: Justin Edelson
 Fix For: 1.1


 Because the TraversalIndex is added last, the `bestPlan` variable will be 
 non-null if an OrderedIndex was usable for the query. If the TraversalIndex 
 ends up being cheaper, then the `bestIndex` variable is set to the 
 TraveralIndex, but the `bestPlan` remains set to a non-null value.
 Later, in SelectorImpl, the fact that the plan is non-null causes the index 
 to be cast to AdvancedQueryIndex which fails with a ClassCastException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OAK-1895) ClassCastException can occur if the TraversalIndex is cheaper than an OrderedIndex (or a different AdvancedQueryIndex impl)

2014-06-16 Thread Justin Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Edelson updated OAK-1895:


Fix Version/s: 1.0.2

and on 1.0 branch in r1603013

 ClassCastException can occur if the TraversalIndex is cheaper than an 
 OrderedIndex (or a different AdvancedQueryIndex impl)
 ---

 Key: OAK-1895
 URL: https://issues.apache.org/jira/browse/OAK-1895
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
Assignee: Justin Edelson
 Fix For: 1.1, 1.0.2


 Because the TraversalIndex is added last, the `bestPlan` variable will be 
 non-null if an OrderedIndex was usable for the query. If the TraversalIndex 
 ends up being cheaper, then the `bestIndex` variable is set to the 
 TraveralIndex, but the `bestPlan` remains set to a non-null value.
 Later, in SelectorImpl, the fact that the plan is non-null causes the index 
 to be cast to AdvancedQueryIndex which fails with a ClassCastException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OAK-1894) PropertyIndex only considers the cost of a single indexed property

2014-06-16 Thread Justin Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Edelson updated OAK-1894:


Attachment: OAK-1894-advanced.diff

here's a different patch which uses the AdvancedQueryIndex. Related to my email 
to http://markmail.org/message/2amhqmstxcabzyqv, I could't see a way of 
actually passing the cheapest property via the IndexPlan, so I added a new 
method to IndexPlan for this.

Agree that more tests are a good idea; will work on that next.

Interestingly, this approach exposed OAK-1895, which I went ahead and fixed on 
trunk and the 1.0 branch.

Also, I figured out why the (rep:excerpt) bit went away. Prior to this change 
(either version), all properties were output in the plan description, with 
non-indexed properties in paren. I've now changed that so that only the 
cheapest indexed property is output, but I should probably adjust that to 
output all properties, just indicating which property index was used.

 PropertyIndex only considers the cost of a single indexed property
 --

 Key: OAK-1894
 URL: https://issues.apache.org/jira/browse/OAK-1894
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
 Fix For: 1.1, 1.0.2

 Attachments: OAK-1894-advanced.diff, OAK-1894.patch


 The existing PropertyIndex loops through the PropertyRestriction objects in 
 the Filter and essentially only calculates the cost of the first indexed 
 property. This isn't actually the first property in the query and 
 Filter.propertyRestrictions is a HashMap.
 More confusingly, the plan for a query with multiple indexed properties 
 outputs *all* indexed properties, even though only the first one is used.
 For queries with multiple indexed properties, the cheapest property index 
 should be used in all three relevant places: when calculating the cost, when 
 executing the query, and when producing the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1894) PropertyIndex only considers the cost of a single indexed property

2014-06-16 Thread Justin Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033309#comment-14033309
 ] 

Justin Edelson commented on OAK-1894:
-

Actually, thinking about this a bit more, there's really no point in outputting 
the property names for indexes which aren't used. Only the used/cheapest 
property should be output.

 PropertyIndex only considers the cost of a single indexed property
 --

 Key: OAK-1894
 URL: https://issues.apache.org/jira/browse/OAK-1894
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
 Fix For: 1.1, 1.0.2

 Attachments: OAK-1894-advanced.diff, OAK-1894.patch


 The existing PropertyIndex loops through the PropertyRestriction objects in 
 the Filter and essentially only calculates the cost of the first indexed 
 property. This isn't actually the first property in the query and 
 Filter.propertyRestrictions is a HashMap.
 More confusingly, the plan for a query with multiple indexed properties 
 outputs *all* indexed properties, even though only the first one is used.
 For queries with multiple indexed properties, the cheapest property index 
 should be used in all three relevant places: when calculating the cost, when 
 executing the query, and when producing the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OAK-1894) PropertyIndex only considers the cost of a single indexed property

2014-06-16 Thread David Gonzalez (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1403#comment-1403
 ] 

David Gonzalez commented on OAK-1894:
-

[~justinedelson] I think listing all index candidate property names would be 
useful to help understand/make immediately clear 1) if you're missing an index 
for a property and 2) if certain operations (like, not) are preventing a 
property from being resolved to an index.


 PropertyIndex only considers the cost of a single indexed property
 --

 Key: OAK-1894
 URL: https://issues.apache.org/jira/browse/OAK-1894
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: query
Reporter: Justin Edelson
 Fix For: 1.1, 1.0.2

 Attachments: OAK-1894-advanced.diff, OAK-1894.patch


 The existing PropertyIndex loops through the PropertyRestriction objects in 
 the Filter and essentially only calculates the cost of the first indexed 
 property. This isn't actually the first property in the query and 
 Filter.propertyRestrictions is a HashMap.
 More confusingly, the plan for a query with multiple indexed properties 
 outputs *all* indexed properties, even though only the first one is used.
 For queries with multiple indexed properties, the cheapest property index 
 should be used in all three relevant places: when calculating the cost, when 
 executing the query, and when producing the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)