[jira] [Commented] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-03-08 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901161#comment-15901161
 ] 

Vikas Saurabh commented on OAK-5499:


Ok, I think I now understand what's going on (and that's different from my idea 
of how-stuff-works when I logged this issue). About the patch, I think the 
change is useful for full reindexing mode. And afaiu, Chetan's concern isn't 
easy to fit in because {{before}} state for {{IndexUpdate}} is different from 
that required by reindexing-editors (EMPTY).

I wonder if it's ok/possible to hook in EMPTY-vs-CURRENT diff being done for 
reindexing editors to collect the editors too? I mean it does process normal 
diff - but if it "finds" that it needs to reindex a few in current one then it 
by-passes usual calls being sent in until it leaves the sub-tree. In my 
mean-time (while it's by-passing calls), it hands over another editor for 
EMPTY-vs-CURRENT diff to "find" other ones.
(I understand the para above is very hand-wavy - but that's the best way I 
could do with my current understanding and words :-/.. also, I wonder how 3 
level indices would behave then).

[~chetanm], would it makes sense to get this state in and open another one to 
handle non-full-reindex case? (assuming doing this completely would take a bit 
more time)

> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Alex Parvulescu
>Priority: Minor
> Fix For: 1.8
>
> Attachments: OAK-5499.patch, OAK-5499-v2-demo.patch, 
> OAK-5499-v2-fix.patch
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-03-08 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901089#comment-15901089
 ] 

Vikas Saurabh commented on OAK-5499:


Assigning to Alex as he's actively working on this. I'd review in a bit, but 
Curran's concern seems relevant.

> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Alex Parvulescu
>Priority: Minor
> Fix For: 1.8
>
> Attachments: OAK-5499.patch, OAK-5499-v2-demo.patch, 
> OAK-5499-v2-fix.patch
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-03-08 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901082#comment-15901082
 ] 

Chetan Mehrotra commented on OAK-5499:
--

The approach is neat! However it only addresses the case for initial index. Now 
consider a case where we have index definitions at 

{noformat}
/oak:index
  /fooIndex
  /barIndex
/content
   /project
   /oak:index
  /foo2Index
{noformat}

And now we had 2 reindex the fooIndex and foo2Index. With current approach this 
would result in 2 traversals. So we need a way to handle such cases also

> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.8
>
> Attachments: OAK-5499.patch, OAK-5499-v2-demo.patch, 
> OAK-5499-v2-fix.patch
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-03-07 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15899615#comment-15899615
 ] 

Alex Parvulescu commented on OAK-5499:
--

[~chetanm], [~catholicon], [~tmueller] gentle ping, the patch needs some more 
eyes for review!

> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.8
>
> Attachments: OAK-5499.patch, OAK-5499-v2-demo.patch, 
> OAK-5499-v2-fix.patch
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-01-25 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837817#comment-15837817
 ] 

Alex Parvulescu commented on OAK-5499:
--

bq. Collecting oak:index nodes as part of first reindex traversal would help in 
avoiding further traversal.
It wouldn't though, the traversal would happen anyways because that's how the 
diff currently works. I added a comment on  OAK-5511, but I don't see it as 
helping much here.

> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.8
>
> Attachments: OAK-5499.patch, OAK-5499-v2-demo.patch, 
> OAK-5499-v2-fix.patch
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-01-24 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837160#comment-15837160
 ] 

Chetan Mehrotra commented on OAK-5499:
--

Also opened OAK-5511 to reduce cost of such index definition "discovery"

> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.8
>
> Attachments: OAK-5499.patch
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-01-24 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837149#comment-15837149
 ] 

Chetan Mehrotra commented on OAK-5499:
--

Collecting oak:index nodes as part of first reindex traversal would help in 
avoiding further traversal. 

Another option would be to use nodetype index and look for 
oak:QueryIndexDefinition entries. This can then be used to determine location 
of oak:index nodes in whole repository

> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.8
>
> Attachments: OAK-5499.patch
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-01-22 Thread Varun Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15833697#comment-15833697
 ] 

Varun Mehrotra commented on OAK-5499:
-

Thanks [~catholicon] for logging this issue. Can we increase the priority of 
this?

> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.8
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)