[ 
https://issues.apache.org/jira/browse/OAK-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024584#comment-16024584
 ] 

Chetan Mehrotra edited comment on OAK-5970 at 5/25/17 11:32 AM:
----------------------------------------------------------------

bq.  I think unifying in/excludes after merging all paths from all indices 
might not be correct. e.g.

Interesting observation. Yes current implementation would not estimate 
properly. Thinking more on this it appears that determining union over multiple 
set of include, exclude sets is tricky. Would think over it.

Sample 
{noformat}
([includes | excludes]


Tree
/     - 100 (subtree count)
/var - 70
/var/a - 60 
/var/a/b - 50
/var/a/b/c - 40

Include/Exclude sets for different indexes

Set1
-----
[/ | /var] + 
[/var/a | /var/a/b ] +
[/var/a/b/c | ]

Count
100 - 70 + 60 - 50 + 40 = 80

Set2
-----
[/ | /var] + 
[/var/a | /var/a/b ] +
[/var/a/b/c | ] + 
[/content | ]

Count = 80 (same as content is in / - /var)

Set3
-----
[/ | /var] + 
[/var/a | /var/a/b ] +
[/var/a/b/c | ] + 
[/ | ]

Count =  100 ( the last set of / supercedes everything and at least traversal 
for that would traverse whole repo)
{noformat}



was (Author: chetanm):
bq.  I think unifying in/excludes after merging all paths from all indices 
might not be correct. e.g.

Interesting observation. Yes current implementation would not estimate 
properly. Thinking more on this it appears that determining union over multiple 
set of include, exclude sets is tricky. Would think over it.

Sample 
{noformat}
([includes | excludes]

Set1
Tree
/     - 100 (subtree count)
/var - 70
/var/a - 60 
/var/a/b - 50
/var/a/b/c - 40

Include/Exclude sets for different indexes
[/ | /var] + 
[/var/a | /var/a/b ] +
[/var/a/b/c | ]

Count
100 - 70 + 60 - 50 + 40 = 80

[/ | /var] + 
[/var/a | /var/a/b ] +
[/var/a/b/c | ] + 
[/content | ]

Count = 80 (same as content is in / - /var)

[/ | /var] + 
[/var/a | /var/a/b ] +
[/var/a/b/c | ] + 
[/ | ]

Count =  100 ( the last set of / supercedes everything and at least traversal 
for that would traverse whole repo)
{noformat}


> (Re-)Indexing: estimate progress / ETA
> --------------------------------------
>
>                 Key: OAK-5970
>                 URL: https://issues.apache.org/jira/browse/OAK-5970
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: indexing
>            Reporter: Thomas Mueller
>            Assignee: Chetan Mehrotra
>             Fix For: 1.8
>
>
> Reindexing can take a long time, so it would be good if we can estimate where 
> we are at (for example in percent of the relevant number of nodes). It might 
> also be possible to estimate when indexing will be done, and the current path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to