[jira] [Commented] (OAK-4907) Collect changes (paths, nts, props..) of a commit in a validator

2016-10-11 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15565521#comment-15565521
 ] 

Stefan Egli commented on OAK-4907:
--

Note that due to OAK-4924 for the {{contentChanged}} calls coming from 
AsyncIndexUpdater no ChangeSet can be attached to the CommitInfo, thus no 
prefiltering can be done for those (and there are quite a few of those calls 
typically) 

> Collect changes (paths, nts, props..) of a commit in a validator
> 
>
> Key: OAK-4907
> URL: https://issues.apache.org/jira/browse/OAK-4907
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.6
>
> Attachments: OAK-4907.patch, OAK-4907.v2.patch
>
>
> It would be useful to collect a set of changes of a commit (eg in a 
> validator) that could later be used in an Observer for eg prefiltering.
> Such a change collector should collect paths, nodetypes, properties, 
> node-names (and perhaps more at a later stage) of all changes and store the 
> result in the CommitInfo's CommitContext.
> Note that this is a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4907) Collect changes (paths, nts, props..) of a commit in a validator

2016-10-11 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567808#comment-15567808
 ] 

Chetan Mehrotra commented on OAK-4907:
--

Patch looks good and test coverage is good!. 

Some minor comments

# Instead of special casing for EMPTY it would be better to check if the 
{{CommitInfo}} has {{CommitContext}} set. If yes then validator can opt in 
otherwise opt out. And then {{CollectorSupport}} can just refer to 
{{CommitContext}}
{code}
if (info == null || info == CommitInfo.EMPTY) {
// then we cannot do change-collecting, as we can't store
// it in the info
return null;
}
{code}
# I see following 3 boolean vars always being set to true. May be replace 3 
with single var like {{changed}}
{code}
// record a change at the current path (which is the 
parent of this property)
addParentPathOnLeave = true;
// and with the current node name
addParentNodeNameOnLeave = true;
// additionally, add parent's nodeType on leave too
addParentNodeTypeOnLeave = true;
{code}
# Log the limits/config at info level in activate method
# Keep the child name as instance variable to avoid computing it again
{code}
if (addParentNodeNameOnLeave) {
support.getParentNodeNames().add(getName(path));
}
{code}
# Use {{Iterables#addAll}}. This avoid creating unnecessary list instance and 
is bit optimized for collection based iterable (which is the case here). Doing 
all this would ensure that overhead of validator is minimal!
{code}
support.getParentNodeTypes().addAll(Lists.newArrayList(parentNodeOrNull.getNames("jcr:mixinTypes")));
{code}
# {{testNull}} test is failing for me

h4. ChangeSet

May be we should split this class in two. 

*ChangeSetBuilder*
{code}
public class ChangeSetBuilder {
private final int maxItems;
private final int maxPathDepth;
private final Set parentPaths = Sets.newHashSet();

private boolean parentPathOverflow;

public ChangeSetBuilder(int maxItems, int maxPathDepth) {
this.maxItems = maxItems;
this.maxPathDepth = maxPathDepth;
}

public ChangeSetBuilder addParentPath(String path){
if (parentPaths.size() > maxItems){
parentPathOverflow = true;
parentPaths.clear();
}

if (!parentPathOverflow) {
parentPaths.add(path);
}
return this;
}

public ChangeSet build(){
return new ChangeSet(...)
}
}
{code}

This class
* Just collects the changes and take care of limits
* Is mutable
* Used by Validator and not to be used by clients

*ChangeSet*

{code}
public class ChangeSet {

@CheckForNull
public Set getParentPaths() {

}
}
{code}

* Immutable class 
* Has accessors marked with {{CheckForNull}}. With null being an indication 
that this information was not collected and observer should do the hard work

{noformat}
 * To limit memory usage, the ChangeSet has a limit on the number of items, 
each,
 * that it collects. If one of those items reach the limit this is called
 * an 'overflow' and the corresponding item type is marked as having 
'overflown'.
 * Downstream Observers should thus check if a particular item has overflown
 * or not.
 {noformat}

 This can be updated to indicate "how" overflow case can be determined i.e. by 
checking if the set returned is null

Also some coverage around overflow case would be good to have

> Collect changes (paths, nts, props..) of a commit in a validator
> 
>
> Key: OAK-4907
> URL: https://issues.apache.org/jira/browse/OAK-4907
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.6
>
> Attachments: OAK-4907.patch, OAK-4907.v2.patch
>
>
> It would be useful to collect a set of changes of a commit (eg in a 
> validator) that could later be used in an Observer for eg prefiltering.
> Such a change collector should collect paths, nodetypes, properties, 
> node-names (and perhaps more at a later stage) of all changes and store the 
> result in the CommitInfo's CommitContext.
> Note that this is a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4907) Collect changes (paths, nts, props..) of a commit in a validator

2016-10-12 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568297#comment-15568297
 ] 

Stefan Egli commented on OAK-4907:
--

Thx [~chetanm], a few comments on the points:
bq. 1. Instead of special casing for EMPTY ...
I see, I guess while the two variants aren't really equivalent they are both 
not ideal. In theory it should also be OK for the ChangeCollector to add a 
_new_ CommitContext (which it indeed doesn't do now) when it's not there - and 
in this case the check for EMPTY would avoid that. But we don't have that at 
the moment, so I guess we can go with the assumption that 'someone else' sets 
the CommitContext (as is happening in commit() atm). I guess what I'm saying is 
that I find both solutions suboptimal, but I'll go for your variant.
bq. 2. I see following 3 boolean vars always being set to true
good point - now after the dust (of initial development) has settled it's 
indeed obvious that the variables can be combined into one.
bq. 3. Log the limits/config
will do
bq. 4. Keep the child name as instance variable
good idea
bq. 5. Use Iterables#addAll.
good idea
bq. 6. testNull test is failing for me
indeed does for me too - that's due to the late-incoming addition of skipping 
EMPTY
bq. May be we should split this class in two. 
certainly
bq. Also some coverage around overflow case would be good to have
sure


> Collect changes (paths, nts, props..) of a commit in a validator
> 
>
> Key: OAK-4907
> URL: https://issues.apache.org/jira/browse/OAK-4907
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.6
>
> Attachments: OAK-4907.patch, OAK-4907.v2.patch
>
>
> It would be useful to collect a set of changes of a commit (eg in a 
> validator) that could later be used in an Observer for eg prefiltering.
> Such a change collector should collect paths, nodetypes, properties, 
> node-names (and perhaps more at a later stage) of all changes and store the 
> result in the CommitInfo's CommitContext.
> Note that this is a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4907) Collect changes (paths, nts, props..) of a commit in a validator

2016-10-12 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568323#comment-15568323
 ] 

Chetan Mehrotra commented on OAK-4907:
--

bq. In theory it should also be OK for the ChangeCollector to add a new 
CommitContext (which it indeed doesn't do now) when it's not there - and in 
this case the check for EMPTY would avoid that.

Yes that would have been ideal. See OAK-4640 for background on why this route 
was not taken

> Collect changes (paths, nts, props..) of a commit in a validator
> 
>
> Key: OAK-4907
> URL: https://issues.apache.org/jira/browse/OAK-4907
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.5.11
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.6
>
> Attachments: OAK-4907.patch, OAK-4907.v2.patch
>
>
> It would be useful to collect a set of changes of a commit (eg in a 
> validator) that could later be used in an Observer for eg prefiltering.
> Such a change collector should collect paths, nodetypes, properties, 
> node-names (and perhaps more at a later stage) of all changes and store the 
> result in the CommitInfo's CommitContext.
> Note that this is a result of 
> [discussions|https://issues.apache.org/jira/browse/OAK-4796?focusedCommentId=15550962&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15550962]
>  around design in OAK-4796



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)