[ https://issues.apache.org/jira/browse/OAK-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15430076#comment-15430076 ]
Chetan Mehrotra edited comment on OAK-1312 at 8/24/16 11:51 AM: ---------------------------------------------------------------- Planned feature work is now done and [patch|^OAK-1312-review-v1.diff] is ready for review. h3. Implementation Details Some details are provided [above|#UsageandConfiguration]. h4. Commit Side Changes {{CommitDiff}} obtains a {{BundlingHandler}} from {{DocumentNodeStore}} which takes care of bundling relative node. For any new node getting added it looks up {{DocumentBundlor}} from {{BundledTypesRegistry}} based on its primary type or mixin. {{DocumentBundlor}} generates {{Matcher}} which determine if give nodestate needs to be bundled. As {{CommitDiff}} traverses downwards new matchers get generated from parent matcher. * The pattern itself is saved as part of nodestate. * hasChildren support - Children status is managed separately for bundled and non bundled child node. This is later used to optimize calls around child access in {{DocumentNodeState}} ** For each bundled node {{:doc-has-child-bundled}} property is set to true to indicate that parent node has a bundled child ** For each non bundled node {{:doc-has-child-non-bundled}} property is set to true to indicate that parent node has a non bundled node. h4. Bundling Config Bundling as a whole feature is controlled via {{bundlingEnabled}} flag on {{DocumentNodeStoreService}}. If the flag is disabled then bundling would be disabled for *new nodes only*. The bundling config is stored in repository and {{BundlingConfigHandler}} observes any changes around that and refreshes the {{BundledTypesRegistry}} in case of any change h4. Reading Side Change On reading side {{DocumentNodeState}} would construct a {{BundlingContext}}. In case a bundling pattern is found then bundling context would filter out properties as per current node. For any child lookup it would determine if the child is bundled then it would construct a {{DocumentNodeState}} instance from properties of bundling root. For listing of child node it would provide a merge iterator of bundled and non bundled nodes. In case it can be determined that all nodes are bundled then it would avoid the call to DocumentNodeStore h3. Open Question # *Config Path* - Currently the bundling config is stored as node in repository itself under {{/jcr:system/documentstore/bundlor}}. Should that be final name. Any steps needs to be taken to make it secure # *Wildcard Support* - Design has support for wildcard in bundling pattern. Should we allow that or restrict that for initial release # *Boostrapping default config* - Per default we should ship with a bundling pattern for {{nt:file}}. Logic for that is implemented in {{BundlingConfigInitializer}}. How should that be registered with Oak. For test it is getting invoked from within {{OakMongoNSRepositoryStub}}. For production setup how should this initializer be registered. One approach would be to expose it as OSGi service and then have a new {{WhiteboardRepositoryInitializer}} implementation and have that registered with Oak class [~mreutegg] [~catholicon] Please review the feature patch and provide feedback so that it can be merged to trunk! Patch is big but quite a bit of stuff is around test. Key parts are changes in {{DocumentNodeState}} , {{CommitDiff}}, {{Commit}} and {{BundlingHandler}} -*Update* - Hold on for review as some conflicts are seen with this feature enabled and package installation. Would ping back once analyzed that- All issues figured out (last one pending approach review) So review can be done now was (Author: chetanm): Planned feature work is now done and [patch|^OAK-1312-review-v1.diff] is ready for review. h3. Implementation Details Some details are provided [above|#UsageandConfiguration]. h4. Commit Side Changes {{CommitDiff}} obtains a {{BundlingHandler}} from {{DocumentNodeStore}} which takes care of bundling relative node. For any new node getting added it looks up {{DocumentBundlor}} from {{BundledTypesRegistry}} based on its primary type or mixin. {{DocumentBundlor}} generates {{Matcher}} which determine if give nodestate needs to be bundled. As {{CommitDiff}} traverses downwards new matchers get generated from parent matcher. * The pattern itself is saved as part of nodestate. * hasChildren support - Children status is managed separately for bundled and non bundled child node. This is later used to optimize calls around child access in {{DocumentNodeState}} ** For each bundled node {{:doc-has-child-bundled}} property is set to true to indicate that parent node has a bundled child ** For each non bundled node {{:doc-has-child-non-bundled}} property is set to true to indicate that parent node has a non bundled node. h4. Bundling Config Bundling as a whole feature is controlled via {{bundlingEnabled}} flag on {{DocumentNodeStoreService}}. If the flag is disabled then bundling would be disabled for *new nodes only*. The bundling config is stored in repository and {{BundlingConfigHandler}} observes any changes around that and refreshes the {{BundledTypesRegistry}} in case of any change h4. Reading Side Change On reading side {{DocumentNodeState}} would construct a {{BundlingContext}}. In case a bundling pattern is found then bundling context would filter out properties as per current node. For any child lookup it would determine if the child is bundled then it would construct a {{DocumentNodeState}} instance from properties of bundling root. For listing of child node it would provide a merge iterator of bundled and non bundled nodes. In case it can be determined that all nodes are bundled then it would avoid the call to DocumentNodeStore h3. Open Question # *Config Path* - Currently the bundling config is stored as node in repository itself under {{/jcr:system/documentstore/bundlor}}. Should that be final name. Any steps needs to be taken to make it secure # *Wildcard Support* - Design has support for wildcard in bundling pattern. Should we allow that or restrict that for initial release # *Boostrapping default config* - Per default we should ship with a bundling pattern for {{nt:file}}. Logic for that is implemented in {{BundlingConfigInitializer}}. How should that be registered with Oak. For test it is getting invoked from within {{OakMongoNSRepositoryStub}}. For production setup how should this initializer be registered. One approach would be to expose it as OSGi service and then have a new {{WhiteboardRepositoryInitializer}} implementation and have that registered with Oak class [~mreutegg] [~catholicon] Please review the feature patch and provide feedback so that it can be merged to trunk! Patch is big but quite a bit of stuff is around test. Key parts are changes in {{DocumentNodeState}} , {{CommitDiff}}, {{Commit}} and {{BundlingHandler}} *Update* - Hold on for review as some conflicts are seen with this feature enabled and package installation. Would ping back once analyzed that > Bundle nodes into a document > ---------------------------- > > Key: OAK-1312 > URL: https://issues.apache.org/jira/browse/OAK-1312 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, documentmk > Reporter: Marcel Reutegger > Assignee: Chetan Mehrotra > Labels: performance > Fix For: 1.6 > > Attachments: OAK-1312-meta-prop-handling.patch, > OAK-1312-review-v1.diff, OAK-1312-review-v2.diff > > > For very fine grained content with many nodes and only few properties per > node it would be more efficient to bundle multiple nodes into a single > MongoDB document. Mostly reading would benefit because there are less > roundtrips to the backend. At the same time storage footprint would be lower > because metadata overhead is per document. > Feature branch - > https://github.com/chetanmeh/jackrabbit-oak/compare/trunk...chetanmeh:OAK-1312 -- This message was sent by Atlassian JIRA (v6.3.4#6332)