[ https://issues.apache.org/jira/browse/OAK-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387241#comment-15387241 ]
Chetan Mehrotra edited comment on OAK-1312 at 10/21/16 6:23 AM: ---------------------------------------------------------------- {anchor:config} h3. Usage and Configuration Bundling definitions are defined in NodeStore as NodeState under '/jcr:system/rep:documentStore/bundlor'. The definition structure consist of {noformat} + <node type name> - pattern - multi {noformat} So have node with name same as nodeType based on which bundling rules have to be applied. That node needs to have a {{pattern}} multi value string property which defines path patterns which needs to be bundle. The path pattern can be * exact - e.g. 'jcr:content' or 'jcr:content/metadata' - Indicates that child node jcr:content needs to be bundled * pattern - e.g. 'jcr:content/\*' - Indicates that child node jcr:content and *all* its child node needs to be bundled {noformat} jcr:system documentstore bundlor app:Asset{pattern = [jcr:content/metadata, jcr:content/renditions, jcr:content/renditions/**, jcr:content]} nt:file{pattern = [jcr:content]} {noformat} Above config defines pattern for nt:file and app:Asset h3. Storage Format As part of bundling any node which gets bundles is stored as relative property in root bundling node. For e.g. given a nt:file {noformat} + book.jpg (nt:file) + jcr:content - jcr:data {noformat} And pattern {noformat} + jcr:system/documentstore/bundlor + nt:file - pattern - [jcr:content] {noformat} Is stored as {code:javascript} { "_id": "2:/test/book.jpg", "_modified": 1469080015, "_commitRoot": {"r1560bfe1650-0-1": "0"}, "_deleted": {"r1560bfe1650-0-1": "false"}, ":pattern": {"r1560bfe1650-0-1": "[\"str:jcr:content\"]"}, "jcr:primaryType": {"r1560bfe1650-0-1": "\"nt:file\""} "jcr:content/:self": {"r1560bfe1650-0-1": "true"}, "jcr:content/jcr:data": {"r1560bfe1650-0-1": "\"bar\""}, } {code} In above format * {{:pattern}} - Special property which stores the pattern used at time of bundling * {{jcr:content/:self}} - This is a marker property to record that jcr:content node is bundled * {{jcr:content/jcr:data}} - Property at book.jpg/jcr:content/@jcr:data stored as relative property The bundling format is captured at time of addition i.e. creation of nodes and then reused for later changes. So if the pattern definition gets changed later it would not impact already bundled nodes. At time of deletion all such relative properties would be set to null Another example For e.g. given a app:Asset {noformat} /content//banner.png - jcr:primaryType = "app:Asset" + jcr:content - jcr:primaryType = "app:AssetContent" + metadata - status = "published" + xmp + 1 - softwareAgent = "Adobe Photoshop" - author = "David" + renditions (nt:folder) + original (nt:file) + jcr:content - jcr:data = ... + comments (nt:folder) {noformat} And pattern {noformat} + jcr:system/documentstore/bundlor + app:Asset - pattern - [jcr:content/metadata, jcr:content/renditions, jcr:content/renditions/**, jcr:content] {noformat} Is stored as {code:javascript} { "_children": true, "_modified": 1469081925, "_id": "2:/test/book.jpg", "_commitRoot": {"r1560c1b3db8-0-1": "0"}, "_deleted": {"r1560c1b3db8-0-1": "false"}, ":pattern": { "r1560c1b3db8-0-1": "[\"str:jcr:content/metadata\",\"str:jcr:content/renditions\",\"str:jcr:content/renditions/**\",\"str:jcr:content\"]" }, "jcr:primaryType": {"r1560c1b3db8-0-1": "\"str:app:Asset\""}, //Relative node jcr:content "jcr:content/:self": {"r1560c1b3db8-0-1": "true"}, "jcr:content/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:oak:Unstructured\""}, //Relative node jcr:content/metadata "jcr:content/metadata/:self": {"r1560c1b3db8-0-1": "true" }, "jcr:content/metadata/status": {"r1560c1b3db8-0-1": "\"published\""}, "jcr:content/metadata/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:oak:Unstructured\""}, //Relative node jcr:content/renditions "jcr:content/renditions/:self": {"r1560c1b3db8-0-1": "true"}, "jcr:content/renditions/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:nt:folder\""}, //Relative node jcr:content/renditions/original "jcr:content/renditions/original/:self": {"r1560c1b3db8-0-1": "true"} "jcr:content/renditions/original/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:nt:file\""}, //Relative node jcr:content/renditions/original/jcr:content "jcr:content/renditions/original/jcr:content/:self": {"r1560c1b3db8-0-1": "true"}, "jcr:content/renditions/original/jcr:content/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:nt:resource\""}, "jcr:content/renditions/original/jcr:content/jcr:data": {"r1560c1b3db8-0-1": "\"<data>\""}, } {code} was (Author: chetanm): {anchor:config} h3. Usage and Configuration Bundling definitions are defined in NodeStore as NodeState under '/jcr:system/documentstore/bundlor'. The definition structure consist of {noformat} + <node type name> - pattern - multi {noformat} So have node with name same as nodeType based on which bundling rules have to be applied. That node needs to have a {{pattern}} multi value string property which defines path patterns which needs to be bundle. The path pattern can be * exact - e.g. 'jcr:content' or 'jcr:content/metadata' - Indicates that child node jcr:content needs to be bundled * pattern - e.g. 'jcr:content/\*' - Indicates that child node jcr:content and *all* its child node needs to be bundled {noformat} jcr:system documentstore bundlor app:Asset{pattern = [jcr:content/metadata, jcr:content/renditions, jcr:content/renditions/**, jcr:content]} nt:file{pattern = [jcr:content]} {noformat} Above config defines pattern for nt:file and app:Asset h3. Storage Format As part of bundling any node which gets bundles is stored as relative property in root bundling node. For e.g. given a nt:file {noformat} + book.jpg (nt:file) + jcr:content - jcr:data {noformat} And pattern {noformat} + jcr:system/documentstore/bundlor + nt:file - pattern - [jcr:content] {noformat} Is stored as {code:javascript} { "_id": "2:/test/book.jpg", "_modified": 1469080015, "_commitRoot": {"r1560bfe1650-0-1": "0"}, "_deleted": {"r1560bfe1650-0-1": "false"}, ":pattern": {"r1560bfe1650-0-1": "[\"str:jcr:content\"]"}, "jcr:primaryType": {"r1560bfe1650-0-1": "\"nt:file\""} "jcr:content/:self": {"r1560bfe1650-0-1": "true"}, "jcr:content/jcr:data": {"r1560bfe1650-0-1": "\"bar\""}, } {code} In above format * {{:pattern}} - Special property which stores the pattern used at time of bundling * {{jcr:content/:self}} - This is a marker property to record that jcr:content node is bundled * {{jcr:content/jcr:data}} - Property at book.jpg/jcr:content/@jcr:data stored as relative property The bundling format is captured at time of addition i.e. creation of nodes and then reused for later changes. So if the pattern definition gets changed later it would not impact already bundled nodes. At time of deletion all such relative properties would be set to null Another example For e.g. given a app:Asset {noformat} /content//banner.png - jcr:primaryType = "app:Asset" + jcr:content - jcr:primaryType = "app:AssetContent" + metadata - status = "published" + xmp + 1 - softwareAgent = "Adobe Photoshop" - author = "David" + renditions (nt:folder) + original (nt:file) + jcr:content - jcr:data = ... + comments (nt:folder) {noformat} And pattern {noformat} + jcr:system/documentstore/bundlor + app:Asset - pattern - [jcr:content/metadata, jcr:content/renditions, jcr:content/renditions/**, jcr:content] {noformat} Is stored as {code:javascript} { "_children": true, "_modified": 1469081925, "_id": "2:/test/book.jpg", "_commitRoot": {"r1560c1b3db8-0-1": "0"}, "_deleted": {"r1560c1b3db8-0-1": "false"}, ":pattern": { "r1560c1b3db8-0-1": "[\"str:jcr:content/metadata\",\"str:jcr:content/renditions\",\"str:jcr:content/renditions/**\",\"str:jcr:content\"]" }, "jcr:primaryType": {"r1560c1b3db8-0-1": "\"str:app:Asset\""}, //Relative node jcr:content "jcr:content/:self": {"r1560c1b3db8-0-1": "true"}, "jcr:content/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:oak:Unstructured\""}, //Relative node jcr:content/metadata "jcr:content/metadata/:self": {"r1560c1b3db8-0-1": "true" }, "jcr:content/metadata/status": {"r1560c1b3db8-0-1": "\"published\""}, "jcr:content/metadata/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:oak:Unstructured\""}, //Relative node jcr:content/renditions "jcr:content/renditions/:self": {"r1560c1b3db8-0-1": "true"}, "jcr:content/renditions/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:nt:folder\""}, //Relative node jcr:content/renditions/original "jcr:content/renditions/original/:self": {"r1560c1b3db8-0-1": "true"} "jcr:content/renditions/original/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:nt:file\""}, //Relative node jcr:content/renditions/original/jcr:content "jcr:content/renditions/original/jcr:content/:self": {"r1560c1b3db8-0-1": "true"}, "jcr:content/renditions/original/jcr:content/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:nt:resource\""}, "jcr:content/renditions/original/jcr:content/jcr:data": {"r1560c1b3db8-0-1": "\"<data>\""}, } {code} > Bundle nodes into a document > ---------------------------- > > Key: OAK-1312 > URL: https://issues.apache.org/jira/browse/OAK-1312 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core, documentmk > Reporter: Marcel Reutegger > Assignee: Chetan Mehrotra > Labels: docs-impacting, performance > Fix For: 1.6, 1.5.13 > > Attachments: OAK-1312-meta-prop-handling.patch, > OAK-1312-review-v1.diff, OAK-1312-review-v2.diff, benchmark-result-db2.txt, > benchmark-result-postgres.txt, benchmark-results.txt, run-benchmark.sh > > > For very fine grained content with many nodes and only few properties per > node it would be more efficient to bundle multiple nodes into a single > MongoDB document. Mostly reading would benefit because there are less > roundtrips to the backend. At the same time storage footprint would be lower > because metadata overhead is per document. > Feature branch - > https://github.com/chetanmeh/jackrabbit-oak/compare/trunk...chetanmeh:OAK-1312 -- This message was sent by Atlassian JIRA (v6.3.4#6332)