[ 
https://issues.apache.org/jira/browse/OAK-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387241#comment-15387241
 ] 

Chetan Mehrotra edited comment on OAK-1312 at 10/21/16 6:23 AM:
----------------------------------------------------------------

{anchor:config}
h3. Usage and Configuration

Bundling definitions are defined in NodeStore as NodeState under 
'/jcr:system/rep:documentStore/bundlor'. The definition structure consist of

{noformat}
+ <node type name>
  - pattern - multi
{noformat}

So have node with name same as nodeType based on which bundling rules have to 
be applied. That node needs to have a {{pattern}} multi value string property 
which
defines path patterns which needs to be bundle. The path pattern can be

* exact - e.g. 'jcr:content' or 'jcr:content/metadata' - Indicates that child 
node jcr:content needs to be bundled
* pattern - e.g. 'jcr:content/\*' - Indicates that child node jcr:content and 
*all* its child node needs to be bundled

{noformat}
jcr:system
  documentstore
    bundlor
      app:Asset{pattern = [jcr:content/metadata, jcr:content/renditions, 
jcr:content/renditions/**, jcr:content]}
      nt:file{pattern = [jcr:content]}
{noformat} 

Above config defines pattern for nt:file and app:Asset

h3. Storage Format

As part of bundling any node which gets bundles is stored as relative property 
in root bundling node.

For e.g. given a nt:file
{noformat}
+ book.jpg (nt:file)
  + jcr:content
     - jcr:data
{noformat}

And pattern
{noformat}
+ jcr:system/documentstore/bundlor
  + nt:file
    - pattern - [jcr:content]
{noformat}

Is stored as 
{code:javascript}
{
  "_id": "2:/test/book.jpg",
  "_modified": 1469080015,
  "_commitRoot": {"r1560bfe1650-0-1": "0"},
  "_deleted": {"r1560bfe1650-0-1": "false"},

  ":pattern": {"r1560bfe1650-0-1": "[\"str:jcr:content\"]"},

  "jcr:primaryType": {"r1560bfe1650-0-1": "\"nt:file\""}

  "jcr:content/:self": {"r1560bfe1650-0-1": "true"},
  "jcr:content/jcr:data": {"r1560bfe1650-0-1": "\"bar\""},
}
{code}

In above format
* {{:pattern}} - Special property which stores the pattern used at time of 
bundling
* {{jcr:content/:self}} - This is a marker property to record that jcr:content 
node is bundled
* {{jcr:content/jcr:data}} - Property at book.jpg/jcr:content/@jcr:data stored 
as relative property

The bundling format is captured at time of addition i.e. creation of nodes and 
then reused for later changes. So if the pattern definition gets changed later 
it would not impact already bundled nodes. At time of deletion all such 
relative properties would be set to null

Another example

For e.g. given a app:Asset
{noformat}
/content//banner.png
  - jcr:primaryType = "app:Asset"
  + jcr:content
    - jcr:primaryType = "app:AssetContent"
    + metadata
      - status = "published"
      + xmp
        + 1
          - softwareAgent = "Adobe Photoshop"
          - author = "David"
    + renditions (nt:folder)
      + original (nt:file)
        + jcr:content
          - jcr:data = ...
    + comments (nt:folder)
{noformat}

And pattern
{noformat}
+ jcr:system/documentstore/bundlor
  + app:Asset
    - pattern - [jcr:content/metadata, jcr:content/renditions, 
jcr:content/renditions/**, jcr:content]
{noformat}

Is stored as 
{code:javascript}
{
  
  "_children": true,
  "_modified": 1469081925,
  "_id": "2:/test/book.jpg",
  "_commitRoot": {"r1560c1b3db8-0-1": "0"},
  "_deleted": {"r1560c1b3db8-0-1": "false"},

  ":pattern": {
    "r1560c1b3db8-0-1": 
"[\"str:jcr:content/metadata\",\"str:jcr:content/renditions\",\"str:jcr:content/renditions/**\",\"str:jcr:content\"]"
  },

  
  "jcr:primaryType": {"r1560c1b3db8-0-1": "\"str:app:Asset\""},

  //Relative node jcr:content
  "jcr:content/:self": {"r1560c1b3db8-0-1": "true"},
  "jcr:content/jcr:primaryType": {"r1560c1b3db8-0-1": 
"\"nam:oak:Unstructured\""},

  //Relative node jcr:content/metadata
  "jcr:content/metadata/:self": {"r1560c1b3db8-0-1": "true" },
  "jcr:content/metadata/status": {"r1560c1b3db8-0-1": "\"published\""},
  "jcr:content/metadata/jcr:primaryType": {"r1560c1b3db8-0-1": 
"\"nam:oak:Unstructured\""},
  
  //Relative node jcr:content/renditions
  "jcr:content/renditions/:self": {"r1560c1b3db8-0-1": "true"},
  "jcr:content/renditions/jcr:primaryType": {"r1560c1b3db8-0-1": 
"\"nam:nt:folder\""},

  //Relative node jcr:content/renditions/original
  "jcr:content/renditions/original/:self": {"r1560c1b3db8-0-1": "true"}
  "jcr:content/renditions/original/jcr:primaryType": {"r1560c1b3db8-0-1": 
"\"nam:nt:file\""},

  //Relative node jcr:content/renditions/original/jcr:content
  "jcr:content/renditions/original/jcr:content/:self": {"r1560c1b3db8-0-1": 
"true"},
  "jcr:content/renditions/original/jcr:content/jcr:primaryType": 
{"r1560c1b3db8-0-1": "\"nam:nt:resource\""},
  "jcr:content/renditions/original/jcr:content/jcr:data": {"r1560c1b3db8-0-1": 
"\"<data>\""},
}
{code}



was (Author: chetanm):
{anchor:config}
h3. Usage and Configuration

Bundling definitions are defined in NodeStore as NodeState under 
'/jcr:system/documentstore/bundlor'. The definition structure consist of

{noformat}
+ <node type name>
  - pattern - multi
{noformat}

So have node with name same as nodeType based on which bundling rules have to 
be applied. That node needs to have a {{pattern}} multi value string property 
which
defines path patterns which needs to be bundle. The path pattern can be

* exact - e.g. 'jcr:content' or 'jcr:content/metadata' - Indicates that child 
node jcr:content needs to be bundled
* pattern - e.g. 'jcr:content/\*' - Indicates that child node jcr:content and 
*all* its child node needs to be bundled

{noformat}
jcr:system
  documentstore
    bundlor
      app:Asset{pattern = [jcr:content/metadata, jcr:content/renditions, 
jcr:content/renditions/**, jcr:content]}
      nt:file{pattern = [jcr:content]}
{noformat} 

Above config defines pattern for nt:file and app:Asset

h3. Storage Format

As part of bundling any node which gets bundles is stored as relative property 
in root bundling node.

For e.g. given a nt:file
{noformat}
+ book.jpg (nt:file)
  + jcr:content
     - jcr:data
{noformat}

And pattern
{noformat}
+ jcr:system/documentstore/bundlor
  + nt:file
    - pattern - [jcr:content]
{noformat}

Is stored as 
{code:javascript}
{
  "_id": "2:/test/book.jpg",
  "_modified": 1469080015,
  "_commitRoot": {"r1560bfe1650-0-1": "0"},
  "_deleted": {"r1560bfe1650-0-1": "false"},

  ":pattern": {"r1560bfe1650-0-1": "[\"str:jcr:content\"]"},

  "jcr:primaryType": {"r1560bfe1650-0-1": "\"nt:file\""}

  "jcr:content/:self": {"r1560bfe1650-0-1": "true"},
  "jcr:content/jcr:data": {"r1560bfe1650-0-1": "\"bar\""},
}
{code}

In above format
* {{:pattern}} - Special property which stores the pattern used at time of 
bundling
* {{jcr:content/:self}} - This is a marker property to record that jcr:content 
node is bundled
* {{jcr:content/jcr:data}} - Property at book.jpg/jcr:content/@jcr:data stored 
as relative property

The bundling format is captured at time of addition i.e. creation of nodes and 
then reused for later changes. So if the pattern definition gets changed later 
it would not impact already bundled nodes. At time of deletion all such 
relative properties would be set to null

Another example

For e.g. given a app:Asset
{noformat}
/content//banner.png
  - jcr:primaryType = "app:Asset"
  + jcr:content
    - jcr:primaryType = "app:AssetContent"
    + metadata
      - status = "published"
      + xmp
        + 1
          - softwareAgent = "Adobe Photoshop"
          - author = "David"
    + renditions (nt:folder)
      + original (nt:file)
        + jcr:content
          - jcr:data = ...
    + comments (nt:folder)
{noformat}

And pattern
{noformat}
+ jcr:system/documentstore/bundlor
  + app:Asset
    - pattern - [jcr:content/metadata, jcr:content/renditions, 
jcr:content/renditions/**, jcr:content]
{noformat}

Is stored as 
{code:javascript}
{
  
  "_children": true,
  "_modified": 1469081925,
  "_id": "2:/test/book.jpg",
  "_commitRoot": {"r1560c1b3db8-0-1": "0"},
  "_deleted": {"r1560c1b3db8-0-1": "false"},

  ":pattern": {
    "r1560c1b3db8-0-1": 
"[\"str:jcr:content/metadata\",\"str:jcr:content/renditions\",\"str:jcr:content/renditions/**\",\"str:jcr:content\"]"
  },

  
  "jcr:primaryType": {"r1560c1b3db8-0-1": "\"str:app:Asset\""},

  //Relative node jcr:content
  "jcr:content/:self": {"r1560c1b3db8-0-1": "true"},
  "jcr:content/jcr:primaryType": {"r1560c1b3db8-0-1": 
"\"nam:oak:Unstructured\""},

  //Relative node jcr:content/metadata
  "jcr:content/metadata/:self": {"r1560c1b3db8-0-1": "true" },
  "jcr:content/metadata/status": {"r1560c1b3db8-0-1": "\"published\""},
  "jcr:content/metadata/jcr:primaryType": {"r1560c1b3db8-0-1": 
"\"nam:oak:Unstructured\""},
  
  //Relative node jcr:content/renditions
  "jcr:content/renditions/:self": {"r1560c1b3db8-0-1": "true"},
  "jcr:content/renditions/jcr:primaryType": {"r1560c1b3db8-0-1": 
"\"nam:nt:folder\""},

  //Relative node jcr:content/renditions/original
  "jcr:content/renditions/original/:self": {"r1560c1b3db8-0-1": "true"}
  "jcr:content/renditions/original/jcr:primaryType": {"r1560c1b3db8-0-1": 
"\"nam:nt:file\""},

  //Relative node jcr:content/renditions/original/jcr:content
  "jcr:content/renditions/original/jcr:content/:self": {"r1560c1b3db8-0-1": 
"true"},
  "jcr:content/renditions/original/jcr:content/jcr:primaryType": 
{"r1560c1b3db8-0-1": "\"nam:nt:resource\""},
  "jcr:content/renditions/original/jcr:content/jcr:data": {"r1560c1b3db8-0-1": 
"\"<data>\""},
}
{code}


> Bundle nodes into a document
> ----------------------------
>
>                 Key: OAK-1312
>                 URL: https://issues.apache.org/jira/browse/OAK-1312
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: core, documentmk
>            Reporter: Marcel Reutegger
>            Assignee: Chetan Mehrotra
>              Labels: docs-impacting, performance
>             Fix For: 1.6, 1.5.13
>
>         Attachments: OAK-1312-meta-prop-handling.patch, 
> OAK-1312-review-v1.diff, OAK-1312-review-v2.diff, benchmark-result-db2.txt, 
> benchmark-result-postgres.txt, benchmark-results.txt, run-benchmark.sh
>
>
> For very fine grained content with many nodes and only few properties per 
> node it would be more efficient to bundle multiple nodes into a single 
> MongoDB document. Mostly reading would benefit because there are less 
> roundtrips to the backend. At the same time storage footprint would be lower 
> because metadata overhead is per document.
> Feature branch - 
> https://github.com/chetanmeh/jackrabbit-oak/compare/trunk...chetanmeh:OAK-1312



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to