[jira] [Updated] (OAK-4802) Basic cache consistency test on exception
[ https://issues.apache.org/jira/browse/OAK-4802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke updated OAK-4802: Labels: candidate_oak_1_0 candidate_oak_1_2 candidate_oak_1_4 resilience (was: ) > Basic cache consistency test on exception > - > > Key: OAK-4802 > URL: https://issues.apache.org/jira/browse/OAK-4802 > Project: Jackrabbit Oak > Issue Type: Test > Components: core, documentmk >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger >Priority: Minor > Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4, > resilience > Fix For: 1.6, 1.5.11 > > > OAK-4774 and OAK-4793 aim to check if the cache behaviour of a DocumentStore > implementation when the underlying backend throws an exception even though > the operation succeeded. E.g. the response cannot be sent back because of a > network issue. > This issue will provide the DocumentStore independent part of those tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4793) Check usage of DocumentStoreException in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke updated OAK-4793: Labels: candidate_oak_1_0 candidate_oak_1_2 candidate_oak_1_4 resilience (was: ) > Check usage of DocumentStoreException in RDBDocumentStore > - > > Key: OAK-4793 > URL: https://issues.apache.org/jira/browse/OAK-4793 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core, rdbmk >Reporter: Marcel Reutegger >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4, > resilience > Fix For: 1.6, 1.5.11 > > Attachments: OAK-4793-1.diff, OAK-4793-2.diff, OAK-4793.diff, > OAK-4793.diff > > > With OAK-4771 the usage of DocumentStoreException was clarified in the > DocumentStore interface. The purpose of this task is to check usage of the > DocumentStoreException in RDBDocumentStore and make sure JDBC driver specific > exceptions are handled consistently and wrapped in a DocumentStoreException. > At the same time, cache consistency needs to be checked as well in case of a > driver exception. E.g. invalidate if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4043) Oak run checkpoints needs to account for multiple index lanes
[ https://issues.apache.org/jira/browse/OAK-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra updated OAK-4043: - Priority: Blocker (was: Critical) > Oak run checkpoints needs to account for multiple index lanes > - > > Key: OAK-4043 > URL: https://issues.apache.org/jira/browse/OAK-4043 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core, run >Reporter: Alex Parvulescu >Assignee: Davide Giannella >Priority: Blocker > Labels: candidate_oak_1_4 > Fix For: 1.6, 1.5.11 > > > Oak run {{checkpoints rm-unreferenced}} [0] currently is hardcoded on a > single checkpoint reference (the default one). Now is it possible to add > multiple lanes, which we already did in AEM, but the checkpoint tool is > blissfully unaware of this and it might trigger a full reindex following > offline compaction. > This needs fixing before the big 1.4 release, so I'm marking it as a blocker. > fyi [~edivad], [~chetanm] > [0] https://github.com/apache/jackrabbit-oak/tree/trunk/oak-run#checkpoints -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-4043) Oak run checkpoints needs to account for multiple index lanes
[ https://issues.apache.org/jira/browse/OAK-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495519#comment-15495519 ] Chetan Mehrotra edited comment on OAK-4043 at 9/16/16 6:23 AM: --- I would prefer to avoid any change in existing names. Probably we can follow the convention of such names having prefix of {{async}} (possibly add a check in Oak also to enforce that) and then figure out the list of checkpoints based on that. Or you can possibly consider any string property in {{:async}} to be a possible checkpoint to determine match! Given the impact of this issue marking this as blocker for next dot release was (Author: chetanm): I would prefer to avoid any change in existing names. Probably we can follow the convention of such names having prefix of {{async}} (possibly add a check in Oak also to enforce that) and then figure out the list of checkpoints based on that. Or you can possibly consider any string property in {{:async}} to be a possible checkpoint to determine match! > Oak run checkpoints needs to account for multiple index lanes > - > > Key: OAK-4043 > URL: https://issues.apache.org/jira/browse/OAK-4043 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core, run >Reporter: Alex Parvulescu >Assignee: Davide Giannella >Priority: Blocker > Labels: candidate_oak_1_4 > Fix For: 1.6, 1.5.11 > > > Oak run {{checkpoints rm-unreferenced}} [0] currently is hardcoded on a > single checkpoint reference (the default one). Now is it possible to add > multiple lanes, which we already did in AEM, but the checkpoint tool is > blissfully unaware of this and it might trigger a full reindex following > offline compaction. > This needs fixing before the big 1.4 release, so I'm marking it as a blocker. > fyi [~edivad], [~chetanm] > [0] https://github.com/apache/jackrabbit-oak/tree/trunk/oak-run#checkpoints -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4043) Oak run checkpoints needs to account for multiple index lanes
[ https://issues.apache.org/jira/browse/OAK-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra updated OAK-4043: - Fix Version/s: 1.5.11 > Oak run checkpoints needs to account for multiple index lanes > - > > Key: OAK-4043 > URL: https://issues.apache.org/jira/browse/OAK-4043 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core, run >Reporter: Alex Parvulescu >Assignee: Davide Giannella >Priority: Critical > Labels: candidate_oak_1_4 > Fix For: 1.6, 1.5.11 > > > Oak run {{checkpoints rm-unreferenced}} [0] currently is hardcoded on a > single checkpoint reference (the default one). Now is it possible to add > multiple lanes, which we already did in AEM, but the checkpoint tool is > blissfully unaware of this and it might trigger a full reindex following > offline compaction. > This needs fixing before the big 1.4 release, so I'm marking it as a blocker. > fyi [~edivad], [~chetanm] > [0] https://github.com/apache/jackrabbit-oak/tree/trunk/oak-run#checkpoints -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4043) Oak run checkpoints needs to account for multiple index lanes
[ https://issues.apache.org/jira/browse/OAK-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495519#comment-15495519 ] Chetan Mehrotra commented on OAK-4043: -- I would prefer to avoid any change in existing names. Probably we can follow the convention of such names having prefix of {{async}} (possibly add a check in Oak also to enforce that) and then figure out the list of checkpoints based on that. Or you can possibly consider any string property in {{:async}} to be a possible checkpoint to determine match! > Oak run checkpoints needs to account for multiple index lanes > - > > Key: OAK-4043 > URL: https://issues.apache.org/jira/browse/OAK-4043 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core, run >Reporter: Alex Parvulescu >Assignee: Davide Giannella >Priority: Critical > Labels: candidate_oak_1_4 > Fix For: 1.6 > > > Oak run {{checkpoints rm-unreferenced}} [0] currently is hardcoded on a > single checkpoint reference (the default one). Now is it possible to add > multiple lanes, which we already did in AEM, but the checkpoint tool is > blissfully unaware of this and it might trigger a full reindex following > offline compaction. > This needs fixing before the big 1.4 release, so I'm marking it as a blocker. > fyi [~edivad], [~chetanm] > [0] https://github.com/apache/jackrabbit-oak/tree/trunk/oak-run#checkpoints -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4043) Oak run checkpoints needs to account for multiple index lanes
[ https://issues.apache.org/jira/browse/OAK-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra updated OAK-4043: - Labels: candidate_oak_1_4 (was: ) > Oak run checkpoints needs to account for multiple index lanes > - > > Key: OAK-4043 > URL: https://issues.apache.org/jira/browse/OAK-4043 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core, run >Reporter: Alex Parvulescu >Assignee: Davide Giannella >Priority: Critical > Labels: candidate_oak_1_4 > Fix For: 1.6 > > > Oak run {{checkpoints rm-unreferenced}} [0] currently is hardcoded on a > single checkpoint reference (the default one). Now is it possible to add > multiple lanes, which we already did in AEM, but the checkpoint tool is > blissfully unaware of this and it might trigger a full reindex following > offline compaction. > This needs fixing before the big 1.4 release, so I'm marking it as a blocker. > fyi [~edivad], [~chetanm] > [0] https://github.com/apache/jackrabbit-oak/tree/trunk/oak-run#checkpoints -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2498) Root record references provide too little context for parsing a segment
[ https://issues.apache.org/jira/browse/OAK-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493751#comment-15493751 ] Michael Dürig commented on OAK-2498: For OAK-4740 we also need to be able to identify references to external binaries. More generally I would suggest we specialise the VALUE type to the different types of values. > Root record references provide too little context for parsing a segment > --- > > Key: OAK-2498 > URL: https://issues.apache.org/jira/browse/OAK-2498 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Michael Dürig >Assignee: Andrei Dulceanu > Labels: tools > Fix For: Segment Tar 0.0.14 > > > According to the [documentation | > http://jackrabbit.apache.org/oak/docs/nodestore/segmentmk.html] the root > record references in a segment header provide enough context for parsing all > records within this segment without any external information. > Turns out this is not true: if a root record reference turns e.g. to a list > record. The items in that list are record ids of unknown type. So even though > those records might live in the same segment, we can't parse them as we don't > know their type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4740) TarReader recovery skips generating the index and binary graphs
[ https://issues.apache.org/jira/browse/OAK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493743#comment-15493743 ] Michael Dürig commented on OAK-4740: Ok got it. As you mentioned before we best wait with this for OAK-2498 as this will simplify regenerating the binary references quite a bit. > TarReader recovery skips generating the index and binary graphs > --- > > Key: OAK-4740 > URL: https://issues.apache.org/jira/browse/OAK-4740 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Alex Parvulescu >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.16 > > > As noticed from the tar recovery bits [0] the resulting tar file would lack > the binary reference graph and index graph. This has implications on the DSGC > (not properly reporting binary references would result in binaries being > GC'ed) and GC operations. > / cc [~frm], [~mduerig] > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L216 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-4793) Check usage of DocumentStoreException in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke resolved OAK-4793. - Resolution: Fixed Fix Version/s: 1.5.11 trunk: [r1760946|http://svn.apache.org/r1760946] > Check usage of DocumentStoreException in RDBDocumentStore > - > > Key: OAK-4793 > URL: https://issues.apache.org/jira/browse/OAK-4793 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core, rdbmk >Reporter: Marcel Reutegger >Assignee: Julian Reschke >Priority: Minor > Fix For: 1.6, 1.5.11 > > Attachments: OAK-4793-1.diff, OAK-4793-2.diff, OAK-4793.diff, > OAK-4793.diff > > > With OAK-4771 the usage of DocumentStoreException was clarified in the > DocumentStore interface. The purpose of this task is to check usage of the > DocumentStoreException in RDBDocumentStore and make sure JDBC driver specific > exceptions are handled consistently and wrapped in a DocumentStoreException. > At the same time, cache consistency needs to be checked as well in case of a > driver exception. E.g. invalidate if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4811) MongoToMongoFbsTest fails
[ https://issues.apache.org/jira/browse/OAK-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493526#comment-15493526 ] Marcel Reutegger commented on OAK-4811: --- I see, then I apologize for the rant. It was just rather annoying to analyze what's going wrong here... > MongoToMongoFbsTest fails > - > > Key: OAK-4811 > URL: https://issues.apache.org/jira/browse/OAK-4811 > Project: Jackrabbit Oak > Issue Type: Bug > Components: upgrade >Affects Versions: 1.4.7 >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.4.8 > > > The test fails in the current 1.4 branch and also with the 1.4.7 release when > a local MongoDB is running. > {noformat} > validateMigration(org.apache.jackrabbit.oak.upgrade.cli.MongoToMongoFbsTest) > Time elapsed: 5.628 sec <<< ERROR! > java.lang.IllegalStateException: This builder does not exist: default > {noformat} > The test runs successfully with 1.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4814) Add orderby support for nodename index
[ https://issues.apache.org/jira/browse/OAK-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-4814: Fix Version/s: 1.6 > Add orderby support for nodename index > -- > > Key: OAK-4814 > URL: https://issues.apache.org/jira/browse/OAK-4814 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Affects Versions: 1.5.10 >Reporter: Ankush Malhotra > Fix For: 1.6 > > > In OAK-1752 you have implemented the index support for :nodeName. The JCR > Query explain tool shows that it is used for conditions like equals. > But it is not used for ORDER BY name() . > Is name() supported in order by clause? If yes then we would need to add > support for that in oak-lucene -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4740) TarReader recovery skips generating the index and binary graphs
[ https://issues.apache.org/jira/browse/OAK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493450#comment-15493450 ] Francesco Mari commented on OAK-4740: - [~mduerig], the missing part would be to hook into the recovery process, as invoked from {{TarReader.generateTarFile()}} and implemented by {{TarReader.DEFAULT_TAR_RECOVERY}}, and regenerate the graph and the binary references index by parsing recovered segments and extracting the necessary information. > TarReader recovery skips generating the index and binary graphs > --- > > Key: OAK-4740 > URL: https://issues.apache.org/jira/browse/OAK-4740 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Alex Parvulescu >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.16 > > > As noticed from the tar recovery bits [0] the resulting tar file would lack > the binary reference graph and index graph. This has implications on the DSGC > (not properly reporting binary references would result in binaries being > GC'ed) and GC operations. > / cc [~frm], [~mduerig] > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L216 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4814) Add orderby support for nodename index
[ https://issues.apache.org/jira/browse/OAK-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493429#comment-15493429 ] Ankush Malhotra commented on OAK-4814: -- [~tmueller] [~chetanm] can you please review this one? Thanks. > Add orderby support for nodename index > -- > > Key: OAK-4814 > URL: https://issues.apache.org/jira/browse/OAK-4814 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Affects Versions: 1.5.10 >Reporter: Ankush Malhotra > > In OAK-1752 you have implemented the index support for :nodeName. The JCR > Query explain tool shows that it is used for conditions like equals. > But it is not used for ORDER BY name() . > Is name() supported in order by clause? If yes then we would need to add > support for that in oak-lucene -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-4814) Add orderby support for nodename index
Ankush Malhotra created OAK-4814: Summary: Add orderby support for nodename index Key: OAK-4814 URL: https://issues.apache.org/jira/browse/OAK-4814 Project: Jackrabbit Oak Issue Type: Improvement Components: query Affects Versions: 1.5.10 Reporter: Ankush Malhotra In OAK-1752 you have implemented the index support for :nodeName. The JCR Query explain tool shows that it is used for conditions like equals. But it is not used for ORDER BY name() . Is name() supported in order by clause? If yes then we would need to add support for that in oak-lucene -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-4803) Simplify the client side of the cold standby
[ https://issues.apache.org/jira/browse/OAK-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco Mari resolved OAK-4803. - Resolution: Fixed Fixed at r1760934. > Simplify the client side of the cold standby > > > Key: OAK-4803 > URL: https://issues.apache.org/jira/browse/OAK-4803 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.12 > > Attachments: OAK-4803-01.patch > > > The implementation of the cold standby client is overly and unnecessarily > complicated. It would be way clearer to separate the client code in two major > components: a simple client responsible for sending messages to and receive > responses from the standby server, and the synchronization algorithm used to > read data from the server and to save read data in the local {{FileStore}}. > Moreover, the client simple client could be further modularised by > encapsulating request encoding, response decoding and message handling into > their own Netty handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4811) MongoToMongoFbsTest fails
[ https://issues.apache.org/jira/browse/OAK-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493345#comment-15493345 ] Dominique Jäggi commented on OAK-4811: -- [~mreutegg], i didn't consciously ignore this test. It came with the merged commit. > MongoToMongoFbsTest fails > - > > Key: OAK-4811 > URL: https://issues.apache.org/jira/browse/OAK-4811 > Project: Jackrabbit Oak > Issue Type: Bug > Components: upgrade >Affects Versions: 1.4.7 >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.4.8 > > > The test fails in the current 1.4 branch and also with the 1.4.7 release when > a local MongoDB is running. > {noformat} > validateMigration(org.apache.jackrabbit.oak.upgrade.cli.MongoToMongoFbsTest) > Time elapsed: 5.628 sec <<< ERROR! > java.lang.IllegalStateException: This builder does not exist: default > {noformat} > The test runs successfully with 1.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4740) TarReader recovery skips generating the index and binary graphs
[ https://issues.apache.org/jira/browse/OAK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493322#comment-15493322 ] Michael Dürig commented on OAK-4740: Yes that would also help my understanding what this issue was about initially... is recovery for the binary index completely broken or did we just break it for binaries with > 4k ids? My fix applies to the latter only. > TarReader recovery skips generating the index and binary graphs > --- > > Key: OAK-4740 > URL: https://issues.apache.org/jira/browse/OAK-4740 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Alex Parvulescu >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.16 > > > As noticed from the tar recovery bits [0] the resulting tar file would lack > the binary reference graph and index graph. This has implications on the DSGC > (not properly reporting binary references would result in binaries being > GC'ed) and GC operations. > / cc [~frm], [~mduerig] > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L216 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-4811) MongoToMongoFbsTest fails
[ https://issues.apache.org/jira/browse/OAK-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger resolved OAK-4811. --- Resolution: Fixed With the changes for OAK-4174, the MongoToMongoFbsTest now also succeeds. > MongoToMongoFbsTest fails > - > > Key: OAK-4811 > URL: https://issues.apache.org/jira/browse/OAK-4811 > Project: Jackrabbit Oak > Issue Type: Bug > Components: upgrade >Affects Versions: 1.4.7 >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.4.8 > > > The test fails in the current 1.4 branch and also with the 1.4.7 release when > a local MongoDB is running. > {noformat} > validateMigration(org.apache.jackrabbit.oak.upgrade.cli.MongoToMongoFbsTest) > Time elapsed: 5.628 sec <<< ERROR! > java.lang.IllegalStateException: This builder does not exist: default > {noformat} > The test runs successfully with 1.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4174) SegmentToJdbcTest failing with improvements of OAK-4119
[ https://issues.apache.org/jira/browse/OAK-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger updated OAK-4174: -- Fix Version/s: 1.4.8 Merged into 1.4 branch: http://svn.apache.org/r1760930 > SegmentToJdbcTest failing with improvements of OAK-4119 > > > Key: OAK-4174 > URL: https://issues.apache.org/jira/browse/OAK-4174 > Project: Jackrabbit Oak > Issue Type: Bug > Components: upgrade >Reporter: angela >Assignee: Tomek Rękawek >Priority: Critical > Fix For: 1.6, 1.4.8 > > > Despite the fact that OAK-4128 has been fixed I get a test failure for > SegmentToJdbcTest.validateMigration with my pending improvements from > OAK-4119. > In order not to break the build I will tmp mark the test with @Ignore as > discussed on the mailing list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4811) MongoToMongoFbsTest fails
[ https://issues.apache.org/jira/browse/OAK-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493291#comment-15493291 ] Marcel Reutegger commented on OAK-4811: --- The root cause is a missing fix when changes for OAK-4679 were backported: OAK-4174 is also required. [~dominique.jaeggi], please do not simply ignore tests when you commit changes. At least a comment in the issue would be nice and makes it easier for others to clean up afterwards. > MongoToMongoFbsTest fails > - > > Key: OAK-4811 > URL: https://issues.apache.org/jira/browse/OAK-4811 > Project: Jackrabbit Oak > Issue Type: Bug > Components: upgrade >Affects Versions: 1.4.7 >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.4.8 > > > The test fails in the current 1.4 branch and also with the 1.4.7 release when > a local MongoDB is running. > {noformat} > validateMigration(org.apache.jackrabbit.oak.upgrade.cli.MongoToMongoFbsTest) > Time elapsed: 5.628 sec <<< ERROR! > java.lang.IllegalStateException: This builder does not exist: default > {noformat} > The test runs successfully with 1.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4740) TarReader recovery skips generating the index and binary graphs
[ https://issues.apache.org/jira/browse/OAK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493289#comment-15493289 ] Alex Parvulescu commented on OAK-4740: -- bq. From here I'm not entirely sure what's left to do here. add a test for the recovery bits maybe? :) > TarReader recovery skips generating the index and binary graphs > --- > > Key: OAK-4740 > URL: https://issues.apache.org/jira/browse/OAK-4740 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Alex Parvulescu >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.16 > > > As noticed from the tar recovery bits [0] the resulting tar file would lack > the binary reference graph and index graph. This has implications on the DSGC > (not properly reporting binary references would result in binaries being > GC'ed) and GC operations. > / cc [~frm], [~mduerig] > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L216 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4740) TarReader recovery skips generating the index and binary graphs
[ https://issues.apache.org/jira/browse/OAK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493251#comment-15493251 ] Michael Dürig commented on OAK-4740: Ok, committed that fix at http://svn.apache.org/viewvc?rev=1760927&view=rev. >From here I'm not entirely sure what's left to do here. Does tar regeneration >work already with this fix or is there work left do do in that area? > TarReader recovery skips generating the index and binary graphs > --- > > Key: OAK-4740 > URL: https://issues.apache.org/jira/browse/OAK-4740 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Alex Parvulescu >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.16 > > > As noticed from the tar recovery bits [0] the resulting tar file would lack > the binary reference graph and index graph. This has implications on the DSGC > (not properly reporting binary references would result in binaries being > GC'ed) and GC operations. > / cc [~frm], [~mduerig] > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L216 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-4811) MongoToMongoFbsTest fails
[ https://issues.apache.org/jira/browse/OAK-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492909#comment-15492909 ] Marcel Reutegger edited comment on OAK-4811 at 9/15/16 12:49 PM: - The test starts to fail on the 1.4 branch with changes merged from OAK-4679 in revision http://svn.apache.org/r1756641 However, I don't think those changes are the root cause for the failing test. was (Author: mreutegg): The test starts to fail on the 1.4 branch with changes merged from OAK-4679 in revision svn.apache.org/r1756641 However, I don't think those changes are the root cause for the failing test. > MongoToMongoFbsTest fails > - > > Key: OAK-4811 > URL: https://issues.apache.org/jira/browse/OAK-4811 > Project: Jackrabbit Oak > Issue Type: Bug > Components: upgrade >Affects Versions: 1.4.7 >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.4.8 > > > The test fails in the current 1.4 branch and also with the 1.4.7 release when > a local MongoDB is running. > {noformat} > validateMigration(org.apache.jackrabbit.oak.upgrade.cli.MongoToMongoFbsTest) > Time elapsed: 5.628 sec <<< ERROR! > java.lang.IllegalStateException: This builder does not exist: default > {noformat} > The test runs successfully with 1.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4740) TarReader recovery skips generating the index and binary graphs
[ https://issues.apache.org/jira/browse/OAK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493210#comment-15493210 ] Francesco Mari commented on OAK-4740: - It looks good to me. It seems like a reasonable tradeoff between having a comprehensive binary reference index and being able to recover the index in case of dirty shutdown. > TarReader recovery skips generating the index and binary graphs > --- > > Key: OAK-4740 > URL: https://issues.apache.org/jira/browse/OAK-4740 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Alex Parvulescu >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.16 > > > As noticed from the tar recovery bits [0] the resulting tar file would lack > the binary reference graph and index graph. This has implications on the DSGC > (not properly reporting binary references would result in binaries being > GC'ed) and GC operations. > / cc [~frm], [~mduerig] > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L216 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-4740) TarReader recovery skips generating the index and binary graphs
[ https://issues.apache.org/jira/browse/OAK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493191#comment-15493191 ] Michael Dürig edited comment on OAK-4740 at 9/15/16 12:40 PM: -- Given the realisation that above monotonicity assumption does not hold and the possibly extra complexity wrt. DSGC I started thinking about other ways to fix this. One idea would be to keep the discrimination of binaries ids (smaller / bigger than 4k) and the way they are stored but to change their representation in the binary index introduced with OAK-4201: for binary ids bigger that 4k, what if we just put the record id pointing to the string record containing the blob id into the index (instead of the blob id itself)? This would give us back recoverability. OTOH it would make the index a bit more expensive to use as big binaries would still need an additional resolution step. However, I think this is a good trade off to make as we should discourage binary ids bigger than 4k anyway. See https://github.com/mduerig/jackrabbit-oak/commit/c7ce960a422fa3ae9f5cbe97d1cf05c63988b036 for a POC of this. [~frm], let me know what you think about this. was (Author: mduerig): Given the realisation that above monotonicity assumption does not hold and the possibly extra complexity wrt. DSGC I started thinking about other ways to fix this. One idea would be to keep the discrimination of binaries ids (smaller / bigger than 4k) and the way they are stored but to change their representation in the binary index introduced with OAK-4201: for binary ids bigger that 4k, what if we just put the record id pointing to the string record containing the blob id into the index (instead of the blob id itself)? This would give us back recoverability. OTOH it would make the index a bit more expensive to use as big binaries would still need an additional resolution step. However, I think this is a good trade off to make as we should discourage binary ids bigger than 4k anyway. > TarReader recovery skips generating the index and binary graphs > --- > > Key: OAK-4740 > URL: https://issues.apache.org/jira/browse/OAK-4740 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Alex Parvulescu >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.16 > > > As noticed from the tar recovery bits [0] the resulting tar file would lack > the binary reference graph and index graph. This has implications on the DSGC > (not properly reporting binary references would result in binaries being > GC'ed) and GC operations. > / cc [~frm], [~mduerig] > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L216 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4740) TarReader recovery skips generating the index and binary graphs
[ https://issues.apache.org/jira/browse/OAK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493191#comment-15493191 ] Michael Dürig commented on OAK-4740: Given the realisation that above monotonicity assumption does not hold and the possibly extra complexity wrt. DSGC I started thinking about other ways to fix this. One idea would be to keep the discrimination of binaries ids (smaller / bigger than 4k) and the way they are stored but to change their representation in the binary index introduced with OAK-4201: for binary ids bigger that 4k, what if we just put the record id pointing to the string record containing the blob id into the index (instead of the blob id itself)? This would give us back recoverability. OTOH it would make the index a bit more expensive to use as big binaries would still need an additional resolution step. However, I think this is a good trade off to make as we should discourage binary ids bigger than 4k anyway. > TarReader recovery skips generating the index and binary graphs > --- > > Key: OAK-4740 > URL: https://issues.apache.org/jira/browse/OAK-4740 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Alex Parvulescu >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.16 > > > As noticed from the tar recovery bits [0] the resulting tar file would lack > the binary reference graph and index graph. This has implications on the DSGC > (not properly reporting binary references would result in binaries being > GC'ed) and GC operations. > / cc [~frm], [~mduerig] > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L216 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4740) TarReader recovery skips generating the index and binary graphs
[ https://issues.apache.org/jira/browse/OAK-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493168#comment-15493168 ] Michael Dürig commented on OAK-4740: That threshold means that our assumption about the id of binaries would grow monotonically with the size of the binary doesn't hold. > TarReader recovery skips generating the index and binary graphs > --- > > Key: OAK-4740 > URL: https://issues.apache.org/jira/browse/OAK-4740 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Alex Parvulescu >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.16 > > > As noticed from the tar recovery bits [0] the resulting tar file would lack > the binary reference graph and index graph. This has implications on the DSGC > (not properly reporting binary references would result in binaries being > GC'ed) and GC operations. > / cc [~frm], [~mduerig] > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L216 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-4813) Simplify the server side of cold standby
Andrei Dulceanu created OAK-4813: Summary: Simplify the server side of cold standby Key: OAK-4813 URL: https://issues.apache.org/jira/browse/OAK-4813 Project: Jackrabbit Oak Issue Type: Improvement Components: segment-tar Reporter: Andrei Dulceanu Assignee: Andrei Dulceanu Priority: Minor Fix For: Segment Tar 0.0.12 With the changes introduced in OAK-4803, it would be nice to keep the previous symmetry between the client and server and remove thus the {{FileStore}} reference from the latter. Per [~frm]'s suggestion from one of the comments in OAK-4803: bq. In the end, these are the only three lines where the FileStore is used in the server, which already suggests that this separation of concerns exists - at least at the level of the handlers. {code:java} p.addLast(new GetHeadRequestHandler(new DefaultStandbyHeadReader(store))); p.addLast(new GetSegmentRequestHandler(new DefaultStandbySegmentReader(store))); p.addLast(new GetBlobRequestHandler(new DefaultStandbyBlobReader(store))); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4803) Simplify the client side of the cold standby
[ https://issues.apache.org/jira/browse/OAK-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493120#comment-15493120 ] Andrei Dulceanu commented on OAK-4803: -- [~frm], I guess you're right with the separation of concerns pointed out for my 1st suggestion. bq. Is this maybe the scope of another improvement issue? Maybe it makes sense to move this to another issue, since this issue already handles quite a number of aspects. bq. I quite don't agree here. You need a client to perform a sync. Reading the whole explanation, I think it's a valid point of view so I will adhere to it :) > Simplify the client side of the cold standby > > > Key: OAK-4803 > URL: https://issues.apache.org/jira/browse/OAK-4803 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.12 > > Attachments: OAK-4803-01.patch > > > The implementation of the cold standby client is overly and unnecessarily > complicated. It would be way clearer to separate the client code in two major > components: a simple client responsible for sending messages to and receive > responses from the standby server, and the synchronization algorithm used to > read data from the server and to save read data in the local {{FileStore}}. > Moreover, the client simple client could be further modularised by > encapsulating request encoding, response decoding and message handling into > their own Netty handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-4783) Update Oak 1.0 to Jackrabbit 2.8.3
[ https://issues.apache.org/jira/browse/OAK-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke resolved OAK-4783. - Resolution: Fixed 1.0: [r1760921|http://svn.apache.org/r1760921] > Update Oak 1.0 to Jackrabbit 2.8.3 > --- > > Key: OAK-4783 > URL: https://issues.apache.org/jira/browse/OAK-4783 > Project: Jackrabbit Oak > Issue Type: Task >Affects Versions: 1.0.33 >Reporter: Julian Reschke >Assignee: Julian Reschke > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4783) Update Oak 1.0 to Jackrabbit 2.8.3
[ https://issues.apache.org/jira/browse/OAK-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke updated OAK-4783: Fix Version/s: 1.0.34 > Update Oak 1.0 to Jackrabbit 2.8.3 > --- > > Key: OAK-4783 > URL: https://issues.apache.org/jira/browse/OAK-4783 > Project: Jackrabbit Oak > Issue Type: Task >Affects Versions: 1.0.33 >Reporter: Julian Reschke >Assignee: Julian Reschke > Fix For: 1.0.34 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4803) Simplify the client side of the cold standby
[ https://issues.apache.org/jira/browse/OAK-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493103#comment-15493103 ] Francesco Mari commented on OAK-4803: - bq. When looking at StandbyClient and StandbyServer I find it a little bit odd that the former doesn't have a FileStore, while the latter has. IMHO having a FileStore reference in the client would make sense and would also help reading the code, as it is much clearer from the beginning who owns what. This comment makes very much sense, because it highlights an asymmetry between the client and the server. To be honest, I prefer the way the client is written, without an explicit reference to the {{FileStore}}. This highlights the fact that the client is concerned with sending requests and parsing responses from the server, instead of taking care of how the data is actually used. I could have achieved the same separation of concerns in the server as well by introducing a very small interface. In the end, these are the only three lines where the {{FileStore}} is used in the server, which already suggests that this separation of concerns exists - at least at the level of the handlers. {noformat} p.addLast(new GetHeadRequestHandler(new DefaultStandbyHeadReader(store))); p.addLast(new GetSegmentRequestHandler(new DefaultStandbySegmentReader(store))); p.addLast(new GetBlobRequestHandler(new DefaultStandbyBlobReader(store))); {noformat} Is this maybe the scope of another improvement issue? bq. It looks more natural to have a client which wants to perform a sync as opposed to have a sync which will create a client. I quite don't agree here. You need a client to perform a sync. The client could be used for different purposes other than the sync, so it makes sense of having the sync process depending on a client and not the other way around. Anyway, the lines you pointed out are leftovers from the refactoring and I recognise that they are confusing. I will clean them up. bq. One minor change in {{StandbySync.run()}}, to allow the state to actually enter {{STATUS_STARTING}}. Good catch. It has to be cleaned up as part of this patch. bq. Rename {{copySegmentFromPrimary}} to {{copySegmentHierarchyFromPrimary}} or any other explanatory method name, since this method does a BFS starting with the initial segment to fetch from server. Nice suggestion. That method should have a more appropriate name, and I like your proposal. > Simplify the client side of the cold standby > > > Key: OAK-4803 > URL: https://issues.apache.org/jira/browse/OAK-4803 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.12 > > Attachments: OAK-4803-01.patch > > > The implementation of the cold standby client is overly and unnecessarily > complicated. It would be way clearer to separate the client code in two major > components: a simple client responsible for sending messages to and receive > responses from the standby server, and the synchronization algorithm used to > read data from the server and to save read data in the local {{FileStore}}. > Moreover, the client simple client could be further modularised by > encapsulating request encoding, response decoding and message handling into > their own Netty handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-4803) Simplify the client side of the cold standby
[ https://issues.apache.org/jira/browse/OAK-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492953#comment-15492953 ] Andrei Dulceanu edited comment on OAK-4803 at 9/15/16 10:26 AM: [~frm], here are some of my observations regarding the patch: # When looking at {{StandbyClient}} and {{StandbyServer}} I find it a little bit odd that the former doesn't have a {{FileStore}}, while the latter has. IMHO having a {{FileStore}} reference in the client would make sense and would also help reading the code, as it is much clearer from the beginning who owns what. # Along the same lines, IMHO it would make sense to reverse the relationship between {{StandbyClient}} and {{StandbySync}} since it looks more natural to have a client which wants to perform a sync as opposed to have a sync which will create a client, assign it a file store and then execute the sync. For example, replacing the line {code:java} StandbyClient cl = newStandbyClient(secondary); {code} with this line {code:java} StandbySync cl = newStandbyClient(secondary); {code} seems a little bit confusing to me. # One minor change in {{StandbySync.run()}}, to allow the state to actually enter {{STATUS_STARTING}}: {code:java} state = STATUS_STARTING; synchronized (sync) { if (active) { return; } state = STATUS_RUNNING; active = true; } {code} # another minor change in {{StandbySyncExecution}}: rename {{copySegmentFromPrimary}} to {{copySegmentHierarchyFromPrimary}} or any other explanatory method name, since this method does a BFS starting with the initial segment to fetch from server. /cc [~marett] was (Author: dulceanu): @frm, here are some of my observations regarding the patch: # When looking at {{StandbyClient}} and {{StandbyServer}} I find it a little bit odd that the former doesn't have a {{FileStore}}, while the latter has. IMHO having a {{FileStore}} reference in the client would make sense and would also help reading the code, as it is much clearer from the beginning who owns what. # Along the same lines, IMHO it would make sense to reverse the relationship between {{StandbyClient}} and {{StandbySync}} since it looks more natural to have a client which wants to perform a sync as opposed to have a sync which will create a client, assign it a file store and then execute the sync. For example, replacing the line {code:java} StandbyClient cl = newStandbyClient(secondary); {code} with this line {code:java} StandbySync cl = newStandbyClient(secondary); {code} seems a little bit confusing to me. # One minor change in {{StandbySync.run()}}, to allow the state to actually enter {{STATUS_STARTING}}: {code:java} state = STATUS_STARTING; synchronized (sync) { if (active) { return; } state = STATUS_RUNNING; active = true; } {code} # another minor change in {{StandbySyncExecution}}: rename {{copySegmentFromPrimary}} to {{copySegmentHierarchyFromPrimary}} or any other explanatory method name, since this method does a BFS starting with the initial segment to fetch from server. /cc [~marett] > Simplify the client side of the cold standby > > > Key: OAK-4803 > URL: https://issues.apache.org/jira/browse/OAK-4803 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.12 > > Attachments: OAK-4803-01.patch > > > The implementation of the cold standby client is overly and unnecessarily > complicated. It would be way clearer to separate the client code in two major > components: a simple client responsible for sending messages to and receive > responses from the standby server, and the synchronization algorithm used to > read data from the server and to save read data in the local {{FileStore}}. > Moreover, the client simple client could be further modularised by > encapsulating request encoding, response decoding and message handling into > their own Netty handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4803) Simplify the client side of the cold standby
[ https://issues.apache.org/jira/browse/OAK-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492953#comment-15492953 ] Andrei Dulceanu commented on OAK-4803: -- @frm, here are some of my observations regarding the patch: # When looking at {{StandbyClient}} and {{StandbyServer}} I find it a little bit odd that the former doesn't have a {{FileStore}}, while the latter has. IMHO having a {{FileStore}} reference in the client would make sense and would also help reading the code, as it is much clearer from the beginning who owns what. # Along the same lines, IMHO it would make sense to reverse the relationship between {{StandbyClient}} and {{StandbySync}} since it looks more natural to have a client which wants to perform a sync as opposed to have a sync which will create a client, assign it a file store and then execute the sync. For example, replacing the line {code:java} StandbyClient cl = newStandbyClient(secondary); {code} with this line {code:java} StandbySync cl = newStandbyClient(secondary); {code} seems a little bit confusing to me. # One minor change in {{StandbySync.run()}}, to allow the state to actually enter {{STATUS_STARTING}}: {code:java} state = STATUS_STARTING; synchronized (sync) { if (active) { return; } state = STATUS_RUNNING; active = true; } {code} # another minor change in {{StandbySyncExecution}}: rename {{copySegmentFromPrimary}} to {{copySegmentHierarchyFromPrimary}} or any other explanatory method name, since this method does a BFS starting with the initial segment to fetch from server. /cc [~marett] > Simplify the client side of the cold standby > > > Key: OAK-4803 > URL: https://issues.apache.org/jira/browse/OAK-4803 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.12 > > Attachments: OAK-4803-01.patch > > > The implementation of the cold standby client is overly and unnecessarily > complicated. It would be way clearer to separate the client code in two major > components: a simple client responsible for sending messages to and receive > responses from the standby server, and the synchronization algorithm used to > read data from the server and to save read data in the local {{FileStore}}. > Moreover, the client simple client could be further modularised by > encapsulating request encoding, response decoding and message handling into > their own Netty handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-4287) Disable / remove SegmentBufferWriter#checkGCGen
[ https://issues.apache.org/jira/browse/OAK-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig resolved OAK-4287. Resolution: Fixed Fixed at http://svn.apache.org/viewvc?rev=1760914&view=rev Turns out that OAK-4631 removed the check already. I added it back but disabled it by default. It can be enabled with {{-Denable-generation-check=true}}. cc [~volteanu] > Disable / remove SegmentBufferWriter#checkGCGen > --- > > Key: OAK-4287 > URL: https://issues.apache.org/jira/browse/OAK-4287 > Project: Jackrabbit Oak > Issue Type: Task > Components: segment-tar >Reporter: Michael Dürig >Assignee: Michael Dürig > Labels: assertion, compaction, gc > Fix For: Segment Tar 0.0.12 > > > {{SegmentBufferWriter#checkGCGen}} is an after the fact check for back > references (see OAK-3348), logging a warning if detects any. As this check > loads the segment it checks the reference for, it is somewhat expensive. We > should either come up with a cheaper way for this check or remove it (at > least disable it by default). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4287) Disable / remove SegmentBufferWriter#checkGCGen
[ https://issues.apache.org/jira/browse/OAK-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig updated OAK-4287: --- Fix Version/s: (was: Segment Tar 0.0.24) Segment Tar 0.0.12 > Disable / remove SegmentBufferWriter#checkGCGen > --- > > Key: OAK-4287 > URL: https://issues.apache.org/jira/browse/OAK-4287 > Project: Jackrabbit Oak > Issue Type: Task > Components: segment-tar >Reporter: Michael Dürig >Assignee: Michael Dürig > Labels: assertion, compaction, gc > Fix For: Segment Tar 0.0.12 > > > {{SegmentBufferWriter#checkGCGen}} is an after the fact check for back > references (see OAK-3348), logging a warning if detects any. As this check > loads the segment it checks the reference for, it is somewhat expensive. We > should either come up with a cheaper way for this check or remove it (at > least disable it by default). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4811) MongoToMongoFbsTest fails
[ https://issues.apache.org/jira/browse/OAK-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492909#comment-15492909 ] Marcel Reutegger commented on OAK-4811: --- The test starts to fail on the 1.4 branch with changes merged from OAK-4679 in revision svn.apache.org/r1756641 However, I don't think those changes are the root cause for the failing test. > MongoToMongoFbsTest fails > - > > Key: OAK-4811 > URL: https://issues.apache.org/jira/browse/OAK-4811 > Project: Jackrabbit Oak > Issue Type: Bug > Components: upgrade >Affects Versions: 1.4.7 >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.4.8 > > > The test fails in the current 1.4 branch and also with the 1.4.7 release when > a local MongoDB is running. > {noformat} > validateMigration(org.apache.jackrabbit.oak.upgrade.cli.MongoToMongoFbsTest) > Time elapsed: 5.628 sec <<< ERROR! > java.lang.IllegalStateException: This builder does not exist: default > {noformat} > The test runs successfully with 1.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4412) Lucene hybrid index
[ https://issues.apache.org/jira/browse/OAK-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra updated OAK-4412: - Labels: docs-impacting (was: ) > Lucene hybrid index > --- > > Key: OAK-4412 > URL: https://issues.apache.org/jira/browse/OAK-4412 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: lucene >Reporter: Tomek Rękawek >Assignee: Chetan Mehrotra > Labels: docs-impacting > Fix For: 1.6, 1.5.11 > > Attachments: OAK-4412-v1.diff, OAK-4412.patch, hybrid-benchmark.sh, > hybrid-result-v1.txt > > > When running Oak in a cluster, each write operation is expensive. After > performing some stress-tests with a geo-distributed Mongo cluster, we've > found out that updating property indexes is a large part of the overall > traffic. > The asynchronous index would be an answer here (as the index update won't be > made in the client request thread), but the AEM requires the updates to be > visible immediately in order to work properly. > The idea here is to enhance the existing asynchronous Lucene index with a > synchronous, locally-stored counterpart that will persist only the data since > the last Lucene background reindexing job. > The new index can be stored in memory or (if necessary) in MMAPed local > files. Once the "main" Lucene index is being updated, the local index will be > purged. > Queries will use an union of results from the {{lucene}} and > {{lucene-memory}} indexes. > The {{lucene-memory}} index, as a local stored entity, will be updated using > an observer, so it'll get both local and remote changes. > The original idea has been suggested by [~chetanm] in the discussion for the > OAK-4233. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4811) MongoToMongoFbsTest fails
[ https://issues.apache.org/jira/browse/OAK-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger updated OAK-4811: -- Fix Version/s: 1.4.8 > MongoToMongoFbsTest fails > - > > Key: OAK-4811 > URL: https://issues.apache.org/jira/browse/OAK-4811 > Project: Jackrabbit Oak > Issue Type: Bug > Components: upgrade >Affects Versions: 1.4.7 >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.4.8 > > > The test fails in the current 1.4 branch and also with the 1.4.7 release when > a local MongoDB is running. > {noformat} > validateMigration(org.apache.jackrabbit.oak.upgrade.cli.MongoToMongoFbsTest) > Time elapsed: 5.628 sec <<< ERROR! > java.lang.IllegalStateException: This builder does not exist: default > {noformat} > The test runs successfully with 1.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4812) Reduce calls to SegmentStore#newSegmentId from the Segment class
[ https://issues.apache.org/jira/browse/OAK-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492820#comment-15492820 ] Francesco Mari commented on OAK-4812: - I guess the record ID cache per segment is kind of useless. It should better be a segment ID cache, since this seems to be the bulk of the problem. Creating record IDs is cheap, but it's not the same for segment IDs. > Reduce calls to SegmentStore#newSegmentId from the Segment class > > > Key: OAK-4812 > URL: https://issues.apache.org/jira/browse/OAK-4812 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Alex Parvulescu >Priority: Minor > > OAK-4631 introduced a change in records handling in a segment that will > amplify the number of calls to {{SegmentStore#newSegmentId}} by the number of > external references [0]. It usually is the case that there are a lot of > record references that point to the same segment id, and the existing > {{recordIdCache}} would not help much in this case. > The scenario I'm seeing for offline compaction (might be a bit biased) is a > full traversal of segments that increases pressure on the {{SegmentIdTable}} > by calling {{newSegmentId}} with a lot of already existing segments. > I'm creating this issue as an 'Improvement' as I think it is interesting to > look into reducing this pressure. This might be by squeezing more out of the > {{SegmentIdTable}} bits (I'd like to followup on this with a benchmark) or > revisiting the code paths from the {{Segment}} class. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Segment.java#L405 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-4803) Simplify the client side of the cold standby
[ https://issues.apache.org/jira/browse/OAK-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15490303#comment-15490303 ] Francesco Mari edited comment on OAK-4803 at 9/15/16 9:11 AM: -- The patch implements everything that is described in the issue and adds unit tests for the newly introduced components. Some unnecessary components like {{StandbyStore}}, {{SegmentLoaderHandler}}, {{StandbyClientHandler}}, and other encoders and decoders have been removed as part of this patch. was (Author: frm): The patch implements everything that is describing in the issue and adds unit tests for the newly introduced components. Some unnecessary components like {{StandbyStore}}, {{SegmentLoaderHandler}}, {{StandbyClientHandler}}, and other encoders and decoders have been removed as part of this patch. > Simplify the client side of the cold standby > > > Key: OAK-4803 > URL: https://issues.apache.org/jira/browse/OAK-4803 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.12 > > Attachments: OAK-4803-01.patch > > > The implementation of the cold standby client is overly and unnecessarily > complicated. It would be way clearer to separate the client code in two major > components: a simple client responsible for sending messages to and receive > responses from the standby server, and the synchronization algorithm used to > read data from the server and to save read data in the local {{FileStore}}. > Moreover, the client simple client could be further modularised by > encapsulating request encoding, response decoding and message handling into > their own Netty handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4631) Simplify the format of segments and serialized records
[ https://issues.apache.org/jira/browse/OAK-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492797#comment-15492797 ] Alex Parvulescu commented on OAK-4631: -- Another side effect I noticed is OAK-4812, we can followup there, so we don't overload this issue too much. > Simplify the format of segments and serialized records > -- > > Key: OAK-4631 > URL: https://issues.apache.org/jira/browse/OAK-4631 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.10 > > Attachments: OAK-4631-01.patch, OAK-4631-02.patch, OAK-4631-03.patch, > OAK-4631-04.patch > > > As discussed in [this thread|http://markmail.org/thread/3oxp6ydboyefr4bg], it > might be beneficial to simplify both the format of the segments and the way > record IDs are serialised. A new strategy needs to be investigated to reach > the right compromise between performance, disk space utilization and > simplicity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-4812) Reduce calls to SegmentStore#newSegmentId from the Segment class
Alex Parvulescu created OAK-4812: Summary: Reduce calls to SegmentStore#newSegmentId from the Segment class Key: OAK-4812 URL: https://issues.apache.org/jira/browse/OAK-4812 Project: Jackrabbit Oak Issue Type: Improvement Components: segment-tar Reporter: Alex Parvulescu Priority: Minor OAK-4631 introduced a change in records handling in a segment that will amplify the number of calls to {{SegmentStore#newSegmentId}} by the number of external references [0]. It usually is the case that there are a lot of record references that point to the same segment id, and the existing {{recordIdCache}} would not help much in this case. The scenario I'm seeing for offline compaction (might be a bit biased) is a full traversal of segments that increases pressure on the {{SegmentIdTable}} by calling {{newSegmentId}} with a lot of already existing segments. I'm creating this issue as an 'Improvement' as I think it is interesting to look into reducing this pressure. This might be by squeezing more out of the {{SegmentIdTable}} bits (I'd like to followup on this with a benchmark) or revisiting the code paths from the {{Segment}} class. [0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Segment.java#L405 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-4811) MongoToMongoFbsTest fails
Marcel Reutegger created OAK-4811: - Summary: MongoToMongoFbsTest fails Key: OAK-4811 URL: https://issues.apache.org/jira/browse/OAK-4811 Project: Jackrabbit Oak Issue Type: Bug Components: upgrade Affects Versions: 1.4.7 Reporter: Marcel Reutegger Assignee: Marcel Reutegger The test fails in the current 1.4 branch and also with the 1.4.7 release when a local MongoDB is running. {noformat} validateMigration(org.apache.jackrabbit.oak.upgrade.cli.MongoToMongoFbsTest) Time elapsed: 5.628 sec <<< ERROR! java.lang.IllegalStateException: This builder does not exist: default {noformat} The test runs successfully with 1.4.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4810) FileDataStore: support SHA-2
[ https://issues.apache.org/jira/browse/OAK-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492777#comment-15492777 ] Chetan Mehrotra commented on OAK-4810: -- bq. I think default for writing (if not configured explicitly) could still be SHA-1. The change can be made anytime. It should not affect any other part much. So default value can be simply switched to SHA-256 Once a binary is added by any digest method we do not need the method details while doing a read as that would be purely on the basis of id. Still it would be good to encode the algo in the id which is passed back to NodeStore > FileDataStore: support SHA-2 > > > Key: OAK-4810 > URL: https://issues.apache.org/jira/browse/OAK-4810 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: blob >Reporter: Thomas Mueller > > The FileDataStore currently uses SHA-1, but that algorithm is deprecated. We > should support other algorithms as well (mainly SHA-256). > Migration should be painless (no long downtime). I think default for writing > (if not configured explicitly) could still be SHA-1. But when reading, > SHA-256 should also be supported (depending on the identifier). That way, the > new Oak version for all repositories (in a cluster + shared datastore) can be > installed "slowly". > After all repositories are running with the new Oak version, the > configuration for SHA-256 can be enabled. That way, SHA-256 is used for new > binaries. Both SHA-1 and SHA-256 are supported for reading. > One potential downside is deduplication would suffer a bit if a new Blob with > same content is added again as digest based match would fail. That can be > mitigated by computing 2 types of digest if need arises. The downsides are > some additional file operations and CPU, and slower migration to SHA-256. > Some other open questions: > * While we are at it, it might makes senses to additionally support SHA-3 and > other algorithms (make it configurable). But the length of the identifier > alone might then not be enough information to know what algorithm is used, so > maybe add a prefix. > * The number of subdirectory levels: should we keep it as is, or should we > reduce it (for example one level less). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4805) Misconfigured lucene index definition can render the whole system unusable
[ https://issues.apache.org/jira/browse/OAK-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492722#comment-15492722 ] Vikas Saurabh commented on OAK-4805: Ack. Would try to see how quickly can I do the complete one. Else, I'd commit this patch and open another issue as you suggested. > Misconfigured lucene index definition can render the whole system unusable > -- > > Key: OAK-4805 > URL: https://issues.apache.org/jira/browse/OAK-4805 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Reporter: Vikas Saurabh >Assignee: Vikas Saurabh > Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4 > Fix For: 1.6 > > Attachments: OAK-4805.patch > > > Mis-configured index definition can throw an exception while collecting > plans. This causes any query (even unrelated ones) to not work as cost > calculation logic would consult a badly constructed index def. Overall a > mis-configured index definition can practically grind the whole system to > halt as the whole query framework stops working. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4631) Simplify the format of segments and serialized records
[ https://issues.apache.org/jira/browse/OAK-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492715#comment-15492715 ] Francesco Mari commented on OAK-4631: - It's also interesting to take the average of the values above, because it helps putting these information in perspective. - source, Oak 1.0 {noformat} 135 KB per data segment 52 byte per map 0.28 byte per list 7byte per template 5byte per node {noformat} - upgraded instance, pre OAK-4631 {noformat} 33 KB per data segment 46 byte per map 12 byte per list 7 byte per template 4 byte per node {noformat} - upgraded instance, post OAK-4631 {noformat} 251 KB per data segment 182 byte per map 58 byte per list 22 byte per template 35 byte per node {noformat} Records got bigger, that's undeniable. But as a consequence of this change records are more easily parseable, segments are better utilised and 54% less segments are needed to store the same data. Less segments means a smaller size of book-keeping data structures used throughout the Segment Store, especially when it comes to compaction. This change traded space for simplicity, and I think there is some value in that. > Simplify the format of segments and serialized records > -- > > Key: OAK-4631 > URL: https://issues.apache.org/jira/browse/OAK-4631 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.10 > > Attachments: OAK-4631-01.patch, OAK-4631-02.patch, OAK-4631-03.patch, > OAK-4631-04.patch > > > As discussed in [this thread|http://markmail.org/thread/3oxp6ydboyefr4bg], it > might be beneficial to simplify both the format of the segments and the way > record IDs are serialised. A new strategy needs to be investigated to reach > the right compromise between performance, disk space utilization and > simplicity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4804) Synonym analyzer with multiple words in synonym definition can give more results than expected
[ https://issues.apache.org/jira/browse/OAK-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492693#comment-15492693 ] Vikas Saurabh commented on OAK-4804: I couldn't find anything on the web. Single/double quotes didn't work :(. [~teofili], would you know? > Synonym analyzer with multiple words in synonym definition can give more > results than expected > -- > > Key: OAK-4804 > URL: https://issues.apache.org/jira/browse/OAK-4804 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Reporter: Vikas Saurabh >Assignee: Vikas Saurabh >Priority: Minor > > Setting up synonyms such as {{"FTW, For the win"}} would also return > documents which contain all of {{"For", "the", "win"}}. > Test case: > {noformat} > @Test > public void fulltextSearchWithPhraseSynonymAnalyzer() throws Exception { > Tree idx = createFulltextIndex(root.getTree("/"), "test"); > TestUtil.useV2(idx); > Tree anl = > idx.addChild(LuceneIndexConstants.ANALYZERS).addChild(LuceneIndexConstants.ANL_DEFAULT); > > anl.addChild(LuceneIndexConstants.ANL_TOKENIZER).setProperty(LuceneIndexConstants.ANL_NAME, > "Standard"); > Tree synFilter = > anl.addChild(LuceneIndexConstants.ANL_FILTERS).addChild("Synonym"); > synFilter.setProperty("synonyms", "syn.txt"); > > synFilter.addChild("syn.txt").addChild(JCR_CONTENT).setProperty(JCR_DATA, > "FTW, For the win"); > Tree test = root.getTree("/").addChild("test"); > test.addChild("1").setProperty("foo", "FTW"); > test.addChild("2").setProperty("foo", "For the win"); > test.addChild("3").setProperty("foo", "For gods sake, this is not the > way to win it"); > root.commit(); > assertQuery("select * from [nt:base] where CONTAINS(*, 'FTW') AND > ISDESCENDANTNODE('/test')", > asList("/test/1", "/test/2"));//current (failing result is > ["/test/1", "/test/2", "/test/3"]) > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4810) FileDataStore: support SHA-2
[ https://issues.apache.org/jira/browse/OAK-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-4810: Component/s: blob > FileDataStore: support SHA-2 > > > Key: OAK-4810 > URL: https://issues.apache.org/jira/browse/OAK-4810 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: blob >Reporter: Thomas Mueller > > The FileDataStore currently uses SHA-1, but that algorithm is deprecated. We > should support other algorithms as well (mainly SHA-256). > Migration should be painless (no long downtime). I think default for writing > (if not configured explicitly) could still be SHA-1. But when reading, > SHA-256 should also be supported (depending on the identifier). That way, the > new Oak version for all repositories (in a cluster + shared datastore) can be > installed "slowly". > After all repositories are running with the new Oak version, the > configuration for SHA-256 can be enabled. That way, SHA-256 is used for new > binaries. Both SHA-1 and SHA-256 are supported for reading. > One potential downside is deduplication would suffer a bit if a new Blob with > same content is added again as digest based match would fail. That can be > mitigated by computing 2 types of digest if need arises. The downsides are > some additional file operations and CPU, and slower migration to SHA-256. > Some other open questions: > * While we are at it, it might makes senses to additionally support SHA-3 and > other algorithms (make it configurable). But the length of the identifier > alone might then not be enough information to know what algorithm is used, so > maybe add a prefix. > * The number of subdirectory levels: should we keep it as is, or should we > reduce it (for example one level less). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-4412) Lucene hybrid index
[ https://issues.apache.org/jira/browse/OAK-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra resolved OAK-4412. -- Resolution: Fixed Fix Version/s: 1.5.11 Most of the required work is done now. Some pending work left for which tasks are opened (see linked issue) Resolving the issue as completed. Specific issue can be created going forward > Lucene hybrid index > --- > > Key: OAK-4412 > URL: https://issues.apache.org/jira/browse/OAK-4412 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: lucene >Reporter: Tomek Rękawek >Assignee: Chetan Mehrotra > Fix For: 1.6, 1.5.11 > > Attachments: OAK-4412-v1.diff, OAK-4412.patch, hybrid-benchmark.sh, > hybrid-result-v1.txt > > > When running Oak in a cluster, each write operation is expensive. After > performing some stress-tests with a geo-distributed Mongo cluster, we've > found out that updating property indexes is a large part of the overall > traffic. > The asynchronous index would be an answer here (as the index update won't be > made in the client request thread), but the AEM requires the updates to be > visible immediately in order to work properly. > The idea here is to enhance the existing asynchronous Lucene index with a > synchronous, locally-stored counterpart that will persist only the data since > the last Lucene background reindexing job. > The new index can be stored in memory or (if necessary) in MMAPed local > files. Once the "main" Lucene index is being updated, the local index will be > purged. > Queries will use an union of results from the {{lucene}} and > {{lucene-memory}} indexes. > The {{lucene-memory}} index, as a local stored entity, will be updated using > an observer, so it'll get both local and remote changes. > The original idea has been suggested by [~chetanm] in the discussion for the > OAK-4233. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-4810) FileDataStore: support SHA-2
Thomas Mueller created OAK-4810: --- Summary: FileDataStore: support SHA-2 Key: OAK-4810 URL: https://issues.apache.org/jira/browse/OAK-4810 Project: Jackrabbit Oak Issue Type: New Feature Reporter: Thomas Mueller The FileDataStore currently uses SHA-1, but that algorithm is deprecated. We should support other algorithms as well (mainly SHA-256). Migration should be painless (no long downtime). I think default for writing (if not configured explicitly) could still be SHA-1. But when reading, SHA-256 should also be supported (depending on the identifier). That way, the new Oak version for all repositories (in a cluster + shared datastore) can be installed "slowly". After all repositories are running with the new Oak version, the configuration for SHA-256 can be enabled. That way, SHA-256 is used for new binaries. Both SHA-1 and SHA-256 are supported for reading. One potential downside is deduplication would suffer a bit if a new Blob with same content is added again as digest based match would fail. That can be mitigated by computing 2 types of digest if need arises. The downsides are some additional file operations and CPU, and slower migration to SHA-256. Some other open questions: * While we are at it, it might makes senses to additionally support SHA-3 and other algorithms (make it configurable). But the length of the identifier alone might then not be enough information to know what algorithm is used, so maybe add a prefix. * The number of subdirectory levels: should we keep it as is, or should we reduce it (for example one level less). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4804) Synonym analyzer with multiple words in synonym definition can give more results than expected
[ https://issues.apache.org/jira/browse/OAK-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492646#comment-15492646 ] Marcel Reutegger commented on OAK-4804: --- Is there a way to define a phrase instead of individual words for the synonym? E.g: {{FTW, 'For the win'}}. > Synonym analyzer with multiple words in synonym definition can give more > results than expected > -- > > Key: OAK-4804 > URL: https://issues.apache.org/jira/browse/OAK-4804 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Reporter: Vikas Saurabh >Assignee: Vikas Saurabh >Priority: Minor > > Setting up synonyms such as {{"FTW, For the win"}} would also return > documents which contain all of {{"For", "the", "win"}}. > Test case: > {noformat} > @Test > public void fulltextSearchWithPhraseSynonymAnalyzer() throws Exception { > Tree idx = createFulltextIndex(root.getTree("/"), "test"); > TestUtil.useV2(idx); > Tree anl = > idx.addChild(LuceneIndexConstants.ANALYZERS).addChild(LuceneIndexConstants.ANL_DEFAULT); > > anl.addChild(LuceneIndexConstants.ANL_TOKENIZER).setProperty(LuceneIndexConstants.ANL_NAME, > "Standard"); > Tree synFilter = > anl.addChild(LuceneIndexConstants.ANL_FILTERS).addChild("Synonym"); > synFilter.setProperty("synonyms", "syn.txt"); > > synFilter.addChild("syn.txt").addChild(JCR_CONTENT).setProperty(JCR_DATA, > "FTW, For the win"); > Tree test = root.getTree("/").addChild("test"); > test.addChild("1").setProperty("foo", "FTW"); > test.addChild("2").setProperty("foo", "For the win"); > test.addChild("3").setProperty("foo", "For gods sake, this is not the > way to win it"); > root.commit(); > assertQuery("select * from [nt:base] where CONTAINS(*, 'FTW') AND > ISDESCENDANTNODE('/test')", > asList("/test/1", "/test/2"));//current (failing result is > ["/test/1", "/test/2", "/test/3"]) > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-4631) Simplify the format of segments and serialized records
[ https://issues.apache.org/jira/browse/OAK-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492599#comment-15492599 ] Alex Parvulescu commented on OAK-4631: -- I think an important aspect of the impact of this patch was not fully tested, namely _disk space utilization_. I'm running some upgrade tests using the latest trunk now and I have some interesting results to share (I'm using {{oak-run debug}} to collect data): - source 8.7GB, oak 1.0 {noformat} Total size: 7 GB in 54137 data segments 768 KB in 3 bulk segments 1 GB in maps (20650196 leaf and branch records) 113 MB in lists (3714097 list and bucket records) 3 GB in values (value and block records of 73489693 properties, 3432/378779/0/1214488 small/medium/long/external blobs, 51059734/3318006/159 small/medium/long strings) 120 MB in templates (16786491 template records) 1 GB in nodes (221232040 node records) {noformat} - upgraded instance pre OAK-4631 (based on rev 1757389) 11GB {noformat} Total size: 10 GB in 321341 data segments 768 KB in 3 bulk segments 2 GB in maps (46451304 leaf and branch records) 619 MB in lists (55468842 list and bucket records) 3 GB in values (value and block records of 70764647 properties, 3429/378684/0/1214419 small/medium/long/external blobs, 46258634/1862224/159 small/medium/long strings) 113 MB in templates (16772763 template records) 1 GB in nodes (251592041 node records) {noformat} - upgraded instance post OAK-4631 37GB {noformat} Total size: 36 GB in 150205 data segments 768 KB in 3 bulk segments 6 GB in maps (35228936 leaf and branch records) 3 GB in lists (55508867 list and bucket records) 4 GB in values (value and block records of 75853352 properties, 3742/380719/0/1216770 small/medium/long/external blobs, 76087785/4765208/159 small/medium/long strings) 712 MB in templates (33716018 template records) 13 GB in nodes (390207210 node records) {noformat} The size delta is pretty big, upgraded repo jumps from {{11GB}} to {{37GB}}. > Simplify the format of segments and serialized records > -- > > Key: OAK-4631 > URL: https://issues.apache.org/jira/browse/OAK-4631 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: Segment Tar 0.0.10 > > Attachments: OAK-4631-01.patch, OAK-4631-02.patch, OAK-4631-03.patch, > OAK-4631-04.patch > > > As discussed in [this thread|http://markmail.org/thread/3oxp6ydboyefr4bg], it > might be beneficial to simplify both the format of the segments and the way > record IDs are serialised. A new strategy needs to be investigated to reach > the right compromise between performance, disk space utilization and > simplicity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)