from:"Julian Sedding"

Re: Using oak run to compact older versions

2024-06-17 Thread Julian Sedding

Hi Roy

Looking at 
https://jackrabbit.apache.org/oak/docs/nodestore/segment/changes.html,
there don't seem to be any changes in the storage format of the TAR
segment store. Therefore, I would expect no issues. I may be missing
something, however.

Regards
Julian

On Mon, Jun 17, 2024 at 12:47 PM Roy Teeuwen  wrote:
>
> Hey,
>
> I am using an oak-core 1.22 based application (AEM 6.5) but I’d like to use 
> the features of the current oak-run jar to do parallel offline compaction. Is 
> this feasible? I tried it on a local instance and everything seems to start 
> after compaction and seems to work, but I’d like to see if there is a way to 
> be more certain that there weren’t any hidden bugs introduced.
>
> Seeing as the segment store hasn’t needed any upgrade in the last iterations 
> of AEM, I expect we should be fine?
>
> Groeten,
> Roy

Re: Tarball Compaction Performance Issue

2024-05-27 Thread Julian Sedding

Hi Andreas

Without looking at any more detail, I noticed that in your invocation
I believe the flag name is misspelled: "—trheads=16" should probably
be "—threads=16", unless it is misspelled in the implementation.

Please confirm that this is NOT the issue. Thanks!

Regards
Julian

On Fri, May 24, 2024 at 8:21 PM Andreas Schaefer
 wrote:
>
> Hi
>
> I already posted a question on Jackrabbit Users but did not get a response so 
> far. That said we changed the approach and ran into new issues with the 
> TarBall Compaction.
>
> Our Segment Store on an AEM 6.5.6 (oak.core v. 1.22.4) is about 700GB and was 
> not compacted for many years.
>
> We tested that we can run the compaction with 1.62.0 without any side-effects 
> and so we started it this way with JDK 11:
>
> java \
>   -Dtar.memoryMapped=true \
>   -Doak.compaction.eagerFlush=true \
>   -Dlogback.configurationFile=logback-compaction.xml \
>   -jar oak-run-1.62.0.jar \
>   —compactor=parallel \
>   —trheads=16 \
>   
>
> This started pretty well until about 15% compared and then came to a crawl 
> where only one process is actually running.
>
> Thread Dump:
>
> "pool-2-thread-2" #23 prio=5 os_prio=0 cpu=1837679.95ms elapsed=61999.36s 
> tid=0x7ecf0d913800 nid=0x1835e6 waiting on condition  [0x7ecec0afd000]
>java.lang.Thread.State: WAITING (parking)
> at jdk.internal.misc.Unsafe.park(java.base@11.0.22/Native Method)
> - parking to wait for  <0x000451000178> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.park(java.base@11.0.22/LockSupport.java:194)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.22/AbstractQueuedSynchronizer.java:2081)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.22/LinkedBlockingQueue.java:433)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.22/ThreadPoolExecutor.java:1054)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.22/ThreadPoolExecutor.java:1114)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.22/ThreadPoolExecutor.java:628)
> at java.lang.Thread.run(java.base@11.0.22/Thread.java:834)
>
> "pool-2-thread-3" #24 prio=5 os_prio=0 cpu=34785318.09ms elapsed=61999.29s 
> tid=0x7ecf0d914800 nid=0x1835e8 runnable  [0x7ece7bffd000]
>java.lang.Thread.State: RUNNABLE
> at java.lang.ThreadLocal.get(java.base@11.0.22/ThreadLocal.java:163)
> at 
> java.lang.StringCoding.decodeUTF8(java.base@11.0.22/StringCoding.java:723)
> at 
> java.lang.StringCoding.decode(java.base@11.0.22/StringCoding.java:257)
> at java.lang.String.(java.base@11.0.22/String.java:507)
> at java.lang.String.(java.base@11.0.22/String.java:561)
> at 
> org.apache.jackrabbit.oak.segment.data.SegmentDataV12.getSignature(SegmentDataV12.java:88)
> at org.apache.jackrabbit.oak.segment.Segment.(Segment.java:201)
> at 
> org.apache.jackrabbit.oak.segment.file.AbstractFileStore.readSegmentUncached(AbstractFileStore.java:300)
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.lambda$readSegment$10(FileStore.java:512)
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$95/0x0008001ff840.call(Unknown
>  Source)
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.lambda$getSegment$0(SegmentCache.java:163)
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache$$Lambda$96/0x0008001ffc40.call(Unknown
>  Source)
> at 
> org.apache.jackrabbit.guava.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4938)
> at 
> org.apache.jackrabbit.guava.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3576)
> at 
> org.apache.jackrabbit.guava.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2318)
> at 
> org.apache.jackrabbit.guava.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2191)
> - locked <0x0006f028d408> (a 
> org.apache.jackrabbit.guava.common.cache.LocalCache$StrongAccessEntry)
> at 
> org.apache.jackrabbit.guava.common.cache.LocalCache$Segment.get(LocalCache.java:2081)
> at 
> org.apache.jackrabbit.guava.common.cache.LocalCache.get(LocalCache.java:4019)
> at 
> org.apache.jackrabbit.guava.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4933)
> at 
> org.apache.jackrabbit.oak.segment.SegmentCache$NonEmptyCache.getSegment(SegmentCache.java:160)
> at 
> org.apache.jackrabbit.oak.segment.file.FileStore.readSegment(FileStore.java:512)
> at 
> org.apache.jackrabbit.oak.segment.SegmentId.getSegment(SegmentId.java:153)
> - locked <0x0006f028d218> (a 
> org.apache.jackrabbit.oak.segment.SegmentId)
> at 
> org.apache.jackrabbit.oak.segment.CachingSeg

Please review fix for GC on SegmentStore setup with SplitPersistence

2022-10-21 Thread Julian Sedding

Hello

I would appreciate a review of PR #665 [0], which fixes OAK-9897 [1].

When running GC on a SegmentStore setup with SplitPersistence, it
happens regularly that the tar archives of the "read-only" part of the
persistence are identified for removal during the "cleanup" phase.
However, these can never be deleted (read-only), which leads to the
FileReaper thread to retry over and over again to delete them. I
noticed the issue while writing tests, but I am sure this happens only
in production systems. The impact AFAICS is limited to some warning
logs and excess resource-usage.

To address the issue I introduced an API change. Namely I added the
method "SegmentArchiveManager#isReadOnly(String archiveName)" with a
default implementation returning "false". This allows for exclusion of
read-only archives from the cleanup process (both the "mark" and the
"sweep" phases).

Thank you for your comments.

Regards
Julian

[0] https://github.com/apache/jackrabbit-oak/pull/665
[1] https://issues.apache.org/jira/browse/OAK-9897

Intent to backport OAK-9785

2022-10-05 Thread Julian Sedding

Hello

I intend to backport "OAK-9785 - Tar SegmentStore can be corrupted
during compaction" to the 1.22 branch. The fix hardens TAR compaction
by aborting it cleanly not only when an IOException is caught, but
also when any other Throwable is caught.

Let me know if you have any concerns.

Regards
Julian

[0] https://issues.apache.org/jira/browse/OAK-9785

OAK-9896 - Running unit-tests in IntelliJ dos not work

2022-08-22 Thread Julian Sedding

Hi

I am having issues when running unit-tests within Intellij IDEA. I can
work around the issue, but it's a bit cumbersome. Therefore I would
like to apply the change proposed in PR #664 [0], which addresses
OAK-9896 [1].

Do others experience the same problem running unit-tests in Intellij?
Does anyone object the proposed change?

Regards
Julian

[0] https://github.com/apache/jackrabbit-oak/pull/664
[1] https://issues.apache.org/jira/browse/OAK-9896

Plan to merge changes for OAK-9888 - Support more flexible SplitPersistence setups via OSGi

2022-08-22 Thread Julian Sedding

Hi

I am planning to merge PR #663 [0] on Wednesday. The changes address
OAK-9888 [1] and are intended to allow more flexibility when creating
a SplitPersistence. They affect the segment-tar and the segment-azure
modules.

Please let me know if you see any issues with this change.

Regards
Julian

[0] https://github.com/apache/jackrabbit-oak/pull/663
[1] https://issues.apache.org/jira/browse/OAK-9888

Re: Outdated 1.22 branch

2021-07-14 Thread Julian Sedding

Hi Andrei

I assume you would need to adjust the SCM Info in the pom of the
"1.22" branch in Git (compare with the pom in "trunk" branch). The SVN
branches are dead after migration to Git AFAIK.

I don't know if other adjustments are required. Konrad would likely know.

Regards
Julian

On Wed, Jul 14, 2021 at 2:44 PM Andrei Dulceanu  wrote:
>
> Hi all,
>
> I have a quick question for you: are all Oak svn branches still updated
> after migration to Git?
>
> I’m trying to cut 1.22.8 release and obviously it fails if I try it under
> git checkout, so I moved in the svn checkout, but then this seems outdated:
>
> jackrabbit-oak-1-22 dulceanu$ svn log -l 1
> 
> r1888717 | mreutegg | 2021-04-13 12:58:52 +0300 (Tue, 13 Apr 2021) | 3
> linesOAK-9393: Release Oak 1.22.7Fix version of oak-doc and
> oak-doc-railroad-macro after release
>
>
> I quickly checked what’s the status for trunk and this one is up-to-date:
>
> jackrabbit-oak dulceanu$ svn log -l 1
> 
> r1890974 | miroslav | 2021-06-22 18:33:17 +0300 (Tue, 22 Jun 2021) | 1
> lineOAK-9469 in case of the timeout in AzureRepositoryLock, retry
> renewing the lease
> 
>
> How can I progress with cutting 1.22.8 release? Any ideas?
>
> Regards,
> Andrei

Re: [Proposal] Feature toggles

2020-07-09 Thread Julian Sedding

Hi Marcel

On Tue, Jul 7, 2020 at 12:02 PM Marcel Reutegger
 wrote:
>
> Hi,
>
> Thanks for the feedback Julian.
>
> On 07.07.20, 10:45, "Julian Sedding"  wrote:
> > I'm not sure about the aspect of the implementation, that FeatureToggle
> > is Closeable and probably often short-lived. Given that the
> > FeatureToggleAdapter is registered with the whiteboard, and thus likely
> > with the OSGi service registry, this _may_ put unnecessary load on the
> > service registry.
>
> If used as a short-lived object, that is indeed a problem. My intention
> with the FeatureToggle is actually that it is long-lived, though it can
> obviously also be used differently. The try-with-resource block in the
> tests is just convenient.

It seems I misinterpreted the use of try-with-resource to indicate
short-lived toggles. I don't think it's possible to enforce long-lived
toggles, but it can certainly be encouraged in documentation. If it
turns out that we get problems with short-lived toggles, they can
still be solved later. I think your API would allow such changes in
the future.

>
> > And lastly, even if a FeatureToggleAdapter is already registered for a
> > feature, a new service would be registered if the same code was run in a
> > second thread.
>
> This is by design. It is valid to have multiple feature toggles registered
> with the same name. It's not the primary use case, but they can be used
> that way.

Ack. I assume they would get the same enabled/disabled state.

>
> > From an OSGi perspective, I would lean towards a long-lived singleton
> > service that can be toggled. The FeatureToggle could then be adjusted to
> > retrieve the matching service if available, or otherwise register its
> > own.
>
> I'm not sure I understand. Can you elaborate what you have in mind?

I meant that the implementation of Feature.newFeatureToggle() (maybe
rename to newFeature after the class name changes?) could be adjusted
from "always registering a FeatureToggle" to "returning an existing
FeatureToggle service with the same name and register a new one only
if none is available". Not sure this would work after you stated above
"it is valid to have multiple feature toggles registered with the same
name", even though I don't understand the benefit of registering
multiple toggles.

>
> > Regarding the API, I would probably rename FeatureToggle to Feature and
> > FeatureToggleAdapter to FeatureToggle. But that's of course a matter of
> > taste.
>
> Thanks for the suggestion. I like it.

:)

>
> > Also, I would add an "isEnabled" method to FeatureToggleAdapter, in
> > order to allow the code setting the toggle to introspect the current
> > state.
>
> I considered this as well, but did not see a use case for it. What would
> you do with this method?

I don't have a use case, but could imagine that introspection of the
state could be useful for reporting (e.g. a web-console report of all
active toggles and their state). I understand the desire to keep an
API minimal, but on the other hand I find it frustrating when an API
doesn't offer seemingly obvious features (obvious in my mind anyways).

>
> Regards
>  Marcel
>

Regards
Julian

Re: [Proposal] Feature toggles

2020-07-07 Thread Julian Sedding

Hi Marcel,

I think the API is elegant. Short of running "feature" code in a
closure, a "try with resource" block encourages developers to clearly
delimit the block of code that is subject to the feature toggle,
hopefully resulting in readable code.

I'm not sure about the aspect of the implementation, that
FeatureToggle is Closeable and probably often short-lived. Given that
the FeatureToggleAdapter is registered with the whiteboard, and thus
likely with the OSGi service registry, this _may_ put unnecessary load
on the service registry. Furthermore, enabling/disabling the toggle
would need to be done in a way that respects this dynamism. And
lastly, even if a FeatureToggleAdapter is already registered for a
feature, a new service would be registered if the same code was run in
a second thread.

>From an OSGi perspective, I would lean towards a long-lived singleton
service that can be toggled. The FeatureToggle could then be adjusted
to retrieve the matching service if available, or otherwise register
its own.

Regarding the API, I would probably rename FeatureToggle to Feature
and FeatureToggleAdapter to FeatureToggle. But that's of course a
matter of taste. Also, I would add an "isEnabled" method to
FeatureToggleAdapter, in order to allow the code setting the toggle to
introspect the current state.

Regards
Julian

On Mon, Jul 6, 2020 at 7:10 PM Marcel Reutegger
 wrote:
>
> Hi,
>
> There is a proposal ready in OAK-9132 [0] that introduces the concept of
> feature toggles [1]. A FeatureToggle is basically a boolean value that
> controls whether some new feature is available. The implementation uses
> the Oak Whiteboard to register a feature toggle. It is then up to
> another bundle to control the state of the feature toggles at
> initialization and/or runtime.
>
> A very simple implementation that wires feature toggles to system
> properties is presented in OAK-9132. More sophisticated implementations
> that talk to a central feature toggle service are also easy to implement
> with an OSGi component that keeps track of registered feature toggles.
>
> Feedback welcome.
>
> Regards
>  Marcel
>
> [0] https://issues.apache.org/jira/browse/OAK-9132
> [1] https://martinfowler.com/articles/feature-toggles.html
>

Re: Query ordered by node name

2020-05-19 Thread Julian Sedding

Or alternatively try [function = "fn:name()"], i.e. with the brackets "()".

Regards
Julian

On Tue, May 19, 2020 at 10:57 AM Julian Sedding  wrote:
>
> Hi Jorge
>
> You could try the Oak Index Definition Generator.
>
> http://oakutils.appspot.com/generate/index
>
> FWIW, in the "name" property node it sets [name = ":name"] instead of
> [function = "fn:name"]. I don't know if that makes a difference and
> which is better, if any.
>
> Regards
> Julian
>
> On Mon, May 18, 2020 at 11:55 PM jorgeeflorez .
>  wrote:
> >
> > Hello,
> > with the following query  I am able to get file nodes ordered by name:
> >
> > SELECT * FROM [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER
> > BY NAME([s]) DESC
> >
> > unfortunately, because I do not have an index, on a big repository I have
> > warnings like:
> >
> > WARN org.apache.jackrabbit.oak.plugins.index.Cursors$TraversingCursor  -
> > Traversed 81000 nodes with filter Filter(query=SELECT * FROM [nt:file] AS s
> > WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s]) DESC,
> > path=/repo1/pruebaJF1/*); consider creating an index or changing the query
> >
> > and the query takes a lot of time.
> >
> > I do not know how to define a proper index for name(). if I use the
> > following:
> >   - compatVersion = 2
> >   - async = "async"
> >   - jcr:primaryType = oak:QueryIndexDefinition
> >   - evaluatePathRestrictions = true
> >   - type = "lucene"
> >   + indexRules
> >+ nt:file
> > + properties
> >  + primaryType
> >   - name = "jcr:primaryType"
> >   - propertyIndex = true
> >  + name
> >   - function = "fn:name"
> >   - ordered = true
> >   - type = "String"
> >
> > the index is used (index cost is 501 compared to 80946 for traverse), but
> > it takes more time than traversing with warnings like:
> >
> > WARN
> > org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor
> >  - Index-Traversed 8 nodes with filter Filter(query=SELECT * FROM
> > [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s])
> > DESC, path=/repo1/pruebaJF1/*)
> >
> > Thanks in advance.
> >
> > Regards.
> >
> > Jorge

Re: Query ordered by node name

2020-05-19 Thread Julian Sedding

Hi Jorge

You could try the Oak Index Definition Generator.

http://oakutils.appspot.com/generate/index

FWIW, in the "name" property node it sets [name = ":name"] instead of
[function = "fn:name"]. I don't know if that makes a difference and
which is better, if any.

Regards
Julian

On Mon, May 18, 2020 at 11:55 PM jorgeeflorez .
 wrote:
>
> Hello,
> with the following query  I am able to get file nodes ordered by name:
>
> SELECT * FROM [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER
> BY NAME([s]) DESC
>
> unfortunately, because I do not have an index, on a big repository I have
> warnings like:
>
> WARN org.apache.jackrabbit.oak.plugins.index.Cursors$TraversingCursor  -
> Traversed 81000 nodes with filter Filter(query=SELECT * FROM [nt:file] AS s
> WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s]) DESC,
> path=/repo1/pruebaJF1/*); consider creating an index or changing the query
>
> and the query takes a lot of time.
>
> I do not know how to define a proper index for name(). if I use the
> following:
>   - compatVersion = 2
>   - async = "async"
>   - jcr:primaryType = oak:QueryIndexDefinition
>   - evaluatePathRestrictions = true
>   - type = "lucene"
>   + indexRules
>+ nt:file
> + properties
>  + primaryType
>   - name = "jcr:primaryType"
>   - propertyIndex = true
>  + name
>   - function = "fn:name"
>   - ordered = true
>   - type = "String"
>
> the index is used (index cost is 501 compared to 80946 for traverse), but
> it takes more time than traversing with warnings like:
>
> WARN
> org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor
>  - Index-Traversed 8 nodes with filter Filter(query=SELECT * FROM
> [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s])
> DESC, path=/repo1/pruebaJF1/*)
>
> Thanks in advance.
>
> Regards.
>
> Jorge

Re: Versionable node deletion

2020-02-25 Thread Julian Sedding

Hi Jorge

If you're looking at reclaiming disk space from "orphaned" binaries,
you likely need Blob Garbage Collection:
https://jackrabbit.apache.org/oak/docs/plugins/blobstore.html#Blob_Garbage_Collection

Regards
Julian

On Mon, Feb 24, 2020 at 3:58 PM jorgeeflorez .
 wrote:
>
> Hi Marco,
> I agree, it is related to OAK-8048.
>
> > But since it
> > isn't, there is still one node that references the binary, so (the binary)
> > is not removed when running the garbage collector.
> >
> > I am not sure about this. I just printed the rootVersion node and it has
> nothing related to the node that was deleted, this is an example:
>
> "node": "jcr:rootVersion",
> "path":
> "/jcr:system/jcr:versionStorage/03/06/92/03069247-5a8e-4957-89d6-3ccaf32edad3/jcr:rootVersion",
> "mixins": [],
> "children": [{
>  "node": "jcr:frozenNode",
>  "path":
> "/jcr:system/jcr:versionStorage/03/06/92/03069247-5a8e-4957-89d6-3ccaf32edad3/jcr:rootVersion/jcr:frozenNode",
>  "mixins": [],
>  "children": [],
>  "properties": [
>  "jcr:frozenPrimaryType = nt:file",
>  "jcr:frozenUuid = 03069247-5a8e-4957-89d6-3ccaf32edad3",
>  "jcr:primaryType = nt:frozenNode",
>  "jcr:uuid = 3a63f325-2e8b-415e-8aa1-6112d4a9049a",
>  "jcr:frozenMixinTypes =
> mix:lastModified,mix:referenceable,rep:AccessControllable,mix:versionable"
> ]
> }],
> "properties": [
>  "jcr:predecessors = ",
>  "jcr:created = 2020-02-21T17:42:44.771-05:00",
>  "jcr:primaryType = nt:version",
>  "jcr:uuid = a3eae304-16f2-438d-a482-e6dbf5b3d198",
>  "jcr:successors = "
> ]
>
> Thinking about what I want, maybe it is not that easy to mark a binary as
> "orphan" (i.e. no node is referencing it) in runtime. But it would be great
> of some method could be called that gets all orphan binaries and deletes
> them. To save space. I do not if something like that exists.
>
> Jorge
>
>
> El lun., 24 feb. 2020 a las 9:17, Marco Piovesana ()
> escribió:
>
> > Hi Jorge,
> > I'm not an expert, but I think it might be related to OAK-804
> > . The root version should
> > be automatically removed when removing the last version. But since it
> > isn't, there is still one node that references the binary, so (the binary)
> > is not removed when running the garbage collector.
> >
> > Marco.
> >
> > On Mon, Feb 24, 2020 at 9:42 PM jorgeeflorez . <
> > jorgeeduardoflo...@gmail.com>
> > wrote:
> >
> > > Hi,
> > > I managed to delete all versions for nodes that no longer exist (except
> > the
> > > jcr:rootVersion nodes, they are "protected"). I was expecting that the
> > > total size of my binary storage would decrease (I am using
> > > OakFileDataStore), since some files are no longer referenced in any
> > nodes.
> > > But that did not happen...
> > >
> > > Any help is appreciated.
> > >
> > > Jorge
> > >
> > > El vie., 21 feb. 2020 a las 15:12, jorgeeflorez . (<
> > > jorgeeduardoflo...@gmail.com>) escribió:
> > >
> > > > Hi,
> > > > when I delete a node that has version history, using node.remove() and
> > > > then session.save(), should all version info related to that node be
> > > > deleted automatically? what about the files in that version history?
> > > >
> > > > After deleting, I print all nodes of the repository and I keep seeing
> > > > those version nodes. Actually, I was working with a repository uses a
> > > > DataStoreBlobStore and after deleting some file nodes I was expecting
> > > that
> > > > the total size of the folder that contains the files would decrease and
> > > it
> > > > did not happen, which led me to make this question :)
> > > >
> > > > Thanks.
> > > >
> > > > Jorge
> > > >
> > >
> >

Re: New Jackrabbit Committer: Konrad Windszus

2019-07-26 Thread Julian Sedding

Welcome Konrad!

Regards
Julian

On Thu, Jul 25, 2019 at 4:02 PM Woonsan Ko  wrote:
>
> Welcome, Konrad!
>
> Cheers,
>
> Woonsan
>
> On Wed, Jul 24, 2019 at 10:11 AM Konrad Windszus  wrote:
> >
> > Hi everyone,
> > thanks a lot for having invited me.
> > Some words about myself: I have experience with AEM/CQ since 2005. I am now 
> > working for Netcentric. I joined the Apache family in 2014 by becoming an 
> > Apache Sling committer. Meanwhile I am part of the Apache Sling PMC.
> >
> > I am looking forward to contribute even more in the future to 
> > Jackrabbit/Oak.
> > Particularly I am interested in improving Filevault and the related Maven 
> > Plugin.
> >
> > Konrad
> >
> >
> > > On 24. Jul 2019, at 15:37, Marcel Reutegger  wrote:
> > >
> > > Hi,
> > >
> > > Please welcome Konrad Windszus as a new committer and PMC member of
> > > the Apache Jackrabbit project. The Jackrabbit PMC recently decided to
> > > offer Konrad committership based on his contributions. I'm happy to
> > > announce that he accepted the offer and that all the related
> > > administrative work has now been taken care of.
> > >
> > > Welcome to the team, Konrad!
> > >
> > > Regards
> > > Marcel
> > >
> >

Re: Setting existing property from single value to multi-value

2019-07-24 Thread Julian Sedding

Thanks Julian for looking onto it!

Deleting the property first is indeed my workaround for the issue and it works

It's not a big deal (clearly, as it didn't pop up for 6 years or so),
but the behaviour was unexpected and seems unnecessarily restrictive
to me. It caused a minor production issue for a client in a very
generic code-path that hits Oak via Sling's ModifiableValueMap. Given
all layers involved I was surprised that I ended up in Oak ;)

If my question helps make Oak a little bit better, that's great. If we
can clarify the question and document it in the list's archive that's
also great.

Regards
Julian

On Wed, Jul 24, 2019 at 6:52 AM Julian Reschke  wrote:
>
> On 24.07.2019 05:55, Julian Reschke wrote:
> > On 23.07.2019 23:57, Julian Sedding wrote:
> >> Hi all
> >>
> >> Let's assume we have a Node N of primary type "nt:unstructured" with
> >> property P that has a String value "foo".
> >>
> >> Now when we try to change the value of P to a String[] value of
> >> ["foo", "bar"] a ValueException is thrown.
> >>
> >> This behaviour was introduced in OAK-273. Unfortunately the ticket
> >> does not give any explanation why this behaviour should be desired.
> >> ...
> >
> > I was curious and looked, and, surprise, I raised this issue back then.
> >
> > I would assume that this came up while running the TCK. That is, if we
> > undo this change, we are likely to see TCK tests failing.
> >
> > (not sure, but worth trying)
> >
> > Now that doesn't necessarily mean that the TCK is correct - I'll need
> > more time to re-read things.
> >
> > Best regards, Julian
>
> FWIW, did you try to delete the property first?
>
> Best regards, Julian

Setting existing property from single value to multi-value

2019-07-23 Thread Julian Sedding

Hi all

Let's assume we have a Node N of primary type "nt:unstructured" with
property P that has a String value "foo".

Now when we try to change the value of P to a String[] value of
["foo", "bar"] a ValueException is thrown.

This behaviour was introduced in OAK-273. Unfortunately the ticket
does not give any explanation why this behaviour should be desired.

Reading the java-docs for Node#setProperty(String name, Value value)
and Node#setProperty(String name, Value[] values) I got the impression
that no ValueFormatException should be thrown in that case.

The following are paragraphs 4-6 from the java-doc, the ones I
consider relevant to this issue:

(4) "The property type of the property will be that specified by the
node type of this node. If the property type of one or more of the
supplied Value objects is different from that required, then a
best-effort conversion is attempted, according to an
implemention-dependent definition of "best effort". If the conversion
fails, a ValueFormatException is thrown."

(5) "If the property is not multi-valued then a ValueFormatException
is also thrown. If another error occurs, a RepositoryException is
thrown."

(6) "If the node type of this node does not indicate a specific
property type, then the property type of the supplied Value objects is
used and if the property already exists it assumes both the new values
and the new property type."

The way I read this, paragraph (5) applies to properties where the
property type is specified by the node type. The reason is that (a) it
follows directly after paragraph (4), which is about node type defined
properties and (b) the word "also" in the phrase "... a
ValueFormatException is _also_ thrown" seems to refer back to (4).

Therefore paragraph (6) would be the one relevant to properties that
have no node type defined property type. And that is very clear that
the property should be changed to the new values and property type.

Does anyone have a good explanation why my reading is incorrect? Or
should I create a JIRA ticket to fix this?

Regards
Julian

Re: svn commit: r1834852 - /jackrabbit/oak/trunk/oak-segment-tar/src/test/java/org/apache/jackrabbit/oak/segment/osgi/

2018-07-02 Thread Julian Sedding

Hi Francesco

Have you considered using the MetaTypeReader from Felix' MetyType
implementation[0]? I've used it before and found it easy enough to
use.

It's only a test dependency, and you don't need to worry about your
implementation being in sync with the spec/Felix' implementation.

Regards
Julian

[0] 
https://github.com/apache/felix/blob/trunk/metatype/src/main/java/org/apache/felix/metatype/MetaDataReader.java

On Mon, Jul 2, 2018 at 4:57 PM,   wrote:
> Author: frm
> Date: Mon Jul  2 14:57:25 2018
> New Revision: 1834852
>
> URL: http://svn.apache.org/viewvc?rev=1834852&view=rev
> Log:
> OAK-6770 - Test the metatype information descriptors
>
> Added:
> 
> jackrabbit/oak/trunk/oak-segment-tar/src/test/java/org/apache/jackrabbit/oak/segment/osgi/MetatypeInformation.java
>(with props)
> Modified:
> 
> jackrabbit/oak/trunk/oak-segment-tar/src/test/java/org/apache/jackrabbit/oak/segment/osgi/SegmentNodeStoreFactoryTest.java
> 
> jackrabbit/oak/trunk/oak-segment-tar/src/test/java/org/apache/jackrabbit/oak/segment/osgi/SegmentNodeStoreMonitorServiceTest.java
> 
> jackrabbit/oak/trunk/oak-segment-tar/src/test/java/org/apache/jackrabbit/oak/segment/osgi/SegmentNodeStoreServiceTest.java
> 
> jackrabbit/oak/trunk/oak-segment-tar/src/test/java/org/apache/jackrabbit/oak/segment/osgi/StandbyStoreServiceTest.java
>
> Added: 
> jackrabbit/oak/trunk/oak-segment-tar/src/test/java/org/apache/jackrabbit/oak/segment/osgi/MetatypeInformation.java
> URL: 
> http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-segment-tar/src/test/java/org/apache/jackrabbit/oak/segment/osgi/MetatypeInformation.java?rev=1834852&view=auto
> ==
> --- 
> jackrabbit/oak/trunk/oak-segment-tar/src/test/java/org/apache/jackrabbit/oak/segment/osgi/MetatypeInformation.java
>  (added)
> +++ 
> jackrabbit/oak/trunk/oak-segment-tar/src/test/java/org/apache/jackrabbit/oak/segment/osgi/MetatypeInformation.java
>  Mon Jul  2 14:57:25 2018
> @@ -0,0 +1,267 @@
> +/*
> + * Licensed to the Apache Software Foundation (ASF) under one
> + * or more contributor license agreements.  See the NOTICE file
> + * distributed with this work for additional information
> + * regarding copyright ownership.  The ASF licenses this file
> + * to you under the Apache License, Version 2.0 (the
> + * "License"); you may not use this file except in compliance
> + * with the License.  You may obtain a copy of the License at
> + *
> + *   http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing,
> + * software distributed under the License is distributed on an
> + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
> + * KIND, either express or implied.  See the License for the
> + * specific language governing permissions and limitations
> + * under the License.
> + */
> +
> +package org.apache.jackrabbit.oak.segment.osgi;
> +
> +import java.io.InputStream;
> +import java.util.HashSet;
> +import java.util.Set;
> +
> +import javax.xml.parsers.DocumentBuilder;
> +import javax.xml.parsers.DocumentBuilderFactory;
> +
> +import org.w3c.dom.Document;
> +import org.w3c.dom.Element;
> +import org.w3c.dom.NodeList;
> +
> +class MetatypeInformation {
> +
> +static MetatypeInformation open(InputStream stream) throws Exception {
> +DocumentBuilderFactory factory = 
> DocumentBuilderFactory.newInstance();
> +DocumentBuilder builder = factory.newDocumentBuilder();
> +Document document = builder.parse(stream);
> +return new MetatypeInformation(document.getDocumentElement());
> +}
> +
> +private static boolean hasAttribute(Element element, String name, String 
> value) {
> +return element.hasAttribute(name) && 
> element.getAttribute(name).equals(value);
> +}
> +
> +private final Element root;
> +
> +private MetatypeInformation(Element root) {
> +this.root = root;
> +}
> +
> +ObjectClassDefinition getObjectClassDefinition(String id) {
> +return new ObjectClassDefinition(id);
> +}
> +
> +class ObjectClassDefinition {
> +
> +private final String id;
> +
> +private ObjectClassDefinition(String id) {
> +this.id = id;
> +}
> +
> +HasAttributeDefinition hasAttributeDefinition(String id) {
> +return new HasAttributeDefinition(this.id, id);
> +}
> +
> +}
> +
> +class HasAttributeDefinition {
> +
> +private final String ocd;
> +
> +private final String id;
> +
> +private String type;
> +
> +private String defaultValue;
> +
> +private String cardinality;
> +
> +private String[] options;
> +
> +private HasAttributeDefinition(String ocd, String id) {
> +this.ocd = ocd;
> +this.id = id;
> +}
> +
> +HasAttributeDefinition withStringType() {
> +this.type = "String";
> +

Re: Looking for small task starting in OAK .. DS conversion?

2017-10-31 Thread Julian Sedding

Hi Christian

It's up to you. I have finished the implementation of the tool now. If
you like, you can build it and see if it helps.

Regards
Julian


On Tue, Oct 31, 2017 at 9:56 AM, Christian Schneider
 wrote:
> Hi Julian,
>
> I finished the conversion for the oak-auth-external module and created a
> PR. The tests all run fine.
> I will look into the comparison tool but I am not sure if it is needed. Of
> course it is possible that I introduce a bug with
> my PR but the comparison tool will also not guarantee that the conversion
> is bug free.
>
> Christian
>
> 2017-10-30 13:40 GMT+01:00 Julian Sedding :
>
>> Hi Christian
>>
>> I have worked on OAK-6741 before and there were some concerns
>> regarding my changes.
>>
>> To address these concerns, I started work on a tool that allows
>> diffing the OSGi DS and MetaType metadata of two bundles. It uses
>> Felix' SCR and MetaType implementations to parse the metadata and
>> should thus be able to compare on a semantic level rather than on a
>> purely syntactic level (i.e. diff all XML files, which comes with its
>> own challenges)[0].
>>
>> Note, that the tool is yet unfinished, as I don't currently have time
>> to complete it. Basically, what's left to do is implementing some
>> comparisons and possibly more rendering (see TODOs in
>> MetaDataDiff[1]). Fell free to fork, or I'm also happy grant you write
>> access on my repository.
>>
>> I hope you find this helpful!
>>
>> Regards
>> Julian
>>
>> [0] https://github.com/jsedding/osgi-ds-metatype-diff
>> [1] https://github.com/jsedding/osgi-ds-metatype-diff/blob/
>> master/src/main/java/net/distilledcode/tools/osgi/MetadataDiff.java
>>
>>
>> On Mon, Oct 30, 2017 at 10:28 AM, Alex Deparvu 
>> wrote:
>> > Hi Christian,
>> >
>> > Thanks for your interest in helping out in this area!
>> > You can look at OAK-6741 [0] to see what the status of this effort is,
>> > there's a few tasks created already waiting for some attention :)
>> >
>> > best,
>> > alex
>> >
>> > [0] https://issues.apache.org/jira/browse/OAK-6741
>> >
>> >
>> >
>> > On Mon, Oct 30, 2017 at 9:57 AM, Christian Schneider <
>> > ch...@die-schneider.net> wrote:
>> >
>> >> Hi all,
>> >>
>> >> as I am just starting to work on OAK I am looking for a small task.
>> >> I found that there are still some components that use the old felix scr
>> >> annotations.
>> >> Does it make sense that I look into converting these to the DS ones so
>> we
>> >> can remove support for felix scr in the build?
>> >>
>> >> I have listed the classes below.
>> >> The main issue I see with the migration is that OAK uses the meta type
>> >> support of felix scr which is quite different to what DS 1.3 provides.
>> So I
>> >> would need to migrate from the property based meta type descriptions to
>> the
>> >> type safe ones of the DS 1.3 metatype support.
>> >>
>> >> Anyway I would provide one module per PR so the reviewer does not have
>> to
>> >> review one big commit at once.
>> >>
>> >> Best
>> >> Christian
>> >>
>> >> --
>> >> --
>> >> Christian Schneider
>> >> http://www.liquid-reality.de
>> >> <https://owa.talend.com/owa/redir.aspx?C=3aa4083e0c744ae1ba52bd062c5a7e
>> >> 46&URL=http%3a%2f%2fwww.liquid-reality.de>
>> >>
>> >> Computer Scientist
>> >> http://www.adobe.com
>> >>
>> >>
>> >> ---
>> >>
>> >> oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/
>> >> authentication/external/impl/DefaultSyncConfigImpl.java:import
>> >> org.apache.felix.scr.annotations.Component;
>> >>
>> >> oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/
>> >> authentication/external/impl/DefaultSyncHandler.java:import
>> >> org.apache.felix.scr.annotations.Component;
>> >>
>> >> oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/
>> >> authentication/external/impl/ExternalIDPManagerImpl.java:import
>> >> org.apache.felix.scr.annotations.Component;
>> >>
>> >> oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/
>> >> authentication/ext

Re: Looking for small task starting in OAK .. DS conversion?

2017-10-30 Thread Julian Sedding

Hi Christian

I have worked on OAK-6741 before and there were some concerns
regarding my changes.

To address these concerns, I started work on a tool that allows
diffing the OSGi DS and MetaType metadata of two bundles. It uses
Felix' SCR and MetaType implementations to parse the metadata and
should thus be able to compare on a semantic level rather than on a
purely syntactic level (i.e. diff all XML files, which comes with its
own challenges)[0].

Note, that the tool is yet unfinished, as I don't currently have time
to complete it. Basically, what's left to do is implementing some
comparisons and possibly more rendering (see TODOs in
MetaDataDiff[1]). Fell free to fork, or I'm also happy grant you write
access on my repository.

I hope you find this helpful!

Regards
Julian

[0] https://github.com/jsedding/osgi-ds-metatype-diff
[1] 
https://github.com/jsedding/osgi-ds-metatype-diff/blob/master/src/main/java/net/distilledcode/tools/osgi/MetadataDiff.java


On Mon, Oct 30, 2017 at 10:28 AM, Alex Deparvu  wrote:
> Hi Christian,
>
> Thanks for your interest in helping out in this area!
> You can look at OAK-6741 [0] to see what the status of this effort is,
> there's a few tasks created already waiting for some attention :)
>
> best,
> alex
>
> [0] https://issues.apache.org/jira/browse/OAK-6741
>
>
>
> On Mon, Oct 30, 2017 at 9:57 AM, Christian Schneider <
> ch...@die-schneider.net> wrote:
>
>> Hi all,
>>
>> as I am just starting to work on OAK I am looking for a small task.
>> I found that there are still some components that use the old felix scr
>> annotations.
>> Does it make sense that I look into converting these to the DS ones so we
>> can remove support for felix scr in the build?
>>
>> I have listed the classes below.
>> The main issue I see with the migration is that OAK uses the meta type
>> support of felix scr which is quite different to what DS 1.3 provides. So I
>> would need to migrate from the property based meta type descriptions to the
>> type safe ones of the DS 1.3 metatype support.
>>
>> Anyway I would provide one module per PR so the reviewer does not have to
>> review one big commit at once.
>>
>> Best
>> Christian
>>
>> --
>> --
>> Christian Schneider
>> http://www.liquid-reality.de
>> > 46&URL=http%3a%2f%2fwww.liquid-reality.de>
>>
>> Computer Scientist
>> http://www.adobe.com
>>
>>
>> ---
>>
>> oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/
>> authentication/external/impl/DefaultSyncConfigImpl.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/
>> authentication/external/impl/DefaultSyncHandler.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/
>> authentication/external/impl/ExternalIDPManagerImpl.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/
>> authentication/external/impl/ExternalLoginModuleFactory.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/
>> authentication/external/impl/principal/ExternalPrincipalConfiguration
>> .java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/
>> authentication/external/impl/SyncManagerImpl.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-auth-ldap/src/main/java/org/apache/jackrabbit/oak/
>> security/authentication/ldap/impl/LdapIdentityProvider.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-auth-ldap/src/main/java/org/apache/jackrabbit/oak/
>> security/authentication/ldap/impl/LdapProviderConfig.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-authorization-cug/src/main/java/org/apache/
>> jackrabbit/oak/spi/security/authorization/cug/impl/
>> CugConfiguration.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-authorization-cug/src/main/java/org/apache/
>> jackrabbit/oak/spi/security/authorization/cug/impl/
>> CugExcludeImpl.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-blob/src/main/java/org/apache/jackrabbit/oak/spi/blob/osgi/
>> FileBlobStoreService.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-blob/src/main/java/org/apache/jackrabbit/oak/spi/blob/osgi/
>> SplitBlobStoreService.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-blob-cloud/src/main/java/org/apache/jackrabbit/oak/blob/cloud/s3/
>> AbstractS3DataStoreService.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-blob-cloud/src/main/java/org/apache/jackrabbit/oak/blob/cloud/s3/
>> S3DataStoreService.java:import
>> org.apache.felix.scr.annotations.Component;
>>
>> oak-blob-cloud/src/main/java/org/apache/ja

Re: clustering and cold standby

2017-10-16 Thread Julian Sedding

Hi Marco

Cold Standby is a TarMK feature that allows for a quick failover. You
may think of it as a near real-time backup. It is _not_ a cluster. As
you noted, the sync is one-way only.

Therefore, I don't think it is possible to direct reads or writes to
the cold standby instance. AFAIK this should be prevented by the
implementation. But others are more knowledgable about these details.

Regards
Julian

On Mon, Oct 16, 2017 at 11:28 AM, Marco Piovesana  wrote:
> Hi all,
> I'm trying to set-up a cluster environment with Oak, so that anytime I can
> take down for maintenance one machine without stopping the service. One
> option is of course to use Mongo or RDBM storages. Talking with the guys at
> adaptTo() this year I've been told that maybe there's another option:
> replicate the repository in each instance of the cluster and use the "*cold
> standby*" for the synchronization.
> The sync process, however, is one-way only. My question is: do you guys
> think is possible to use it in a cluster where read and write requests are
> coming from any of the instances of the cluster?
>
> Marco.

Re: OAK-6575 - Provide a secure external URL to a DataStore binary.

2017-08-24 Thread Julian Sedding

Hi

On Thu, Aug 24, 2017 at 9:27 AM, Ian Boston  wrote:
> On 24 August 2017 at 08:18, Michael Dürig  wrote:
>
>>
>>
>> URI uri = ((OakValueFactory) valueFactory).getSignedURI(binProp);
>>
>>
> +1
>
> One point
> Users in Sling dont know abou Oak, they know about JCR.

I think this issue should be solved in two steps:

1. Figure out how to surface a signed URL from the DataStore to the
level of the JCR (or Oak) API.
2. Provide OSGi glue inside Sling, possibly exposing the signed URL it
via adaptTo().

>
> URI uri = ((OakValueFactory)
> valueFactory).getSignedURI(jcrNode.getProperty("jcr:data"));
>
> No new APIs, let OakValueFactory work it out and return null if it cant do
> it. It should also handle a null parameter.
> (I assume OakValueFactory already exists)
>
> If you want to make it extensible
>
>  T convertTo(Object source, Class target);
>
> used as
>
> URI uri = ((OakValueFactory)
> valueFactory). convertTo(jcrNode.getProperty("jcr:data"), URI.class);

There is an upcoming OSGi Spec for a Converter service (RFC 215 Object
Conversion, also usable outside of OSGI)[0]. It has an implementation
in Felix, but afaik no releases so far.

A generic Converter would certainly help with decoupling. Basically
the S3-DataStore could register an appropriate conversion, hiding all
implementation details.

Regards
Julian

[0] 
https://github.com/osgi/design/blob/05cd5cf03d4b6f8a512886eae472a6b6fde594b0/rfcs/rfc0215/rfc-0215-object-conversion.pdf

>
> The user doesnt know or need to know the URI is signed, it needs a URI that
> can be resolved.
> Oak wants it to be signed.
>
> Best Regards
> Ian
>
>
>
>> Michael
>>
>>
>>
>>
>>> A rough sketch of any alternative proposal would be helpful to decide
>>> how to move forward
>>>
>>> Chetan Mehrotra
>>>
>>>

Dependency to DropWizard Metrics Library (was: Percentile implementation)

2017-07-17 Thread Julian Sedding

Hi all

OAK-6430[0] introduced a mandatory dependency to
io.dropwizard.metrics:metrics-core to oak-segment-tar.

Before this change, the runtime dependency to this metrics library was
optional (and it still is in oak-core).

Originally, the dependency was introduced in OAK-3654[1] and a facade
was implemented with the following justification: "To avoid having
dependency on Metrics API all over in Oak we can come up with minimal
interfaces which can be used in Oak and then provide an implementation
backed by Metric."

There was no discussion on the Oak list at the time. However, a
similar discussion happened on the Sling list[2]. Basically, bad past
experiences with breaking changes in the dropwizard metrics API led to
the implementation of a facade in order to limit the potentila impact
of future breaking changes. Of course a facade decouples the code from
the dependency and thus allows plugging in a different implementation
should the need arise.

Therefore, I ask the dev team:
(1) Do we want a mandatory runtime dependency
io.dropwizard.metrics:metrics-core?
(2) Should we revisit OAK-6430 and implement the mechanism via the
facade? Probably extending the HistogramStats interface with a method
"#getPercentile(double)".

IMHO we should avoid the mandatory dependency.

Regards
Julian

[0] https://issues.apache.org/jira/browse/OAK-6430
[1] https://issues.apache.org/jira/browse/OAK-3654
[2] http://markmail.org/thread/47fd5psel2wv2y42



On Thu, Jul 6, 2017 at 2:54 PM, Andrei Dulceanu
 wrote:
>> The only problem that I see is the fact that it doesn't provide a way to
>> easily access a desired percentile (only mean and 75th, 95th, 98th, 99th
>> and 999th). Currently we are using 50th percentile, i.e. mean, but in the
>> future that might change.
>>
>
> Please read median instead of mean above. Implementing the change, I
> discovered Histogram#getSnapshot().getValue(double quantile) which is
> exactly what I was looking for.
>
>
>> I will try to make the adjustments and will revisit the percentile
>> implementation once we'll change our use pattern there.
>>
>
> This change is tracked in OAK-6430 [0] and fixed at r1801043.
>
> [0] https://issues.apache.org/jira/browse/OAK-6430
>
> 2017-07-06 14:55 GMT+03:00 Andrei Dulceanu :
>
>> Hi Chetan,
>>
>>
>>> Instead of commons-math can we use Metric Histogram  (which I also
>>> suggested earlier in the thread).
>>
>>
>> I took another look at the Metric Histogram and I think at the moment it
>> can be used instead of SynchronizedDescriptiveStatistics from
>> commons-math3. The only problem that I see is the fact that it doesn't
>> provide a way to easily access a desired percentile (only mean and 75th,
>> 95th, 98th, 99th and 999th). Currently we are using 50th percentile, i.e.
>> mean, but in the future that might change.
>>
>>
>>> This would avoid downstream Oak
>>> users to include another dependency as Oak is already using Metrics in
>>> other places.
>>>
>>
>> I will try to make the adjustments and will revisit the percentile
>> implementation once we'll change our use pattern there.
>>
>> Regards,
>> Andrei
>>
>> 2017-07-06 14:38 GMT+03:00 Chetan Mehrotra :
>>
>>> Instead of commons-math can we use Metric Histogram  (which I also
>>> suggested earlier in the thread). This would avoid downstream Oak
>>> users to include another dependency as Oak is already using Metrics in
>>> other places.
>>>
>>> Can we reconsider this decision?
>>> Chetan Mehrotra
>>>
>>>
>>> On Tue, Jul 4, 2017 at 4:45 PM, Julian Sedding 
>>> wrote:
>>> > Maybe it is not necessary to embed *all* of commons-math3. The bnd
>>> > tool (used by maven-bundle-plugin) can intelligently embed classes
>>> > from specified java packages, but only if they are referenced.
>>> > Depending on how well commons-math3 is modularized, that could allow
>>> > for much less embedded classes. Neil Bartlett wrote a good blog post
>>> > about this feature[0].
>>> >
>>> > Regards
>>> > Julian
>>> >
>>> > [0] http://njbartlett.name/2014/05/26/static-linking.html
>>> >
>>> >
>>> > On Tue, Jul 4, 2017 at 12:20 PM, Andrei Dulceanu
>>> >  wrote:
>>> >> I'll add the dependency.
>>> >>
>>> >> Thanks,
>>> >> Andrei
>>> >>
>>> >> 2017-07-04 13:10 GMT+03:00 Michael Dürig :
>>> >>
>>> >>>
>>> >>>
&g

Re: Percentile implementation

2017-07-04 Thread Julian Sedding

Maybe it is not necessary to embed *all* of commons-math3. The bnd
tool (used by maven-bundle-plugin) can intelligently embed classes
from specified java packages, but only if they are referenced.
Depending on how well commons-math3 is modularized, that could allow
for much less embedded classes. Neil Bartlett wrote a good blog post
about this feature[0].

Regards
Julian

[0] http://njbartlett.name/2014/05/26/static-linking.html


On Tue, Jul 4, 2017 at 12:20 PM, Andrei Dulceanu
 wrote:
> I'll add the dependency.
>
> Thanks,
> Andrei
>
> 2017-07-04 13:10 GMT+03:00 Michael Dürig :
>
>>
>>
>> On 04.07.17 11:15, Francesco Mari wrote:
>>
>>> 2017-07-04 10:52 GMT+02:00 Andrei Dulceanu :
>>>
 Now my question is this: do we have a simple percentile implementation in
 Oak (I didn't find one)?

>>>
>>> I'm not aware of a percentile implementation in Oak.
>>>
>>> If not, would you recommend writing my own or adapting/extracting an
 existing one in a utility class?

>>>
>>> In the past we copied and pasted source code from other projects in
>>> Oak. As long as the license allows it and proper attribution is given,
>>> it shouldn't be a problem. That said, I'm not a big fan of either
>>> rewriting an implementation from scratch or copying and pasting source
>>> code from other projects. Is exposing a percentile really necessary?
>>> If yes, how big of a problem is embedding of commons-math3?
>>>
>>>
>> We should avoid copy paste as we might miss important fixes in later
>> releases. I only did this once for some code where we needed a fix that
>> wasn't yet released. It was a hassle.
>> I would just add a dependency to commons-math3. Its a library exposing the
>> functionality we require, so let's use it.
>>
>> Michael
>>

Re: [DiSCUSS] - highly vs rarely used data

2017-07-04 Thread Julian Sedding

>From my experience working with customers, I can pretty much guarantee
that sooner or later:

(a) the implementation of an automatism is not *quite* what they need/want
(b) they want to be able to manually select (or more likely override)
whether a file can be archived

Thus I suggest to come up with a pluggable "strategy" interface and
provide a sensible default implementation. The default will be fine
for most customers/users, but advanced use-cases can be implemented by
substituting the implementation. Implementations could then also
respect manually set flags (=properties) if desired.

A much more important and difficult question to answer IMHO is how to
deal with the slow retrieval of archived content. And if needed, how
to expose the slow availability (i.e. unavailable now but available
later) to the end user (or application layer). To me this sounds
tricky if we want to stick to the JCR API.

Regards
Julian



On Mon, Jul 3, 2017 at 4:33 PM, Tommaso Teofili
 wrote:
> I am sure there are both use cases for automatic vs manual/controlled
> collection of unused data, however if I were a user I would personally not
> want to care about this. While I'd be happy to know that my repo is faster
> / smaller / cleaner / whatever it'd sound overly complex to deal with JCR
> and Oak constraints and behaviours from the application layer.
> IMHO if we want to have such a feature in Oak to save resources, it should
> be the persistence responsibility to say "hey, this content is not being
> accessed for ages, let's try to claim some resources from it" (which could
> mean moving to cold storage, compress it or anything else).
>
> My 2 cents,
> Tommaso
>
>
>
> Il giorno lun 3 lug 2017 alle ore 15:46 Thomas Mueller
>  ha scritto:
>
>> Hi,
>>
>> > a property on the node, e.g. "archiveState=toArchive"
>>
>> I wonder if we _can_ easily write to the version store? Also, some
>> nodetypes don't allow such properties? It might need to be a hidden
>> property, but then you can't use the JCR API. Or maintain this data in a
>> "shadow" structure (not with the nodes), which would complicate move
>> operations.
>>
>> If I was a customer, I wouldn't wan't to *manually* mark / unmark binaries
>> to be moved to / from long time storage. I would probably just want to rely
>> on automatic management. But I'm not a customer, so my opinion is not that
>> relevant (
>>
>> > Using a property directly specified for this purpose gives us more
>> direct control over how it is being used I think.
>>
>> Sure, but it also comes with some complexities.
>>
>> Regards,
>> Thomas
>>
>>
>>
>>

Re: copy on write node store

2017-05-30 Thread Julian Sedding

Slightly off topic: the thought that the copy on read/write indexing
features may need to be explicitly managed in such a setup just
occurred to me.

I.e. when an instance is switched to the copy on write node store, the
local index directory will deviate from the "mainline" node store.
Upon switching the instance back to the "mainline" (i.e. disabling
copy on write node store), the local index directory may need to be
deleted? Or maybe it is already resilient enough to automatically
recover.

Regards
Julian

On Tue, May 30, 2017 at 10:05 AM, Michael Dürig  wrote:
>
>
> On 30.05.17 09:34, Tomek Rekawek wrote:
>>
>> Hello Michael,
>>
>> thanks for the reply!
>>
>>> On 30 May 2017, at 09:18, Michael Dürig  wrote:
>>> AFAIU from your mail and from looking at the patch this is about a node
>>> store implementation that can be rolled back to a previous state.
>>>
>>> If this is the case, a simpler way to achieve this might be to use the
>>> TarMK and and add functionality for rolling it back.
>>
>>
>> Indeed, it would be much simpler. However, the main purpose of the new
>> feature is testing the blue-green Sling deployments. That’s why we need the
>> DocumentMK to support it as well.
>
>
> Ok I see. I think the fact that these classes are not for production use
> should be stated in the Javadoc along with what clarifications of what can
> be expected from the store wrt. interleaving of calls to various mutators
> (e.g. enableCopyOnWrite() / disableCopyOnWrite() / merge(), etc.). I foresee
> a couple of very sneaky race conditions here.
>
> Michael

Re: upgrade repository structure with backward-incompatible changes

2017-05-19 Thread Julian Sedding

Hi Marco

In this case I think you should use the JCR API to implement your
content changes.

I am not aware of a pure JCR toolkit that helps with this, so you may
just need to write something yourself.

Regards
Julian



On Fri, May 19, 2017 at 5:00 PM, Marco Piovesana  wrote:
> Hi Julian,
> I meant I'm using Oak not Sing. Yes I'm using JCR API.
>
> Marco.
>
> On Fri, May 19, 2017 at 2:22 PM, Julian Sedding  wrote:
>
>> Hi Marco
>>
>> On Fri, May 19, 2017 at 2:10 PM, Marco Piovesana 
>> wrote:
>> > Hi Julian, Michael and Robert
>> > first of all thanks for the suggestions.
>> > I'm using Oak directly inside my application,
>>
>> Do you mean you are not using the JCR API?
>>
>> > so I guess the Sling Pipes
>> > are not something I can use, or not? Is the concept of Pipe already
>> defined
>> > in some way inside oak?
>>
>> No Oak has no such concept. Sling Pipes is an OSGi bundle that is
>> unrelated to Oak but uses the JCR and Jackrabbit APIs (both are
>> implemented by Oak).
>>
>> Regards
>> Julian
>>
>> >
>> > Marco.
>> >
>> > On Fri, May 19, 2017 at 10:39 AM, Julian Sedding 
>> wrote:
>> >
>> >> Hi Marco
>> >>
>> >> It sounds like you are dealing with a JCR-based application and thus
>> >> you should be using the JCR API (directly or indirectly, e.g. via
>> >> Sling) to change your content.
>> >>
>> >> CommitHook is an Oak internal API that does not enforce any JCR
>> >> semantics. So if you were to go down that route, you would need to be
>> >> very careful not to change the content structure in a way  that
>> >> essentially corrupts JCR semantics.
>> >>
>> >> Regards
>> >> Julian
>> >>
>> >>
>> >> On Tue, May 16, 2017 at 6:33 PM, Marco Piovesana 
>> >> wrote:
>> >> > Hi Tomek,
>> >> > yes I'm trying to upgrade within the same repository type but I can
>> >> decide
>> >> > weather to migrate the repository or not based on what makes the
>> upgrade
>> >> > easier.
>> >> > The CommitHooks can only be used inside an upgrade to a new
>> repository?
>> >> > What is the suggested way to apply backward-incompatible changes if i
>> >> don't
>> >> > want to migrate the data from one repository to another but I want to
>> >> apply
>> >> > the modifications to the original one?
>> >> >
>> >> > Marco.
>> >> >
>> >> > On Tue, May 16, 2017 at 4:04 PM, Tomek Rekawek
>> > >> >
>> >> > wrote:
>> >> >
>> >> >> Hi Marco,
>> >> >>
>> >> >> the main purpose of the oak-upgrade is to migrate a Jackrabbit 2 /
>> CRX2
>> >> >> repository into Oak or to migrate one Oak node store (eg. segment) to
>> >> >> another (like Mongo). On the other hand, it’s not a good choice to
>> use
>> >> it
>> >> >> for the application upgrades within the same repository type. You
>> didn’t
>> >> >> mention if your upgrade involves the repository migration (in this
>> case
>> >> >> choosing oak-upgrade would be justified) or not.
>> >> >>
>> >> >> If you still want to use oak-upgrade, it allows to use custom
>> >> CommitHooks
>> >> >> [1] during the migration. They should be included in the class path
>> with
>> >> >> the ServiceLoader mechanism [2].
>> >> >>
>> >> >> Regards,
>> >> >> Tomek
>> >> >>
>> >> >> [1] http://jackrabbit.apache.org/oak/docs/architecture/
>> >> >> nodestate.html#The_commit_hook_mechanism
>> >> >> [2] https://docs.oracle.com/javase/tutorial/sound/SPI-intro.html
>> >> >>
>> >> >> --
>> >> >> Tomek Rękawek | Adobe Research | www.adobe.com
>> >> >> reka...@adobe.com
>> >> >>
>> >> >> > On 14 May 2017, at 12:20, Marco Piovesana 
>> >> wrote:
>> >> >> >
>> >> >> > Hi all,
>> >> >> > I'm trying to deal with backward-incompatible changes on my
>> repository
>> >> >> > structure. I was looking at the oak-up

Re: upgrade repository structure with backward-incompatible changes

2017-05-19 Thread Julian Sedding

Hi Marco

On Fri, May 19, 2017 at 2:10 PM, Marco Piovesana  wrote:
> Hi Julian, Michael and Robert
> first of all thanks for the suggestions.
> I'm using Oak directly inside my application,

Do you mean you are not using the JCR API?

> so I guess the Sling Pipes
> are not something I can use, or not? Is the concept of Pipe already defined
> in some way inside oak?

No Oak has no such concept. Sling Pipes is an OSGi bundle that is
unrelated to Oak but uses the JCR and Jackrabbit APIs (both are
implemented by Oak).

Regards
Julian

>
> Marco.
>
> On Fri, May 19, 2017 at 10:39 AM, Julian Sedding  wrote:
>
>> Hi Marco
>>
>> It sounds like you are dealing with a JCR-based application and thus
>> you should be using the JCR API (directly or indirectly, e.g. via
>> Sling) to change your content.
>>
>> CommitHook is an Oak internal API that does not enforce any JCR
>> semantics. So if you were to go down that route, you would need to be
>> very careful not to change the content structure in a way  that
>> essentially corrupts JCR semantics.
>>
>> Regards
>> Julian
>>
>>
>> On Tue, May 16, 2017 at 6:33 PM, Marco Piovesana 
>> wrote:
>> > Hi Tomek,
>> > yes I'm trying to upgrade within the same repository type but I can
>> decide
>> > weather to migrate the repository or not based on what makes the upgrade
>> > easier.
>> > The CommitHooks can only be used inside an upgrade to a new repository?
>> > What is the suggested way to apply backward-incompatible changes if i
>> don't
>> > want to migrate the data from one repository to another but I want to
>> apply
>> > the modifications to the original one?
>> >
>> > Marco.
>> >
>> > On Tue, May 16, 2017 at 4:04 PM, Tomek Rekawek > >
>> > wrote:
>> >
>> >> Hi Marco,
>> >>
>> >> the main purpose of the oak-upgrade is to migrate a Jackrabbit 2 / CRX2
>> >> repository into Oak or to migrate one Oak node store (eg. segment) to
>> >> another (like Mongo). On the other hand, it’s not a good choice to use
>> it
>> >> for the application upgrades within the same repository type. You didn’t
>> >> mention if your upgrade involves the repository migration (in this case
>> >> choosing oak-upgrade would be justified) or not.
>> >>
>> >> If you still want to use oak-upgrade, it allows to use custom
>> CommitHooks
>> >> [1] during the migration. They should be included in the class path with
>> >> the ServiceLoader mechanism [2].
>> >>
>> >> Regards,
>> >> Tomek
>> >>
>> >> [1] http://jackrabbit.apache.org/oak/docs/architecture/
>> >> nodestate.html#The_commit_hook_mechanism
>> >> [2] https://docs.oracle.com/javase/tutorial/sound/SPI-intro.html
>> >>
>> >> --
>> >> Tomek Rękawek | Adobe Research | www.adobe.com
>> >> reka...@adobe.com
>> >>
>> >> > On 14 May 2017, at 12:20, Marco Piovesana 
>> wrote:
>> >> >
>> >> > Hi all,
>> >> > I'm trying to deal with backward-incompatible changes on my repository
>> >> > structure. I was looking at the oak-upgrade module but, as far as I
>> could
>> >> > understand, I can't really make modifications that require some logic
>> >> (e.g.
>> >> > remove a property and add a new mandatory property with a value based
>> on
>> >> > the removed one).
>> >> > I saw that one of the options might be the "namespace migration":
>> >> > - remap the current namespace to a different prefix;
>> >> > - create a new namespace with original prefix;
>> >> > - port all nodes from old namespace to new namespace applying the
>> >> required
>> >> > modifications.
>> >> >
>> >> > I couldn't find much documentation on the topic, so my question is: is
>> >> this
>> >> > the right way to do it? There are other suggested approaches to the
>> >> > problem? There's already a tool that can be used to define how to map
>> a
>> >> > source CND definition into a destination CND definition and then apply
>> >> the
>> >> > modifications to a repository?
>> >> >
>> >> > Marco.
>> >>
>>

Re: upgrade repository structure with backward-incompatible changes

2017-05-19 Thread Julian Sedding

Hi Marco

It sounds like you are dealing with a JCR-based application and thus
you should be using the JCR API (directly or indirectly, e.g. via
Sling) to change your content.

CommitHook is an Oak internal API that does not enforce any JCR
semantics. So if you were to go down that route, you would need to be
very careful not to change the content structure in a way  that
essentially corrupts JCR semantics.

Regards
Julian


On Tue, May 16, 2017 at 6:33 PM, Marco Piovesana  wrote:
> Hi Tomek,
> yes I'm trying to upgrade within the same repository type but I can decide
> weather to migrate the repository or not based on what makes the upgrade
> easier.
> The CommitHooks can only be used inside an upgrade to a new repository?
> What is the suggested way to apply backward-incompatible changes if i don't
> want to migrate the data from one repository to another but I want to apply
> the modifications to the original one?
>
> Marco.
>
> On Tue, May 16, 2017 at 4:04 PM, Tomek Rekawek 
> wrote:
>
>> Hi Marco,
>>
>> the main purpose of the oak-upgrade is to migrate a Jackrabbit 2 / CRX2
>> repository into Oak or to migrate one Oak node store (eg. segment) to
>> another (like Mongo). On the other hand, it’s not a good choice to use it
>> for the application upgrades within the same repository type. You didn’t
>> mention if your upgrade involves the repository migration (in this case
>> choosing oak-upgrade would be justified) or not.
>>
>> If you still want to use oak-upgrade, it allows to use custom CommitHooks
>> [1] during the migration. They should be included in the class path with
>> the ServiceLoader mechanism [2].
>>
>> Regards,
>> Tomek
>>
>> [1] http://jackrabbit.apache.org/oak/docs/architecture/
>> nodestate.html#The_commit_hook_mechanism
>> [2] https://docs.oracle.com/javase/tutorial/sound/SPI-intro.html
>>
>> --
>> Tomek Rękawek | Adobe Research | www.adobe.com
>> reka...@adobe.com
>>
>> > On 14 May 2017, at 12:20, Marco Piovesana  wrote:
>> >
>> > Hi all,
>> > I'm trying to deal with backward-incompatible changes on my repository
>> > structure. I was looking at the oak-upgrade module but, as far as I could
>> > understand, I can't really make modifications that require some logic
>> (e.g.
>> > remove a property and add a new mandatory property with a value based on
>> > the removed one).
>> > I saw that one of the options might be the "namespace migration":
>> > - remap the current namespace to a different prefix;
>> > - create a new namespace with original prefix;
>> > - port all nodes from old namespace to new namespace applying the
>> required
>> > modifications.
>> >
>> > I couldn't find much documentation on the topic, so my question is: is
>> this
>> > the right way to do it? There are other suggested approaches to the
>> > problem? There's already a tool that can be used to define how to map a
>> > source CND definition into a destination CND definition and then apply
>> the
>> > modifications to a repository?
>> >
>> > Marco.
>>

Re: new name for the multiplexing node store

2017-05-11 Thread Julian Sedding

+1 to CompositeNodeStore

Regards
Julian

On Thu, May 11, 2017 at 10:36 AM, Bertrand Delacretaz
 wrote:
> On Thu, May 11, 2017 at 9:33 AM, Robert Munteanu  wrote:
>> ...MultiplexingNodeStore is a pretty standard implementation
>> of the Composite design pattern...
>
> So CompositeNodeStore maybe? I like it.
>
> -Bertrand

Re: new name for the multiplexing node store

2017-05-05 Thread Julian Sedding

Hi Tomek

In all related discussions the term "mount" appears a lot. So why not
Mounting NodeStore? The module could be "oak-store-mount".

Regards
Julian


On Fri, May 5, 2017 at 1:39 PM, Tomek Rekawek  wrote:
> Hello oak-dev,
>
> the multiplexing node store has been recently extracted from the oak-core 
> into a separate module and I’ve used it as an opportunity to rename the 
> thing. The name I suggested is Federated Node Store. Robert doesn’t agree 
> it’s the right name, mostly because the “partial” node stores, creating the 
> combined (multiplexing / federated) one, are not usable on their own and 
> stores only a part of the overall repository content.
>
> Our arguments in their full lengths can be found in the OAK-6136 (last 3-4 
> comments), so there’s no need to repeat them here. We wanted to ask you for 
> opinion about the name. We kind of agree that the “multiplexing” is not the 
> best choice - can you suggest something else or maybe you think that 
> “federated” is good enough?
>
> Thanks for the feedback.
>
> Regards,
> Tomek
>
> --
> Tomek Rękawek | Adobe Research | www.adobe.com
> reka...@adobe.com
>

Re: oak-run: Enforcing size

2017-04-28 Thread Julian Sedding

I also think that the build should not produce different artifacts
depending on a profile.

If the jar file gets too big when embedding the JDBC driver, we may
want to consider producing two build artifacts: the jar file without
RDB support and another one (e.g. with classifier "rdb") that embeds
the drivers.

Regards
Julian



On Fri, Apr 28, 2017 at 1:15 PM, Davide Giannella  wrote:
> On 26/04/2017 09:32, Julian Reschke wrote:
>> On 2017-04-26 10:28, Davide Giannella wrote:
>>
>>> a release we're not triggering any specific profile.
>>
>> Well, in that case we're not triggering the profile, right?
>
> Exactly. Therefore the released oak-run never embedded any jdbc so far.
> Anyone correct me if I'm wrong.
>
>>
>>> Regardless, the fastest solution is to increase the size according to
>>> what you see. However is this a new dependency you're adding as of new
>>> features?
>>
>> No, it always has been the case.
>>
>> However, if you select all RDB profiles you'll include essentially all
>> JDBC drivers, in which case maintaining the limit becomes pretty
>> pointless...
>
> I'd say you could change the size for the RDB profiles only (adding the
> enforcer size under the profiles) or simply increase the general size.
>
> It seems strange to me that we're not embedding the jdbc dependencies
> for the released jar. Maybe we want to change that and simplify.
>
> How bit is the generated jar for the RDB profiles?
>
> D.

Re: [ops] Unify NodeStore/DataStore configurations by using nstab

2017-04-28 Thread Julian Sedding

Hi Arek

I agree that we could benefit from a way to bootstrap a repository
from a single configuration file.

Regarding the format you suggest, I am sceptical that it is suitable
to cover all (required) complexities of setting up a repository.
Consider that besides the persistence, there are various security
components, initial content providers etc that (may) need to be
considered.

I suggest you create a POC in a separate Maven module. That's probably
the best way to find out whether your suggested configuration language
suits the requirements of setting up an Oak repository.

Regarding the implementation, I assume you should be able to get quite
far with just using the classes Oak and Jcr. They should also give an
impression of the configuration options you may want to cover.
Furthermore, you would need a way to map some class names to
short-hand names (e.g. Segment, File etc from your examples). I'd
start with a hard-coded Map or a Properties file. Once the POC is done
and we want to integrate it, we can consider replacing this registry
mechanism.

Regards
Julian


On Fri, Apr 28, 2017 at 12:56 PM, Arek Kita  wrote:
> Hi,
>
> I've noticed recently that with many different NodeStore
> implementation (Segment, Document, Multiplexing) but also DataStore
> implementation (File, S3, Azure) and some composite ones like
> (Hierarchical, Federated - that was already mentioned in [0]) it
> becomes more and more difficult to set up everything correctly and be
> able to know the current persistence state of repository (especially
> with pretty aged repos).
>
> Moreover, the configuration pattern that is based on individual PID of
> one service becomes problematic (i.e. recent change for
> SegmentNodeStoreService).
>
> From the operations and user perspective everything should be treated
> as a whole IMHO no matter which service handles which fragment of
> persistence layout. Oak should know itself how to "autowire" different
> parts, obviously with some hints and pointers from users as they want
> to run Oak in their own preferred layout.
>
> My proposal would be to integrate everything together to a pretty old
> concept called "fstab". For our purposes I would call it "nstab".
>
> This could look like [1] for the most simple case (with internal
> blobs), [2] for typical SegmentMK + FDS, [3] for SegmentMK + S3DS, [4]
> for MultiplexingNodeStore with some areas of repo set as read only. I
> think we could also model Hierarchical and Federated DataStores as
> well in the future.
>
> Examples are for illustration purposes but I guess such setup will
> help changing layout without a need to inspect many OSGi
> configurations in a current setup and making sure some conflicting
> ones aren't active.
>
> The schema is also similar to an UNIX-way of configuring filesystem so
> it will help Oak users to understand the layout (at least better than
> it is now). I see also advantage for automated tooling like
> oak-upgrade for complex cases in the future - user just provides
> source nstab and target nstab in order to migrate repository.
>
> The config should be also simpler avoiding things like customBlobStore
> (it will be inferred from context).
>
> WDYT? I have some thoughts how could this be implemented but first I
> would like to know your opinions on that.
>
> Thanks in advance for feedback!
> Arek
>
>
> [0] http://oak.markmail.org/thread/22dvuo6b7ab5ib7m
> [1] 
> https://gist.githubusercontent.com/kitarek/f755dab6e889d1dfc5a1c595727f0171/raw/53d41ac7f935886783afd6c85d60e38e565a9259/nstab.1
> [2] 
> https://gist.githubusercontent.com/kitarek/f755dab6e889d1dfc5a1c595727f0171/raw/53d41ac7f935886783afd6c85d60e38e565a9259/nstab.2
> [3] 
> https://gist.githubusercontent.com/kitarek/f755dab6e889d1dfc5a1c595727f0171/raw/53d41ac7f935886783afd6c85d60e38e565a9259/nstab.3
> [4] 
> https://gist.githubusercontent.com/kitarek/f755dab6e889d1dfc5a1c595727f0171/raw/53d41ac7f935886783afd6c85d60e38e565a9259/nstab.4

Re: [m12n] Location of InitialContent

2017-04-20 Thread Julian Sedding

Hi Angela

>From the features you describe it sounds like it should go into
org.apache.jackrabbit.oak.jcr (or at least most of it). It looks like
it is being used in lots of tests in oak-core, however, so this may
just be wishful thinking...

Regards
Julian


On Thu, Apr 20, 2017 at 10:07 AM, Angela Schreiber  wrote:
> hi
>
> the original intention of the 'InitialContent' was just to registers
> built-in JCR node types, which explains it's location in
> org.apache.jackrabbit.oak.plugins.nodetype.write
>
> in the mean time it has a evolved to container for all kind of initial
> content required for a JCR repository: mandatory structure, version
> storage, uuid-index and most recently document ns specific configuration
> (see also OAK-5656 ).
>
> to me the location in the org.apache.jackrabbit.oak.plugins.nodetype.write
> package no longer makes sense and i would suggest to move it to the
> org.apache.jackrabbit.oak package along with the Oak, OakInitializer,
> OakVersion and other utilities used to create an JCR/Oak repository.
>
> wdyt?
>
> kind regards
> angela
>
>

Re: [DISCUSS] Which I/O statistics should the FileStore expose?

2017-02-13 Thread Julian Sedding

Hi Francesco

I believe you should implement an IOMonitor using the metrics in the
org.apache.jackrabbit.oak.stats package. These can be backed by
swappable StatisticsProvider implementations. I believe by default
it's a NOOP implementation. However, I believe that if the
MetricStatisticsProvider implementation is used, it automatically
exposes the metrics via JMX. So all you need to do is feed the correct
data into a suitable metric. I believe Chetan contributed these, so he
will know more about the details.

Regards
Julian


On Mon, Feb 13, 2017 at 6:21 PM, Francesco Mari
 wrote:
> Hi all,
>
> The recently introduced IOMonitor allows the FileStore to trigger I/O
> events. Callback methods from IOMonitor can be implemented to receive
> information about segment reads and writes.
>
> A trivial implementation of IOMonitor is able to track the following raw data.
>
> - The number of segments read and write operations.
> - The duration in nanoseconds of every read and write.
> - The number of bytes read or written by each operation.
>
> We are about to expose this kind of information from an MBean - for
> the sake of discussion, let's call it IOMonitorMBean. I'm currently in
> favour of starting small and exposing the following statistics:
>
> - The duration of the latest write (long).
> - The duration of the latest read (long).
> - The number of write operations (long).
> - The number of read operations (long).
>
> I would like your opinion about what's the most useful way to present
> this data through an MBean. Should just raw data be exposed? Is it
> appropriate for IOMonitorMBean to perform some kind of aggregation,
> like sum and average? Should richer data be returned from the MBean,
> like tabular data?
>
> Please keep in mind that this data is supposed to be consumed by a
> monitoring solution, and not a by human reader.

Re: [VOTE] Release Apache Jackrabbit Oak 1.4.11

2016-12-13 Thread Julian Sedding

[X] +1 Release this package as Apache Jackrabbit Oak 1.4.11

Regards
Julian

On Mon, Dec 12, 2016 at 3:12 PM, Julian Reschke  wrote:
> On 2016-12-12 13:05, Davide Giannella wrote:
>>
>> ...
>
>
> [X] +1 Release this package as Apache Jackrabbit Oak 1.4.11
>
> Best regards, Julian
>

Re: [VOTE] Release Apache Jackrabbit Oak 1.2.22

2016-12-12 Thread Julian Sedding

[X] +1 Release this package as Apache Jackrabbit Oak 1.2.22

Regards
Julian

On Mon, Dec 12, 2016 at 10:15 AM, Julian Reschke  wrote:
> On 2016-12-12 05:41, Amit Jain wrote:
>>
>> ...
>
>
> [X] +1 Release this package as Apache Jackrabbit Oak 1.2.22
>
> Best regards, Julian
>

Clarifiing Blob#getReference and BlobStore#getReference

2016-12-09 Thread Julian Sedding

Hi all

I was wondering if Blob#getReference could be used in
AbstractBlob#equal to optimize blob comparison (OAK-5253).
Specifically whether blobA.getReference() != blobB.getReference()
(pseudocode) allows us to determine that the blobs are not equal.

However, the API docs[0,1] only state that they return a "secure
reference" to the Blob. They do not explain what "safe" is supposed to
mean in this context.

Thanks for your insights!

Regards
Julian

[0] 
http://static.javadoc.io/org.apache.jackrabbit/oak-core/1.5.14/org/apache/jackrabbit/oak/api/Blob.html#getReference()
[1] 
http://static.javadoc.io/org.apache.jackrabbit/oak-blob/1.5.14/org/apache/jackrabbit/oak/spi/blob/BlobStore.html#getReference(java.lang.String)

Re: Is Lucene CopyOnRead/CopyOnWrite beneficial with SegmentNodeStore?

2016-11-21 Thread Julian Sedding

Thanks Chetan, that's very helpful.

Regards
Julian

On Mon, Nov 21, 2016 at 3:47 PM, Chetan Mehrotra
 wrote:
> In general its better that you use a BlobStore even with
> SegmenNodeStore. In that case CoR and CoW allows using Lucene's memory
> mapped FSDirectory support providing better performance.
>
> Some old numbers can be seen at [1]
>
> Chetan Mehrotra
> [1] 
> https://issues.apache.org/jira/browse/OAK-1702?focusedCommentId=13965551&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13965551
>
>
> On Mon, Nov 21, 2016 at 8:03 PM, Julian Sedding  wrote:
>> Hi all
>>
>> Do we have experience or measurements to suggest that using Lucene's
>> CopyOnRead and CopyOnWrite features is beneficial when the
>> SegmentNodeStore isused?
>>
>> The documentation indirectly suggests that CopyOnRead is only
>> beneficial with remote NodeStores (i.e. DocumentNodeStore). So the
>> question is whether the SegmentNodeStore implementation allows
>> sufficiently fast access to Lucene's index files, or whether the
>> associated overhead still makes CopyOnRead beneficial.
>>
>> Thanks for any insights!
>>
>> Regards
>> Julian

Is Lucene CopyOnRead/CopyOnWrite beneficial with SegmentNodeStore?

2016-11-21 Thread Julian Sedding

Hi all

Do we have experience or measurements to suggest that using Lucene's
CopyOnRead and CopyOnWrite features is beneficial when the
SegmentNodeStore isused?

The documentation indirectly suggests that CopyOnRead is only
beneficial with remote NodeStores (i.e. DocumentNodeStore). So the
question is whether the SegmentNodeStore implementation allows
sufficiently fast access to Lucene's index files, or whether the
associated overhead still makes CopyOnRead beneficial.

Thanks for any insights!

Regards
Julian

Re: [VOTE] Release Apache Jackrabbit Oak 1.4.10

2016-11-11 Thread Julian Sedding

+1 Release this package as Apache Jackrabbit Oak 1.4.10

Regards
Julian

On Fri, Nov 11, 2016 at 6:38 AM, Amit Jain  wrote:
> On Thu, Nov 10, 2016 at 7:23 PM, Davide Giannella  wrote:
>
>> Please vote on releasing this package as Apache Jackrabbit Oak 1.4.10.
>> The vote is open for the next 72 hours and passes if a majority of at
>> least three +1 Jackrabbit PMC votes are cast.
>>
>
> +1 Release this package as Apache Jackrabbit Oak 1.4.10
>
> Thanks
> Amit

Re: [VOTE] Release Apache Jackrabbit Oak 1.5.13

2016-11-11 Thread Julian Sedding

+1 Release this package as Apache Jackrabbit Oak 1.5.13

Regards
Julian

On Wed, Nov 9, 2016 at 10:29 AM, Alex Parvulescu
 wrote:
> [X] +1 Release this package as Apache Jackrabbit Oak 1.5.13
>
> On Tue, Nov 8, 2016 at 4:33 PM, Davide Giannella  wrote:
>
>>
>> A candidate for the Jackrabbit Oak 1.5.13 release is available at:
>>
>> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.5.13/
>>
>> The release candidate is a zip archive of the sources in:
>>
>>
>> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/
>> jackrabbit-oak-1.5.13/
>>
>> The SHA1 checksum of the archive is
>> c023a1924941e1609abf82b4e63a8617276e6091.
>>
>> A staged Maven repository is available for review at:
>>
>> https://repository.apache.org/
>>
>> The command for running automated checks against this release candidate is:
>>
>> $ sh check-release.sh oak 1.5.13
>> c023a1924941e1609abf82b4e63a8617276e6091
>>
>> Please vote on releasing this package as Apache Jackrabbit Oak 1.5.13.
>> The vote is open for the next 72 hours and passes if a majority of at
>> least three +1 Jackrabbit PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Jackrabbit Oak 1.5.13
>> [ ] -1 Do not release this package because...
>>
>> Davide
>>
>>
>>

Re: svn commit: r1767830 - in /jackrabbit/oak/trunk/oak-upgrade/src: main/java/org/apache/jackrabbit/oak/upgrade/security/AuthorizableFolderEditor.java test/java/org/apache/jackrabbit/oak/upgrade/Auth

2016-11-07 Thread Julian Sedding

Sorry, my bad. Thanks Chetan!

On Thu, Nov 3, 2016 at 8:58 AM,   wrote:
> Author: chetanm
> Date: Thu Nov  3 07:58:35 2016
> New Revision: 1767830
>
> URL: http://svn.apache.org/viewvc?rev=1767830&view=rev
> Log:
> OAK-5043: Very old JR2 repositories may have invalid nodetypes for groupsPath 
> and usersPath
>
> Add missing license header
>
> Modified:
> 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/security/AuthorizableFolderEditor.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/AuthorizableFolderEditorTest.java
>
> Modified: 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/security/AuthorizableFolderEditor.java
> URL: 
> http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/security/AuthorizableFolderEditor.java?rev=1767830&r1=1767829&r2=1767830&view=diff
> ==
> --- 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/security/AuthorizableFolderEditor.java
>  (original)
> +++ 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/security/AuthorizableFolderEditor.java
>  Thu Nov  3 07:58:35 2016
> @@ -1,3 +1,19 @@
> +/*
> + * Licensed to the Apache Software Foundation (ASF) under one or more
> + * contributor license agreements.  See the NOTICE file distributed with
> + * this work for additional information regarding copyright ownership.
> + * The ASF licenses this file to You under the Apache License, Version 2.0
> + * (the "License"); you may not use this file except in compliance with
> + * the License.  You may obtain a copy of the License at
> + *
> + *  http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
>  package org.apache.jackrabbit.oak.upgrade.security;
>
>  import org.apache.jackrabbit.oak.api.CommitFailedException;
>
> Modified: 
> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/AuthorizableFolderEditorTest.java
> URL: 
> http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/AuthorizableFolderEditorTest.java?rev=1767830&r1=1767829&r2=1767830&view=diff
> ==
> --- 
> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/AuthorizableFolderEditorTest.java
>  (original)
> +++ 
> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/AuthorizableFolderEditorTest.java
>  Thu Nov  3 07:58:35 2016
> @@ -1,3 +1,19 @@
> +/*
> + * Licensed to the Apache Software Foundation (ASF) under one or more
> + * contributor license agreements.  See the NOTICE file distributed with
> + * this work for additional information regarding copyright ownership.
> + * The ASF licenses this file to You under the Apache License, Version 2.0
> + * (the "License"); you may not use this file except in compliance with
> + * the License.  You may obtain a copy of the License at
> + *
> + *  http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
>  package org.apache.jackrabbit.oak.upgrade;
>
>  import org.apache.jackrabbit.JcrConstants;
>
>

Re: Deprecate oak-segment and oak-tarmk-standby

2016-11-01 Thread Julian Sedding

+1 for the deprecations

Regards
Julian

On Tue, Nov 1, 2016 at 12:19 PM, Robert Munteanu  wrote:
> Thanks Francesco and Michael, all clear now for me.
>
> Robert
>
> On Tue, 2016-11-01 at 11:47 +0100, Michael Dürig wrote:
>>
>> On 1.11.16 11:44 , Francesco Mari wrote:
>> > > 2. What is the migration strategy for deployments using oak-
>> > > segment?
>> >
>> > Just use oak-segment-tar instead of oak-segment and oak-tarmk-
>> > standby.
>> > It is almost a trivial replacement, except for some minor details
>> > that
>> > will be covered in the documentation.
>> >
>>
>> And migration is covered via oak-upgrade. That is, there is a one
>> time
>> migration effort involved on repositories created by oak-segment.
>>
>> Michael
>

Re: segment-tar depending on oak-core

2016-10-31 Thread Julian Sedding

Hi all

My preference is also with a higher degree of modularity. Compared to
a monolithic application it is a trade-off that leads to both, higher
complexity and higher flexibility. Provided we are willing to change
and learn, I am sure we can easily manage the complexity. Numerous
benefits of the extra flexibility have been mentioned in this thread
before, so I won't repeat them.

As I understand it the Oak package structure was designed to
facilitate modularity very early on. As Jukka wrote back in 2012:

"[...] Ultimately such extra plugin components may well end up as
separate Maven components, but until the related service interfaces
and plugin boundaries are well defined it's better to keep all such
code together and simply use Java package boundaries to separate them.
That's the rationale behind the .oak.plugins package [...]"[0].

IMHO, now that the API boundaries are well defined (I hope), it would
be great to finally move the structure of the code-base and release
artifacts towards a more modular approach.

Regards
Julian

[0] http://markmail.org/thread/cs34a637dr26xscj


On Fri, Oct 28, 2016 at 8:29 AM, Francesco Mari
 wrote:
> Hi
>
> 2016-10-27 19:08 GMT+02:00 Alexander Klimetschek :
>> Maybe looking at this step by step would help.
>
> The oak-segment-tar bundle was supposed to be the first step.
>
>>For example, start with the nodestore implementations and extract everything 
>>into separate modules that is necessary for this - i.e. an oak-store-api 
>>along with the impls. But keep other apis in oak-core in that first step, to 
>>limit the effort. (And try not renaming the API packages, as well as keeping 
>>them backwards compatible, i.e. no major version bump, if possible).
>
> This didn't happen because of lack of consensus. See my previous
> answer to Michael Marth.
>
>>See how that works out and if positive, continue with more.
>
> The reaction to the modularization effort was not positive, so
> oak-segment-tar backed up.
>
>>
>> Cheers,
>> Alex
>>
>> Am 27. Okt. 2016, 03:48 -0700 schrieb Francesco Mari 
>> :
>> Something did happen: the first NodeStore implementation living in its
>> own module was oak-segment-tar. We just decided to go back to the old
>> model exactly because we didn't reach consensus about modularizing its
>> upstream and downstream dependencies.
>>
>> 2016-10-27 12:22 GMT+02:00 Michael Marth :
>> fwiw: last year a concrete proposal was made that seemed to have consensus
>>
>> “Move NodeStore implementations into their own modules"
>> http://markmail.org/message/6ylxk4twdi2lzfdz
>>
>> Agree that nothing happened - but I believe that this move might again find 
>> consenus
>>
>>
>>
>> On 27/10/16 10:49, "Francesco Mari"  wrote:
>>
>> We keep having this conversation regularly but nothing ever changes.
>> As much as I would like to push the modularization effort forward, I
>> recognize that the majority of the team is either not in favour or
>> openly against it. I don't want to disrupt the way most of us are used
>> to work. Michael Dürig already provided an extensive list of what we
>> will be missing if we keep writing software the way we do, so I'm not
>> going to repeat it. The most sensible thing to do is, in my humble
>> opinion, accept the decision of the majority.
>>
>> 2016-10-27 11:05 GMT+02:00 Davide Giannella :
>> On 27/10/2016 08:53, Michael Dürig wrote:
>>
>> +1.
>>
>> It would also help re. backporting, continuous integration, releasing,
>> testing, longevity, code reuse, maintainability, reducing technical
>> debt, deploying, stability, etc, etc...
>>
>> While I can agree on the above, and the fact that now we have
>> https://issues.apache.org/jira/browse/OAK-5007 in place, just for the
>> sake or argument I would say that if we want to have any part of Oak
>> with an independent release cycle we need to
>>
>> Have proper API packages that abstract things. Specially from oak-core
>>
>> As soon as we introduce a separate release cycle for a single module we
>> have to look at a wider picture. What other modules are affected?
>>
>> Taking the example of segment-tar we saw that we need
>>
>> - oak-core-api (name can be changed)
>> - independent releases of the oak tools: oak-run, oak-upgrade, ...
>> - independent release cycle for parent/pom.xml
>> - anything I'm missing?
>>
>> So if we want to go down that route than we have to do it properly and
>> for good. Not half-way.
>>
>> Davide
>>
>>

Re: svn commit: r1765583 - in /jackrabbit/oak/trunk: oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/ oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/property/strategy/ oak-cor

2016-10-21 Thread Julian Sedding

Thanks Chetan!

On Fri, Oct 21, 2016 at 2:59 PM, Chetan Mehrotra
 wrote:
> On Thu, Oct 20, 2016 at 6:08 PM, Julian Sedding  wrote:
>> I think we could get away with increasing this to 4.1.0 if we can
>> annotate QueryEngineSettingsMBean with @ProviderType.
>
> Makes sense. Opened OAK-4977 for that
>
> Chetan Mehrotra

Re: segment-tar depending on oak-core

2016-10-21 Thread Julian Sedding

> All of this is my understanding and I may be wrong, so please correct me
> if I'm wrong. I'm right, could adding an oak-core-api with independent
> lifecycle solve the situation?

While this may be possible, an arguably simpler solution would be to
give oak-run and oak-upgrade a separate lifecycle. They are consumers
of both segment-tar and oak-core (+ other bundles with same release
cycle). Hence they require interoperable releases of both *before*
they themselves can be released.

The other alternative, as Thomas mentioned, is to release everything
at once, including segment-tar.

Regards
Julian


On Fri, Oct 21, 2016 at 12:46 PM, Davide Giannella  wrote:
> Hello team,
>
> while integrating Oak with segment-tar in other products, I'm facing
> quite a struggle with a sort-of circular dependencies. We have
> segment-tar that depends on oak-core and then we have tools like oak-run
> or oak-upgrade which depends on both oak-core and segment-tar.
>
> this may not be an issue but in case of changes in the API, like for
> 1.5.12 we have the following situation. 1.5.12 has been released with
> segment-tar 0.0.14 but this mix doesn't actually work on OSGi
> environment as of API changes. On the other hand, in order to release
> 0.0.16 we need oak-core 1.5.12 with the changes.
>
> Now oak-run and other tools may fail, or at least be in an unknown
> situation.
>
> All of this is my understanding and I may be wrong, so please correct me
> if I'm wrong. I'm right, could adding an oak-core-api with independent
> lifecycle solve the situation?
>
> Davide
>
>

Re: [REVIEW] Configuration required for node bundling config for DocumentNodeStore - OAK-1312

2016-10-21 Thread Julian Sedding

+1 for initializing the default config unconditionally

Regards
Julian

On Fri, Oct 21, 2016 at 12:14 PM, Chetan Mehrotra
 wrote:
> Opened OAK-4975 for query around default config handling.
> Chetan Mehrotra
>
>
> On Fri, Oct 21, 2016 at 2:14 PM, Davide Giannella  wrote:
>> On 21/10/2016 08:23, Michael Marth wrote:
>>> Hi Chetan,
>>>
>>> Re “Should we ship with a default config”:
>>>
>>> I vote for a small default config:
>>> - default because: if the feature is always-on in trunk we will get better 
>>> insights in day-to-day work (as opposed to switching it on only 
>>> occasionally)
>>> - small because: the optimal bundling is probably very specific to the 
>>> application and its read-write patterns. Your suggestion to include nt:file 
>>> (and maybe rep:AccessControllable) looks reasonable to me, though.
>>>
>> +1 but I would not do it that DocumentNS has to actively register it. I
>> would have a plain RepositoryInitialiser always on beside the
>> InitialContent. So that it's clear it's somewhat different. In the end
>> as far as I understood it doesn't matter if we're running segment, tar
>> or Document. The config will affect only Document.
>>
>> Davide
>>
>>

Re: svn commit: r1765583 - in /jackrabbit/oak/trunk: oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/ oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/property/strategy/ oak-cor

2016-10-20 Thread Julian Sedding

> -@Version("4.0.0")
> +@Version("5.0.0")
>  @Export(optional = "provide:=true")
>  package org.apache.jackrabbit.oak.api.jmx;

I think we could get away with increasing this to 4.1.0 if we can
annotate QueryEngineSettingsMBean with @ProviderType. I.e. we don't
expect API consumers to  implement QueryEngineSettingsMBean and
therefore the API change is irrelevant for them.

WDYT?

Regards
Julian



On Wed, Oct 19, 2016 at 2:20 PM,   wrote:
> Author: thomasm
> Date: Wed Oct 19 12:20:56 2016
> New Revision: 1765583
>
> URL: http://svn.apache.org/viewvc?rev=1765583&view=rev
> Log:
> OAK-4888 Warn or fail queries above a configurable cost value
>
> Added:
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/query/QueryOptions.java
> Modified:
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/QueryEngineSettingsMBean.java
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/package-info.java
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/property/strategy/ContentMirrorStoreStrategy.java
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/query/Query.java
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/query/QueryEngineSettings.java
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/query/QueryEngineSettingsMBeanImpl.java
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/query/QueryImpl.java
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/query/SQL2Parser.java
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/query/UnionQueryImpl.java
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/query/xpath/Statement.java
> 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/query/xpath/XPathToSQL2Converter.java
> 
> jackrabbit/oak/trunk/oak-core/src/test/java/org/apache/jackrabbit/oak/query/SQL2ParserTest.java
> 
> jackrabbit/oak/trunk/oak-core/src/test/java/org/apache/jackrabbit/oak/query/XPathTest.java
> 
> jackrabbit/oak/trunk/oak-jcr/src/test/java/org/apache/jackrabbit/oak/jcr/query/QueryTest.java
>
> Modified: 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/QueryEngineSettingsMBean.java
> URL: 
> http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/QueryEngineSettingsMBean.java?rev=1765583&r1=1765582&r2=1765583&view=diff
> ==
> --- 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/QueryEngineSettingsMBean.java
>  (original)
> +++ 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/QueryEngineSettingsMBean.java
>  Wed Oct 19 12:20:56 2016
> @@ -51,4 +51,19 @@ public interface QueryEngineSettingsMBea
>   */
>  void setLimitReads(long limitReads);
>
> +/**
> + * Whether queries that don't use an index will fail (throw an 
> exception).
> + * The default is false.
> + *
> + * @return true if they fail
> + */
> +boolean getFailTraversal();
> +
> +/**
> + * Set whether queries that don't use an index will fail (throw an 
> exception).
> + *
> + * @param failTraversal the new value for this setting
> + */
> +void setFailTraversal(boolean failTraversal);
> +
>  }
>
> Modified: 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/package-info.java
> URL: 
> http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/package-info.java?rev=1765583&r1=1765582&r2=1765583&view=diff
> ==
> --- 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/package-info.java
>  (original)
> +++ 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/api/jmx/package-info.java
>  Wed Oct 19 12:20:56 2016
> @@ -15,7 +15,7 @@
>   * limitations under the License.
>   */
>
> -@Version("4.0.0")
> +@Version("5.0.0")
>  @Export(optional = "provide:=true")
>  package org.apache.jackrabbit.oak.api.jmx;
>
>
> Modified: 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/property/strategy/ContentMirrorStoreStrategy.java
> URL: 
> http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/property/strategy/ContentMirrorStoreStrategy.java?rev=1765583&r1=1765582&r2=1765583&view=diff
> ==
> --- 
> jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/property/strategy/ContentMirrorStoreStrategy.java
>  (original)
> +++ 
> ja

Re: Possibility of making nt:resource unreferenceable

2016-10-12 Thread Julian Sedding

On Wed, Oct 12, 2016 at 11:24 AM, Bertrand Delacretaz
 wrote:
> On Wed, Oct 12, 2016 at 11:18 AM, Julian Sedding  wrote:
>> ...As a remedy for implementations that rely on the current referencable
>> nature, we could provide tooling that automatically adds the
>> "mix:referencable" mixin to existing nt:resource nodes...
>
> Good idea, I suppose this can be done with a commit hook in a non-intrusive 
> way?

For JR2 content being upgraded to Oak (or during an Oak to Oak
"sidegrade"), i.e. in the oak-upgrade module, it would be easy to add
this functionality via a commit hook.

For an existing Oak repository the same functionality could be
implemented on the JCR API and a full repo traversal, I suppose. If we
can get past the node-type validation. Alternatively we could come up
with an extension SPI/API that allows plugging in an implementation
for specific non-trivial node-type updates. This would even allow for
two alternative implementations: one that adds mix:referencable and
another that removes the jcr:uuid property - so JCR users could choose
which strategy they prefer.

Regards
Julian

>
> -Bertrand

Re: Possibility of making nt:resource unreferenceable

2016-10-12 Thread Julian Sedding

I'm with Julian R. on this (as I understand him). We should change the
node-type nt:resource to match the JCR 2.0 spec and deal with the
consequences.

Currently I am under the impression that we have no knowledge of what
*might* break, with varying opinions on the matter. Maybe we should to
find out what *does* break.

As a remedy for implementations that rely on the current referencable
nature, we could provide tooling that automatically adds the
"mix:referencable" mixin to existing nt:resource nodes and recommend
adapting the code to add the mixin as well.

Regards
Julian


On Wed, Oct 12, 2016 at 11:04 AM, Carsten Ziegeler  wrote:
> The latest proposal was not about making nt:resource unreferenceable,
> but silently changing the resource type for a nt:resource child node of
> a nt:file node to Oak:Resource.
>
> I just found three other places in Sling where nt:file nodes are created
> by hand. So with any other mechanism we have to change a lot of places
> in Sling alone. Not to mention all downstream users.
>
> Carsten
>
> Thomas Mueller wrote
>> Hi,
>>
>> I agree with Julian, I think making nt:resource unreferenceable would
>> (hardcoding some "magic" in Oak) would lead to hard-to-find bugs and
>> problems.
>>
>>> So whatever solution we pick, there is a risk that existing code fails.
>>
>> Yes. But I think if we create a new nodetype, at least it would be easier
>> for users to understand the problem.
>>
>> Also, the "upgrade path" with a new nodetype is smoother. This can be done
>> incrementally, even thought it might mean more total work. But making
>> nt:resource unreferenceable would be a hard break, and I think risk of
>> bigger problems is higher.
>>
>> Regards,
>> Thomas
>>
>>
>>
>> On 07/10/16 12:05, "Julian Reschke"  wrote:
>>
>>> On 2016-10-07 10:56, Carsten Ziegeler wrote:
 Julian Reschke wrote
> On 2016-10-07 08:04, Carsten Ziegeler wrote:
>> ...
>> The easiest solution that comes to my mind is:
>>
>> Whenever a nt:resource child node of a nt:file node is created, it is
>> silently changed to oak:resource.
>>
>> Carsten
>> ...
>
> Observation: that might break code that actually wants a referenceable
> node: it would create the node, check for the presence of
> mix:referenceable, and then decide not to add it because it's already
> there.
>

 Well, there might be code that assumes that a file uploaded through
 webdav is using a resource child node that is referenceable.
 Or a file posted through the Sling POST servlet has this. Now, you could
 argue if that code did not create the file, it should check node types,
 but how likely is that if the code has history?

 So whatever solution we pick, there is a risk that existing code fails.
 ...
>>>
>>> That is true..
>>>
>>> However, my preference would be to only break code which is
>>> non-conforming right now. Code should not rely on nt:resource being
>>> referenceable (see
>>> >> ml#3.7.11.5%20nt:resource>).
>>>
>>> So my preference would be to make that change and see what breaks (and
>>> get that fixed).
>>>
 ...
>>>
>>>
>>> Best regards, Julian
>>
>>
>
>
>
>
> --
> Carsten Ziegeler
> Adobe Research Switzerland
> cziege...@apache.org
>

Re: Datastore GC only possible after Tar Compaction

2016-10-05 Thread Julian Sedding

Thanks Amit for your insights.

Is it documented that DS GC is ineffective if no prior tar compaction
is performed? IMHO we should make this as clear as possible, because
the behaviour deviates from JR2 and thus has the potential to throw
lots of users. Possibly even mention it as a possible reason in the
log message if DS GC was ineffective.

Would it be possible to improve the heuristic without traversing the
node tree? I.e. do the segment tar files contain sufficient
information in their indexes to safely determine that some binary
references are dead? I'm looking for no false positives but possibly
many false negatives.

Regards
Julian


On Mon, Oct 3, 2016 at 10:37 AM, Amit Jain  wrote:
> Hi,
>
> On Mon, Oct 3, 2016 at 1:29 PM, Julian Sedding  wrote:
>
>> I just became aware that on a system configured with SegmentNodeStore
>> and FileDatastore a Datastore garbage collection can only free up
>> space *after* a Tar Compaction was run.
>>
>>
> Yes that is a pre-requisite.
>
>
>> I would like to discuss whether it is desirable to require a Tar
>> Compaction prior to a DS GC. If someone knows about the rationale
>> behind this behaviour, I would also appreciate these insights!
>>
>> The alternative behaviour, which I would have expected, is to collect
>> only binaries that are referenced from the root NodeState or any of
>> the checkpoint's root NodeStates (i.e. "live" NodeStates).
>>
>> From an implementation perspective, I assume that the current
>> behaviour can be implemented with better performance than a solution
>> that checks only "live" NodeStates. However, IMHO that should not be
>> the only relevant factor in the discussion.
>>
>
> I believe the performance impact of loading all nodes to check whether the
> node has a binary property
> is quite high. What you are referring to was how it is implemented in
> Jackrabbit and
> the reference collection phase took days on larger repositories. But with
> the NodeStore specific implementation for
> blob reference collection this phase takes only a few hours. For example
> there is also an enhancement already implemented in oak-segment-tar
> to have the index of binary reference OAK-4201.
>
> Thanks
> Amit

Datastore GC only possible after Tar Compaction

2016-10-03 Thread Julian Sedding

Hi all

I just became aware that on a system configured with SegmentNodeStore
and FileDatastore a Datastore garbage collection can only free up
space *after* a Tar Compaction was run.

This behaviour is not immediately intuitive to me.

I would like to discuss whether it is desirable to require a Tar
Compaction prior to a DS GC. If someone knows about the rationale
behind this behaviour, I would also appreciate these insights!

The alternative behaviour, which I would have expected, is to collect
only binaries that are referenced from the root NodeState or any of
the checkpoint's root NodeStates (i.e. "live" NodeStates).

>From an implementation perspective, I assume that the current
behaviour can be implemented with better performance than a solution
that checks only "live" NodeStates. However, IMHO that should not be
the only relevant factor in the discussion.

I'm looking forward to your feedback!

Regards
Julian

Re: [VOTE] Require JDK7 for Oak 1.4

2016-09-19 Thread Julian Sedding

+1

Regards
Julian

On Mon, Sep 19, 2016 at 11:22 AM, Michael Dürig  wrote:
>
>
> On 16.9.16 5:16 , Julian Reschke wrote:
>>
>> [X] +1 Yes, require JDK7 for Oak 1.4
>
>
> Michael

Re: Requirement to support multiple NodeStore instance in same setup (OAK-4490)

2016-06-21 Thread Julian Sedding

Hi Chetan

I agree that we should not rely on the service.ranking for this. A
type property makes sense IMO.

On the other hand, do we really need to expose both NodeStores in the
service registry? The secondary (cache) NodeStore could also be
treated as an implementation detail of the DocumentNodeStore and
switched on/off via configuration. Of course the devil is in the
detail then - how to configure different BlobStores, cache sizes etc
of the secondary NodeStore?

Not exposing the secondary NodeStore in the service registry would be
backwards compatible. Introducing the "type" property potentially
breaks existing consumers, i.e. is not backwards compatible.

Regards
Julian

On Tue, Jun 21, 2016 at 9:03 AM, Chetan Mehrotra
 wrote:
> Hi Team,
>
> As part of OAK-4180 feature around using another NodeStore as a local
> cache for a remote Document store I would need to register another
> NodeStore instance (for now a SegmentNodeStore - OAK-4490) with the
> OSGi service registry.
>
> This instance would then be used by SecondaryStoreCacheService to save
> NodeState under certain paths locally and use it later for reads.
>
> With this change we would have a situation where there would be
> multiple NodeStore instance in same service registry. This can confuse
> some component which have a dependency on NodeStore as a reference and
> we need to ensure they bind to correct NodeStore instance.
>
> Proposal A - Use a 'type' service property to distinguish
> ==
>
> Register the NodeStore with a 'type' property. For now the value can
> be 'primary' or 'secondary'. When any component registers the
> NodeStore it also provides the type property.
>
> On user side the reference needs to provide which type of NodeStore it
> needs to bound
>
> This would ensure that user of NodeStore get bound to correct type.
>
> if we use service.ranking then it can cause a race condition where the
> secondary instance may get bound untill primary comes up
>
> Looking for feedback on what approach to take
>
> Chetan Mehrotra

Re: Duplicate logic in oak-run commands

2016-05-05 Thread Julian Sedding

Hi Francesco

+1 for centralizing logic for creating a NodeStore instance.

I like the idea of encoding the description of a NodeStore instance in
a URI. This is both concise and extensible. We need to also consider
how to express the use of different DataStores as well. I.e. the URI
should ideally describe a complete setup.

Regards
Julian





On Thu, May 5, 2016 at 10:42 AM, Francesco Mari
 wrote:
> Hi all,
>
> While looking into OAK-4246 I figured out that many commands in oak-run
> implement the same logic over and over to create instance of NodeStore from
> command line arguments and options.
>
> Thus, I created OAK-4349 to propose another approach to the problem. A
> connection to a specific NodeStore could be specified by using an URI and
> the logic to create NodeStore instances could be implemented in a single
> place and reused from every command.
>
> I proposed some examples in OAK-4349. I'm looking forward to hearing what
> you think about this suggestion.

Re: [VOTE] Please vote for the final name of oak-segment-next

2016-04-26 Thread Julian Sedding

Hi

+1 for oak-segment-file or oak-segment-tar.

+0 for oak-segment-store. We *may* implement another segment-based
persistence later, in which case having the persistence strategy in
the name sounds like a good idea to me.

Similarly, a later refactoring of the document store could lead to
oak-document-mongo and oak-document-rdb (plus possibly
oak-document-spi for shared stuff).

Regards
Julian


On Tue, Apr 26, 2016 at 2:00 PM, Thomas Mueller  wrote:
> Hi,
>
> I would keep the "oak-segment-*" name, so that it's clear what it is based
> on. So:
>
> -1 oak-local-store
> -1 oak-embedded-store
>
> +1 oak-segment-*
>
> Within the oak-segment-* options, I don't have a preference.
>
> Regards,
> Thomas
>
>
> On 25/04/16 16:46, "Michael Dürig"  wrote:
>
>>
>>Hi,
>>
>>There is a couple of names that came up in the discussion [1]:
>>
>>oak-local-store
>>oak-segment-file
>>oak-embedded-store
>>oak-segment-store
>>oak-segment-tar
>>oak-segment-next
>>
>>Please vote which of the above six options you would like to see as the
>>final name for oak-segment-next [2]:
>>
>>Put +1 next to those names that you favour, put -1 to veto names and
>>remove the remaining names. Please justify any veto as otherwise it is
>>non binding.
>>
>>The name with the most +1 votes and without any -1 vote will be chosen.
>>
>>The vote is open for the next 72 hours.
>>
>>Michael
>>
>>
>>[1] http://markmail.org/thread/ktk7szjxtucpqd2o
>>[2] https://issues.apache.org/jira/browse/OAK-4245
>

Re: Increase language level to Java 7

2016-04-07 Thread Julian Sedding

We could enforce java6 signatures for the branches using the
animal-sniffer-maven-plugin. This should help detect bogus backports
quickly.

Regards
Julian

On Thu, Apr 7, 2016 at 10:57 AM, Francesco Mari
 wrote:
> Language features would be available for new, backport-free developments.
> Existing code doesn't have to use those features if they would be an issue
> during backports.
>
> 2016-04-07 10:25 GMT+02:00 Davide Giannella :
>
>> On 06/04/2016 15:25, Francesco Mari wrote:
>> > I was talking about trunk, of course. Developers working in areas where
>> > backports are the norm have to carefully consider if and when using Java
>> 7
>> > language features would be appropriate. New portions of the codebase
>> could
>> > use of the new features freely.
>> >
>>
>> We were discussing this on chat. Generally I'd say +1 for trunk but we
>> risk to introduce problems for backports.
>>
>> Davide
>>
>>
>>

Re: Increase language level to Java 7

2016-04-06 Thread Julian Sedding

+1

Regards
Julian

On Wed, Apr 6, 2016 at 2:27 PM, Francesco Mari  wrote:
> Hi all,
>
> some months ago we decided to drop support for Java 1.6 [1]. What about
> increasing the language level of the compiler so as to be able to use the
> new features in Java 7?
>
> [1]: http://jackrabbit.markmail.org/thread/t3gzwi25tcz6masg

Re: [VOTE] Release Apache Jackrabbit Oak 1.5.0

2016-03-30 Thread Julian Sedding

[X] +1 Release this package as Apache Jackrabbit Oak 1.5.0

Regards
Julian

On Tue, Mar 29, 2016 at 3:07 PM, Alex Parvulescu
 wrote:
> [X] +1 Release this package as Apache Jackrabbit Oak 1.5.0
>
> best,
> alex
>
> On Tue, Mar 29, 2016 at 10:57 AM, Amit Jain  wrote:
>
>> A candidate for the Jackrabbit Oak 1.5.0 release is available at:
>>
>> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.5.0/
>>
>> The release candidate is a zip archive of the sources in:
>>
>>
>> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.5.0/
>>
>> The SHA1 checksum of the archive is
>> 1c4b3a95c8788a80129c1b7efb7dc38f4d19bd08.
>>
>> A staged Maven repository is available for review at:
>>
>> https://repository.apache.org/
>>
>> The command for running automated checks against this release candidate is:
>>
>> $ sh check-release.sh oak 1.5.0
>> 1c4b3a95c8788a80129c1b7efb7dc38f4d19bd08
>>
>> Please vote on releasing this package as Apache Jackrabbit Oak 1.5.0.
>> The vote is open for the next 72 hours and passes if a majority of at
>> least three +1 Jackrabbit PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Jackrabbit Oak 1.5.0
>> [ ] -1 Do not release this package because...
>>
>> My vote is +1.
>>
>> Thanks
>> Amit
>>

Re: [VOTE] Release Apache Jackrabbit Oak 1.4.1

2016-03-30 Thread Julian Sedding

[X] +1 Release this package as Apache Jackrabbit Oak 1.4.1

Regards
Julian

On Wed, Mar 30, 2016 at 1:52 PM, Julian Reschke  wrote:
> On 2016-03-24 15:32, Davide Giannella wrote:
>>
>> ...
>
>
> [X] +1 Release this package as Apache Jackrabbit Oak 1.4.1
>
> Best regards, Julian
>

Re: [VOTE] Release Apache Jackrabbit Oak 1.4.0 (take 3)

2016-03-07 Thread Julian Sedding

[X] +1 Release this package as Apache Jackrabbit Oak 1.4.0

Regards
Julian

On Mon, Mar 7, 2016 at 11:51 AM, Davide Giannella  wrote:
> A candidate for the Jackrabbit Oak 1.4.0 release is available at:
>
> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.4.0/
>
> The release candidate is a zip archive of the sources in:
>
>
> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.4.0/
>
> The SHA1 checksum of the archive is
> 483493eacea4c64a6a568982058996d745ad4e18.
>
> A staged Maven repository is available for review at:
>
> https://repository.apache.org/
>
> The command for running automated checks against this release candidate is:
>
> $ sh check-release.sh oak 1.4.0 483493eacea4c64a6a568982058996d745ad4e18
>
> Please vote on releasing this package as Apache Jackrabbit Oak 1.4.0.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Jackrabbit PMC votes are cast.
>
> [ ] +1 Release this package as Apache Jackrabbit Oak 1.4.0
> [ ] -1 Do not release this package because...
>
> Davide
>

Re: svn commit: r1733315 - /jackrabbit/oak/branches/1.4/RELEASE-NOTES.txt

2016-03-03 Thread Julian Sedding

Nitpick:

> +Changes in Oak 1.2.0
IMHO that should be "Changes in Oak 1.4.0"

This probably doesn't warrant a re-release, but if there *is* a
re-release due to OAK-4085[0] it would be nice to correct it.

Regards
Julian

[0] https://issues.apache.org/jira/browse/OAK-4085

On Wed, Mar 2, 2016 at 4:45 PM,   wrote:
> Author: davide
> Date: Wed Mar  2 15:45:23 2016
> New Revision: 1733315
>
> URL: http://svn.apache.org/viewvc?rev=1733315&view=rev
> Log:
> OAK-4073 - Release Oak 1.4.0
>
> release notes
>
>
> Modified:
> jackrabbit/oak/branches/1.4/RELEASE-NOTES.txt
>
> Modified: jackrabbit/oak/branches/1.4/RELEASE-NOTES.txt
> URL: 
> http://svn.apache.org/viewvc/jackrabbit/oak/branches/1.4/RELEASE-NOTES.txt?rev=1733315&r1=1733314&r2=1733315&view=diff
> ==
> --- jackrabbit/oak/branches/1.4/RELEASE-NOTES.txt (original)
> +++ jackrabbit/oak/branches/1.4/RELEASE-NOTES.txt Wed Mar  2 15:45:23 2016
> @@ -1,4 +1,4 @@
> -Release Notes -- Apache Jackrabbit Oak -- Version 1.3.16
> +Release Notes -- Apache Jackrabbit Oak -- Version 1.4.0
>
>  Introduction
>  
> @@ -7,32 +7,809 @@ Jackrabbit Oak is a scalable, high-perfo
>  repository designed for use as the foundation of modern world-class
>  web sites and other demanding content applications.
>
> -Apache Jackrabbit Oak 1.3.16 is an unstable release cut directly from
> -Jackrabbit Oak trunk, with a focus on new features and other
> -improvements. For production use we recommend the latest stable 1.2.x
> -release.
> +Jackrabbit Oak 1.4 is an incremental feature release based on and
> +compatible with earlier stable Jackrabbit Oak 1.x releases. Jackrabbit
> +Oak 1.4.x releases are considered stable and targeted for production
> +use.
>
>  The Oak effort is a part of the Apache Jackrabbit project.
>  Apache Jackrabbit is a project of the Apache Software Foundation.
>
> -Changes in Oak 1.3.16
> +Changes in Oak 1.2.0
>  -
>
>  Sub-task
>
> +[OAK-318] - Excerpt support
> +[OAK-1708] - extend DocumentNodeStoreService to support
> +RDBPersistence
> +[OAK-1828] - Improved SegmentWriter
> +[OAK-1860] - unit tests for concurrent DocumentStore access
> +[OAK-1940] - memory cache for RDB persistence
> +[OAK-2008] - authorization setup for closed user groups
> +[OAK-2171] - oak-run should support repository upgrades with all
> +available options
> +[OAK-2410] - [sonar]Some statements not being closed in
> +RDBDocumentStore
> +[OAK-2502] - Provide initial implementation of the Remote
> +Operations specification
> +[OAK-2509] - Support for faceted search in query engine
> +[OAK-2510] - Support for faceted search in Solr index
> +[OAK-2511] - Support for faceted search in Lucene index
> +[OAK-2512] - ACL filtering for faceted search
> +[OAK-2630] - Cleanup Oak jobs on buildbot
> +[OAK-2634] - QueryEngine should expose name query as property
> +restriction
> +[OAK-2700] - Cleanup usages of mk-api
> +[OAK-2701] - Move oak-mk-api to attic
> +[OAK-2702] - Move oak-mk to attic
> +[OAK-2747] - Admin cannot create versions on a locked page by
> +itself
> +[OAK-2756] - Move mk-package of oak-commons to attic
> +[OAK-2760] - HttpServer in Oak creates multiple instance of
> +ContentRepository
> +[OAK-2770] - Configurable mode for backgroundOperationLock
> +[OAK-2781] - log node type changes and the time needed to traverse
> +the repository
> +[OAK-2813] - Create a benchmark for measuring the lag of async
> +index
> +[OAK-2826] - Refactor ListeneableFutureTask to commons
> +[OAK-2828] - Jcr builder class does not allow overriding most of
> +its dependencies
> +[OAK-2850] - Flag states from revision of an external change
> +[OAK-2856] - improve RDB diagnostics
> +[OAK-2901] - RDBBlobStoreTest should be able to run against
> +multiple DB types
> +[OAK-2915] - add (experimental) support for Apache Derby
> +[OAK-2916] - RDBDocumentStore: use of "GREATEST" in SQL apparently
> +doesn't have test coverage in unit tests
> +[OAK-2918] - RDBConnectionHandler: handle failure on setReadOnly()
> +gracefully
> +[OAK-2923] - RDB/DB2: change minimal supported version from 10.5
> +to 10.1, also log decimal version numbers as well
> +[OAK-2930] - RDBBlob/DocumentStore throws NPE when used after
> +being closed
> +[OAK-2931] - RDBDocumentStore: mitigate effects of large query
> +result sets
> +[OAK-2940] - RDBDocumentStore: "set" operation on _modified
> +appears to be implemented as "max"
> +[OAK-2943] - Support measure for union queries
> +[OAK-2944] - Support merge iterator for union order by queries
> +[OAK-2949] - RDBDocumentStore: no custom SQL needed for GREATEST
> +[OAK-2950] - RDBDocumentStore: conditional fetch logic is reversed
> +[OAK-2952] - RDBConnectionHandler: log

Re: testing blob equality

2016-03-01 Thread Julian Sedding

Thanks for creating the ticket, I'll take a stab at it. As a bonus we could
include a utility to generate an initial cache file.

Regards
Julian

On Monday, February 29, 2016, Tomek Rekawek  wrote:

> Hi Julian,
>
> > On 29 Feb 2016, at 15:40, Julian Sedding  > wrote:
> >
> > Should we automatically wrap the DS in the LengthCachingDatastore in
> > oak-upgrade? Or provide an option for the cache-file path, which turns
> > it on if set?
>
> Good idea. I think we should enable it by default (to limit the number of
> parameters required to perform a “standard” migration). We may also add the
> parameter to change cache-file location. I created OAK-4074 to track this.
>
> Best regards,
> Tomek
>
> --
> Tomek Rękawek | Adobe Research | www.adobe.com
> reka...@adobe.com 
>
>
>

Re: testing blob equality

2016-02-29 Thread Julian Sedding

Yes, the LengthCachingDataStore is exactly the way to go. You need to
wrap the original datastore in the length caching datastore (using the
repository.xml). The LengthCachingDataStore not only caches the
length, but (for the FileDataStore at least) it also prevents a call
to File.exists(). These add up on the FS and I expect even more so on
S3.

Should we automatically wrap the DS in the LengthCachingDatastore in
oak-upgrade? Or provide an option for the cache-file path, which turns
it on if set?

Regards
Julian

On Mon, Feb 29, 2016 at 3:17 PM, Tomek Rekawek  wrote:
> Thanks Chetan, I haven’t noticed the length() invocation in the createBlob(). 
> It seems that the LengthCachingDataStore is something I was looking for.
>
> Best regards,
> Tomek
>
> --
> Tomek Rękawek | Adobe Research | www.adobe.com
> reka...@adobe.com
>
>> On 29 Feb 2016, at 14:35, Chetan Mehrotra  wrote:
>>
>> On Mon, Feb 29, 2016 at 6:42 PM, Tomek Rekawek  wrote:
>>> I wonder if we can switch the order of length and identity comparison in 
>>> AbstractBlob#equal() method. Is there any case in which the 
>>> getContentIdentity() method will be slower than length()?
>>
>> That can be switched but I am afraid that it would not work as
>> expected. In JackrabbitNodeState#createBlob determining the
>> contentIdentity involves determining the length. You can give
>> org.apache.jackrabbit.oak.upgrade.blob.LengthCachingDataStore a try
>> (See OAK-2882 for details)
>>
>> Chetan Mehrotra
>

Re: oak-upgrade test failures (was Re: Oak 1.3.16 release plan)

2016-02-15 Thread Julian Sedding

The test failures in the issue seem to suggest that this may be relates to
simple versionables. IIRC we recently added support for some broken JR2
constructs. Could they have been fixed in the last JR release? If that's
the case it may no longer be possible to populate the source repository for
the tests.

Just pure guesses, but I thought it might help.

Regards
Julian

On Monday, February 15, 2016, Davide Giannella  wrote:

> On 12/02/2016 18:36, Manfred Baedke wrote:
> > Hi,
> >
> > This is due to change 1721196 (associated with JCR-2633), which
> > changes the persistent data model. Probably the test has just to be
> > tweaked accordingly, I'll look into it during WE.
> Thank you very much Manfred.
>
> I've filed https://issues.apache.org/jira/browse/OAK-4018 to keep track
> and block 1.3.16.
>
> From here, once it's fixed in JR we have potentially 2 options:
>
> 1) unlock 1.3.16 by downgrading to JR 2.11.3
> 2) release JR 2.12.1, upgrade to Oak, release 1.3.16. Which will bring
> the oak relase around 4-5 days late.
>
> I'm for two as it will give us more coverage around the inclusion of the
> new stable JR release.
>
> Thoughts?
>
> Davide
>
>
>

Re: Anchor tags on doc pages get positioned wrongly under top menu

2016-02-14 Thread Julian Sedding

Hi Vikas

I agree that having the anchor text hidden is a usability hazard. I
tried your suggested approach in Firefox (via FireBug) and didn't have
any success. However, a slight variation of the scheme, still relying
on the ":target" pseudo selector, did the trick for me.

h2 > a:target {
position: relative;
top: -40px;
}

I scoped the rule to the "h2" element, which is defined to have a
height of 40px. I think it's then ok to repeat this value.

Regards
Julian

On Fri, Feb 12, 2016 at 6:24 PM, Vikas Saurabh  wrote:
> Hi,
>
> I'm sure we all have noticed that our anchor tags scroll the page a
> little too much such that the actual position gets hidden under the
> same menu.
>
> With google and this link [0], it seems, we can just plug-in
>
> ```
> :target:before {
> content:"";
> display:block;
> height:40px; /* fixed header height*/
> margin:-40px 0 0; /* negative fixed header height */
> }
> ```
> in oak-doc/src/site/resources/site.css to fix the issue.
>
> But, since I suck at html/css, I wasn't sure if this is fine. '40px'
> is manual hit-and-trial. Is there something better?
>
> Thanks,
> Vikas
>
> [0]: 
> https://www.itsupportguides.com/tech-tips-tricks/how-to-offset-anchor-tag-link-using-css/

Re: [VOTE] Release Apache Jackrabbit Oak 1.3.15

2016-02-04 Thread Julian Sedding

[X] +1 Release this package as Apache Jackrabbit Oak 1.3.15

Regards
Julian

On Wed, Feb 3, 2016 at 10:44 PM, Alex Parvulescu
 wrote:
> [X] +1 Release this package as Apache Jackrabbit Oak 1.3.15
>
> On Wed, Feb 3, 2016 at 5:00 PM, Davide Giannella  wrote:
>
>> A candidate for the Jackrabbit Oak 1.3.15 release is available at:
>>
>> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.3.15/
>>
>> The release candidate is a zip archive of the sources in:
>>
>>
>> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.3.15/
>>
>> The SHA1 checksum of the archive is
>> aba9e0aea9400edb47eb498f63bede1969a31132.
>>
>> A staged Maven repository is available for review at:
>>
>> https://repository.apache.org/
>>
>> The command for running automated checks against this release candidate is:
>>
>> $ sh check-release.sh oak 1.3.15
>> aba9e0aea9400edb47eb498f63bede1969a31132
>>
>> Please vote on releasing this package as Apache Jackrabbit Oak 1.3.15.
>> The vote is open for the next 72 hours and passes if a majority of at
>> least three +1 Jackrabbit PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Jackrabbit Oak 1.3.15
>> [ ] -1 Do not release this package because...
>>
>> Davide
>>

Re: svn commit: r1727297 - /jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java

2016-01-30 Thread Julian Sedding

Right yes. All good then!

On Sat, Jan 30, 2016 at 2:18 PM, Alex Parvulescu
 wrote:
> you missed a couple of lines up:
>
>> -System.out.println("Debug " + args[0]);
>> +   System.out.println("Debug " + file);
>
>
> On Fri, Jan 29, 2016 at 5:41 PM, Julian Sedding  wrote:
>
>> > + System.out.println("Debug " + file);
>>
>> Is this on purpose or an oversight?
>>
>> Regards
>> Julian
>>
>> On Thu, Jan 28, 2016 at 10:56 AM,   wrote:
>> > Author: alexparvulescu
>> > Date: Thu Jan 28 09:56:31 2016
>> > New Revision: 1727297
>> >
>> > URL: http://svn.apache.org/viewvc?rev=1727297&view=rev
>> > Log:
>> > OAK-3928 oak-run debug should use a read-only store
>> >
>> > Modified:
>> >
>>  
>> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
>> >
>> > Modified:
>> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
>> > URL:
>> http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java?rev=1727297&r1=1727296&r2=1727297&view=diff
>> >
>> ==
>> > ---
>> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
>> (original)
>> > +++
>> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
>> Thu Jan 28 09:56:31 2016
>> > @@ -839,12 +839,9 @@ public final class Main {
>> >  System.exit(1);
>> >  } else {
>> >  // TODO: enable debug information for other node store
>> implementations
>> > -System.out.println("Debug " + args[0]);
>> >  File file = new File(args[0]);
>> > -FileStore store = newFileStore(file)
>> > -.withMaxFileSize(256)
>> > -.withMemoryMapping(false)
>> > -.create();
>> > +System.out.println("Debug " + file);
>> > +ReadOnlyStore store = new ReadOnlyStore(file);
>> >  try {
>> >  if (args.length == 1) {
>> >  debugFileStore(store);
>> >
>> >
>>

Re: svn commit: r1727297 - /jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java

2016-01-29 Thread Julian Sedding

> + System.out.println("Debug " + file);

Is this on purpose or an oversight?

Regards
Julian

On Thu, Jan 28, 2016 at 10:56 AM,   wrote:
> Author: alexparvulescu
> Date: Thu Jan 28 09:56:31 2016
> New Revision: 1727297
>
> URL: http://svn.apache.org/viewvc?rev=1727297&view=rev
> Log:
> OAK-3928 oak-run debug should use a read-only store
>
> Modified:
> 
> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
>
> Modified: 
> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
> URL: 
> http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java?rev=1727297&r1=1727296&r2=1727297&view=diff
> ==
> --- 
> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
>  (original)
> +++ 
> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
>  Thu Jan 28 09:56:31 2016
> @@ -839,12 +839,9 @@ public final class Main {
>  System.exit(1);
>  } else {
>  // TODO: enable debug information for other node store 
> implementations
> -System.out.println("Debug " + args[0]);
>  File file = new File(args[0]);
> -FileStore store = newFileStore(file)
> -.withMaxFileSize(256)
> -.withMemoryMapping(false)
> -.create();
> +System.out.println("Debug " + file);
> +ReadOnlyStore store = new ReadOnlyStore(file);
>  try {
>  if (args.length == 1) {
>  debugFileStore(store);
>
>

Re: [VOTE] Shut down oakcomm...@jackrabbit.apache.org mailing list

2015-12-08 Thread Julian Sedding

[X] +1, shut down oakcomm...@jackrabbit.apache.org

Regards
Julian



On Tuesday, December 8, 2015, Michael Dürig  wrote:

>
> Hi,
>
> NOTE: this vote is about oakcomm...@jackrabbit.apache.org. NOT about
> oak-comm...@jackrabbit.apache.org. Mind the dash!
>
> It is unknown how oakcomm...@jackrabbit.apache.org came into existence
> and it was most likely by human error. The list archives are empty [1] and
> I guess most if not all of you didn't even know of its existence. I only
> learned of it recently through the reporter tool [2].
>
> I'm thus proposing to shut that list down but we need to agree consensus
> through this list [3]. Therefore, please vote:
>
> [ ] +1, shut down oakcomm...@jackrabbit.apache.org
> [ ] -1, do not shut down oakcomm...@jackrabbit.apache.org
>
> The vote is open for 72h.
>
> Michael
>
> [1]
> http://mail-archives.apache.org/mod_mbox/jackrabbit-oakcommits/index.html_
> [2] https://reporter.apache.org/#mailinglists_jackrabbit
> [3]
> https://issues.apache.org/jira/browse/INFRA-10916?focusedCommentId=15047031&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15047031
>

Re: Segment Store modularization

2015-12-08 Thread Julian Sedding

I agree with Francesco. SNFE should be an implementation detail of the
Segment bundle. If any code outside of this module depends on SNFE in
order to handle it differently, I would consider that a leaked
abstraction. The special handling should instead be moved into the
Segment bundle (which may not be trivial and could require API
changes/additions).

IMHO, that's how modularization can help drive good APIs. Together
with baselining + impprt/export packages, violations of module
boundaries become visible.

Regards
Julian

On Tue, Dec 8, 2015 at 10:32 AM, Michael Dürig  wrote:
>>> IMO SNFE should be exported so upstream projects can depend on it.
>>> Otherwise there is no value in throwing a specific exception in the first
>>> place.
>>>
>>>
>> My goal is to move the Segment Store into its own bundle without having
>> circular dependencies between this new bundle and oak-core. I could have
>> tried to create two bundles - one with the exported API of the Segment
>> Store and one with its implementation - but I prefer not to go this way at
>> the moment. Defining a proper Segment Store API seems to require a
>> refactoring way deeper than the one I'm doing, and I'm not sure if we want
>> to go head first into this task, given the current changes currently in
>> progress on the Segment Store.
>
>
> Right, makes sense. Can we come up with a different way of (somewhat)
> reliable conveying a SNFE up the stack so interested parties could hook into
> it?
>
> Michael

Re: fixVersions in jira

2015-12-08 Thread Julian Sedding

+1 - Setting the next release as fixVersion for blockers makes sense.
For all other issues setting the fixVersion once it is fixed seems
more sensible.

Regards
Julian

On Tue, Dec 8, 2015 at 8:27 AM, Marcel Reutegger  wrote:
> On 07/12/15 14:55, "Davide Giannella" wrote:
>>The process I'm proposing is:
>>
>>- fixVersion = 1.4
>>- fix it
>>- fixVersion = 1.3.x
>
> +1
>
> Regards
>  Marcel
>

Re: LuceneIndexEditor message in error.log

2015-11-23 Thread Julian Sedding

Hi Chris

Two questions:

1. What is the property type of
/content/michigan-lsa/classics/en/graduate-students/current-students/jcr:content/cq:lastModified?
2. Do you have an idea what the index definition looks like that is
used for indexing this property?

Regards
Julian


On Mon, Nov 23, 2015 at 6:38 PM, Chris  wrote:
> Hello. We are running Oak v1.2.4 in AEM6.1. After upgrading to AEM6.1 I see
> the error message below repeated 1000+ times a day in the error log. The
> logs refer to many different content nodes, not just this example. We have
> also have already requested Daycare support. The response indicated that
> this log message might be related to OAK-3020 [1]. So far the system
> continues to operate, but it seems like a major warning sign if indexing
> continues to fail at this rate. Can anyone provide suggestions?
>
> 19.11.2015 10:08:37.561 *WARN* [pool-9-thread-4]
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor Failed to
> index the node
> [/content/michigan-lsa/classics/en/graduate-students/current-students]
> java.lang.IllegalArgumentException: DocValuesField
> ":dvjcr:content/cq:lastModified" appears more than once in this document
> (only one value is allowed per field)
> at
> org.apache.lucene.index.NumericDocValuesWriter.addValue(NumericDocValuesWriter.java:54)
> at
> org.apache.lucene.index.DocValuesProcessor.addNumericField(DocValuesProcessor.java:153)
> at
> org.apache.lucene.index.DocValuesProcessor.addField(DocValuesProcessor.java:66)
> at
> org.apache.lucene.index.TwoStoredFieldsConsumers.addField(TwoStoredFieldsConsumers.java:36)
> at
> org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:236)
> at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
> at
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:455)
> at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1534)
> at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1507)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.addOrUpdate(LuceneIndexEditor.java:302)
> at
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:198)
> at
> org.apache.jackrabbit.oak.spi.commit.CompositeEditor.leave(CompositeEditor.java:74)
> at
> org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:63)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:153)
> at
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:418)
> at
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:583)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:418)
> at
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:583)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:418)
> at
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:583)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:418)
> at
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:583)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:418)
> at
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:583)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at
> org.apache.jackrabbit.oak.plugins.segment.MapRecord$2.childNodeChanged(MapRecord.java:403)
> at
> org.apache.jackrabbit.oak.plugins.segment.MapRecord$3.childNodeChanged(MapRecord.java:444)
> at
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:487)
> at
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:436)
> at
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:394)
> at
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:583)
> at
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:52)
> at
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:376)
> at
> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpda

Re: Lucene auto-tune of cost

2015-11-04 Thread Julian Sedding

Hi Ian

Thanks for the informative response. I can see how mapping Lucene
implementation details and assumptions to a clustered storage can be
challenging. So on TarMK having synchronous Lucene indexes should be
fine, while on DocumentMK it could lead to a degradation of I/O and
potentially a lot of commit conflicts/retries.

Separating text-extraction from indexing sounds interesting!

Regards
Julian




On Wed, Nov 4, 2015 at 12:07 PM, Ian Boston  wrote:
> Hi,
> Slightly off topic response:
>
> With the current indexing scheme: (IIUC).
> One factor is that with shared index files, indexing can only be performed
> on a cluster leader, and for updates the lucene segments must be written to
> the repository to be read by other instances in the cluster. That means a
> hard lucene commit. If the indexing is sync, then that will mean a large
> number of hard lucene commits, which generally leads to either latency or
> lots of IO or lots of segments. Hence Async is more efficient.
>
> If all lucene indexing is performed locally and the segments are not
> shared, sync indexing works without issue as updates can be written to a
> write ahead log, then added to the index with a soft commit, and the wal
> adjusted on periodic hard commits. local indexing is viable using the
> current scheme in a standalone environment.
>
> text extraction should ideally happen as a 1 time operation on immutable
> content bodies, the result being stored as metadata of the content body.
> imho it should be a separate operation from index update which should only
> deal with indexing properties, including a already tokenized stream.
> Tokenizing can be extremely resource expensive, especially with bad
> content, like vector remastered pdfs, hence why it should not block index
> updates.
>
> Best Regards
> Ian
>
>
>
>
>
>
> On 4 November 2015 at 10:37, Julian Sedding  wrote:
>
>> Slightly off topic: why is/should Lucene Indexes always be async? I
>> understand that requirement for a full-text index, which may need to
>> do (slow) text-extraction. However, updates on a Lucene-based property
>> index are usually very fast. So it is not obvious to me why they
>> should not be synchronous.
>>
>> Thanks for any enlightening replies!
>>
>> Regards
>> Julian
>>
>> On Wed, Nov 4, 2015 at 9:49 AM, Ian Boston  wrote:
>> > On 4 November 2015 at 00:45, Davide Giannella  wrote:
>> >
>> >> Hello Team,
>> >>
>> >> Lucene index is always asynchronous and the async index could lag behind
>> >> by definition.
>> >>
>> >> Sometimes we could have the same query better served by a property
>> >> index, or traversing for example. In case the async index is lagging
>> >> behind it could be that the traversing index is better suited to return
>> >> the information as it will be more updated.
>> >>
>> >> As we know we run an async update every 5 seconds, we could come up with
>> >> some algorithm to be used on the cost computing, that auto correct with
>> >> some math the cost, increasing it the more the time passed since the
>> >> last full execution of async index.
>> >>
>> >> WDYT?
>> >>
>> >
>> >
>> > Going down the property index route, for a DocumentMK instance will bloat
>> > the DocumentStore further. That already consumes 60% of a production
>> > repository and like many in DB inverted indexes is not an efficient
>> storage
>> > structure. It's probably ok for TarMK.
>> >
>> > Traversals are a problem for production. They will create random outages
>> > under any sort of concurrent load.
>> >
>> > ---
>> > If the way the indexing was performed is changed, it could make the index
>> > NRT or real time depending on your point of view. eg. Local indexes, each
>> > Oak index in the cluster becoming a shard with replication to cover
>> > instance unavailability. No more indexing cycles, soft commits with each
>> > instance using a FS Directory and a update queue replacing the async
>> > indexing queue. Query by map reduce. It might have to copy on write to
>> seed
>> > new instances where the number of instances falls below 3.
>> >
>> >
>> >
>> > Best Regards
>> > Ian
>> >
>> >
>> >
>> >>
>> >> Davide
>> >>
>>

Re: Lucene auto-tune of cost

2015-11-04 Thread Julian Sedding

Slightly off topic: why is/should Lucene Indexes always be async? I
understand that requirement for a full-text index, which may need to
do (slow) text-extraction. However, updates on a Lucene-based property
index are usually very fast. So it is not obvious to me why they
should not be synchronous.

Thanks for any enlightening replies!

Regards
Julian

On Wed, Nov 4, 2015 at 9:49 AM, Ian Boston  wrote:
> On 4 November 2015 at 00:45, Davide Giannella  wrote:
>
>> Hello Team,
>>
>> Lucene index is always asynchronous and the async index could lag behind
>> by definition.
>>
>> Sometimes we could have the same query better served by a property
>> index, or traversing for example. In case the async index is lagging
>> behind it could be that the traversing index is better suited to return
>> the information as it will be more updated.
>>
>> As we know we run an async update every 5 seconds, we could come up with
>> some algorithm to be used on the cost computing, that auto correct with
>> some math the cost, increasing it the more the time passed since the
>> last full execution of async index.
>>
>> WDYT?
>>
>
>
> Going down the property index route, for a DocumentMK instance will bloat
> the DocumentStore further. That already consumes 60% of a production
> repository and like many in DB inverted indexes is not an efficient storage
> structure. It's probably ok for TarMK.
>
> Traversals are a problem for production. They will create random outages
> under any sort of concurrent load.
>
> ---
> If the way the indexing was performed is changed, it could make the index
> NRT or real time depending on your point of view. eg. Local indexes, each
> Oak index in the cluster becoming a shard with replication to cover
> instance unavailability. No more indexing cycles, soft commits with each
> instance using a FS Directory and a update queue replacing the async
> indexing queue. Query by map reduce. It might have to copy on write to seed
> new instances where the number of instances falls below 3.
>
>
>
> Best Regards
> Ian
>
>
>
>>
>> Davide
>>

Re: [VOTE] Release Apache Jackrabbit Oak 1.0.23

2015-10-29 Thread Julian Sedding

[X] +1 Release this package as Apache Jackrabbit Oak 1.0.23

Regards
Julian

On Wed, Oct 28, 2015 at 10:59 AM, Davide Giannella  wrote:
> A candidate for the Jackrabbit Oak 1.0.23 release is available at:
>
> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.23/
>
> The release candidate is a zip archive of the sources in:
>
>
> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.23/
>
> The SHA1 checksum of the archive is
> c6cdd835e2cf4cb92cbca97b2be99276079b7623.
>
> A staged Maven repository is available for review at:
>
> https://repository.apache.org/
>
> The command for running automated checks against this release candidate is:
>
> $ sh check-release.sh oak 1.0.23
> c6cdd835e2cf4cb92cbca97b2be99276079b7623
>
> Please vote on releasing this package as Apache Jackrabbit Oak 1.0.23.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Jackrabbit PMC votes are cast.
>
> [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.23
> [ ] -1 Do not release this package because...
>
> Davide

Re: [VOTE] Release Apache Jackrabbit Oak 1.3.9

2015-10-27 Thread Julian Sedding

[X] +1 Release this package as Apache Jackrabbit Oak 1.3.9

Regards
Julian

On Tue, Oct 27, 2015 at 1:19 PM, Davide Giannella  wrote:
> A candidate for the Jackrabbit Oak 1.3.9 release is available at:
>
> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.3.9/
>
> The release candidate is a zip archive of the sources in:
>
>
> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.3.9/
>
> The SHA1 checksum of the archive is
> 86deb7381e0c33ff8c92ab80eb89b1696044b1fe.
>
> A staged Maven repository is available for review at:
>
> https://repository.apache.org/
>
> The command for running automated checks against this release candidate is:
>
> $ sh check-release.sh oak 1.3.9 86deb7381e0c33ff8c92ab80eb89b1696044b1fe
>
> Please vote on releasing this package as Apache Jackrabbit Oak 1.3.9.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Jackrabbit PMC votes are cast.
>
> [ ] +1 Release this package as Apache Jackrabbit Oak 1.3.9
> [ ] -1 Do not release this package because...
>

Re: Why does oak-core import every package with an optional resolution?

2015-10-27 Thread Julian Sedding

+1 for cleaning this up.

Quite a few dependencies are marked optional in maven and thus will
end up with resolution:=optional in the manifest anyway. Maybe that's
already good enough.

Maybe in the future the modules that require these optional
dependencies can be factored into their own bundles, thus obviating
the need for optionality of the dependencies all together.

Regards
Julian

On Tue, Oct 27, 2015 at 2:01 PM, Francesco Mari
 wrote:
> In the meantime, I logged OAK-3558.
>
> 2015-10-26 19:34 GMT+01:00 Julian Reschke :
>> On 2015-10-26 16:37, Francesco Mari wrote:
>>>
>>> 2015-10-26 13:15 GMT+01:00 Julian Reschke :

 On 2015-10-26 12:13, Chetan Mehrotra wrote:
>
>
> Looking at history of oak-core/pom.xml  this change was done in [1]
> for OAK-1708 most like to support loading of various DB drivers from
> within Oak Core and probably a temp change which was not looked back
> again. That might not be required now as the DataSource gets injected
> and oak-core need not be aware of drivers etc. So we can get rid of
> that
>
> @Julian - Thoughts?
> ...



 No thoughts other than "I do not understand Maven dependencies
 sufficiently
 well to comment".

 How about changing it and see whether anything breaks?

>>>
>>> It would be a good starting point if you would remember why the change
>>> was made in the first place, so we could have something more specific
>>> to look after.
>>
>>
>> I believe the change was made to avoid having dependencies such as to the
>> connection pool or specific JDBC drivers.
>>
>> But, again, if I knew/could remember I'd tell you. If you believe this is
>> incorrect just go ahead and fix it.
>>
>> Best regards, Julian

Re: jackrabbit-oak build #6574: Broken

2015-10-05 Thread Julian Sedding

This failure does not seem to be related to my commit:

> Tests run: 42, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 13.589
sec <<< FAILURE!
>
nodeWithSameString[1](org.apache.jackrabbit.oak.plugins.segment.RecordUsageAnalyserTest)
Time elapsed: 4.031 sec <<< ERROR!
> java.lang.OutOfMemoryError: Java heap space
>at
org.apache.jackrabbit.oak.cache.CacheLIRS$Segment.clear(CacheLIRS.java:826)
>at
org.apache.jackrabbit.oak.cache.CacheLIRS$Segment.(CacheLIRS.java:781)
>at
org.apache.jackrabbit.oak.cache.CacheLIRS.invalidateAll(CacheLIRS.java:193)
>at org.apache.jackrabbit.oak.cache.CacheLIRS.(CacheLIRS.java:180)
>at
org.apache.jackrabbit.oak.cache.CacheLIRS$Builder.build(CacheLIRS.java:1564)
>at
org.apache.jackrabbit.oak.cache.CacheLIRS$Builder.build(CacheLIRS.java:1560)
>at
org.apache.jackrabbit.oak.plugins.segment.StringCache.(StringCache.java:52)
>at
org.apache.jackrabbit.oak.plugins.segment.SegmentTracker.(SegmentTracker.java:123)
>at
org.apache.jackrabbit.oak.plugins.segment.SegmentTracker.(SegmentTracker.java:146)
>at
org.apache.jackrabbit.oak.plugins.segment.RecordUsageAnalyserTest.setup(RecordUsageAnalyserTest.java:70)
>...

Regards
Julian


On Mon, Oct 5, 2015 at 1:18 PM, Travis CI  wrote:

> Build Update for apache/jackrabbit-oak
> -
>
> Build: #6574
> Status: Broken
>
> Duration: 449 seconds
> Commit: 939e373f8cdca2c8e86a72ec31aab69be5f921ec (trunk)
> Author: Julian Sedding
> Message: OAK-3473 - CliUtils#handleSigInt uses classes from sun.misc.*
>
> - use Runtime#addShutdownHook() instead
>
>
> git-svn-id: https://svn.apache.org/repos/asf/jackrabbit/oak/trunk@1706784
> 13f79535-47bb-0310-9956-ffa450edef68
>
> View the changeset:
> https://github.com/apache/jackrabbit-oak/compare/99ef2ec9b3a9...939e373f8cdc
>
> View the full build log and details:
> https://travis-ci.org/apache/jackrabbit-oak/builds/83668306
>
> --
> sent by Jukka's Travis notification gateway
>

Re: svn commit: r1706674 - in /jackrabbit/oak/trunk/oak-upgrade: ./ src/main/java/org/apache/jackrabbit/oak/upgrade/ src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/ src/test/java/org

2015-10-05 Thread Julian Sedding

Java 6 failed to infer some generic argument types. Fixed in
https://svn.apache.org/r1706797

Regards
Julian

On Mon, Oct 5, 2015 at 12:09 PM, Julian Sedding  wrote:
> Thanks for alerting me. I'll take a look.
>
> Regards
> Julian
>
> On Mon, Oct 5, 2015 at 12:06 PM, Michael Dürig  wrote:
>>
>> Hi,
>>
>> I just noted that this won't compile on Java 1.6 [1]. Julian, could you have
>> a look?
>>
>> OTOH, do we still need/want to support Java 1.6? Let's discuss this in
>> another thread.
>>
>> Michael
>>
>>
>> [1] [ERROR] /jenkins/workspace/Apache Jackrabbit Oak
>> matrix/jdk/jdk-1.6u45/label/Ubuntu/nsfixtures/SEGMENT_MK/profile/unittesting/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/AssertingPeriodicReporter.java:[75,15]
>> hasReports(org.hamcrest.Matcher> extends java.lang.String>>,org.hamcrest.Matcher> java.lang.Long,? extends java.lang.String>>) in
>> org.apache.jackrabbit.oak.upgrade.nodestate.report.AssertingPeriodicReporter
>> cannot be applied to (org.hamcrest.Matcher> java.lang.Object,? extends
>> java.lang.Object>>,org.hamcrest.Matcher> java.lang.Long,? extends java.lang.String>>)
>>
>>
>>
>>
>>
>> On 4.10.15 3:18 , jsedd...@apache.org wrote:
>>>
>>> Author: jsedding
>>> Date: Sun Oct  4 13:18:29 2015
>>> New Revision: 1706674
>>>
>>> URL:http://svn.apache.org/viewvc?rev=1706674&view=rev
>>> Log:
>>> OAK-3460 - Progress logging for RepositorySidegrade
>>>
>>> - implemented ReportingNodeState wrapper in order to be able to log each
>>> NodeState access
>>>
>>> Added:
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/LoggingReporter.java
>>> (with props)
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/PeriodicReporter.java
>>> (with props)
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/Reporter.java
>>> (with props)
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/ReportingNodeState.java
>>> (with props)
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/AssertingPeriodicReporter.java
>>> (with props)
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/PeriodicReporterTest.java
>>> (with props)
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/ReportingNodeStateTest.java
>>> (with props)
>>> Modified:
>>>  jackrabbit/oak/trunk/oak-upgrade/pom.xml
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/JackrabbitNodeState.java
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/RepositorySidegrade.java
>>>
>>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/RepositoryUpgrade.java
>>
>>

Re: svn commit: r1706674 - in /jackrabbit/oak/trunk/oak-upgrade: ./ src/main/java/org/apache/jackrabbit/oak/upgrade/ src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/ src/test/java/org

2015-10-05 Thread Julian Sedding

Thanks for alerting me. I'll take a look.

Regards
Julian

On Mon, Oct 5, 2015 at 12:06 PM, Michael Dürig  wrote:
>
> Hi,
>
> I just noted that this won't compile on Java 1.6 [1]. Julian, could you have
> a look?
>
> OTOH, do we still need/want to support Java 1.6? Let's discuss this in
> another thread.
>
> Michael
>
>
> [1] [ERROR] /jenkins/workspace/Apache Jackrabbit Oak
> matrix/jdk/jdk-1.6u45/label/Ubuntu/nsfixtures/SEGMENT_MK/profile/unittesting/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/AssertingPeriodicReporter.java:[75,15]
> hasReports(org.hamcrest.Matcher extends java.lang.String>>,org.hamcrest.Matcher java.lang.Long,? extends java.lang.String>>) in
> org.apache.jackrabbit.oak.upgrade.nodestate.report.AssertingPeriodicReporter
> cannot be applied to (org.hamcrest.Matcher java.lang.Object,? extends
> java.lang.Object>>,org.hamcrest.Matcher java.lang.Long,? extends java.lang.String>>)
>
>
>
>
>
> On 4.10.15 3:18 , jsedd...@apache.org wrote:
>>
>> Author: jsedding
>> Date: Sun Oct  4 13:18:29 2015
>> New Revision: 1706674
>>
>> URL:http://svn.apache.org/viewvc?rev=1706674&view=rev
>> Log:
>> OAK-3460 - Progress logging for RepositorySidegrade
>>
>> - implemented ReportingNodeState wrapper in order to be able to log each
>> NodeState access
>>
>> Added:
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/LoggingReporter.java
>> (with props)
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/PeriodicReporter.java
>> (with props)
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/Reporter.java
>> (with props)
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/ReportingNodeState.java
>> (with props)
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/AssertingPeriodicReporter.java
>> (with props)
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/PeriodicReporterTest.java
>> (with props)
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/nodestate/report/ReportingNodeStateTest.java
>> (with props)
>> Modified:
>>  jackrabbit/oak/trunk/oak-upgrade/pom.xml
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/JackrabbitNodeState.java
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/RepositorySidegrade.java
>>
>> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/RepositoryUpgrade.java
>
>

Re: Yourkit profiler open source license

2015-10-05 Thread Julian Sedding

Excellent, thank you!

Regards
Julian

On Mon, Oct 5, 2015 at 11:31 AM, Michael Dürig  wrote:
>
> Yes, just email them and you'll get a key. See Open Source Project Licenses
> [1]. The Oak web site already has the reference they ask for [2].
>
> Michael
>
> [1] https://www.yourkit.com/purchase/
> [2] http://jackrabbit.apache.org/oak/docs/attribution.html
>
>
> On 5.10.15 10:15 , Julian Sedding wrote:
>>
>> Hi Michael
>>
>> Did we ever get access to a YourKit Open Source license?
>>
>> Regards
>> Julian
>>
>>
>> On Mon, Mar 23, 2015 at 3:23 PM, Bertrand Delacretaz
>>  wrote:
>>>
>>> On Mon, Mar 23, 2015 at 3:13 PM, Michael Dürig 
>>> wrote:
>>>>
>>>> ...Will do something similar for Oak then, even though
>>>> strictly speaking it might not be required
>>>
>>>
>>> Note that such attribution links should use rel=nofollow to be
>>> consistent with the lowest levels of
>>> https://www.apache.org/foundation/sponsorship.html
>>>
>>> -Bertrand

Re: [VOTE] Release Apache Jackrabbit Oak 1.0.22

2015-10-05 Thread Julian Sedding

[X] +1 Release this package as Apache Jackrabbit Oak 1.0.22

Regards
Julian

On Mon, Oct 5, 2015 at 10:23 AM, Davide Giannella  wrote:
> A candidate for the Jackrabbit Oak 1.0.22 release is available at:
>
> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.22/
>
> The release candidate is a zip archive of the sources in:
>
>
> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.22/
>
> The SHA1 checksum of the archive is
> 04d4dd2ce13bfe72e49f1754de1fffb3d0db0fd6.
>
> A staged Maven repository is available for review at:
>
> https://repository.apache.org/
>
> The command for running automated checks against this release candidate is:
>
> $ sh check-release.sh oak 1.0.22
> 04d4dd2ce13bfe72e49f1754de1fffb3d0db0fd6
>
> Please vote on releasing this package as Apache Jackrabbit Oak 1.0.22.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Jackrabbit PMC votes are cast.
>
> [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.22
> [ ] -1 Do not release this package because...

Re: Yourkit profiler open source license

2015-10-05 Thread Julian Sedding

Hi Michael

Did we ever get access to a YourKit Open Source license?

Regards
Julian


On Mon, Mar 23, 2015 at 3:23 PM, Bertrand Delacretaz
 wrote:
> On Mon, Mar 23, 2015 at 3:13 PM, Michael Dürig  wrote:
>> ...Will do something similar for Oak then, even though
>> strictly speaking it might not be required
>
> Note that such attribution links should use rel=nofollow to be
> consistent with the lowest levels of
> https://www.apache.org/foundation/sponsorship.html
>
> -Bertrand

Re: Is there query support for CONTAINS(path/to/property)?

2015-09-20 Thread Julian Sedding

Hi Chetan

Thanks for this hint, that's very helpful! I'll continue down this
line and see how far that takes me. Basic tests seem promising.

Regards
Julian

On Fri, Sep 18, 2015 at 4:30 PM, Chetan Mehrotra
 wrote:
> Hi Julian,
>
> LucenePropertyIndex does support contains for properties, For an
> example see LucenePropertyIndexTest#fulltextBooleanComplexOrQueries.
> You can give your usecase a try
> Chetan Mehrotra
>
>
> On Fri, Sep 18, 2015 at 5:33 PM, Julian Sedding  wrote:
>> Hi there
>>
>> Digging through the indexing and query code, especially the lucene
>> part, I got the impression that full-text queries on properties are
>> not currently supported. Is this observation correct?
>>
>> Furthermore, it seems that under the hood Lucene's full-text index is
>> used for (some) "LIKE" queries. This leads me to the question, whether
>> the following queries are semantically equivalent:
>>
>> //*[jcr:contains(a/@b, 'fish')]
>> vs
>> //*[jcr:like(a/@b, '%fish%')]
>>
>> If these are equivalent it would seem fairly straight forward to add
>> "contains" support for properties (as was present in JR2). Opinions?
>>
>> Thanks and regards
>> Julian

Is there query support for CONTAINS(path/to/property)?

2015-09-18 Thread Julian Sedding

Hi there

Digging through the indexing and query code, especially the lucene
part, I got the impression that full-text queries on properties are
not currently supported. Is this observation correct?

Furthermore, it seems that under the hood Lucene's full-text index is
used for (some) "LIKE" queries. This leads me to the question, whether
the following queries are semantically equivalent:

//*[jcr:contains(a/@b, 'fish')]
vs
//*[jcr:like(a/@b, '%fish%')]

If these are equivalent it would seem fairly straight forward to add
"contains" support for properties (as was present in JR2). Opinions?

Thanks and regards
Julian

Re: [VOTE] Release Apache Jackrabbit Oak 1.2.6

2015-09-17 Thread Julian Sedding

On Thu, Sep 17, 2015 at 11:42 AM, Marcel Reutegger  wrote:

> Please vote on releasing this package as Apache Jackrabbit Oak 1.2.6.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Jackrabbit PMC votes are cast.

[X ] +1 Release this package as Apache Jackrabbit Oak 1.2.6

Regards
Julian

Re: [VOTE] Release Apache Jackrabbit Oak 1.0.21

2015-09-17 Thread Julian Sedding

On Thu, Sep 17, 2015 at 11:09 AM, Marcel Reutegger  wrote:

> Please vote on releasing this package as Apache Jackrabbit Oak 1.0.21.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Jackrabbit PMC votes are cast.

[X] +1 Release this package as Apache Jackrabbit Oak 1.0.21

Regards
Julian

Re: System.exit()???? , was: svn commit: r1696202 - in /jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document: ClusterNodeInfo.java DocumentMK.java DocumentNodeStore.j

2015-09-11 Thread Julian Sedding

My preference is (b), even though I think stopping the NodeStore
service should be sufficient (it may not currently be sufficient, I
don't know).

Particularly, I believe that "trying harder" is detrimental to the
overall stability of a cluster/topology. We are dealing with a
possibly faulty instance, so who can decide that it is ok again after
trying harder? The faulty instance itself?

"Read-only" doesn't sound too useful either, because that may fool
clients into thinking they are dealing with a "healthy" instance for
longer than necessary and thus can lead to bigger issues downstream.

I believe that "fail early and fail often" is the path to a stable cluster.

Regards
Julian

On Thu, Sep 10, 2015 at 6:43 PM, Stefan Egli  wrote:
> On 09/09/15 18:11, "Stefan Egli"  wrote:
>
>>On 09/09/15 18:01, "Stefan Egli"  wrote:
>>
>>>I think if the observers would all be 'OSGi-ified' then this could be
>>>achieved. But currently eg the BackgroundObserver is just a pojo and not
>>>an osgi component (thus doesn't support any activate/deactivate method
>>>hooks).
>>
>>.. I take that back - going via OsgiWhiteboard should work as desired - so
>>perhaps implementing deactivate/activate methods in the
>>(Background)Observer(s) would do the trick .. I'll give it a try ..
>
> ootb this wont work as the BackgroundObserver, as one example, is not an
> OSGi component, so wont get any deactivate/activate calls atm. so to
> achieve this, it would have to be properly OSGi-ified - something which
> sounds like a bigger task and not only limited to this one class - which
> means making DocumentNodeStore 'restart capable' sounds like a bigger task
> too and the question is indeed if it is worth while ('will it work?') or
> if there are alternatives..
>
> which brings me back to the original question as to what should be done in
> case of a lease failure - to recap the options left (if System.exit is not
> one of them) are:
>
> a) 'go read-only': prevent writes by throwing exceptions from this moment
> until eternity
>
> b) 'stop oak': stop the oak-core bundle (prevent writes by throwing
> exceptions for those still reaching out for the nodeStore)
>
> c) 'try harder': try to reactivate the lease - continue allowing writes -
> and make sure the next backgroundWrite has correctly updated the
> 'unsavedLastRevisions' (cos others could have done a recover of this node,
> so unsavedLastRevisions contains superfluous stuff that must no longer be
> written). this would open the door for edge cases ('change of longer time
> window with multiple leaders') but perhaps is not entirely impossible...
>
> additionally/independently:
>
> * in all cases the discovery-lite descriptor should expose this lease
> failure/partitioning situation - so that anyone can react who would like
> to, esp should anyone no longer assume that the local instance is leader
> or part of the cluster - and to support that optional Sling Health Check
> which still does a System.exit :)
>
> * also, we should probably increase the lease thread's priority to reduce
> the likelihood of the lease timing out (same would be true for
> discovery.impl's heartbeat thread)
>
>
> * plus increasing the lease time from 1min to perhaps 5min as the default
> would also reduce the number of cases that hit problems dramatically
>
> wdyt?
>
> Cheers,
> Stefan
>
>

Re: oak-run - quo vadis? (was: oak-run upgrade improvements)

2015-09-03 Thread Julian Sedding

Hi Tomek

Ideally we can have the NodeStore and DataStore/BlobStore creation in
a shared CLI module. That would allow best re-use IMHO.

I know that Oak has a flat module structure. Still I would think that
putting these split modules + something like "oak-tools-commons" into
a folder called "tools". What do others think? Does anyone have strong
feelings for staying with a flat structure?

Regards
Julian

On Thu, Sep 3, 2015 at 10:02 AM, Tomek Rekawek  wrote:
> Hi Julian,
>
> Thanks for making this more general.
>
>
>
>
>
> On 03/09/15 09:31, "Julian Sedding"  wrote:
>
>>(…) split oak-run into three modules:
>>
>>- oak-dev-tools
>>- oak-upgrade
>>- oak-ops-tool
>
> I think it makes perfect sense.
>
> Regarding the oak-upgrade, we already have such a module. I wonder if we 
> should add the CLI frontend there or create a new module.
>
> Best regards,
> Tomek

oak-run - quo vadis? (was: oak-run upgrade improvements)

2015-09-03 Thread Julian Sedding

Hi Tomek

I believe that benchmarks in oak-run also use Jackrabbit 2 (via fixtures).
However, OAK-3342 proposes moving the benchmarks into their own module.

Looking at this from a less technical perspective, I propose to split
oak-run into three modules:

- oak-dev-tools (benchmarks, scalability, etc), size does not matter so
much, as devs probably build it themselves
- oak-upgrade (all things copying repositories between persistence
formats), contains JR2
- oak-ops-tools (tools used for operating an Oak repo: inspecting,
checking, console, etc. may also contain NodeStore copy for convenience if
code duplication can be avoided), does NOT contain JR2

Naming is of course up for debate. Let's just discuss if a split along
these lines makes sense for now.

WDYT?

Regards
Julian



On Wednesday, September 2, 2015, Tomek Rekawek  wrote:

> Hi,
>
> One more thing. The “upgrade" command requires a lot of Maven dependencies
> (Amazon API client for the S3 support, Jackrabbit 2 for the repository
> upgrades, etc.) Some of these dependencies conflicts with the Oak modules
> (eg. Jackrabbit 2 uses older lucene-core than the oak-lucene and the new
> version is not backward-compatible). Because of that, the dependency
> management in the oak-run module is complicated - even before my patch
> there is a separate profile for building the project using Jackrabbit 2
> dependencies and there are also two assembly files building the normal jar
> and the “jackrabbit 2” jar.
>
> I think we should extract the upgrade command to a separate Maven module
> (eg. oak-migrator or oak-upgrade-tool). This way we can precisely define
> what are the required dependencies and we don’t need to care if they are
> compatible with other oak-run commands.
>
> Any objections? :)
>
> Best regards,
> Tomek
>
>
> On 02/09/15 14:58, "Tomek Rekawek"  wrote:
>
> >Hello,
> >
> >I created a pull request [1] for the OAK-2171 [2]. It exposes all
> features added recently to the oak-upgrade module (version history copy,
> filtering paths) as well as all migration paths (eg. mongo -> rdb) in the
> oak-run upgrade command. There are also tests. Looking forward to feedback
> :)
> >
> >Best regards,
> >Tomek
> >
> >[1] https://github.com/apache/jackrabbit-oak/pull/38
> >[2] https://issues.apache.org/jira/browse/OAK-2171
> >
> >--
> >Tomek Rękawek | Adobe Research | www.adobe.com
> >reka...@adobe.com
> >
> >
> >
>

Re: [VOTE] Epics in Jira

2015-09-01 Thread Julian Sedding

+1

Julian

On Tue, Sep 1, 2015 at 10:17 AM, Michael Dürig  wrote:
>
> +1 given we move to
> https://issues.apache.org/jira/secure/attachment/12752488/Screen%20Shot%202015-08-26%20at%203.40.18%20pm.png
>
> Michael
>
>
> On 1.9.15 9:40 , Davide Giannella wrote:
>>
>> Hello team,
>>
>> some of us noticed we lack the epics in our jira so I raised an issue
>> asking whether that would be possible to have them.
>>
>> https://issues.apache.org/jira/browse/INFRA-10185
>>
>> had a I reply (which TBH didn't really understand completely). Feel free
>> to follow-up on the issue itself if you require more details.
>>
>> Can we start a vote session for changing our jira schema to allows epics?
>>
>> My vote is +1.
>>
>> Cheers
>> Davide
>>
>>
>

Re: New committer: Francesco Mari

2015-08-30 Thread Julian Sedding

Congratulations Francesco!

Regards
Julian

On Monday, August 31, 2015, Chetan Mehrotra 
wrote:

> Welcome Francesco!
> Chetan Mehrotra
>
>
> On Fri, Aug 28, 2015 at 5:18 PM, Michael Dürig  > wrote:
> > Hi,
> >
> > Please welcome Francesco as a new committer and PMC member of the Apache
> > Jackrabbit project. The Jackrabbit PMC recently decided to offer
> Francesco
> > committership based on his contributions. I'm happy to announce that he
> > accepted the offer and that all the related administrative work has now
> been
> > taken care of.
> >
> > Welcome to the team, Francesco!
> >
> > Michael
>

Re: New committer: Julian Sedding

2015-08-30 Thread Julian Sedding

Dear PMC

Thank you for the invitation an your trust. I will use the right to commit
responsibly and to the overall good of the Jackrabbit project.

I'm looking forward to working with you!

Regards
Julian

On Monday, August 31, 2015, Chetan Mehrotra 
wrote:

> Welcome Julian!
> Chetan Mehrotra
>
>
> On Fri, Aug 28, 2015 at 5:18 PM, Michael Dürig  > wrote:
> > Hi,
> >
> > Please welcome Julian as a new committer and PMC member of the Apache
> > Jackrabbit project. The Jackrabbit PMC recently decided to offer Julian
> > committership based on his contributions. I'm happy to announce that he
> > accepted the offer and that all the related administrative work has now
> been
> > taken care of.
> >
> > Welcome to the team, Julian!
> >
> > Michael
>

Re: SegmentStore: MultiStore Compaction

2015-08-28 Thread Julian Sedding

Hi Thomas

The idea is most welcome. Just had an OOME with 4G heap on a local dev
instance yesterday due to compaction (running Oak 1.2.4).

Which API do you want to use for copying? The NodeStore API? Note that
for a complete copy you need to also consider copying checkpoints.

The idea is very similar to what's happening in oak-upgrade. From my
experiments in that area, I think this approach could result in a
reasonably fast compaction.

For a simplistic approach (and I may just be summarizing your idea,
not quite sure):
- the NodeStore could be wrapped with a CompactingNodeStore
- CompactingNodeStore delegates to store 1 and store 2; a bloom filter
recording copied paths optimizes correct delegation; false positives
check both store 2 then store 1
- once copy is complete, store 1 can be shut down
- repeat when necessary

Challenges may be in the caching layers, where references to store 1
may be hard to get rid of?

Regards
Julian

On Fri, Aug 28, 2015 at 10:05 AM, Thomas Mueller  wrote:
> Hi,
>
> I thought about SegmentStore compaction and made a few slides:
>
> http://www.slideshare.net/ThomasMueller12/multi-store-compaction
>
> Feedback is welcome! The idea is at quite an early stage, so if you don't 
> understand or agree with some items, I'm to blame.
>
> Regards,
> Thomas

Re: svn commit: r1694498 - in /jackrabbit/oak/trunk: oak-run/src/main/java/org/apache/jackrabbit/oak/run/ oak-run/src/test/java/org/apache/jackrabbit/oak/run/ oak-upgrade/src/main/java/org/apache/jack

2015-08-06 Thread Julian Sedding

Thanks Manfred for getting this in!

For completeness regarding credits (Manfred couldn't know): I created
the initial patch and developed some doubts over time. Tomek took care
of adapting the code to address those doubts and create a superior
feature (IMHO). I assisted at that stage with code reviews and
suggestions only. So thanks Tomek!

Regards
Julian


On Thu, Aug 6, 2015 at 3:57 PM,   wrote:
> Author: baedke
> Date: Thu Aug  6 13:57:58 2015
> New Revision: 1694498
>
> URL: http://svn.apache.org/r1694498
> Log:
> OAK-2776: Upgrade should allow to skip copying versions
>
> Initial implementation. Full credit goes to Julian Sedding 
> (jsedd...@gmail.com) for the patch and to Tomek Rekawek (treka...@gmail.com) 
> for aligning it with the current trunk and integrating into oak-run.
>
> Added:
> 
> jackrabbit/oak/trunk/oak-run/src/test/java/org/apache/jackrabbit/oak/run/ParseVersionCopyArgumentTest.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/DescendantsIterator.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/version/
> 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/version/VersionCopier.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/version/VersionCopyConfiguration.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/version/VersionHistoryUtil.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/version/VersionableEditor.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/CopyVersionHistoryTest.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/util/VersionCopyTestUtils.java
> Modified:
> 
> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/JackrabbitNodeState.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/main/java/org/apache/jackrabbit/oak/upgrade/RepositoryUpgrade.java
> 
> jackrabbit/oak/trunk/oak-upgrade/src/test/java/org/apache/jackrabbit/oak/upgrade/AbstractRepositoryUpgradeTest.java
>
> Modified: 
> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
> URL: 
> http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java?rev=1694498&r1=1694497&r2=1694498&view=diff
> ==
> --- 
> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
>  (original)
> +++ 
> jackrabbit/oak/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/run/Main.java
>  Thu Aug  6 13:57:58 2015
> @@ -28,8 +28,12 @@ import java.io.File;
>  import java.io.IOException;
>  import java.io.InputStream;
>  import java.sql.Timestamp;
> +import java.text.DateFormat;
> +import java.text.ParseException;
> +import java.text.SimpleDateFormat;
>  import java.util.ArrayList;
>  import java.util.Arrays;
> +import java.util.Calendar;
>  import java.util.Collections;
>  import java.util.HashMap;
>  import java.util.HashSet;
> @@ -60,6 +64,8 @@ import joptsimple.ArgumentAcceptingOptio
>  import joptsimple.OptionParser;
>  import joptsimple.OptionSet;
>  import joptsimple.OptionSpec;
> +
> +import org.apache.commons.lang.time.DateUtils;
>  import org.apache.jackrabbit.core.RepositoryContext;
>  import org.apache.jackrabbit.core.config.RepositoryConfig;
>  import org.apache.jackrabbit.oak.Oak;
> @@ -939,6 +945,8 @@ public final class Main {
>  private static void upgrade(String[] args) throws Exception {
>  OptionParser parser = new OptionParser();
>  parser.accepts("datastore", "keep data store");
> +ArgumentAcceptingOptionSpec copyVersions = 
> parser.accepts("copy-versions", "copy referenced versions. valid arguments: 
> true|false|-mm-dd").withRequiredArg().defaultsTo("true");
> +ArgumentAcceptingOptionSpec copyOrphanedVersions = 
> parser.accepts("copy-orphaned-versions", "copy all versions. valid arguments: 
> true|false|-mm-dd").withRequiredArg().defaultsTo("true");
>  OptionSpec nonOption = parser.nonOptions();
>  OptionSet options = parser.parse(args);
>
> @@ -967,6 +975,7 @@ public final class Main {
>  new RepositoryUpgrade(source, target);
>  upgrade.setCopyBinariesByReference(
>

Re: Checkpoints and copying NodeStore instances (aka RepositorySidegrade)

2015-08-06 Thread Julian Sedding

Hi Davide

Of course we need to measure the speed and improvements. I am not sure
how much time I will have to implement benchmarks for this, but will
try.

So far I have not tried the cached Tika full-text extraction, yet. I'm
curious how much gain it can provide though. This may make re-indexing
so cheap that we need not worry about it any more.

I hadn't really paid attention to OAK-2749, but it sounds interesting.
Similarly but differently, I was pondering the idea to allow
multithreaded tar2mongo copies. Since DocumentMK supports clustering,
it should be possible to copy different sub-trees in different
threads?!

It would indeed be interesting to have a chat some time! But first
I'll be on holidays for two weeks :)

Regards
Julian

On Wed, Aug 5, 2015 at 8:57 PM, Davide Giannella  wrote:
> On 05/08/2015 17:45, Julian Sedding wrote:
>> ...
>>
>> My aim is to reduce the critical path for migrating one NodeStore
>> (incl JR2) to another. Indexing (especially async indexing) takes is a
>> big part of the time, so if I can move that out of the critical path,
>> it can save a lot of downtime.
>
> Interesting. I know async index can be lengthy but it would be very
> interesting if we could measure what we have now and the improvements
> we're making.
>
> The slowest part of the async index is normally the full-text extraction
> as they run in a single thread. With
> https://issues.apache.org/jira/browse/OAK-2749 we provided a mechanism
> (not used yet AFAIK) to run different indexers on different threads.
> Maybe it's something you would like to experiment with as well to speed
> up the indexing.
>
> If you want ping me on chat tomorrow morning (CEST) so we can quickly
> see what we can do here. But I think we should start measuring it first :)
>
> Cheers
> Davide
>
>
>

Re: Checkpoints and copying NodeStore instances (aka RepositorySidegrade)

2015-08-06 Thread Julian Sedding

Hi Alex

See inline.

On Wed, Aug 5, 2015 at 7:57 PM, Alex Parvulescu
 wrote:
> Hi,
>
> see inline
>
> On Wed, Aug 5, 2015 at 5:45 PM, Julian Sedding  wrote:
>
>> Hi Alex
>>
>> Thanks for your comments.
>>
>> On Wed, Aug 5, 2015 at 3:48 PM, Alex Parvulescu
>>  wrote:
>> > Hi,
>> >
>> > Just a few clarifications on the error you see
>> >
>> >> My interpretation is that the AsyncIndexUpdate is trying to retrieve
>> > the previous checkpoint as stored in /:async/async. Of course this
>> > checkpoint is not present in the copied NodeStore and thus cannot be
>> > retrieved.
>> >
>> > The error comes from DocumentMk trying to parse the reference checkpoint
>> > value. Basically what fails here is 'Revision.fromString' receiving a
>> > malformed checkpoint value because it comes from the SegmentMk. The quick
>> > fix is to manually remove the properties on the "/:async" hidden node.
>> This
>> > will indeed trigger a full reindex, but will help you getting over this
>> > issue.
>>
>> Agreed. In this case parsing the revision is the first thing that
>> fails. When copying DNS to SNS a similar situation would arise,
>> because no snapshot with the provided ID exists.
>>
>>
> [alex] Not really, as the SegmentMk will not fail (no
> IllegalArgumentException), but simply log a warning the checkpoint doesn't
> exist and perform a full reindex. So in this regard it is a bit more
> lenient :)

Ok, I didn't know that SegmentMK is more lenient here. Should we make
DocumentMK degrade gracefully as well? Currently the AsyncIndex does
not recover by itself. It would be more robust if it did.

>
>
>
>> >
>> >> IMHO it would be desirable to (optionally) copy the checkpoints as
>> > well. In the case of AsyncIndexUpdate, having the checkpoint can save
>> > a full re-index.
>> >
>> > This is very tricky, as the 2 representations of checkpoints between
>> > SegmentMk and DocumentMk are quite different. I would strongly suggest
>> > going for the reindex, after all you'd only migrate once, so you can
>> > prepare for this lengthy process.
>>
>> I'm experimenting with the following approach:
>> * retrieve the first checkpoint and copy the NodeState tree at that
>> revision (available via CheckpointMBean impls)
>> * after copying the tree, merge and create a checkpoint (expiration
>> time can be calculated)
>> * rinse and repeat until the head revision is reached
>>
>> My aim is to reduce the critical path for migrating one NodeStore
>> (incl JR2) to another. Indexing (especially async indexing) takes is a
>> big part of the time, so if I can move that out of the critical path,
>> it can save a lot of downtime.
>>
>
> [alex] interesting approach. I would only reduce this to the 'current'
> indexed checkpoint (the async reference). So you'd migrate that over first
> as the head state, create a checkpoint based on it (let' call it 'c0').
> then diff&apply the SegmentMk head state on top of this. update the async
> property to point to c0 and you might be good.

Absolutely, only copying the checkpoints that are actually needed makes sense.

Thinking out loud: it may be faster to run the async-index in the copy
process, based on the diff from the source NodeStore between the
checkpoint and the head. That should be feasible, right?

>
>
>
>>
>> My current approach for a migration from JR2 to MongoMK is to:
>> * copy JR2 to TarMK (TarMK is a lot faster for creating indexes etc.
>> than MongoMK)
>> * repeat JR2 to TarMK copy every week or every 24h using incremental
>> copy. this saves on CommitHook execution time - in theory this can
>> reduce the time for one run to a single full repository traversal.
>> * finally on the day when the systems should be switched over, run a
>> last JR2 to TarMK and then a TarMK to MongoMK copy. this is the
>> critical path.
>>
>
> [alex] Always going through the SegmentMk seems a bit convoluted. Why not
> do the migration once, then apply the diffs on top of MongoMk directly
> (AFAIK we have support for incremental updates now)? Are the 24h diffs so
> big that it makes it unusable/unacceptable to go to MongoMk directly? (I'd
> like to see this backed by some numbers).

We definitely need numbers. I aim to do some experiments after my
holidays and provide some numbers at the beginning of September.

Regarding incremental upgrades definitely yield a huge benefit on the
critical path with SegmentMK, don't kn

Re: Checkpoints and copying NodeStore instances (aka RepositorySidegrade)

2015-08-05 Thread Julian Sedding

Hi Alex

Thanks for your comments.

On Wed, Aug 5, 2015 at 3:48 PM, Alex Parvulescu
 wrote:
> Hi,
>
> Just a few clarifications on the error you see
>
>> My interpretation is that the AsyncIndexUpdate is trying to retrieve
> the previous checkpoint as stored in /:async/async. Of course this
> checkpoint is not present in the copied NodeStore and thus cannot be
> retrieved.
>
> The error comes from DocumentMk trying to parse the reference checkpoint
> value. Basically what fails here is 'Revision.fromString' receiving a
> malformed checkpoint value because it comes from the SegmentMk. The quick
> fix is to manually remove the properties on the "/:async" hidden node. This
> will indeed trigger a full reindex, but will help you getting over this
> issue.

Agreed. In this case parsing the revision is the first thing that
fails. When copying DNS to SNS a similar situation would arise,
because no snapshot with the provided ID exists.

>
>> IMHO it would be desirable to (optionally) copy the checkpoints as
> well. In the case of AsyncIndexUpdate, having the checkpoint can save
> a full re-index.
>
> This is very tricky, as the 2 representations of checkpoints between
> SegmentMk and DocumentMk are quite different. I would strongly suggest
> going for the reindex, after all you'd only migrate once, so you can
> prepare for this lengthy process.

I'm experimenting with the following approach:
* retrieve the first checkpoint and copy the NodeState tree at that
revision (available via CheckpointMBean impls)
* after copying the tree, merge and create a checkpoint (expiration
time can be calculated)
* rinse and repeat until the head revision is reached

My aim is to reduce the critical path for migrating one NodeStore
(incl JR2) to another. Indexing (especially async indexing) takes is a
big part of the time, so if I can move that out of the critical path,
it can save a lot of downtime.

My current approach for a migration from JR2 to MongoMK is to:
* copy JR2 to TarMK (TarMK is a lot faster for creating indexes etc.
than MongoMK)
* repeat JR2 to TarMK copy every week or every 24h using incremental
copy. this saves on CommitHook execution time - in theory this can
reduce the time for one run to a single full repository traversal.
* finally on the day when the systems should be switched over, run a
last JR2 to TarMK and then a TarMK to MongoMK copy. this is the
critical path.

Due to the above, copying at least the checkpoint of the async index
will likely speed up the critical path. Of course measuring execution
times will provide the definitive answer to this question.

Regards
Julian

>
> best,
> alex
>
>
> On Wed, Aug 5, 2015 at 3:35 PM, Julian Sedding  wrote:
>
>> Hi all
>>
>> I am working on a scenario, where I need to copy a SegmentNodeStore
>> (TarMK) to a DocumentNodeStore (MongoDB).
>>
>> It is pretty straight forward to simply copy the NodeStore via the
>> API. No problems here.
>>
>> In a recent experiment I successfully copied the NodeStore and got an
>> exception in the logs (stacktrace below the email).
>>
>> My interpretation is that the AsyncIndexUpdate is trying to retrieve
>> the previous checkpoint as stored in /:async/async. Of course this
>> checkpoint is not present in the copied NodeStore and thus cannot be
>> retrieved.
>>
>> IMHO it would be desirable to (optionally) copy the checkpoints as
>> well. In the case of AsyncIndexUpdate, having the checkpoint can save
>> a full re-index.
>>
>> The question that remains is how the internal state of
>> AsyncIndexUpdate should be modified:
>> * implementing the logic in oak-upgrade would be pragmatic, but
>> distributes knowledge about AsyncIndexUpdate implementation details to
>> different modules
>> * having a CommitHook/Editor in oak-core that can be used in
>> oak-upgrade might be cleaner, but would only get used in oak-upgrade
>>
>> Other ideas and opinions regarding this feature are more than welcome!
>>
>> Regards
>> Julian
>>
>>
>> 05.08.2015 00:03:19.133 *ERROR* [pool-6-thread-2]
>> org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception
>> during job execution of
>> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@471e4b4b :
>> 91f7e218-6cf5-4a44-a324-f094c29898e6
>> java.lang.IllegalArgumentException: 91f7e218-6cf5-4a44-a324-f094c29898e6
>> at
>> org.apache.jackrabbit.oak.plugins.document.Revision.fromString(Revision.java:236)
>> at
>> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.retrieve(DocumentNodeStore.java:1570)
>> at
>> org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpda

Checkpoints and copying NodeStore instances (aka RepositorySidegrade)

2015-08-05 Thread Julian Sedding

Hi all

I am working on a scenario, where I need to copy a SegmentNodeStore
(TarMK) to a DocumentNodeStore (MongoDB).

It is pretty straight forward to simply copy the NodeStore via the
API. No problems here.

In a recent experiment I successfully copied the NodeStore and got an
exception in the logs (stacktrace below the email).

My interpretation is that the AsyncIndexUpdate is trying to retrieve
the previous checkpoint as stored in /:async/async. Of course this
checkpoint is not present in the copied NodeStore and thus cannot be
retrieved.

IMHO it would be desirable to (optionally) copy the checkpoints as
well. In the case of AsyncIndexUpdate, having the checkpoint can save
a full re-index.

The question that remains is how the internal state of
AsyncIndexUpdate should be modified:
* implementing the logic in oak-upgrade would be pragmatic, but
distributes knowledge about AsyncIndexUpdate implementation details to
different modules
* having a CommitHook/Editor in oak-core that can be used in
oak-upgrade might be cleaner, but would only get used in oak-upgrade

Other ideas and opinions regarding this feature are more than welcome!

Regards
Julian


05.08.2015 00:03:19.133 *ERROR* [pool-6-thread-2]
org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception
during job execution of
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@471e4b4b :
91f7e218-6cf5-4a44-a324-f094c29898e6
java.lang.IllegalArgumentException: 91f7e218-6cf5-4a44-a324-f094c29898e6
at 
org.apache.jackrabbit.oak.plugins.document.Revision.fromString(Revision.java:236)
at 
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.retrieve(DocumentNodeStore.java:1570)
at 
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:279)
at 
org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Re: oak-run 50MB

2015-07-20 Thread Julian Sedding

+1 It sounds sensible to split this up. It seems that it has evolved
into a collection of functionality that shares mostly the fact that
they are run on the command line. I would like to see the logic used
to bootstrap various Oak setups based on command line parameters to be
extracted and re-used.

Regards
Julian

On Mon, Jul 20, 2015 at 2:39 PM, Davide Giannella  wrote:
> Hello folks,
>
> oak-run is 50MB+. Shall we look into how to potentially split what this
> module provides in something more precise and maybe delete some parts no
> longer used/supported?
>
> Off the top of my head oak-run provides:
>
> - utility tools for segment, mongo and datastore
> - micro benchmarking
> - scalability benchmarking
> - rough http server
>
> So the question is: shall we look into the details for splitting it into
> smaller modules? I don't think oak-run is an appropriate name either.
>
> Davide
>
>

1 2 >

1 - 100 of 119 matches

Mail list logo