Re: How to implement a queue in Oak?
Thanks Jukka, in my case I have several producers and a single consumer, so your second suggestion should do the trick. Can you make any comments about the performance of such a solution? I assume Node#getChildNodeNames() is pretty cheap, what about the addNode, getNode, removeNode methods? Carsten 2014-08-01 15:03 GMT+02:00 Jukka Zitting : > Hi, > > On Wed, Jul 30, 2014 at 7:27 AM, Carsten Ziegeler > wrote: > > Does this make sense? Is there anything else to be considered? > > You didn't mention whether you expect the queue to be concurrently > accessed. > > If there is just a single producer and a single consumer, then you > could just name the entries using a running sequence number, and > there's no need for repository-level ordering. For example: > > void produce(Node queue) { > long number = queue.getProperty("writeCount").getLong(); > queue.addNode("entry" + number); > queue.setProperty("writeCount", number +1); > queue.getSession().save(); > } > > // consumes all entries currently in the queue, call again later to > continue > void consume(Node queue) { > long original = queue.getProperty("readCount").getLong(); > long number = original; > while (queue.hasChild("entry" + number)) { > queue.getNode("entry" + number++).remove(); > } > if (number != original) { > queue.setProperty("readCount", number); > queue.getSession().save(); > } > } > > If there is a single consumer but multiple concurrent producers, you > could name the entries using timestamps and do explicit ordering in > the consumer: > > void produce(Node queue) { > queue.addNode("entry-" + System.currentTimeMillis() + "-" + > UUID.randomUUID()); > queue.getSession().save(); > } > > // consumes all entries currently in the queue, call again later to > continue > void consume(Node queue) { > List names = Lists.newArrayList(queue.getChildNodeNames()); > Collections.sort(names); > for (String name : names) { > queue.getNode(name).remove(); > } > queue.getSession().save(); > } > > Note that the above does not guarantee strict ordering between entries > produced at roughly the same time (typically within a few dozens of > milliseconds). For that you'll need more explicit synchronization > between the producers. Similarly, supporting concurrent consumers will > likely require some extra synchronization between the consumers as the > repository itself does not provide strict guarantees in this area > (unless you want to rely on the features of specific backends like the > TarMK). > > -- > Jukka Zitting > -- Carsten Ziegeler Adobe Research Switzerland cziege...@apache.org
Re: How to implement a queue in Oak?
Hi, On Wed, Jul 30, 2014 at 7:27 AM, Carsten Ziegeler wrote: > Does this make sense? Is there anything else to be considered? You didn't mention whether you expect the queue to be concurrently accessed. If there is just a single producer and a single consumer, then you could just name the entries using a running sequence number, and there's no need for repository-level ordering. For example: void produce(Node queue) { long number = queue.getProperty("writeCount").getLong(); queue.addNode("entry" + number); queue.setProperty("writeCount", number +1); queue.getSession().save(); } // consumes all entries currently in the queue, call again later to continue void consume(Node queue) { long original = queue.getProperty("readCount").getLong(); long number = original; while (queue.hasChild("entry" + number)) { queue.getNode("entry" + number++).remove(); } if (number != original) { queue.setProperty("readCount", number); queue.getSession().save(); } } If there is a single consumer but multiple concurrent producers, you could name the entries using timestamps and do explicit ordering in the consumer: void produce(Node queue) { queue.addNode("entry-" + System.currentTimeMillis() + "-" + UUID.randomUUID()); queue.getSession().save(); } // consumes all entries currently in the queue, call again later to continue void consume(Node queue) { List names = Lists.newArrayList(queue.getChildNodeNames()); Collections.sort(names); for (String name : names) { queue.getNode(name).remove(); } queue.getSession().save(); } Note that the above does not guarantee strict ordering between entries produced at roughly the same time (typically within a few dozens of milliseconds). For that you'll need more explicit synchronization between the producers. Similarly, supporting concurrent consumers will likely require some extra synchronization between the consumers as the repository itself does not provide strict guarantees in this area (unless you want to rely on the features of specific backends like the TarMK). -- Jukka Zitting
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.4
+1 all checks ok alex On Fri, Aug 1, 2014 at 1:10 PM, Chetan Mehrotra wrote: > +1 > > All checks ok > Chetan Mehrotra > > > On Fri, Aug 1, 2014 at 3:53 PM, Tommaso Teofili > wrote: > > +1 > > > > Regards, > > Tommaso > > > > > > 2014-08-01 11:45 GMT+02:00 Thomas Mueller : > > > >> A candidate for the Jackrabbit Oak 1.0.4 release is available at: > >> > >> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.4/ > >> > >> The release candidate is a zip archive of the sources in: > >> > >> > >> > https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.4/ > >> > >> The SHA1 checksum of the archive is > >> 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e. > >> > >> A staged Maven repository is available for review at: > >> > >> https://repository.apache.org/ > >> > >> The command for running automated checks against this release candidate > is: > >> > >> $ sh check-release.sh oak 1.0.4 > >> 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e > >> > >> Please vote on releasing this package as Apache Jackrabbit Oak 1.0.4. > >> The vote is open for the next 72 hours and passes if a majority of at > >> least three +1 Jackrabbit PMC votes are cast. > >> > >> [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.4 > >> [ ] -1 Do not release this package because... > >> > >> My vote is +1 > >> > >> Regards > >> Thomas > >> > >> >
Re: How to implement a queue in Oak?
There are the utilities in the org.apache.jackrabbit.commons.flat package, which were built for mapping flat structures to a JCR hierarchy. See the BTreeManager class for a good starting point. Michael On 1.8.14 7:56 , Carsten Ziegeler wrote: I'm wondering if anyone has a good idea how to model a queue with efficient operations in JCR - or is JCR not suited for this use case? Regards Carsten 2014-07-30 15:57 GMT+02:00 Carsten Ziegeler : Using a different storage than JCR would be easy in my case, however I *want* to use JCR Carsten 2014-07-30 14:55 GMT+02:00 Lukas Smith : Hi, I can totally see that it might be useful to be able to go through the Oak/JCR API to have a queue but maybe this is stretching Oak a bit far if you end up with 1k+ queues. However I think it would be great to look more into federation for this. I think ModeShape supports this quite well already, ie. being able to hook in another JCR tree, a file system, a git repository, CMIS .. I am sure that it would also be possible to implement on top of some MQ standard. see also https://docs.jboss.org/author/display/MODE/Federation?_sscc=t regards, Lukas On 30 Jul 2014, at 14:41, Angela Schreiber wrote: hi carsten if you are expecting your nodes to be in a given order (e.g. the order of creation) you need to have a parent that has orderable children... in which case we don't make any promises about huge child collections... it will not work well. if you don't have the requirement of ordered children, you can have _many_ but need to make sure that your parent node doesn't have orderable children (e.g. oak:Unstructured)... but then you cannot expect that new children are appended at the "end of the list"... there is no list and there is not guaranteed order. i guess you have a little misunderstanding when it comes to the concept of orderable child nodes -> JSR 283 will be your friend. regards angela On 30/07/14 13:27, "Carsten Ziegeler" wrote: Hi, afaik with Oak the too many child nodes problem of JR2 is solved, therefore I'm wondering what the best way to store a queue in the repository is? In my use cases, there are usually not many items within a single queue, let's say a few hundreds. In some cases the queue might grow to some thousands but not more than maybe 20k. The idea is that new entries (nodes) are added to the end of the queue, and processing would read the first node from the queue, update the properties and once done, remove it. My initial design was to simply store all entries as sub nodes of some queue root entry without any hierarchy. addNode should add them at the end and simply iteration over the child nodes of the root gives the first entry. No need for sortable nodes. Does this make sense? Is there anything else to be considered? Regards Carsten -- Carsten Ziegeler Adobe Research Switzerland cziege...@apache.org -- Carsten Ziegeler Adobe Research Switzerland cziege...@apache.org
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.4
+1 All checks ok Chetan Mehrotra On Fri, Aug 1, 2014 at 3:53 PM, Tommaso Teofili wrote: > +1 > > Regards, > Tommaso > > > 2014-08-01 11:45 GMT+02:00 Thomas Mueller : > >> A candidate for the Jackrabbit Oak 1.0.4 release is available at: >> >> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.4/ >> >> The release candidate is a zip archive of the sources in: >> >> >> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.4/ >> >> The SHA1 checksum of the archive is >> 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e. >> >> A staged Maven repository is available for review at: >> >> https://repository.apache.org/ >> >> The command for running automated checks against this release candidate is: >> >> $ sh check-release.sh oak 1.0.4 >> 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e >> >> Please vote on releasing this package as Apache Jackrabbit Oak 1.0.4. >> The vote is open for the next 72 hours and passes if a majority of at >> least three +1 Jackrabbit PMC votes are cast. >> >> [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.4 >> [ ] -1 Do not release this package because... >> >> My vote is +1 >> >> Regards >> Thomas >> >>
Re: Order of property restrictions in Query Filter
Hi, Sorry, I don't understand. In my view, the queries select [jcr:path] from [nt:base] where id = '1' and x = '2' and select [jcr:path] from [nt:base] where x = '2' and id = '1' are equivalent (it doesn't matter in which order the conditions were written). It a certain *index* is faster if it uses id or x first, then that's up to the index to decide. But the application developer (who wrote the query) wouldn't know that. It would depend on the data, and the data might change. Regards, Thomas On 31/07/14 17:15, "Chetan Mehrotra" wrote: >Suppose we have a query like > >select [jcr:path] > from [nt:base] > where id = '1' and x = '2' > >Currently the property restrictions are maintained as a HashMap in >FilterImpl so above ordering information would be lost. > >Such ordering information might be useful when querying against Lucene >index. The Boolean query created would maintain the order and might be >faster if the result from first clause is small. > >Would it make sense to retain the order of property restrictions? > >Chetan Mehrotra
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.4
+1 Regards, Tommaso 2014-08-01 11:45 GMT+02:00 Thomas Mueller : > A candidate for the Jackrabbit Oak 1.0.4 release is available at: > > https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.4/ > > The release candidate is a zip archive of the sources in: > > > https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.4/ > > The SHA1 checksum of the archive is > 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e. > > A staged Maven repository is available for review at: > > https://repository.apache.org/ > > The command for running automated checks against this release candidate is: > > $ sh check-release.sh oak 1.0.4 > 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e > > Please vote on releasing this package as Apache Jackrabbit Oak 1.0.4. > The vote is open for the next 72 hours and passes if a majority of at > least three +1 Jackrabbit PMC votes are cast. > > [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.4 > [ ] -1 Do not release this package because... > > My vote is +1 > > Regards > Thomas > >
[VOTE] Release Apache Jackrabbit Oak 1.0.4
A candidate for the Jackrabbit Oak 1.0.4 release is available at: https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.4/ The release candidate is a zip archive of the sources in: https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.4/ The SHA1 checksum of the archive is 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e. A staged Maven repository is available for review at: https://repository.apache.org/ The command for running automated checks against this release candidate is: $ sh check-release.sh oak 1.0.4 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e Please vote on releasing this package as Apache Jackrabbit Oak 1.0.4. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.4 [ ] -1 Do not release this package because... My vote is +1 Regards Thomas
Re: How to implement a queue in Oak?
I don't yet have a proper proposal, but maybe what could be done may be something similar to what Davide has done with regards to the ordered index (using a skiplist), that is defining a specific structure of the nodes, together with a specific implementation, that avoids having to use sortable node types but is an inherently sorted structure, e.g. binary search tree. Any opinions? Regards, Tommaso 2014-08-01 7:56 GMT+02:00 Carsten Ziegeler : > I'm wondering if anyone has a good idea how to model a queue with efficient > operations in JCR - or is JCR not suited for this use case? > > Regards > Carsten > > > 2014-07-30 15:57 GMT+02:00 Carsten Ziegeler : > > > Using a different storage than JCR would be easy in my case, however I > > *want* to use JCR > > > > Carsten > > > > > > 2014-07-30 14:55 GMT+02:00 Lukas Smith : > > > > Hi, > >> > >> I can totally see that it might be useful to be able to go through the > >> Oak/JCR API to have a queue but maybe this is stretching Oak a bit far > if > >> you end up with 1k+ queues. > >> > >> However I think it would be great to look more into federation for this. > >> I think ModeShape supports this quite well already, ie. being able to > hook > >> in another JCR tree, a file system, a git repository, CMIS .. I am sure > >> that it would also be possible to implement on top of some MQ standard. > >> > >> see also https://docs.jboss.org/author/display/MODE/Federation?_sscc=t > >> > >> regards, > >> Lukas > >> > >> > On 30 Jul 2014, at 14:41, Angela Schreiber wrote: > >> > > >> > hi carsten > >> > > >> > if you are expecting your nodes to be in a given order (e.g. the > >> > order of creation) you need to have a parent that has orderable > >> > children... in which case we don't make any promises about huge > >> > child collections... it will not work well. > >> > > >> > if you don't have the requirement of ordered children, you can > >> > have _many_ but need to make sure that your parent node doesn't > >> > have orderable children (e.g. oak:Unstructured)... but then you > >> > cannot expect that new children are appended at the "end of the > >> > list"... there is no list and there is not guaranteed order. > >> > > >> > i guess you have a little misunderstanding when it comes to > >> > the concept of orderable child nodes -> JSR 283 will be your friend. > >> > > >> > regards > >> > angela > >> > > >> >> On 30/07/14 13:27, "Carsten Ziegeler" wrote: > >> >> > >> >> Hi, > >> >> > >> >> afaik with Oak the too many child nodes problem of JR2 is solved, > >> >> therefore > >> >> I'm wondering what the best way to store a queue in the repository > is? > >> >> > >> >> In my use cases, there are usually not many items within a single > >> queue, > >> >> let's say a few hundreds. In some cases the queue might grow to some > >> >> thousands but not more than maybe 20k. > >> >> > >> >> The idea is that new entries (nodes) are added to the end of the > queue, > >> >> and > >> >> processing would read the first node from the queue, update the > >> properties > >> >> and once done, remove it. > >> >> > >> >> My initial design was to simply store all entries as sub nodes of > some > >> >> queue root entry without any hierarchy. addNode should add them at > the > >> end > >> >> and simply iteration over the child nodes of the root gives the first > >> >> entry. No need for sortable nodes. > >> >> > >> >> Does this make sense? Is there anything else to be considered? > >> >> > >> >> Regards > >> >> Carsten > >> >> -- > >> >> Carsten Ziegeler > >> >> Adobe Research Switzerland > >> >> cziege...@apache.org > >> > > >> > > > > > > > > -- > > Carsten Ziegeler > > Adobe Research Switzerland > > cziege...@apache.org > > > > > > -- > Carsten Ziegeler > Adobe Research Switzerland > cziege...@apache.org >