Re: How to implement a queue in Oak?

2014-08-01 Thread Carsten Ziegeler
Thanks Jukka,

in my case I have several producers and a single consumer, so your second
suggestion should do the trick.
Can you make any comments about the performance of such a solution? I
assume Node#getChildNodeNames() is pretty cheap, what about the addNode,
getNode, removeNode methods?

Carsten


2014-08-01 15:03 GMT+02:00 Jukka Zitting :

> Hi,
>
> On Wed, Jul 30, 2014 at 7:27 AM, Carsten Ziegeler 
> wrote:
> > Does this make sense? Is there anything else to be considered?
>
> You didn't mention whether you expect the queue to be concurrently
> accessed.
>
> If there is just a single producer and a single consumer, then you
> could just name the entries using a running sequence number, and
> there's no need for repository-level ordering. For example:
>
> void produce(Node queue) {
> long number = queue.getProperty("writeCount").getLong();
> queue.addNode("entry" + number);
> queue.setProperty("writeCount", number +1);
> queue.getSession().save();
> }
>
> // consumes all entries currently in the queue, call again later to
> continue
> void consume(Node queue) {
> long original = queue.getProperty("readCount").getLong();
> long number = original;
> while (queue.hasChild("entry" + number)) {
> queue.getNode("entry" + number++).remove();
> }
> if (number != original) {
> queue.setProperty("readCount", number);
> queue.getSession().save();
> }
> }
>
> If there is a single consumer but multiple concurrent producers, you
> could name the entries using timestamps and do explicit ordering in
> the consumer:
>
> void produce(Node queue) {
> queue.addNode("entry-" + System.currentTimeMillis() + "-" +
> UUID.randomUUID());
> queue.getSession().save();
> }
>
> // consumes all entries currently in the queue, call again later to
> continue
> void consume(Node queue) {
> List names = Lists.newArrayList(queue.getChildNodeNames());
> Collections.sort(names);
> for (String name : names) {
> queue.getNode(name).remove();
> }
> queue.getSession().save();
> }
>
> Note that the above does not guarantee strict ordering between entries
> produced at roughly the same time (typically within a few dozens of
> milliseconds). For that you'll need more explicit synchronization
> between the producers. Similarly, supporting concurrent consumers will
> likely require some extra synchronization between the consumers as the
> repository itself does not provide strict guarantees in this area
> (unless you want to rely on the features of specific backends like the
> TarMK).
>
> --
> Jukka Zitting
>



-- 
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org


Re: How to implement a queue in Oak?

2014-08-01 Thread Jukka Zitting
Hi,

On Wed, Jul 30, 2014 at 7:27 AM, Carsten Ziegeler  wrote:
> Does this make sense? Is there anything else to be considered?

You didn't mention whether you expect the queue to be concurrently accessed.

If there is just a single producer and a single consumer, then you
could just name the entries using a running sequence number, and
there's no need for repository-level ordering. For example:

void produce(Node queue) {
long number = queue.getProperty("writeCount").getLong();
queue.addNode("entry" + number);
queue.setProperty("writeCount", number +1);
queue.getSession().save();
}

// consumes all entries currently in the queue, call again later to continue
void consume(Node queue) {
long original = queue.getProperty("readCount").getLong();
long number = original;
while (queue.hasChild("entry" + number)) {
queue.getNode("entry" + number++).remove();
}
if (number != original) {
queue.setProperty("readCount", number);
queue.getSession().save();
}
}

If there is a single consumer but multiple concurrent producers, you
could name the entries using timestamps and do explicit ordering in
the consumer:

void produce(Node queue) {
queue.addNode("entry-" + System.currentTimeMillis() + "-" +
UUID.randomUUID());
queue.getSession().save();
}

// consumes all entries currently in the queue, call again later to continue
void consume(Node queue) {
List names = Lists.newArrayList(queue.getChildNodeNames());
Collections.sort(names);
for (String name : names) {
queue.getNode(name).remove();
}
queue.getSession().save();
}

Note that the above does not guarantee strict ordering between entries
produced at roughly the same time (typically within a few dozens of
milliseconds). For that you'll need more explicit synchronization
between the producers. Similarly, supporting concurrent consumers will
likely require some extra synchronization between the consumers as the
repository itself does not provide strict guarantees in this area
(unless you want to rely on the features of specific backends like the
TarMK).

-- 
Jukka Zitting


Re: [VOTE] Release Apache Jackrabbit Oak 1.0.4

2014-08-01 Thread Alex Parvulescu
+1 all checks ok

alex


On Fri, Aug 1, 2014 at 1:10 PM, Chetan Mehrotra 
wrote:

> +1
>
> All checks ok
> Chetan Mehrotra
>
>
> On Fri, Aug 1, 2014 at 3:53 PM, Tommaso Teofili
>  wrote:
> > +1
> >
> > Regards,
> > Tommaso
> >
> >
> > 2014-08-01 11:45 GMT+02:00 Thomas Mueller :
> >
> >> A candidate for the Jackrabbit Oak 1.0.4 release is available at:
> >>
> >> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.4/
> >>
> >> The release candidate is a zip archive of the sources in:
> >>
> >>
> >>
> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.4/
> >>
> >> The SHA1 checksum of the archive is
> >> 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e.
> >>
> >> A staged Maven repository is available for review at:
> >>
> >> https://repository.apache.org/
> >>
> >> The command for running automated checks against this release candidate
> is:
> >>
> >> $ sh check-release.sh oak 1.0.4
> >> 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e
> >>
> >> Please vote on releasing this package as Apache Jackrabbit Oak 1.0.4.
> >> The vote is open for the next 72 hours and passes if a majority of at
> >> least three +1 Jackrabbit PMC votes are cast.
> >>
> >> [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.4
> >> [ ] -1 Do not release this package because...
> >>
> >> My vote is +1
> >>
> >> Regards
> >> Thomas
> >>
> >>
>


Re: How to implement a queue in Oak?

2014-08-01 Thread Michael Dürig


There are the utilities in the org.apache.jackrabbit.commons.flat 
package, which were built for mapping flat structures to a JCR 
hierarchy. See the BTreeManager class for a good starting point.


Michael



On 1.8.14 7:56 , Carsten Ziegeler wrote:

I'm wondering if anyone has a good idea how to model a queue with efficient
operations in JCR - or is JCR not suited for this use case?

Regards
Carsten


2014-07-30 15:57 GMT+02:00 Carsten Ziegeler :


Using a different storage than JCR would be easy in my case, however I
*want* to use JCR

Carsten


2014-07-30 14:55 GMT+02:00 Lukas Smith :

Hi,


I can totally see that it might be useful to be able to go through the
Oak/JCR API to have a queue but maybe this is stretching Oak a bit far if
you end up with 1k+ queues.

However I think it would be great to look more into federation for this.
I think ModeShape supports this quite well already, ie. being able to hook
in another JCR tree, a file system, a git repository, CMIS .. I am sure
that it would also be possible to implement on top of some MQ standard.

see also https://docs.jboss.org/author/display/MODE/Federation?_sscc=t

regards,
Lukas


On 30 Jul 2014, at 14:41, Angela Schreiber  wrote:

hi carsten

if you are expecting your nodes to be in a given order (e.g. the
order of creation) you need to have a parent that has orderable
children... in which case we don't make any promises about huge
child collections... it will not work well.

if you don't have the requirement of ordered children, you can
have _many_ but need to make sure that your parent node doesn't
have orderable children (e.g. oak:Unstructured)... but then you
cannot expect that new children are appended at the "end of the
list"... there is no list and there is not guaranteed order.

i guess you have a little misunderstanding when it comes to
the concept of orderable child nodes -> JSR 283 will be your friend.

regards
angela


On 30/07/14 13:27, "Carsten Ziegeler"  wrote:

Hi,

afaik with Oak the too many child nodes problem of JR2 is solved,
therefore
I'm wondering what the best way to store a queue in the repository is?

In my use cases, there are usually not many items within a single

queue,

let's say a few hundreds. In some cases the queue might grow to some
thousands but not more than maybe 20k.

The idea is that new entries (nodes) are added to the end of the queue,
and
processing would read the first node from the queue, update the

properties

and once done, remove it.

My initial design was to simply store all entries as sub nodes of some
queue root entry without any hierarchy. addNode should add them at the

end

and simply iteration over the child nodes of the root gives the first
entry. No need for sortable nodes.

Does this make sense? Is there anything else to be considered?

Regards
Carsten
--
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org








--
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org







Re: [VOTE] Release Apache Jackrabbit Oak 1.0.4

2014-08-01 Thread Chetan Mehrotra
+1

All checks ok
Chetan Mehrotra


On Fri, Aug 1, 2014 at 3:53 PM, Tommaso Teofili
 wrote:
> +1
>
> Regards,
> Tommaso
>
>
> 2014-08-01 11:45 GMT+02:00 Thomas Mueller :
>
>> A candidate for the Jackrabbit Oak 1.0.4 release is available at:
>>
>> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.4/
>>
>> The release candidate is a zip archive of the sources in:
>>
>>
>> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.4/
>>
>> The SHA1 checksum of the archive is
>> 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e.
>>
>> A staged Maven repository is available for review at:
>>
>> https://repository.apache.org/
>>
>> The command for running automated checks against this release candidate is:
>>
>> $ sh check-release.sh oak 1.0.4
>> 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e
>>
>> Please vote on releasing this package as Apache Jackrabbit Oak 1.0.4.
>> The vote is open for the next 72 hours and passes if a majority of at
>> least three +1 Jackrabbit PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.4
>> [ ] -1 Do not release this package because...
>>
>> My vote is +1
>>
>> Regards
>> Thomas
>>
>>


Re: Order of property restrictions in Query Filter

2014-08-01 Thread Thomas Mueller
Hi,

Sorry, I don't understand. In my view, the queries

select [jcr:path] from [nt:base] where id = '1' and x = '2'

and

select [jcr:path] from [nt:base] where x = '2' and id = '1'

are equivalent (it doesn't matter in which order the conditions were
written). It a certain *index* is faster if it uses id or x first, then
that's up to the index to decide. But the application developer (who wrote
the query) wouldn't know that. It would depend on the data, and the data
might change.


Regards,
Thomas






On 31/07/14 17:15, "Chetan Mehrotra"  wrote:

>Suppose we have a query like
>
>select [jcr:path]
>  from [nt:base]
>  where id = '1' and x = '2'
>
>Currently the property restrictions are maintained as a HashMap in
>FilterImpl so above ordering information would be lost.
>
>Such ordering information might be useful when querying against Lucene
>index. The Boolean query created would maintain the order and might be
>faster if the result from first clause is small.
>
>Would it make sense to retain the order of property restrictions?
>
>Chetan Mehrotra



Re: [VOTE] Release Apache Jackrabbit Oak 1.0.4

2014-08-01 Thread Tommaso Teofili
+1

Regards,
Tommaso


2014-08-01 11:45 GMT+02:00 Thomas Mueller :

> A candidate for the Jackrabbit Oak 1.0.4 release is available at:
>
> https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.4/
>
> The release candidate is a zip archive of the sources in:
>
>
> https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.4/
>
> The SHA1 checksum of the archive is
> 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e.
>
> A staged Maven repository is available for review at:
>
> https://repository.apache.org/
>
> The command for running automated checks against this release candidate is:
>
> $ sh check-release.sh oak 1.0.4
> 84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e
>
> Please vote on releasing this package as Apache Jackrabbit Oak 1.0.4.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Jackrabbit PMC votes are cast.
>
> [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.4
> [ ] -1 Do not release this package because...
>
> My vote is +1
>
> Regards
> Thomas
>
>


[VOTE] Release Apache Jackrabbit Oak 1.0.4

2014-08-01 Thread Thomas Mueller
A candidate for the Jackrabbit Oak 1.0.4 release is available at:

https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.4/

The release candidate is a zip archive of the sources in:


https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.4/

The SHA1 checksum of the archive is
84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e.

A staged Maven repository is available for review at:

https://repository.apache.org/

The command for running automated checks against this release candidate is:

$ sh check-release.sh oak 1.0.4
84b2b31d9bff0159a2a7ab22f4af3c5d08b0e81e

Please vote on releasing this package as Apache Jackrabbit Oak 1.0.4.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 Jackrabbit PMC votes are cast.

[ ] +1 Release this package as Apache Jackrabbit Oak 1.0.4
[ ] -1 Do not release this package because...

My vote is +1

Regards
Thomas



Re: How to implement a queue in Oak?

2014-08-01 Thread Tommaso Teofili
I don't yet have a proper proposal, but maybe what could be done may be
something similar to what Davide has done with regards to the ordered index
(using a skiplist), that is defining a specific structure of the nodes,
together with a specific implementation, that avoids having to use sortable
node types but is an inherently sorted structure, e.g. binary search tree.
Any opinions?

Regards,
Tommaso


2014-08-01 7:56 GMT+02:00 Carsten Ziegeler :

> I'm wondering if anyone has a good idea how to model a queue with efficient
> operations in JCR - or is JCR not suited for this use case?
>
> Regards
> Carsten
>
>
> 2014-07-30 15:57 GMT+02:00 Carsten Ziegeler :
>
> > Using a different storage than JCR would be easy in my case, however I
> > *want* to use JCR
> >
> > Carsten
> >
> >
> > 2014-07-30 14:55 GMT+02:00 Lukas Smith :
> >
> > Hi,
> >>
> >> I can totally see that it might be useful to be able to go through the
> >> Oak/JCR API to have a queue but maybe this is stretching Oak a bit far
> if
> >> you end up with 1k+ queues.
> >>
> >> However I think it would be great to look more into federation for this.
> >> I think ModeShape supports this quite well already, ie. being able to
> hook
> >> in another JCR tree, a file system, a git repository, CMIS .. I am sure
> >> that it would also be possible to implement on top of some MQ standard.
> >>
> >> see also https://docs.jboss.org/author/display/MODE/Federation?_sscc=t
> >>
> >> regards,
> >> Lukas
> >>
> >> > On 30 Jul 2014, at 14:41, Angela Schreiber  wrote:
> >> >
> >> > hi carsten
> >> >
> >> > if you are expecting your nodes to be in a given order (e.g. the
> >> > order of creation) you need to have a parent that has orderable
> >> > children... in which case we don't make any promises about huge
> >> > child collections... it will not work well.
> >> >
> >> > if you don't have the requirement of ordered children, you can
> >> > have _many_ but need to make sure that your parent node doesn't
> >> > have orderable children (e.g. oak:Unstructured)... but then you
> >> > cannot expect that new children are appended at the "end of the
> >> > list"... there is no list and there is not guaranteed order.
> >> >
> >> > i guess you have a little misunderstanding when it comes to
> >> > the concept of orderable child nodes -> JSR 283 will be your friend.
> >> >
> >> > regards
> >> > angela
> >> >
> >> >> On 30/07/14 13:27, "Carsten Ziegeler"  wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> afaik with Oak the too many child nodes problem of JR2 is solved,
> >> >> therefore
> >> >> I'm wondering what the best way to store a queue in the repository
> is?
> >> >>
> >> >> In my use cases, there are usually not many items within a single
> >> queue,
> >> >> let's say a few hundreds. In some cases the queue might grow to some
> >> >> thousands but not more than maybe 20k.
> >> >>
> >> >> The idea is that new entries (nodes) are added to the end of the
> queue,
> >> >> and
> >> >> processing would read the first node from the queue, update the
> >> properties
> >> >> and once done, remove it.
> >> >>
> >> >> My initial design was to simply store all entries as sub nodes of
> some
> >> >> queue root entry without any hierarchy. addNode should add them at
> the
> >> end
> >> >> and simply iteration over the child nodes of the root gives the first
> >> >> entry. No need for sortable nodes.
> >> >>
> >> >> Does this make sense? Is there anything else to be considered?
> >> >>
> >> >> Regards
> >> >> Carsten
> >> >> --
> >> >> Carsten Ziegeler
> >> >> Adobe Research Switzerland
> >> >> cziege...@apache.org
> >> >
> >>
> >
> >
> >
> > --
> > Carsten Ziegeler
> > Adobe Research Switzerland
> > cziege...@apache.org
> >
>
>
>
> --
> Carsten Ziegeler
> Adobe Research Switzerland
> cziege...@apache.org
>