Re: Running background operation on a single node in a Oak cluster

2013-12-02 Thread Chetan Mehrotra
On Wed, Nov 20, 2013 at 11:41 PM, Jukka Zitting  wrote:
> Yes, sounds like a good candidate. The only additional bit we'd need
> is a timestamp that allows indexers on other cluster nodes to
> automatically resume processing if an active indexing task dies for
> whatever reason without a chance to clear the flag.

Implemented such a logic with OAK-1246

Chetan Mehrotra


Re: Running background operation on a single node in a Oak cluster

2013-11-20 Thread Jukka Zitting
Hi,

On Wed, Nov 20, 2013 at 1:00 PM, Alex Parvulescu
 wrote:
> about the flag, we already set the status info ("async-status" =
> "running"), ideally we could just use that. [0]

Yes, sounds like a good candidate. The only additional bit we'd need
is a timestamp that allows indexers on other cluster nodes to
automatically resume processing if an active indexing task dies for
whatever reason without a chance to clear the flag.

BR,

Jukka Zitting


Re: Running background operation on a single node in a Oak cluster

2013-11-20 Thread Alex Parvulescu
hi,

about the flag, we already set the status info ("async-status" =
"running"), ideally we could just use that. [0]

best,
alex


[0]
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java?view=markup#l156





On Wed, Nov 20, 2013 at 6:19 PM, Jukka Zitting wrote:

> Hi,
>
> On Wed, Nov 20, 2013 at 12:09 PM, Marcel Reutegger 
> wrote:
> >> The async index update is designed so that it should work correctly
> >> even when run concurrently on multiple cluster nodes. It uses
> >> optimistic locking to prevent conflicting updates.
> >
> > I'm confident this works, but I'm a bit concerned about duplicate work
> > being done. Doesn't the probability of concurrent updates increase
> > with every cluster node we add?
>
> The idea behind the design was that we'd make the update interval
> dependent on the number of cluster nodes. I.e. instead of an interval
> of t seconds that's independent of the cluster size, we'd configure
> the index update interval to (roughly) n*t, where n is the size of the
> cluster. That way an index update would would on average get triggered
> once every t seconds across the cluster.
>
> > A while ago I saw them more frequently. See
> > https://issues.apache.org/jira/browse/OAK-1166 for more details.
> > After https://issues.apache.org/jira/browse/OAK-1198 they are now
> > less frequently.
>
> OK, thanks for the pointers.
>
> > But even if there are no warnings it may still mean
> > unnecessary work is done and then discarded. Though, I understand
> > most of the discarded lucene changes were already persisted to a
> > branch and will have to be garbage collected. Is this correct?
>
> Correct. I'd expect this to be a problem only during large imports
> when big index updates are needed (and when the likelihood of
> concurrent work is much increased). The flag I proposed in the earlier
> message should help with such cases.
>
> BR,
>
> Jukka Zitting
>


Re: Running background operation on a single node in a Oak cluster

2013-11-20 Thread Jukka Zitting
Hi,

On Wed, Nov 20, 2013 at 12:09 PM, Marcel Reutegger  wrote:
>> The async index update is designed so that it should work correctly
>> even when run concurrently on multiple cluster nodes. It uses
>> optimistic locking to prevent conflicting updates.
>
> I'm confident this works, but I'm a bit concerned about duplicate work
> being done. Doesn't the probability of concurrent updates increase
> with every cluster node we add?

The idea behind the design was that we'd make the update interval
dependent on the number of cluster nodes. I.e. instead of an interval
of t seconds that's independent of the cluster size, we'd configure
the index update interval to (roughly) n*t, where n is the size of the
cluster. That way an index update would would on average get triggered
once every t seconds across the cluster.

> A while ago I saw them more frequently. See
> https://issues.apache.org/jira/browse/OAK-1166 for more details.
> After https://issues.apache.org/jira/browse/OAK-1198 they are now
> less frequently.

OK, thanks for the pointers.

> But even if there are no warnings it may still mean
> unnecessary work is done and then discarded. Though, I understand
> most of the discarded lucene changes were already persisted to a
> branch and will have to be garbage collected. Is this correct?

Correct. I'd expect this to be a problem only during large imports
when big index updates are needed (and when the likelihood of
concurrent work is much increased). The flag I proposed in the earlier
message should help with such cases.

BR,

Jukka Zitting


RE: Running background operation on a single node in a Oak cluster

2013-11-20 Thread Marcel Reutegger
Hi,

> The async index update is designed so that it should work correctly
> even when run concurrently on multiple cluster nodes. It uses
> optimistic locking to prevent conflicting updates.

I'm confident this works, but I'm a bit concerned about duplicate work
being done. Doesn't the probability of concurrent updates increase
with every cluster node we add?

> Thus I'd rather not limit the async index update to just one node, and
> instead leave it running throughout the cluster. The update contains
> code that's designed to automatically detect conflicting updates and
> suppress any warnings about them. Do you still see conflicts being
> logged?

A while ago I saw them more frequently. See
https://issues.apache.org/jira/browse/OAK-1166 for more details.
After https://issues.apache.org/jira/browse/OAK-1198 they are now
less frequently. But even if there are no warnings it may still mean 
unnecessary work is done and then discarded. Though, I understand
most of the discarded lucene changes were already persisted to a
branch and will have to be garbage collected. Is this correct?

Regards
 Marcel



Re: Running background operation on a single node in a Oak cluster

2013-11-20 Thread Jukka Zitting
Hi,

On Wed, Nov 20, 2013 at 5:33 AM, Chetan Mehrotra
 wrote:
> Would it make sense to make use of this feature to run the Async Index
> Update only on one node?. It would also help in avoiding the conflict
> while concurrently updating the index related data in cluster.

The async index update is designed so that it should work correctly
even when run concurrently on multiple cluster nodes. It uses
optimistic locking to prevent conflicting updates.

Thus I'd rather not limit the async index update to just one node, and
instead leave it running throughout the cluster. The update contains
code that's designed to automatically detect conflicting updates and
suppress any warnings about them. Do you still see conflicts being
logged?

The only case I know where having concurrent updaters is troublesome
is when a large import is being indexed, as that would cause lots of
duplicate and unnecessary work on all cluster nodes. Perhaps we should
add an extra flag in the index that the updater can use to signal to
other cluster nodes that the latest set of changes is already being
processed.

BR,

Jukka Zitting