Re: Asynchronous indexing consistency

2013-06-12 Thread Jukka Zitting
Hi, After some thought and discussions about this, it seems to me that the tree copy approach is probably too complex to implement. Instead I'd like to propose a "checkpoint" mechanism that allows a client to request a state of the repository to be excluded from garbage collection for a specified

Re: Asynchronous indexing consistency

2013-06-04 Thread Thomas Mueller
Hi, >A complication here is the way the current index configurations nodes >are structured, as we'd have to copy the root to something like >/oak:index/someIndex/:before and without careful pre- or >post-processing that would end up creating a recursive sequence of >past revisions at >/oak:index/s

Re: Asynchronous indexing consistency

2013-05-30 Thread Jukka Zitting
Hi, On Wed, May 29, 2013 at 5:26 PM, Thomas Mueller wrote: > What I meant by "reading old revisions" is that we need a way to read old > revisions. You have suggested to use the copy operation to do that, which > is fine; another solution is to not garbage collect a certain revision; > this would

Re: Asynchronous indexing consistency

2013-05-29 Thread Ian Boston
Hi, I've been watching this thread, and the situation sounds quite simular to issues I was seeing in pre production load testing a bit over a year ago. I started with Solr and switched to elastic search. ElasticSearch uses a write ahead log on every instance in its cluster to address this issue. I

Re: Asynchronous indexing consistency

2013-05-29 Thread Thomas Mueller
Hi, >For example during large batch imports or content >migrations it might be useful to be able to speed things up by >disabling things like full text indexing. OK, it's good to have a concrete use case. For a migration, the easiest solution might be to re-create the index at the end. For a lar

Re: Asynchronous indexing consistency

2013-05-29 Thread Jukka Zitting
Hi, On Wed, May 29, 2013 at 3:01 PM, Thomas Mueller wrote: >>There could be various reasons for why an indexer might not be >>available for an extended amount of time > > Possibly you are right, but let me try to challenge this assumption: > > Wouldn't it be a problem if the index isn't updated f

Re: Asynchronous indexing consistency

2013-05-29 Thread Thomas Mueller
Hi, >There could be various reasons for why an indexer might not be >available for an extended amount of time Possibly you are right, but let me try to challenge this assumption: Wouldn't it be a problem if the index isn't updated for a long time? Don't we need a protection against an outdated i

Re: Asynchronous indexing consistency

2013-05-29 Thread Jukka Zitting
Hi, On Wed, May 29, 2013 at 12:28 PM, Thomas Mueller wrote: > What would happen if you stop a cluster node for a long time (for example > 1 day)? Would async indexing be done on another cluster node? If yes, I > guess we need a way to ensure that's the case. If not, then the problem > might be th

Re: Asynchronous indexing consistency

2013-05-29 Thread Thomas Mueller
Hi, What would happen if you stop a cluster node for a long time (for example 1 day)? Would async indexing be done on another cluster node? If yes, I guess we need a way to ensure that's the case. If not, then the problem might be that old revisions are no longer available. I guess we need a simi

Asynchronous indexing consistency

2013-05-29 Thread Alex Parvulescu
hi guys, I'm trying to find a solution for keeping the async index up to date in the case where the system restarts. How can the indexing process pick up where it left off? A quick chat with Jukka turned up some ideas, but be warned this is all based on api changes: - expose the revision info a