[ 
https://issues.apache.org/jira/browse/LUCENE-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894645#comment-13894645
 ] 

Mark Miller commented on LUCENE-5438:
-------------------------------------

Very interesting - can't wait to see how the performance works out.

Trying to move Solr over to the replication module is something I've briefly 
thought about here and there - and then stopped like touching an electric fence 
:) It took so much work and effort to get the current replication code very 
stable with SolrCloud that I don't look forward to such a challenge in the near 
future.

We would def like to have the ability to only index once. Of course, if you are 
sending documents to replicas async while indexing on the leader (we don't 
yet), I wonder how much benefit you get?

Hopefully work like this gets some others interested in giving a replication 
overhaul a shot. 

> add near-real-time replication
> ------------------------------
>
>                 Key: LUCENE-5438
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5438
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/replicator
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 5.0, 4.7
>
>         Attachments: LUCENE-5438.patch
>
>
> Lucene's replication module makes it easy to incrementally sync index
> changes from a master index to any number of replicas, and it
> handles/abstracts all the underlying complexity of holding a
> time-expiring snapshot, finding which files need copying, syncing more
> than one index (e.g., taxo + index), etc.
> But today you must first commit on the master, and then again the
> replica's copied files are fsync'd, because the code operates on
> commit points.  But this isn't "technically" necessary, and it mixes
> up durability and fast turnaround time.
> Long ago we added near-real-time readers to Lucene, for the same
> reason: you shouldn't have to commit just to see the new index
> changes.
> I think we should do the same for replication: allow the new segments
> to be copied out to replica(s), and new NRT readers to be opened, to
> fully decouple committing from visibility.  This way apps can then
> separately choose when to replicate (for freshness), and when to
> commit (for durability).
> I think for some apps this could be a compelling alternative to the
> "re-index all documents on each shard" approach that Solr Cloud /
> ElasticSearch implement today, and it may also mean that the
> transaction log can remain external to / above the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to