[
https://issues.apache.org/jira/browse/SOLR-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249965#comment-17249965
]
David Smiley commented on SOLR-13102:
-------------------------------------
I forgot about this proposal. Still; [[email protected]] please take a look
at my proposal SOLR-15051. The issue here centers around the use of a
SolrCloud shard "term" to keep multiple readers and one writer with leader
hand-off happy using the same space for a shard. My primary concern with this
plan is how this may conceptually leak concerns between the low level Directory
and high level SolrCloud. Perhaps it can work in some way nicely – I dunno
just by looking at the issue description. Also it's unclear if the replica
types would know/care about the use of this Directory; hopefully not.
Might you re-title this to somehow include "via shard leader term prefix" or
some-such differentiator? Solr *already* has a shared storage implementation
using HdfsDirectory.
> Shared storage Directory implementation
> ---------------------------------------
>
> Key: SOLR-13102
> URL: https://issues.apache.org/jira/browse/SOLR-13102
> Project: Solr
> Issue Type: New Feature
> Reporter: Yonik Seeley
> Priority: Major
>
> We need a general strategy (and probably a general base class) that can work
> with shared storage and not corrupt indexes from multiple writers.
> One strategy that is used on local disk is to use locks. This doesn't extend
> well to remote / shared filesystems when the locking is not tied into the
> object store itself since a process can lose the lock (a long GC or whatever)
> and then immediately try to write a file and there is no way to stop it.
> An alternate strategy ditches the use of locks and simply avoids overwriting
> files by some algorithmic mechanism.
> One of my colleagues outlined one way to do this:
> https://www.youtube.com/watch?v=UeTFpNeJ1Fo
> That strategy uses random looking filenames and then writes a "core.metadata"
> file that maps between the random names and the original names. The problem
> is then reduced to overwriting "core.metadata" when you lose the lock. One
> way to fix this is to version "core.metadata". Since the new leader election
> code was implemented, each shard as a monotonically increasing "leader term",
> and we can use that as part of the filename. When a reader goes to open an
> index, it can use the latest file from the directory listing, or even use the
> term obtained from ZK if we can't trust the directory listing to be up to
> date. Additionally, we don't need random filenames to avoid collisions... a
> simple unique prefix or suffix would work fine (such as the leader term again)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]