[jira] [Commented] (SOLR-13101) Shared storage support in SolrCloud

Ilan Ginzburg (Jira) Thu, 10 Dec 2020 13:07:06 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247510#comment-17247510
 ]


Ilan Ginzburg commented on SOLR-13101:
--------------------------------------

I have no issue with "will not fix" this Jira.

>From my perspective, the fundamental problem of this approach is not the 
>introduction of a new replica type but the need to commit every batch to be 
>able to push segments and having to wait for the push to complete and succeed 
>before calling the indexing itself as successful (there are a few possible 
>optimizations such as pushing files before commit happens so they're ready on 
>blob by then, but the fundamental issues do not go away). That's a major 
>performance degradation.

So yes, please close it. Thanks.

Looking forward to see a different approach that does not have the problems 
listed above! (or less of them :))

> Shared storage support in SolrCloud
> -----------------------------------
>
>                 Key: SOLR-13101
>                 URL: https://issues.apache.org/jira/browse/SOLR-13101
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud
>            Reporter: Yonik Seeley
>            Priority: Major
>          Time Spent: 15h 50m
>  Remaining Estimate: 0h
>
> Solr should have first-class support for shared storage (blob/object stores 
> like S3, google cloud storage, etc. and shared filesystems like HDFS, NFS, 
> etc).
> The key component will likely be a new replica type for shared storage.  It 
> would have many of the benefits of the current "pull" replicas (not indexing 
> on all replicas, all shards identical with no shards getting out-of-sync, 
> etc), but would have additional benefits:
>  - Any shard could become leader (the blob store always has the index)
>  - Better elasticity scaling down
>    - durability not linked to number of replcias.. a single replica could be 
> common for write workloads
>    - could drop to 0 replicas for a shard when not needed (blob store always 
> has index)
>  - Allow for higher performance write workloads by skipping the transaction 
> log
>    - don't pay for what you don't need
>    - a commit will be necessary to flush to stable storage (blob store)
>  - A lot of the complexity and failure modes go away
> An additional component a Directory implementation that will work well with 
> blob stores.  We probably want one that treats local disk as a cache since 
> the latency to remote storage is so large.  I think there are still some 
> "locking" issues to be solved here (ensuring that more than one writer to the 
> same index won't corrupt it).  This should probably be pulled out into a 
> different JIRA issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13101) Shared storage support in SolrCloud

Reply via email to