Re: anyone use hadoop+solr?

MitchK Mon, 06 Sep 2010 05:38:14 -0700

Thanks for your detailed feedback Andzej!

>From what I understood, SOLR-1301 becomes obsolete ones Solr becomes
cloud-ready, right?




> Looking into the future: eventually, when SolrCloud arrives we will be 
> able to index straight to a SolrCloud cluster, assigning documents to 
> shards through a hashing schema (e.g. 'md5(docId) % numShards')
> 
Hm, let's say the md5(docId) would produce a value of 10 (it won't, but
let's assume it).
If I got a constant number of shards, the doc will be published to the same
shard again and again.

i.e.: 10 % numShards(5) = 2 -> doc 10 will be indexed at shard 2.

A few days later the rest of the cluster is available, now it looks like

10 % numShards(10) ->  1 -> doc 10 will be indexed at shard 1... and what
about the older version at shard 2? I am no expert when it comes to
cloudComputing and the other stuff.
If you can point me to one or another reference where I can read about it,
it would help me a lot, since I only want to understand how it works at the
moment.

The problem with Solr is its lack of documentation in some classes and the
lack of capsulating some very complex things into different methods or
extra-classes. Of course, this is because it costs some extra time to do so,
but it makes understanding and modifying things very complicated if you do
not understand whats going on from a theoretical point of view.

Since the cloud-feature will be complex, a lack of documentation and no
understanding of the theory behind the code will make contributing back
very, very complicated.

Thank you :-)
- Mitch
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/anyone-use-hadoop-solr-tp485333p1425986.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: anyone use hadoop+solr?

Reply via email to