Hi,

100,000 is not a big number in IR world. Lucene actually played some trick
so people can do very big IR system with Lucene-based system. I have read
blog post that people have much larger search document base....The only
concern you need to think of is to provide bigger heap size to JRE to avoid
OutOfMemory Exception.
*Hadoop is more focusing on the disturbuted crawler as far I know...

Hope it help,
Vinci


TimRobertson100 wrote:
> 
> Hi all,
> I have just got a SOLR index working for the first time on a few 100,000
> records from a custom database dump, and the results are very impressive,
> both in the speed it indexes (even on my macbook) and the response times.
> 
> If I want to index "what, where(grid based to 0.1 degree cells), when,
> who"
> type information (lets say a schema of 10 strings, 2 dates, 4 ints) what
> are
> the limitations going to be?
> 
> Is there any documentation on whether indexes can be partitioned easily,
> so
> scaling is somewhat linear?
> 
> My reasoning to look for this is our current searchable "index" is on a
> mysql database with 2 main fact tables of 150,000,000 records and
> 15,000,000
> records which are normally joined for most queries.  We are looking to
> increase to 10x that size so I am looking at Billions of records...
> 
> How likely will this scale on SOLR?
> What's the biggest number of items people have indexed?
> How complicated do the queries have to get before things get slow? This is
> the kind of thing I am looking for:
> (name:"Passer domesticus*" AND cell:[36543 TO 43324] AND mod360Cell[45 TO
> 65] AND year:[1950 TO *])
> - if you care, this is a search for "The bird of type Sparrows in a geo
> bounding box and collected/observed after 1950"...
> 
> I'm going to be trying anyway, but any pointers appreciated (Hadoop
> perhaps?)
> 
> Thanks,
> 
> Tim
> PS - This is an open source open access project to create an index of
> biodiversity data (http://data.gbif.org) so your help is going towards a
> worthwhile cause!
> 
> 

-- 
View this message in context: 
http://www.nabble.com/What-are-the-limits--Billions-of-records-anyone--tp16262032p16268808.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to