With that many documents, I think GSA cost might be in millions of USD.  Don't 
go there.

300 MB docs might be called medium these days.  Of course, if those documents 
themselves are huge, then it's more resource intensive.  10 TB sounds like a 
lot 
when it comes to search, but it's hard to tell what that represents (e.g. are 
those docs with lots of photos in them?  Presentations very light on text?  
Plain text documents with 300 words per page? etc.)

Anyhow, yes, Solr is a fine choice for this.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: atreyu <wjhendrick...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Thu, May 12, 2011 12:59:28 PM
> Subject: Support for huge data set?
> 
> Hi,
> 
> I have about 300 million docs (or 10TB data) which is doubling every  3
> years, give or take.  The data mostly consists of Oracle records,  webpage
> files (HTML/XML, etc.) and office doc files.  There are b/t two  and four
> dozen concurrent users, typically.  The indexing server has  > 27 GB of RAM,
> but it still gets extremely taxed, and this will only get  worse. 
> 
> Would Solr be able to efficiently deal with a load of this  size?  I am
> trying to avoid the heavy cost of GSA,  etc...
> 
> Thanks.
> 
> 
> --
> View this message in context: 
>http://lucene.472066.n3.nabble.com/Support-for-huge-data-set-tp2932652p2932652.html
>
> Sent  from the Solr - User mailing list archive at Nabble.com.
> 

Reply via email to