Re: scaling / sharding questions

2008-06-15 Thread Marcus Herou
Yep got that. Thanks. /M On Sun, Jun 15, 2008 at 8:42 PM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > With Lance's MD5 schema you'd do this: > > 1 shard: 0-f* > 2 shards: 0-8*, 9-f* > 3 shards: 0-5*, 6-a*, b-f* > 4 shards: 0-3*, 4-7*, 8-b*, c-f* > ... > 16 shards: 0*, 1*, 2*... d*, e*, f

Language Analyser

2008-06-15 Thread sherin
Hi All, I need to develop a language analyzer to implement multilingual search. It will be very useful if I get any sample language analyzer and a sample data used to index with that analyzer. thanks in advance.., -- View this message in context: http://www.nabble.com/Language-Analyser-tp17

Faceting on date fields

2008-06-15 Thread Rhys Palmer
Hi all, I've having some problems getting faceting to work correctly on date fields. For each document I index in solr I store a created date. ie. 1993-01-01T00:00:00.000Z What I'm trying to do is for any search query, facet on year part of the created date using: facet.date = created_date_dt f.

Re: doubt with an index of 300gb

2008-06-15 Thread Norberto Meijome
On Sun, 15 Jun 2008 14:38:15 +0200 "Roberto Nieto" <[EMAIL PROTECTED]> wrote: > Hi Otis, > > Thanks a lot for your interest. > > The main thing i cant understand very well is that if I have 8 maquines that > will be searchers, for example, why they will have a higher cost of hw if I > have one b

Re: doubt with an index of 300gb

2008-06-15 Thread Roberto Nieto
Hi Otis, I think that my questions were not very well formulated. We have dedicate machines for parsing, 2 machines (active/pasive) for indexing, the index allocated in a SAN filesystem and dedicate machines for searching. All of my questions came because if i have an index of 300gb i dont know h

Re: scaling / sharding questions

2008-06-15 Thread Otis Gospodnetic
With Lance's MD5 schema you'd do this: 1 shard: 0-f* 2 shards: 0-8*, 9-f* 3 shards: 0-5*, 6-a*, b-f* 4 shards: 0-3*, 4-7*, 8-b*, c-f* ... 16 shards: 0*, 1*, 2*... d*, e*, f* Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Marcus Herou <[

Re: doubt with an index of 300gb

2008-06-15 Thread Otis Gospodnetic
Roberto, All I was trying to say that it *might* be cheaper to buy: 10 smaller servers with 4 GB RAM each, for a total of 40 GB RAM than 1 big server with 40 GB RAM and the CPU matching the CPU power of 10 smaller servers Of course, there are other things to consider, too - power usage, hosting

Re: Memory problems when highlight with not very big index

2008-06-15 Thread Roberto Nieto
Hi Yonik, I think your are right, it must be that. If i activate the highlighting of a field that i´m not specifing in "fl", it will have the same use of RAM as if i return it? Internally it will be as if I add it to "fl"? 2008/6/13 Yonik Seeley <[EMAIL PROTECTED]>: > On Fri, Jun 13, 2008 at 3

Re: doubt with an index of 300gb

2008-06-15 Thread Roberto Nieto
Hi Otis, Thanks a lot for your interest. The main thing i cant understand very well is that if I have 8 maquines that will be searchers, for example, why they will have a higher cost of hw if I have one big index. If I have 10 smaller indexes I will need to search over all of them so...that won´t