I seem to recall Doug C. commenting on this: http://lucene.markmail.org/search/?q=FilterIndexReader#query
:FilterIndexReader%20from%3A%22Doug%20Cutting%22+page:1+mid:y673avueo43ufwhm+state:results
Not sure if that is exactly what you are looking for, but sounds
similar.
-Grant
On Apr 29, 2008, at 1:10 PM, Otis Gospodnetic wrote:
Hi Nico,
I don't think there is a tool to split an existing Lucene index,
though I imagine one could write such a tool using http://lucene.apache.org/java/2_3_1/fileformats.html
as a guide.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
From: Nico Heid <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, April 29, 2008 4:10:09 AM
Subject: Index splitting
Hi,
Let me first roughly describe the scenario :-)
We're trying to index online stored data for some thousand users.
The schema.xml has a custom identifier for the user, so FQ can be
applied
and further filtering is only done for the user (more important,
the user
doesn't get to see results from data not belonging to him)
Unfortunatelly, the Index might become quite big ( we're indexing
more that
50 TB Data, all kind of files, full text (indexed only, not stored)
where
possible, elsewhere fileinfos (size, date) and meta if available)
So Question the is:
We're thinking of starting out with multiple Solr instances (either
in their
own containers or MultiCore, guess that's not the important point),
on 1 to
n machines. Lets just pretend: we do modulo 5 on the user number
and assign
it to one of the two machines. The index gets distributed on
QuerySlaves (
1-m dependend on the need).
So now the Question:
Is there a way to split a too big index into smaller ones? Do I
have to
create more instances at the beginning, so that I will not run out
of power
and space? (which will ad quite a bit of redundance of data)
Lets say I miscalculated and used only 2 indices, but now I see I
need at
least 4.
Any idea will be very welcome,
Thanks,
Nico
--------------------------
Grant Ingersoll
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ