Thanks Yonik to share your reflexion,

This doesn't sound like true federated search,

I'm affraid to not understand "federated search", you seems to have a precise idea behind the head.

since you have a number
of fields that are the same in each index that you search across, and
you treat them all the same.  This is functionally equivalent to
having a single schema and a single index.  You can still have
multiple applications that query the single collection differently.

Before a pointer or a web example from you, what you describe seems to me like implement a complete database with a single table (not easy to understand and maintain, but possible). To my experience, a collection is a schema, with thousands or millions XML documents, could be 10, 20 or more fields, and search configuration is generated from a kind of data schema (there's no real standard for explaining for example, that a title or a subject need one field for exact match, and another for word search). If an index was too big (hopefully I never touch this limit with lucene), I guess there are solutions. My problem is to maintain different collections with each their intellectual logic, some shared FieldNames, like Dublin Core, or at least "fulltext", but also specific for each ones.

Depending on update patterns and index sizes, you can probably get
better efficiency with multiple indexes, but not really more
functionality (in your case), right?

Maybe "let it understandable" could be accepted as a functionality ? Perhaps less now, but it was a time when lucene index could become corrupted, so that separate them was important.

I guess that those specific problems will not be Solr priorities, but till I have been corrected, I'm still feeling that multiple indexes are useful.


--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique

Reply via email to