On 2010-03-05 11:22, mark harwood wrote:
I'll commit the current mostly-working state today, you can take a look

OK. However I think this XMLQueryParser addition will only resurface a 
long-standing issue with Luke and Lucene in general.
This query parser works best on multiple fields (e.g. free-text<UserQuery>  tags 
and<TermsFilter>  on structured fields). Each field typically requires different 
analyzers and there is currently no way of recording this information as metadata alongside 
an index.
Without this metadata each user's Luke session starts with a game of 
"guess-which-analyzer-to-use?"

I guess ;) that generally speaking there is no good answer to this - the same token stream could have been produced by varying analysis chains, even across indexing sessions that append to the same index.


I use my own proprietary system for storing such index metadata and this is 
through an XML file that contains a BeanEncoder-serialized 
PerFieldAnalyserWrapper among other things.
It would be nice to see some standardisation in how this information can be 
made available in *any* Lucene index but I guess this overlaps with things like 
Solr's config.

Yes.

Theoretically one could store such information in IndexCommit.getUserData(). The lack of standardized metadata is an issue, of course - we could start experimenting with this in Luke, to see whether we can squeeze a subset of Solr schema there.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to