Re: SpanQueries in Luke

Andrzej Bialecki Fri, 05 Mar 2010 03:11:47 -0800

On 2010-03-05 11:22, mark harwood wrote:

I'll commit the current mostly-working state today, you can take a look


OK. However I think this XMLQueryParser addition will only resurface a 
long-standing issue with Luke and Lucene in general.
This query parser works best on multiple fields (e.g. free-text<UserQuery>  tags 
and<TermsFilter>  on structured fields). Each field typically requires different 
analyzers and there is currently no way of recording this information as metadata alongside 
an index.
Without this metadata each user's Luke session starts with a game of 
"guess-which-analyzer-to-use?"

I guess ;) that generally speaking there is no good answer to this - thesame token stream could have been produced by varying analysis chains,even across indexing sessions that append to the same index.


I use my own proprietary system for storing such index metadata and this is 
through an XML file that contains a BeanEncoder-serialized 
PerFieldAnalyserWrapper among other things.
It would be nice to see some standardisation in how this information can be 
made available in *any* Lucene index but I guess this overlaps with things like 
Solr's config.


Yes.

Theoretically one could store such information inIndexCommit.getUserData(). The lack of standardized metadata is anissue, of course - we could start experimenting with this in Luke, tosee whether we can squeeze a subset of Solr schema there.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: SpanQueries in Luke

Reply via email to