>>The lack of standardized metadata is an issue, of course - we could start experimenting with this in Luke, to see whether we can squeeze a subset of Solr schema there.
Actually, an "AnalyzerFactory" interface in Luke might provide the abstraction which would allow Solr, my proprietary metadata system, and any other config resource to hook into Luke. That would be pretty cool ----- Original Message ---- From: Andrzej Bialecki <a...@getopt.org> To: java-user@lucene.apache.org Sent: Fri, 5 March, 2010 11:11:12 Subject: Re: SpanQueries in Luke On 2010-03-05 11:22, mark harwood wrote: >>> I'll commit the current mostly-working state today, you can take a look > > OK. However I think this XMLQueryParser addition will only resurface a > long-standing issue with Luke and Lucene in general. > This query parser works best on multiple fields (e.g. free-text<UserQuery> > tags and<TermsFilter> on structured fields). Each field typically requires > different analyzers and there is currently no way of recording this > information as metadata alongside an index. > Without this metadata each user's Luke session starts with a game of > "guess-which-analyzer-to-use?" I guess ;) that generally speaking there is no good answer to this - the same token stream could have been produced by varying analysis chains, even across indexing sessions that append to the same index. > > I use my own proprietary system for storing such index metadata and this is > through an XML file that contains a BeanEncoder-serialized > PerFieldAnalyserWrapper among other things. > It would be nice to see some standardisation in how this information can be > made available in *any* Lucene index but I guess this overlaps with things > like Solr's config. Yes. Theoretically one could store such information in IndexCommit.getUserData(). The lack of standardized metadata is an issue, of course - we could start experimenting with this in Luke, to see whether we can squeeze a subset of Solr schema there. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org