RE: Using multiple analysers within a query
Actually, just realised a PhraseQuery is incorrect... I only want a single TermQuery but it just needs to be quoted, d'oh. -Original Message- Then I found that because that analyser always returns a single token (TermQuery) it would send through spaces into the final query string, causing problems. So also in getFieldQuery I check if it needs breaking up and converting into a PhraseQuery. CONFIDENTIALITY NOTICE AND DISCLAIMER Information in this transmission is intended only for the person(s) to whom it is addressed and may contain privileged and/or confidential information. If you are not the intended recipient, any disclosure, copying or dissemination of the information is unauthorised and you should delete/destroy all copies and notify the sender. No liability is accepted for any unauthorised use of the information contained in this transmission. This disclaimer has been automatically added. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Using multiple analysers within a query
Hi again, Thanks for everyone who replied. The PerFieldAnalyzerWrapper was a good suggestion, and one I had overlooked, but for our particular requirements it wouldn't quite work so I went with overriding getFieldQuery(). You were right, Paul. In 1.4.2 a whole heap of QueryParser changes were made, mostly removing the analyzer parameter from methods. In the end I built my changes on top of the NewMultiFieldQueryParser which was shared here recently and works wonders -- thanks Bill Janssen and sergiu gordea. I added support for slops and boosts to build together with the multi-fields array, and then overrode getFieldQuery to check the queryText for a start char ("=" for example) and if found remove it and switch to a non-tokenising analyser. Then I found that because that analyser always returns a single token (TermQuery) it would send through spaces into the final query string, causing problems. So also in getFieldQuery I check if it needs breaking up and converting into a PhraseQuery. Seems to work, just needs thorough testing. If anyone would like a copy I could post it up here. Regards, --Leto (excuse the disclaimer...) > We have the need for analysed and 'not analysed/not tokenised' clauses > within one query. Imagine an unparsed query like: > > +title:"Hello World" +path:Resources\Live\1 > > In the above example we would want the first clause to use > StandardAnalyser and the second to use an analyser which returns the > term as a single token. So a parsed result might look like: > > +(title:hello title:world) +path:Resources\Live\1 > > Would anyone have any suggestions on how this could be done? I was > thinking maybe the QueryParser would have to be changed/extended to > accept a separator other than colon ":", something like "=" for > example to indicate this clause is not to be tokenised. Or perhaps > this can all be done using a single analyser? CONFIDENTIALITY NOTICE AND DISCLAIMER Information in this transmission is intended only for the person(s) to whom it is addressed and may contain privileged and/or confidential information. If you are not the intended recipient, any disclosure, copying or dissemination of the information is unauthorised and you should delete/destroy all copies and notify the sender. No liability is accepted for any unauthorised use of the information contained in this transmission. This disclaimer has been automatically added. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Using multiple analysers within a query
On Nov 22, 2004, at 9:17 AM, Morus Walter wrote: Erik Hatcher writes: If your query isn't entered by users, you shouldn't use query parser in most cases anyway. I'd go even further and say in all cases. If you use lucene as a search server you have to provide the query somehow. E.g. we have an php application, that sends queries to a lucene search servlet. In this case it's justifiable to serialize the query into query parser syntax on the client side and have query parser read the query again on the server side. Ah, good point! I hadn't considered this scenario. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Using multiple analysers within a query
Erik Hatcher writes: > > If your query isn't entered by users, you shouldn't use query parser in > > most cases anyway. > > I'd go even further and say in all cases. > If you use lucene as a search server you have to provide the query somehow. E.g. we have an php application, that sends queries to a lucene search servlet. In this case it's justifiable to serialize the query into query parser syntax on the client side and have query parser read the query again on the server side. I don't recall any problems with the aproach since we clean up the user before constructing the query. Morus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Using multiple analysers within a query
On Nov 22, 2004, at 2:56 AM, Morus Walter wrote: Kauler, Leto S writes: Would anyone have any suggestions on how this could be done? I was thinking maybe the QueryParser would have to be changed/extended to accept a separator other than colon ":", something like "=" for example to indicate this clause is not to be tokenised. I suggested that in a recent discussion and Erik Hatcher objected that it isn't a good idea, to require that users know which field to query in which way. I guess he is right. QueryParser is a one-size fits (?) all sort of beast. It has plenty of negatives, no question. If your query isn't entered by users, you shouldn't use query parser in most cases anyway. I'd go even further and say in all cases. Or perhaps this can all be done using a single analyser? Look at PerFieldAnalyzerWrapper. You will probably have to write a keyword analyzer (unless you can use whitespace analyzer in your case). We should probably add a KeywordAnalyzer to Lucene's core at some point. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Using multiple analysers within a query
Kauler, Leto S writes: > > Would anyone have any suggestions on how this could be done? I was > thinking maybe the QueryParser would have to be changed/extended to > accept a separator other than colon ":", something like "=" for example > to indicate this clause is not to be tokenised. I suggested that in a recent discussion and Erik Hatcher objected that it isn't a good idea, to require that users know which field to query in which way. I guess he is right. If your query isn't entered by users, you shouldn't use query parser in most cases anyway. > Or perhaps this can all > be done using a single analyser? > Look at PerFieldAnalyzerWrapper. You will probably have to write a keyword analyzer (unless you can use whitespace analyzer in your case). HTH Morus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Using multiple analysers within a query
On Monday 22 November 2004 05:02, Kauler, Leto S wrote: > Hi Lucene list, > > We have the need for analysed and 'not analysed/not tokenised' clauses > within one query. Imagine an unparsed query like: > > +title:"Hello World" +path:Resources\Live\1 > > In the above example we would want the first clause to use > StandardAnalyser and the second to use an analyser which returns the > term as a single token. So a parsed result might look like: > > +(title:hello title:world) +path:Resources\Live\1 > > Would anyone have any suggestions on how this could be done? I was > thinking maybe the QueryParser would have to be changed/extended to > accept a separator other than colon ":", something like "=" for example > to indicate this clause is not to be tokenised. Or perhaps this can all > be done using a single analyser? Overriding QueryParser.getFieldQuery() might work for you. It is given the field and the query text so an analyzer can be chosen depending on the field. In case you don't use the latest cvs head, it may be worthwhile to have a look. Some of the getFieldQuery methods have been deprecated, but I don't know when. Regards, Paul. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Using multiple analysers within a query
Hi Lucene list, We have the need for analysed and 'not analysed/not tokenised' clauses within one query. Imagine an unparsed query like: +title:"Hello World" +path:Resources\Live\1 In the above example we would want the first clause to use StandardAnalyser and the second to use an analyser which returns the term as a single token. So a parsed result might look like: +(title:hello title:world) +path:Resources\Live\1 Would anyone have any suggestions on how this could be done? I was thinking maybe the QueryParser would have to be changed/extended to accept a separator other than colon ":", something like "=" for example to indicate this clause is not to be tokenised. Or perhaps this can all be done using a single analyser? Regards (and excuse the disclaimer), --Leto CONFIDENTIALITY NOTICE AND DISCLAIMER Information in this transmission is intended only for the person(s) to whom it is addressed and may contain privileged and/or confidential information. If you are not the intended recipient, any disclosure, copying or dissemination of the information is unauthorised and you should delete/destroy all copies and notify the sender. No liability is accepted for any unauthorised use of the information contained in this transmission. This disclaimer has been automatically added. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]