----- Original Message -----
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, May 21, 2004 11:36 AM
Subject: Query parser and minus signs
>
>
>
>
> Hi All,
>
> I'm using Lucene on a site that has split content with a branch containing
> pages in English and a separate branch in Chinese. Some of the chinese
> pages include some (untranslatable) English words, so when a search is
> carried out in either language you can get pages from the wrong branch. To
> combat this we introduced a language field into the index which contains
> the standard language codes: en-UK and zh-HK.
>
> When you parse a query e.g. language:"en\-UK" you could reasonably expect
> the search to recover all pages with the language field set to "en-UK"
(the
> minus symbol should be escaped by the backslash according to the FAQ).
> Unfortunately the parser seems to return "en UK" as the parsed query and
> hence returns no documents.
>
> Has anyone else had this problem, or could suggest a workaround ?? as I
> have
> yet to find a solution in the mailing archives or elsewhere.
Index the standard language code as a
new Field(fieldName, code, false, true, false)
This will bypass the Analyzer at indexing time, since tokenization is set to
false. Then when you create your queries, add a
new TermQuery(new Term(fieldName, desiredLanguageCode))
to the user query object. This will bypass the Analyzer at query time and
give you the desired result.
>
> Many thanks in advance,
>
> Alex Bourne
>
>
>
> _____________________________________________________
>
> This transmission has been issued by a member of the HSBC Group
> ("HSBC") for the information of the addressee only and should not be
> reproduced and / or distributed to any other person. Each page
> attached hereto must be read in conjunction with any disclaimer which
> forms part of it. This transmission is neither an offer nor the
solicitation
> of an offer to sell or purchase any investment. Its contents are based
> on information obtained from sources believed to be reliable but HSBC
> makes no representation and accepts no responsibility or liability as to
> its completeness or accuracy.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]