Yeah, they are more complex than the "exactish" match -- basically, there are more fields involved -- combined sometimes with AND and sometimes with OR, and sometimes negated field values, sometimes groupings, etc. These other field values are all single words (no spaces), and a search might involve a wildcard on them. Hope that helps.
Thanks. Chris Hostetter wrote: > > > : Thanks for your input. I'm sure I could do as you suggest (and maybe > that > : will end up being my best option), but I had hoped to use a string for > : creating the query object, particularly as some of my queries are a bit > : complex. > > you have to clarify what you mean by "use a string for creating the query > object" ... there's nothing in what i suggested that implies you can't do > that, that's exactly what i'm suggesting you do... > > String input = ...; > Analyzer a = new YourCustomAnalyzer(); > // because you know your analyzer allways produces exactly one token... > Token t = a.tokenStream("yourField", new StringReader(input)).next(); > Query yourQuery = new TermQuery("yourField", t.termText()); > > ...if your queries are more complex then just the "exactish" matching you > described before, then that's a seperate issue -- what you described > didn't sound like it required any special input processing -- you said you > had a "string" and you wanted to find exact matches on that string (with > some normalization) ... but that you didn't want your input split on > whitespace, or hyphens, or any of the "special" characters QueryParser > uses. > > If you want other things then that certainly makes things more > complicated, but the basic idea is still the same ... so what exactly do > you mean when you say it's more complicated? > > > : > I haven't really been following this thread, but it's gotten so long > : > i got interested. > : > > : > from whta i can tell skimming the discussion so far, it seems like the > : > biggest confusion is about the definition of a "phrase" and what > analyzers > : > do with "quote" characters and what the QueryParser does with "quote" > : > charcters -- when ultimately you don't seem to really care about > "phrases" > : > in a textual searching sense; nor do you seem to care about any of the > : > "features" of the QueryParser. > : > > : > it seems that what you care about is: > : > > : > 1) making documents, and adding a list of "text chunks" to those > : > documents (what you've been calling phrases) > : > 2) you then want to be able to search for "almost-exact" matches on > those > : > "text chunks" ... these matches should be "exactish" because you > don't > : > want partial matches based on white spaces, or splitting on > hyphens, > : > but they shouldn't be truely exact because you want some simple > : > normalization... > : > > : > : actually would like to "normalize" a phrase (spaces) or a hyphenated > : > word or > : > : an underscored word to the same value -- e.g. MS-WORD or ms_WORd or > "MS > : > : Word" --> ms_word. > : > > : > ...in which case, you should: > : > a) write yourself an analyzer which does no "tokenizing" (ie: each > input > : > Field value generates a single token) but does the normalization > you > : > want. > : > b) use this Analyzer when you add the fields to your documents, even > : > though you don't want *real* tokenization, add make the field type > : > TOKENIZED so your analyzer gets used. > : > c) when you get some text input to serach on, pass it to the same > : > Analyzer, take the Token you get back and manualy construct a > : > TermQuery out of it for the neccessary field. > : > > : > ...that's it. that's all she wrote -- don't even look in > QueryParser's > : > general direction, at all. > : > > : > > : > > : > -Hoss > : > > : > > : > --------------------------------------------------------------------- > : > To unsubscribe, e-mail: [EMAIL PROTECTED] > : > For additional commands, e-mail: [EMAIL PROTECTED] > : > > : > > : > > : > : -- > : View this message in context: > http://www.nabble.com/Phrase-search-using-quotes----special-Tokenizer-tf2200760.html#a6128827 > : Sent from the Lucene - Java Users forum at Nabble.com. > : > : > : --------------------------------------------------------------------- > : To unsubscribe, e-mail: [EMAIL PROTECTED] > : For additional commands, e-mail: [EMAIL PROTECTED] > : > > > > -Hoss > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/Phrase-search-using-quotes----special-Tokenizer-tf2200760.html#a6134864 Sent from the Lucene - Java Users forum at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]