rence *
* www.microfocus.com/devforum *
*
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 16 December 2003 12:31
To: Lucene Users List
Subject: Re: Disabling modifiers?
On Tuesday, December 16, 2003, at 07:28 AM, Erik Hatcher
Thanks Karl.
-Original Message-
From: Karl Penney [mailto:[EMAIL PROTECTED]
Sent: 16 December 2003 13:58
To: Lucene Users List
Subject: Re: Disabling modifiers?
One of the token patterns defined by the StandardTokenizer.jj is this
sers List'" <[EMAIL PROTECTED]>
Sent: Tuesday, December 16, 2003 7:46 AM
Subject: RE: Disabling modifiers?
> I think it is a problem with the indexing. I've found another example...
>
> WS-CA-PP00-PROCESS-YYMM
>
> I've looked at the index, and it has been to
w.microfocus.com/devforum *
*
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 16 December 2003 12:31
To: Lucene Users List
Subject: Re: Disabling modifiers?
On Tuesday, December 16, 2003, at 07:28 AM, Erik Hatcher wrote:
> And yes, if you are using StandardTokenizer, you ar
On Tuesday, December 16, 2003, at 07:28 AM, Erik Hatcher wrote:
And yes, if you are using StandardTokenizer, you are probably not
tokenizing COBOL quite like you expect. Is there a COBOL parser you
could tap into that could give you the tokens you want?
Ummm. nevermind that last question...
On Tuesday, December 16, 2003, at 05:46 AM, Iain Young wrote:
Treating them as two separate words when quoted is indicative of your
analyzer not being sufficient for your domain. What Analyzer are you
using? Do you have knowledge of what it is tokenizing text into?
I have created a custom analyz
I think it is a problem with the indexing. I've found another example...
WS-CA-PP00-PROCESS-YYMM
I've looked at the index, and it has been tokenized into 3 words...
WS
CA-PP00-PROCESS
YYMM
Looks as though I might have to use a custom tokenizer as well as an
analyzer then, but any ideas as to wh
regor Heinrich [mailto:[EMAIL PROTECTED]
Sent: 15 December 2003 18:32
To: 'Lucene Users List'
Subject: RE: Disabling modifiers?
If you don't want to fiddle with the JavaCC source of QueryParser.jj, you
could work with a regular expression that works in front of the actual query
parser.
> Treating them as two separate words when quoted is indicative of your
> analyzer not being sufficient for your domain. What Analyzer are you
> using? Do you have knowledge of what it is tokenizing text into?
I have created a custom analyzer (CobolAnalyzer) which contains some custom
stop wor
If you don't want to fiddle with the JavaCC source of QueryParser.jj, you
could work with a regular expression that works in front of the actual query
parser. I just did something similar because I input Lucene's query strings
into a latent semantic analysis algorithm and remove words with + and ?
On Monday, December 15, 2003, at 12:12 PM, Iain Young wrote:
A quick question. Is there any way to disable the - and + modifiers in
the
QueryParser?
Not currently.
I've had a bit of success by putting quotes around the offending
names, (as
suggested on this list), but the results are still less
11 matches
Mail list logo