Re: Query: A ? B

2004-03-05 Thread Erik Hatcher
Actually a slop of 1 does guarantee order... it is either an exact 
match or 1 term off.  It takes a slop of 2 or greater for reverse order 
matches.

But it is not exactly 1 term off, which is what Jochen wants.  *shrug*

	Erik

On Mar 4, 2004, at 6:22 PM, Otis Gospodnetic wrote:

Ah, sorry, I had misread your email, thinking you were asking a way to
match a single character.
The only thing that comes to my tired mind now is a phrase query with a
slop of 1, but that doesn't gurantee order, I believe.
Otis

--- Jochen Frey [EMAIL PROTECTED] wrote:
Otis:

Maybe I don't understand this right, but I *think* I am looking for
something different:
I am trying to write a query like this: my * house which should
match my
own house, my red house, my small house, but should not match
my
house ... you get the idea.
If I am not mistaken, a wildcard query only works if the wildcard is
within
a word (or token), and it would allow me to do things like g*
matching
green, great, ...etc. I don't know how to make that work for
multi words
scenarios.
Here is what I tried WildcardQuery in the unit test (TestBasics):

Query query = new WildcardQuery(new Term(field,six hundred *
five));
Thanks!
Jochen
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 04, 2004 12:00 PM
To: Lucene Users List
Subject: Re: Query: A ? B
Use WildcardQuery: A?B

Otis

--- Jochen Frey [EMAIL PROTECTED] wrote:
Hi Everyone.

I am trying to figure out how create a query that matches

A ? B

Where ? is exactly one token. Can anyone tell me how to do that?

Obviously it's easy to match 'A * B' where '*' is 0 or 1 tokens
(just
use a
PhraseQuery and set slop to 1). However, if I require exactly one
word/token
between 'A' and 'B'?
BTW, I know a very clumsy way of doing this, but I really don't
like
it: For
each indexed token insert a token (for example 'X') at the same
token-position. Then the query would be: A X B and everybody
(except the
indexing performance as well as the size on disk) would be happy.
There's got to be an easier way. Right?

Thanks in advance!
Jochen


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail:
[EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Query: A ? B

2004-03-04 Thread Jochen Frey
Hi Everyone.

I am trying to figure out how create a query that matches

A ? B

Where ? is exactly one token. Can anyone tell me how to do that?


Obviously it's easy to match 'A * B' where '*' is 0 or 1 tokens (just use a
PhraseQuery and set slop to 1). However, if I require exactly one word/token
between 'A' and 'B'?


BTW, I know a very clumsy way of doing this, but I really don't like it: For
each indexed token insert a token (for example 'X') at the same
token-position. Then the query would be: A X B and everybody (except the
indexing performance as well as the size on disk) would be happy.

There's got to be an easier way. Right?

Thanks in advance!
Jochen


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Query: A ? B

2004-03-04 Thread Otis Gospodnetic
Use WildcardQuery: A?B

Otis

--- Jochen Frey [EMAIL PROTECTED] wrote:
 Hi Everyone.
 
 I am trying to figure out how create a query that matches
 
 A ? B
 
 Where ? is exactly one token. Can anyone tell me how to do that?
 
 
 Obviously it's easy to match 'A * B' where '*' is 0 or 1 tokens (just
 use a
 PhraseQuery and set slop to 1). However, if I require exactly one
 word/token
 between 'A' and 'B'?
 
 
 BTW, I know a very clumsy way of doing this, but I really don't like
 it: For
 each indexed token insert a token (for example 'X') at the same
 token-position. Then the query would be: A X B and everybody
 (except the
 indexing performance as well as the size on disk) would be happy.
 
 There's got to be an easier way. Right?
 
 Thanks in advance!
 Jochen
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Query: A ? B

2004-03-04 Thread Jochen Frey
Otis:

Maybe I don't understand this right, but I *think* I am looking for
something different:

I am trying to write a query like this: my * house which should match my
own house, my red house, my small house, but should not match my
house ... you get the idea.

If I am not mistaken, a wildcard query only works if the wildcard is within
a word (or token), and it would allow me to do things like g* matching
green, great, ...etc. I don't know how to make that work for multi words
scenarios.

Here is what I tried WildcardQuery in the unit test (TestBasics):

Query query = new WildcardQuery(new Term(field,six hundred * five));

Thanks!
Jochen

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 04, 2004 12:00 PM
To: Lucene Users List
Subject: Re: Query: A ? B

Use WildcardQuery: A?B

Otis

--- Jochen Frey [EMAIL PROTECTED] wrote:
 Hi Everyone.
 
 I am trying to figure out how create a query that matches
 
 A ? B
 
 Where ? is exactly one token. Can anyone tell me how to do that?
 
 
 Obviously it's easy to match 'A * B' where '*' is 0 or 1 tokens (just
 use a
 PhraseQuery and set slop to 1). However, if I require exactly one
 word/token
 between 'A' and 'B'?
 
 
 BTW, I know a very clumsy way of doing this, but I really don't like
 it: For
 each indexed token insert a token (for example 'X') at the same
 token-position. Then the query would be: A X B and everybody
 (except the
 indexing performance as well as the size on disk) would be happy.
 
 There's got to be an easier way. Right?
 
 Thanks in advance!
 Jochen
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Query: A ? B

2004-03-04 Thread Erik Hatcher
Right Otis was confused by what you were asking.

Google supports what you are asking for, I believe, although I don't 
recall if an '*' indicates one or more or just one.

As far as I know, there is no easy way to do the exact distance like 
you desire.  You could always clone the PhraseQuery stuff into a custom 
Query that uses an == instead of a  for the slop.  Although you'll 
also need to tweak this to disallow reversing of terms too.  Slop 
handles terms out of order too.  Maybe the new span feature can do 
this?

	Erik

On Mar 4, 2004, at 4:29 PM, Jochen Frey wrote:

Otis:

Maybe I don't understand this right, but I *think* I am looking for
something different:
I am trying to write a query like this: my * house which should 
match my
own house, my red house, my small house, but should not match my
house ... you get the idea.

If I am not mistaken, a wildcard query only works if the wildcard is 
within
a word (or token), and it would allow me to do things like g* 
matching
green, great, ...etc. I don't know how to make that work for multi 
words
scenarios.

Here is what I tried WildcardQuery in the unit test (TestBasics):

Query query = new WildcardQuery(new Term(field,six hundred * 
five));

Thanks!
Jochen
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 04, 2004 12:00 PM
To: Lucene Users List
Subject: Re: Query: A ? B
Use WildcardQuery: A?B

Otis

--- Jochen Frey [EMAIL PROTECTED] wrote:
Hi Everyone.

I am trying to figure out how create a query that matches

A ? B

Where ? is exactly one token. Can anyone tell me how to do that?

Obviously it's easy to match 'A * B' where '*' is 0 or 1 tokens (just
use a
PhraseQuery and set slop to 1). However, if I require exactly one
word/token
between 'A' and 'B'?
BTW, I know a very clumsy way of doing this, but I really don't like
it: For
each indexed token insert a token (for example 'X') at the same
token-position. Then the query would be: A X B and everybody
(except the
indexing performance as well as the size on disk) would be happy.
There's got to be an easier way. Right?

Thanks in advance!
Jochen
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Query: A ? B

2004-03-04 Thread Jochen Frey
I think I know my way around the Span feature reasonably well ... and I
don't think it can be used for what I want to do.

But I would love to be proven wrong on this one.

:)

-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 04, 2004 1:52 PM
To: Lucene Users List
Subject: Re: Query: A ? B

Right Otis was confused by what you were asking.

Google supports what you are asking for, I believe, although I don't 
recall if an '*' indicates one or more or just one.

As far as I know, there is no easy way to do the exact distance like 
you desire.  You could always clone the PhraseQuery stuff into a custom 
Query that uses an == instead of a  for the slop.  Although you'll 
also need to tweak this to disallow reversing of terms too.  Slop 
handles terms out of order too.  Maybe the new span feature can do 
this?

Erik


On Mar 4, 2004, at 4:29 PM, Jochen Frey wrote:

 Otis:

 Maybe I don't understand this right, but I *think* I am looking for
 something different:

 I am trying to write a query like this: my * house which should 
 match my
 own house, my red house, my small house, but should not match my
 house ... you get the idea.

 If I am not mistaken, a wildcard query only works if the wildcard is 
 within
 a word (or token), and it would allow me to do things like g* 
 matching
 green, great, ...etc. I don't know how to make that work for multi 
 words
 scenarios.

 Here is what I tried WildcardQuery in the unit test (TestBasics):

 Query query = new WildcardQuery(new Term(field,six hundred * 
 five));

 Thanks!
 Jochen

 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
 Sent: Thursday, March 04, 2004 12:00 PM
 To: Lucene Users List
 Subject: Re: Query: A ? B

 Use WildcardQuery: A?B

 Otis

 --- Jochen Frey [EMAIL PROTECTED] wrote:
 Hi Everyone.

 I am trying to figure out how create a query that matches

 A ? B

 Where ? is exactly one token. Can anyone tell me how to do that?


 Obviously it's easy to match 'A * B' where '*' is 0 or 1 tokens (just
 use a
 PhraseQuery and set slop to 1). However, if I require exactly one
 word/token
 between 'A' and 'B'?


 BTW, I know a very clumsy way of doing this, but I really don't like
 it: For
 each indexed token insert a token (for example 'X') at the same
 token-position. Then the query would be: A X B and everybody
 (except the
 indexing performance as well as the size on disk) would be happy.

 There's got to be an easier way. Right?

 Thanks in advance!
 Jochen


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Query: A ? B

2004-03-04 Thread Otis Gospodnetic
Ah, sorry, I had misread your email, thinking you were asking a way to
match a single character.
The only thing that comes to my tired mind now is a phrase query with a
slop of 1, but that doesn't gurantee order, I believe.

Otis

--- Jochen Frey [EMAIL PROTECTED] wrote:
 Otis:
 
 Maybe I don't understand this right, but I *think* I am looking for
 something different:
 
 I am trying to write a query like this: my * house which should
 match my
 own house, my red house, my small house, but should not match
 my
 house ... you get the idea.
 
 If I am not mistaken, a wildcard query only works if the wildcard is
 within
 a word (or token), and it would allow me to do things like g*
 matching
 green, great, ...etc. I don't know how to make that work for
 multi words
 scenarios.
 
 Here is what I tried WildcardQuery in the unit test (TestBasics):
 
 Query query = new WildcardQuery(new Term(field,six hundred *
 five));
 
 Thanks!
 Jochen
 
 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, March 04, 2004 12:00 PM
 To: Lucene Users List
 Subject: Re: Query: A ? B
 
 Use WildcardQuery: A?B
 
 Otis
 
 --- Jochen Frey [EMAIL PROTECTED] wrote:
  Hi Everyone.
  
  I am trying to figure out how create a query that matches
  
  A ? B
  
  Where ? is exactly one token. Can anyone tell me how to do that?
  
  
  Obviously it's easy to match 'A * B' where '*' is 0 or 1 tokens
 (just
  use a
  PhraseQuery and set slop to 1). However, if I require exactly one
  word/token
  between 'A' and 'B'?
  
  
  BTW, I know a very clumsy way of doing this, but I really don't
 like
  it: For
  each indexed token insert a token (for example 'X') at the same
  token-position. Then the query would be: A X B and everybody
  (except the
  indexing performance as well as the size on disk) would be happy.
  
  There's got to be an easier way. Right?
  
  Thanks in advance!
  Jochen
  
  
 
 -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail:
 [EMAIL PROTECTED]
  
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]