Re: [Neo] Strange behavior of LuceneFulltextIndexService

2009-12-24 Thread Mattias Persson
2009/12/24, Sebastian Stober :
> Hi Mattias,
>
> I just ran the test and all looks fine now! :-)
> Thanks for the great support and happy holidays to everybody!
And to you!

Best,
Mattias
>
> Cheers,
> Sebastian
>
>> Date: Wed, 23 Dec 2009 13:15:53 +0100
>> From: Mattias Persson 
>> Subject: Re: [Neo] Strange behavior of LuceneFulltextIndexService
>> To: Neo user discussions 
>> Message-ID:
>>  
>> Content-Type: text/plain; charset=UTF-8
>>
>> Just checking... did the latest index-util fix your problems?
>>
>> 2009/12/21 Mattias Persson :
>>> I think I fixed it. It'll be available from our maven repo soon!
>>>
>>> 2009/12/21 Mattias Persson :
>>>> Ok great, I'll look into this error and see if I can locate that bug.
>>>>
>>>> 2009/12/21 Sebastian Stober :
>>>>> Hello Mattias,
>>>>>
>>>>> thank you for your quick reply. The new behavior you describe looks
>>>>> like
>>>>> what I would expect. (I think fulltext queries should generally be
>>>>> treated case-insensitive)
>>>>>
>>>>> The original junit-test now completes without error. However, there
>>>>> still seems to be something odd.
>>>>>
>>>>> If I modify the setup code (before I run any test queries) like this,
>>>>> the LuceneFulltextIndexService is messed up:
>>>>>
>>>>> // using LuceneFulltextIndexService
>>>>>
>>>>> andy.setProperty( "name", "Andy Wachowski" );
>>>>> andy.setProperty( "title", "Director" );
>>>>> // ? ? ?larry.setProperty( "name", "Larry Wachowski" ); //old
>>>>> larry.setProperty( "name", "Andy Wachowski" ); //new(deliberately
>>>>> wrong)
>>>>> larry.setProperty( "title", "Director" );
>>>>> index.index( andy, "name", andy.getProperty( "name" ) );
>>>>> index.index( andy, "title", andy.getProperty( "title" ) );
>>>>> index.index( larry, "name", larry.getProperty( "name" ) );
>>>>> index.index( larry, "title", larry.getProperty( "title" ) );
>>>>>
>>>>> // new: fixing the name of larry
>>>>> index.removeIndex( larry, "name", larry.getProperty( "name" ) );
>>>>> larry.setProperty( "name", "Larry Wachowski" );
>>>>> index.index( larry, "name", larry.getProperty( "name" ) );
>>>>>
>>>>> // start the test...
>>>>> index.getNodes( "name", "wachowski" )
>>>>> now returns only larry instead of both nodes.
>>>>>
>>>>> Any ideas? It looks like the index entry for andy is removed as well.
>>>>>
>>>>> Cheers,
>>>>> Sebastian
>>>>>
>>>>>> snip
> ___
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>


-- 
Mattias Persson, [matt...@neotechnology.com]
Neo Technology, www.neotechnology.com
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Strange behavior of LuceneFulltextIndexService

2009-12-24 Thread Sebastian Stober
Hi Mattias,

I just ran the test and all looks fine now! :-)
Thanks for the great support and happy holidays to everybody!

Cheers,
Sebastian

> Date: Wed, 23 Dec 2009 13:15:53 +0100
> From: Mattias Persson 
> Subject: Re: [Neo] Strange behavior of LuceneFulltextIndexService
> To: Neo user discussions 
> Message-ID:
>   
> Content-Type: text/plain; charset=UTF-8
> 
> Just checking... did the latest index-util fix your problems?
> 
> 2009/12/21 Mattias Persson :
>> I think I fixed it. It'll be available from our maven repo soon!
>>
>> 2009/12/21 Mattias Persson :
>>> Ok great, I'll look into this error and see if I can locate that bug.
>>>
>>> 2009/12/21 Sebastian Stober :
>>>> Hello Mattias,
>>>>
>>>> thank you for your quick reply. The new behavior you describe looks like
>>>> what I would expect. (I think fulltext queries should generally be
>>>> treated case-insensitive)
>>>>
>>>> The original junit-test now completes without error. However, there
>>>> still seems to be something odd.
>>>>
>>>> If I modify the setup code (before I run any test queries) like this,
>>>> the LuceneFulltextIndexService is messed up:
>>>>
>>>> // using LuceneFulltextIndexService
>>>>
>>>> andy.setProperty( "name", "Andy Wachowski" );
>>>> andy.setProperty( "title", "Director" );
>>>> // ? ? ?larry.setProperty( "name", "Larry Wachowski" ); //old
>>>> larry.setProperty( "name", "Andy Wachowski" ); //new(deliberately wrong)
>>>> larry.setProperty( "title", "Director" );
>>>> index.index( andy, "name", andy.getProperty( "name" ) );
>>>> index.index( andy, "title", andy.getProperty( "title" ) );
>>>> index.index( larry, "name", larry.getProperty( "name" ) );
>>>> index.index( larry, "title", larry.getProperty( "title" ) );
>>>>
>>>> // new: fixing the name of larry
>>>> index.removeIndex( larry, "name", larry.getProperty( "name" ) );
>>>> larry.setProperty( "name", "Larry Wachowski" );
>>>> index.index( larry, "name", larry.getProperty( "name" ) );
>>>>
>>>> // start the test...
>>>> index.getNodes( "name", "wachowski" )
>>>> now returns only larry instead of both nodes.
>>>>
>>>> Any ideas? It looks like the index entry for andy is removed as well.
>>>>
>>>> Cheers,
>>>> Sebastian
>>>>
>>>>> snip
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Strange behavior of LuceneFulltextIndexService

2009-12-23 Thread Mattias Persson
Just checking... did the latest index-util fix your problems?

2009/12/21 Mattias Persson :
> I think I fixed it. It'll be available from our maven repo soon!
>
> 2009/12/21 Mattias Persson :
>> Ok great, I'll look into this error and see if I can locate that bug.
>>
>> 2009/12/21 Sebastian Stober :
>>> Hello Mattias,
>>>
>>> thank you for your quick reply. The new behavior you describe looks like
>>> what I would expect. (I think fulltext queries should generally be
>>> treated case-insensitive)
>>>
>>> The original junit-test now completes without error. However, there
>>> still seems to be something odd.
>>>
>>> If I modify the setup code (before I run any test queries) like this,
>>> the LuceneFulltextIndexService is messed up:
>>>
>>> // using LuceneFulltextIndexService
>>>
>>> andy.setProperty( "name", "Andy Wachowski" );
>>> andy.setProperty( "title", "Director" );
>>> //      larry.setProperty( "name", "Larry Wachowski" ); //old
>>> larry.setProperty( "name", "Andy Wachowski" ); //new(deliberately wrong)
>>> larry.setProperty( "title", "Director" );
>>> index.index( andy, "name", andy.getProperty( "name" ) );
>>> index.index( andy, "title", andy.getProperty( "title" ) );
>>> index.index( larry, "name", larry.getProperty( "name" ) );
>>> index.index( larry, "title", larry.getProperty( "title" ) );
>>>
>>> // new: fixing the name of larry
>>> index.removeIndex( larry, "name", larry.getProperty( "name" ) );
>>> larry.setProperty( "name", "Larry Wachowski" );
>>> index.index( larry, "name", larry.getProperty( "name" ) );
>>>
>>> // start the test...
>>> index.getNodes( "name", "wachowski" )
>>> now returns only larry instead of both nodes.
>>>
>>> Any ideas? It looks like the index entry for andy is removed as well.
>>>
>>> Cheers,
>>> Sebastian
>>>
>>>> Message: 4
>>>> Date: Fri, 18 Dec 2009 10:16:33 +0100
>>>> From: Mattias Persson 
>>>> Subject: Re: [Neo] Strange behavior of LuceneFulltextIndexService
>>>> To: Neo user discussions 
>>>> Message-ID:
>>>>       
>>>> Content-Type: text/plain; charset=UTF-8
>>>>
>>>> I've made some changes to make LuceneFulltextIndexService and
>>>> LuceneFulltextQueryIndexService behave more natural. So this is the
>>>> new (and better) deal (copied from the javadoc, from your example!):
>>>>
>>>> LuceneFulltextIndexService:
>>>>     /**
>>>>      * Since this is a "fulltext" index it changes the contract of this 
>>>> method
>>>>      * slightly. It treats the {...@code value} more like a query in than 
>>>> you can
>>>>      * query for individual words in your indexed values.
>>>>      *
>>>>      * So if you've indexed node (1) with value "Andy Wachowski" and node 
>>>> (2)
>>>>      * with "Larry Wachowski" you can expect this behaviour if you query 
>>>> for:
>>>>      *
>>>>      * o "andy"            --> (1)
>>>>      * o "Andy"            --> (1)
>>>>      * o "wachowski"       --> (1), (2)
>>>>      * o "andy larry"      -->
>>>>      * o "larry Wachowski" --> (2)
>>>>      * o "wachowski Andy"  --> (1)
>>>>      */
>>>>
>>>> LuceneFulltextQueryIndexService:
>>>>     /**
>>>>      * Here the {...@code value} is treated as a lucene query,
>>>>      * http://lucene.apache.org/java/2_9_1/queryparsersyntax.html
>>>>      *
>>>>      * So if you've indexed node (1) with value "Andy Wachowski" and node 
>>>> (2)
>>>>      * with "Larry Wachowski" you can expect this behaviour if you query 
>>>> for:
>>>>      *
>>>>      * o "andy"            --> (1)
>>>>      * o "Andy"            --> (1)
>>>>      * o "wachowski"       --> (1), (2)
>>>>      * o "andy AND larry"  -->
>&g

Re: [Neo] Strange behavior of LuceneFulltextIndexService

2009-12-21 Thread Mattias Persson
I think I fixed it. It'll be available from our maven repo soon!

2009/12/21 Mattias Persson :
> Ok great, I'll look into this error and see if I can locate that bug.
>
> 2009/12/21 Sebastian Stober :
>> Hello Mattias,
>>
>> thank you for your quick reply. The new behavior you describe looks like
>> what I would expect. (I think fulltext queries should generally be
>> treated case-insensitive)
>>
>> The original junit-test now completes without error. However, there
>> still seems to be something odd.
>>
>> If I modify the setup code (before I run any test queries) like this,
>> the LuceneFulltextIndexService is messed up:
>>
>> // using LuceneFulltextIndexService
>>
>> andy.setProperty( "name", "Andy Wachowski" );
>> andy.setProperty( "title", "Director" );
>> //      larry.setProperty( "name", "Larry Wachowski" ); //old
>> larry.setProperty( "name", "Andy Wachowski" ); //new(deliberately wrong)
>> larry.setProperty( "title", "Director" );
>> index.index( andy, "name", andy.getProperty( "name" ) );
>> index.index( andy, "title", andy.getProperty( "title" ) );
>> index.index( larry, "name", larry.getProperty( "name" ) );
>> index.index( larry, "title", larry.getProperty( "title" ) );
>>
>> // new: fixing the name of larry
>> index.removeIndex( larry, "name", larry.getProperty( "name" ) );
>> larry.setProperty( "name", "Larry Wachowski" );
>> index.index( larry, "name", larry.getProperty( "name" ) );
>>
>> // start the test...
>> index.getNodes( "name", "wachowski" )
>> now returns only larry instead of both nodes.
>>
>> Any ideas? It looks like the index entry for andy is removed as well.
>>
>> Cheers,
>> Sebastian
>>
>>> Message: 4
>>> Date: Fri, 18 Dec 2009 10:16:33 +0100
>>> From: Mattias Persson 
>>> Subject: Re: [Neo] Strange behavior of LuceneFulltextIndexService
>>> To: Neo user discussions 
>>> Message-ID:
>>>       
>>> Content-Type: text/plain; charset=UTF-8
>>>
>>> I've made some changes to make LuceneFulltextIndexService and
>>> LuceneFulltextQueryIndexService behave more natural. So this is the
>>> new (and better) deal (copied from the javadoc, from your example!):
>>>
>>> LuceneFulltextIndexService:
>>>     /**
>>>      * Since this is a "fulltext" index it changes the contract of this 
>>> method
>>>      * slightly. It treats the {...@code value} more like a query in than 
>>> you can
>>>      * query for individual words in your indexed values.
>>>      *
>>>      * So if you've indexed node (1) with value "Andy Wachowski" and node 
>>> (2)
>>>      * with "Larry Wachowski" you can expect this behaviour if you query 
>>> for:
>>>      *
>>>      * o "andy"            --> (1)
>>>      * o "Andy"            --> (1)
>>>      * o "wachowski"       --> (1), (2)
>>>      * o "andy larry"      -->
>>>      * o "larry Wachowski" --> (2)
>>>      * o "wachowski Andy"  --> (1)
>>>      */
>>>
>>> LuceneFulltextQueryIndexService:
>>>     /**
>>>      * Here the {...@code value} is treated as a lucene query,
>>>      * http://lucene.apache.org/java/2_9_1/queryparsersyntax.html
>>>      *
>>>      * So if you've indexed node (1) with value "Andy Wachowski" and node 
>>> (2)
>>>      * with "Larry Wachowski" you can expect this behaviour if you query 
>>> for:
>>>      *
>>>      * o "andy"            --> (1)
>>>      * o "Andy"            --> (1)
>>>      * o "wachowski"       --> (1), (2)
>>>      * o "andy AND larry"  -->
>>>      * o "andy OR larry"   --> (1), (2)
>>>      * o "larry Wachowski" --> (1), (2) // lucene's default operator is OR
>>>      *
>>>      * The default AND/OR behaviour can be changed by overriding
>>>      * {...@link #getDefaultQueryOperator(String, Object)}.
>>>      */
>>>
>>>
>>> Does this make more sense?
>>>
>>> 2009/12

Re: [Neo] Strange behavior of LuceneFulltextIndexService

2009-12-21 Thread Mattias Persson
Ok great, I'll look into this error and see if I can locate that bug.

2009/12/21 Sebastian Stober :
> Hello Mattias,
>
> thank you for your quick reply. The new behavior you describe looks like
> what I would expect. (I think fulltext queries should generally be
> treated case-insensitive)
>
> The original junit-test now completes without error. However, there
> still seems to be something odd.
>
> If I modify the setup code (before I run any test queries) like this,
> the LuceneFulltextIndexService is messed up:
>
> // using LuceneFulltextIndexService
>
> andy.setProperty( "name", "Andy Wachowski" );
> andy.setProperty( "title", "Director" );
> //      larry.setProperty( "name", "Larry Wachowski" ); //old
> larry.setProperty( "name", "Andy Wachowski" ); //new(deliberately wrong)
> larry.setProperty( "title", "Director" );
> index.index( andy, "name", andy.getProperty( "name" ) );
> index.index( andy, "title", andy.getProperty( "title" ) );
> index.index( larry, "name", larry.getProperty( "name" ) );
> index.index( larry, "title", larry.getProperty( "title" ) );
>
> // new: fixing the name of larry
> index.removeIndex( larry, "name", larry.getProperty( "name" ) );
> larry.setProperty( "name", "Larry Wachowski" );
> index.index( larry, "name", larry.getProperty( "name" ) );
>
> // start the test...
> index.getNodes( "name", "wachowski" )
> now returns only larry instead of both nodes.
>
> Any ideas? It looks like the index entry for andy is removed as well.
>
> Cheers,
> Sebastian
>
>> Message: 4
>> Date: Fri, 18 Dec 2009 10:16:33 +0100
>> From: Mattias Persson 
>> Subject: Re: [Neo] Strange behavior of LuceneFulltextIndexService
>> To: Neo user discussions 
>> Message-ID:
>>       
>> Content-Type: text/plain; charset=UTF-8
>>
>> I've made some changes to make LuceneFulltextIndexService and
>> LuceneFulltextQueryIndexService behave more natural. So this is the
>> new (and better) deal (copied from the javadoc, from your example!):
>>
>> LuceneFulltextIndexService:
>>     /**
>>      * Since this is a "fulltext" index it changes the contract of this 
>> method
>>      * slightly. It treats the {...@code value} more like a query in than 
>> you can
>>      * query for individual words in your indexed values.
>>      *
>>      * So if you've indexed node (1) with value "Andy Wachowski" and node (2)
>>      * with "Larry Wachowski" you can expect this behaviour if you query for:
>>      *
>>      * o "andy"            --> (1)
>>      * o "Andy"            --> (1)
>>      * o "wachowski"       --> (1), (2)
>>      * o "andy larry"      -->
>>      * o "larry Wachowski" --> (2)
>>      * o "wachowski Andy"  --> (1)
>>      */
>>
>> LuceneFulltextQueryIndexService:
>>     /**
>>      * Here the {...@code value} is treated as a lucene query,
>>      * http://lucene.apache.org/java/2_9_1/queryparsersyntax.html
>>      *
>>      * So if you've indexed node (1) with value "Andy Wachowski" and node (2)
>>      * with "Larry Wachowski" you can expect this behaviour if you query for:
>>      *
>>      * o "andy"            --> (1)
>>      * o "Andy"            --> (1)
>>      * o "wachowski"       --> (1), (2)
>>      * o "andy AND larry"  -->
>>      * o "andy OR larry"   --> (1), (2)
>>      * o "larry Wachowski" --> (1), (2) // lucene's default operator is OR
>>      *
>>      * The default AND/OR behaviour can be changed by overriding
>>      * {...@link #getDefaultQueryOperator(String, Object)}.
>>      */
>>
>>
>> Does this make more sense?
>>
>> 2009/12/17 Mattias Persson :
>>> That is indeed a behaviour which needs to be straightened out, I see
>>> that it doesn't behave as expected all the time. I'll look into this
>>> as soon as possible.
>>>
>>> Btw. is it a good idea to have LuceneFulltextIndexService and
>>> LuceneFulltextQueryIndexService be case-insensitive, should it be
>>> configurable or would it be nice with case-sensitivity instead (so
>>> that you'd have to run .toLowerCase(), or something, on your strings
>>> and queries to get such behaviour)?
>>>
>>> 2009/12/17 Sebastian Stober :
>>>> Hello,
>>>>
>>>> I ran into some strange behavior of the LuceneFulltextIndexService in
>>>> the application I am building. So I put together a junit test based on
>>>> the example from
>>>> http://wiki.neo4j.org/content/Indexing_with_IndexService#Wachowski_brothers_example
>>>>
>>>> Here's what I found out using 0.9-SNAPSHOT of index-util (version 0.8
>>>> wasn't any better):
>>>>
>>>> >> snip
>
>
>
> ___
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Neo Technology, www.neotechnology.com
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Strange behavior of LuceneFulltextIndexService

2009-12-21 Thread Sebastian Stober
Hello Mattias,

thank you for your quick reply. The new behavior you describe looks like
what I would expect. (I think fulltext queries should generally be
treated case-insensitive)

The original junit-test now completes without error. However, there
still seems to be something odd.

If I modify the setup code (before I run any test queries) like this,
the LuceneFulltextIndexService is messed up:

// using LuceneFulltextIndexService

andy.setProperty( "name", "Andy Wachowski" );
andy.setProperty( "title", "Director" );
//  larry.setProperty( "name", "Larry Wachowski" ); //old
larry.setProperty( "name", "Andy Wachowski" ); //new(deliberately wrong)
larry.setProperty( "title", "Director" );
index.index( andy, "name", andy.getProperty( "name" ) );
index.index( andy, "title", andy.getProperty( "title" ) );
index.index( larry, "name", larry.getProperty( "name" ) );
index.index( larry, "title", larry.getProperty( "title" ) );

// new: fixing the name of larry
index.removeIndex( larry, "name", larry.getProperty( "name" ) );
larry.setProperty( "name", "Larry Wachowski" );
index.index( larry, "name", larry.getProperty( "name" ) );

// start the test...
index.getNodes( "name", "wachowski" )
now returns only larry instead of both nodes.

Any ideas? It looks like the index entry for andy is removed as well.

Cheers,
Sebastian

> Message: 4
> Date: Fri, 18 Dec 2009 10:16:33 +0100
> From: Mattias Persson 
> Subject: Re: [Neo] Strange behavior of LuceneFulltextIndexService
> To: Neo user discussions 
> Message-ID:
>   
> Content-Type: text/plain; charset=UTF-8
> 
> I've made some changes to make LuceneFulltextIndexService and
> LuceneFulltextQueryIndexService behave more natural. So this is the
> new (and better) deal (copied from the javadoc, from your example!):
> 
> LuceneFulltextIndexService:
> /**
>  * Since this is a "fulltext" index it changes the contract of this method
>  * slightly. It treats the {...@code value} more like a query in than you 
> can
>  * query for individual words in your indexed values.
>  *
>  * So if you've indexed node (1) with value "Andy Wachowski" and node (2)
>  * with "Larry Wachowski" you can expect this behaviour if you query for:
>  *
>  * o "andy"--> (1)
>  * o "Andy"--> (1)
>  * o "wachowski"   --> (1), (2)
>  * o "andy larry"  -->
>  * o "larry Wachowski" --> (2)
>  * o "wachowski Andy"  --> (1)
>  */
> 
> LuceneFulltextQueryIndexService:
> /**
>  * Here the {...@code value} is treated as a lucene query,
>  * http://lucene.apache.org/java/2_9_1/queryparsersyntax.html
>  *
>  * So if you've indexed node (1) with value "Andy Wachowski" and node (2)
>  * with "Larry Wachowski" you can expect this behaviour if you query for:
>  *
>  * o "andy"--> (1)
>  * o "Andy"--> (1)
>  * o "wachowski"   --> (1), (2)
>  * o "andy AND larry"  -->
>  * o "andy OR larry"   --> (1), (2)
>  * o "larry Wachowski" --> (1), (2) // lucene's default operator is OR
>  *
>  * The default AND/OR behaviour can be changed by overriding
>  * {...@link #getDefaultQueryOperator(String, Object)}.
>  */
> 
> 
> Does this make more sense?
> 
> 2009/12/17 Mattias Persson :
>> That is indeed a behaviour which needs to be straightened out, I see
>> that it doesn't behave as expected all the time. I'll look into this
>> as soon as possible.
>>
>> Btw. is it a good idea to have LuceneFulltextIndexService and
>> LuceneFulltextQueryIndexService be case-insensitive, should it be
>> configurable or would it be nice with case-sensitivity instead (so
>> that you'd have to run .toLowerCase(), or something, on your strings
>> and queries to get such behaviour)?
>>
>> 2009/12/17 Sebastian Stober :
>>> Hello,
>>>
>>> I ran into some strange behavior of the LuceneFulltextIndexService in
>>> the application I am building. So I put together a junit test based on
>>> the example from
>>> http://wiki.neo4j.org/content/Indexing_with_IndexService#Wachowski_brothers_example
>>>
>>> Here's what I found out using 0.9-SNAPSHOT of index-util (version 0.8
>>> wasn't any better):
>>>
>>> >> snip



___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Strange behavior of LuceneFulltextIndexService

2009-12-18 Thread Mattias Persson
I've made some changes to make LuceneFulltextIndexService and
LuceneFulltextQueryIndexService behave more natural. So this is the
new (and better) deal (copied from the javadoc, from your example!):

LuceneFulltextIndexService:
/**
 * Since this is a "fulltext" index it changes the contract of this method
 * slightly. It treats the {...@code value} more like a query in than you 
can
 * query for individual words in your indexed values.
 *
 * So if you've indexed node (1) with value "Andy Wachowski" and node (2)
 * with "Larry Wachowski" you can expect this behaviour if you query for:
 *
 * o "andy"--> (1)
 * o "Andy"--> (1)
 * o "wachowski"   --> (1), (2)
 * o "andy larry"  -->
 * o "larry Wachowski" --> (2)
 * o "wachowski Andy"  --> (1)
 */

LuceneFulltextQueryIndexService:
/**
 * Here the {...@code value} is treated as a lucene query,
 * http://lucene.apache.org/java/2_9_1/queryparsersyntax.html
 *
 * So if you've indexed node (1) with value "Andy Wachowski" and node (2)
 * with "Larry Wachowski" you can expect this behaviour if you query for:
 *
 * o "andy"--> (1)
 * o "Andy"--> (1)
 * o "wachowski"   --> (1), (2)
 * o "andy AND larry"  -->
 * o "andy OR larry"   --> (1), (2)
 * o "larry Wachowski" --> (1), (2) // lucene's default operator is OR
 *
 * The default AND/OR behaviour can be changed by overriding
 * {...@link #getDefaultQueryOperator(String, Object)}.
 */


Does this make more sense?

2009/12/17 Mattias Persson :
> That is indeed a behaviour which needs to be straightened out, I see
> that it doesn't behave as expected all the time. I'll look into this
> as soon as possible.
>
> Btw. is it a good idea to have LuceneFulltextIndexService and
> LuceneFulltextQueryIndexService be case-insensitive, should it be
> configurable or would it be nice with case-sensitivity instead (so
> that you'd have to run .toLowerCase(), or something, on your strings
> and queries to get such behaviour)?
>
> 2009/12/17 Sebastian Stober :
>> Hello,
>>
>> I ran into some strange behavior of the LuceneFulltextIndexService in
>> the application I am building. So I put together a junit test based on
>> the example from
>> http://wiki.neo4j.org/content/Indexing_with_IndexService#Wachowski_brothers_example
>>
>> Here's what I found out using 0.9-SNAPSHOT of index-util (version 0.8
>> wasn't any better):
>>
>> // ... setup as in example
>> // LFTIS means LuceneFulltextIndexService and
>> // LFTQIS means LuceneFulltextQueryIndexService
>> // IterableUtils.asVector(resIt) is just some helper method
>>
>> // This will return the andy node.
>> // (Note: for LuceneIndexService)
>> res = index.getSingleNode( "name", "andy wachowski" );
>> assertEquals(andy, res);
>> // LFTIS: null   LFTQIS: [andy, larry]
>>
>> res = index.getSingleNode( "name", "Andy Wachowski" );
>> assertEquals(andy, res);
>> // LFTIS: null   LFTQIS: ok
>>
>> res = index.getSingleNode( "name", "andy" );
>> assertEquals(andy, res);
>> // LFTIS: ok   LFTQIS: ok
>>
>> res = index.getSingleNode( "name", "Andy" );
>> assertEquals(andy, res);
>> // LFTIS: ok   LFTQIS: null
>>
>>
>> // This will return an Iterable containing only the andy node
>> // (Note: for LuceneIndexService)
>> resIt = index.getNodes( "name", "Andy Wachowski");
>> resList = IterableUtils.asVector(resIt);
>> assertEquals(1, resList.size());
>> assertTrue(resList.contains(larry));
>> // LFTIS: []   LFTQIS: []
>>
>> resIt = index.getNodes( "name", "andy wachowski");
>> resList = IterableUtils.asVector(resIt);
>> assertEquals(1, resList.size());
>> assertTrue(resList.contains(larry));
>> // LFTIS: []   LFTQIS: [andy, larry]
>>
>> resIt = index.getNodes( "name", "Andy");
>> resList = IterableUtils.asVector(resIt);
>> assertEquals(1, resList.size());
>> assertTrue(resList.contains(larry));
>> // LFTIS: ok   LFTQIS: []
>>
>> resIt = index.getNodes( "name", "andy" );
>> resList = IterableUtils.asVector(resIt);
>> System.out.println(resList);
>> assertEquals(1, resList.size());
>> assertTrue(resList.contains(andy));
>> // LFTIS: ok   LFTQIS: ok
>>
>>
>> // This will return an Iterable containing both andy and larry
>> // (Note: for LuceneIndexService)
>> resIt = index.getNodes( "title", "Director" );
>> resList = IterableUtils.asVector(resIt);
>> assertEquals(2, resList.size());
>> assertTrue(resList.contains(larry));
>> assertTrue(resList.contains(andy));
>> // LFTIS: ok   LFTQIS: []
>>
>> resIt = index.getNodes( "title", "director" );
>> resList = IterableUtils.asVector(resIt);
>> assertEquals(2, resList.size());
>> assertTrue(resList.contains(larry));
>> assertTrue(resList.contains(andy));
>> // LFTIS: ok   LFTQIS: ok
>>
>>
>> // Will return andy and larry since the fulltext index will find matches
>> on word-level.
>> resIt = index.getNodes( "name", "wachowski" );
>> resList = IterableUtils.asVector(resIt);
>> asse

Re: [Neo] Strange behavior of LuceneFulltextIndexService

2009-12-17 Thread Mattias Persson
That is indeed a behaviour which needs to be straightened out, I see
that it doesn't behave as expected all the time. I'll look into this
as soon as possible.

Btw. is it a good idea to have LuceneFulltextIndexService and
LuceneFulltextQueryIndexService be case-insensitive, should it be
configurable or would it be nice with case-sensitivity instead (so
that you'd have to run .toLowerCase(), or something, on your strings
and queries to get such behaviour)?

2009/12/17 Sebastian Stober :
> Hello,
>
> I ran into some strange behavior of the LuceneFulltextIndexService in
> the application I am building. So I put together a junit test based on
> the example from
> http://wiki.neo4j.org/content/Indexing_with_IndexService#Wachowski_brothers_example
>
> Here's what I found out using 0.9-SNAPSHOT of index-util (version 0.8
> wasn't any better):
>
> // ... setup as in example
> // LFTIS means LuceneFulltextIndexService and
> // LFTQIS means LuceneFulltextQueryIndexService
> // IterableUtils.asVector(resIt) is just some helper method
>
> // This will return the andy node.
> // (Note: for LuceneIndexService)
> res = index.getSingleNode( "name", "andy wachowski" );
> assertEquals(andy, res);
> // LFTIS: null   LFTQIS: [andy, larry]
>
> res = index.getSingleNode( "name", "Andy Wachowski" );
> assertEquals(andy, res);
> // LFTIS: null   LFTQIS: ok
>
> res = index.getSingleNode( "name", "andy" );
> assertEquals(andy, res);
> // LFTIS: ok   LFTQIS: ok
>
> res = index.getSingleNode( "name", "Andy" );
> assertEquals(andy, res);
> // LFTIS: ok   LFTQIS: null
>
>
> // This will return an Iterable containing only the andy node
> // (Note: for LuceneIndexService)
> resIt = index.getNodes( "name", "Andy Wachowski");
> resList = IterableUtils.asVector(resIt);
> assertEquals(1, resList.size());
> assertTrue(resList.contains(larry));
> // LFTIS: []   LFTQIS: []
>
> resIt = index.getNodes( "name", "andy wachowski");
> resList = IterableUtils.asVector(resIt);
> assertEquals(1, resList.size());
> assertTrue(resList.contains(larry));
> // LFTIS: []   LFTQIS: [andy, larry]
>
> resIt = index.getNodes( "name", "Andy");
> resList = IterableUtils.asVector(resIt);
> assertEquals(1, resList.size());
> assertTrue(resList.contains(larry));
> // LFTIS: ok   LFTQIS: []
>
> resIt = index.getNodes( "name", "andy" );
> resList = IterableUtils.asVector(resIt);
> System.out.println(resList);
> assertEquals(1, resList.size());
> assertTrue(resList.contains(andy));
> // LFTIS: ok   LFTQIS: ok
>
>
> // This will return an Iterable containing both andy and larry
> // (Note: for LuceneIndexService)
> resIt = index.getNodes( "title", "Director" );
> resList = IterableUtils.asVector(resIt);
> assertEquals(2, resList.size());
> assertTrue(resList.contains(larry));
> assertTrue(resList.contains(andy));
> // LFTIS: ok   LFTQIS: []
>
> resIt = index.getNodes( "title", "director" );
> resList = IterableUtils.asVector(resIt);
> assertEquals(2, resList.size());
> assertTrue(resList.contains(larry));
> assertTrue(resList.contains(andy));
> // LFTIS: ok   LFTQIS: ok
>
>
> // Will return andy and larry since the fulltext index will find matches
> on word-level.
> resIt = index.getNodes( "name", "wachowski" );
> resList = IterableUtils.asVector(resIt);
> assertEquals(2, resList.size());
> assertTrue(resList.contains(larry));
> assertTrue(resList.contains(andy));
> // LFTIS: ok   LFTQIS: ok
>
>
> // Will return the andy node.
> // Note: Should not work for LFTIS
> resIt = index.getNodes( "name", "wachow* andy" );
> resList = IterableUtils.asVector(resIt);
> System.out.println(resList);
> assertEquals(1, resList.size());
> assertTrue(resList.contains(andy));
> // LFTIS: []   LFTQIS: [andy, larry]
>
>
> My conclusions from this:
> 1. LFTIS works only for single-word queries
> How can I then query for "Andy Wachowski"?
>
> 2. LFTQIS is case sensitive: getNodes() retrieves nothing for "Andy
> Wachowski" while "andy wachowski" finds both, andy AND lary
> on the other hand getSingleNode() works only for "Andy Wachowski"
>
> If this behavior is deliberate, I'd really like the understand the
> semantics behind it. It would also be a good idea to add some more
> documentation on the wiki.
>
> Best regards,
> Sebastian
>
>
> ___
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Neo Technology, www.neotechnology.com
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo] Strange behavior of LuceneFulltextIndexService

2009-12-17 Thread Sebastian Stober
Hello,

I ran into some strange behavior of the LuceneFulltextIndexService in
the application I am building. So I put together a junit test based on
the example from
http://wiki.neo4j.org/content/Indexing_with_IndexService#Wachowski_brothers_example

Here's what I found out using 0.9-SNAPSHOT of index-util (version 0.8
wasn't any better):

// ... setup as in example
// LFTIS means LuceneFulltextIndexService and
// LFTQIS means LuceneFulltextQueryIndexService
// IterableUtils.asVector(resIt) is just some helper method

// This will return the andy node.
// (Note: for LuceneIndexService)
res = index.getSingleNode( "name", "andy wachowski" );
assertEquals(andy, res);
// LFTIS: null   LFTQIS: [andy, larry]

res = index.getSingleNode( "name", "Andy Wachowski" );
assertEquals(andy, res);
// LFTIS: null   LFTQIS: ok

res = index.getSingleNode( "name", "andy" );
assertEquals(andy, res);
// LFTIS: ok   LFTQIS: ok

res = index.getSingleNode( "name", "Andy" );
assertEquals(andy, res);
// LFTIS: ok   LFTQIS: null


// This will return an Iterable containing only the andy node
// (Note: for LuceneIndexService)
resIt = index.getNodes( "name", "Andy Wachowski");
resList = IterableUtils.asVector(resIt);
assertEquals(1, resList.size());
assertTrue(resList.contains(larry));
// LFTIS: []   LFTQIS: []

resIt = index.getNodes( "name", "andy wachowski");
resList = IterableUtils.asVector(resIt);
assertEquals(1, resList.size());
assertTrue(resList.contains(larry));
// LFTIS: []   LFTQIS: [andy, larry]

resIt = index.getNodes( "name", "Andy");
resList = IterableUtils.asVector(resIt);
assertEquals(1, resList.size());
assertTrue(resList.contains(larry));
// LFTIS: ok   LFTQIS: []

resIt = index.getNodes( "name", "andy" );
resList = IterableUtils.asVector(resIt);
System.out.println(resList);
assertEquals(1, resList.size());
assertTrue(resList.contains(andy));
// LFTIS: ok   LFTQIS: ok


// This will return an Iterable containing both andy and larry
// (Note: for LuceneIndexService)
resIt = index.getNodes( "title", "Director" );
resList = IterableUtils.asVector(resIt);
assertEquals(2, resList.size());
assertTrue(resList.contains(larry));
assertTrue(resList.contains(andy));
// LFTIS: ok   LFTQIS: []

resIt = index.getNodes( "title", "director" );
resList = IterableUtils.asVector(resIt);
assertEquals(2, resList.size());
assertTrue(resList.contains(larry));
assertTrue(resList.contains(andy));
// LFTIS: ok   LFTQIS: ok


// Will return andy and larry since the fulltext index will find matches
on word-level.
resIt = index.getNodes( "name", "wachowski" );
resList = IterableUtils.asVector(resIt);
assertEquals(2, resList.size());
assertTrue(resList.contains(larry));
assertTrue(resList.contains(andy));
// LFTIS: ok   LFTQIS: ok


// Will return the andy node.
// Note: Should not work for LFTIS
resIt = index.getNodes( "name", "wachow* andy" );
resList = IterableUtils.asVector(resIt);
System.out.println(resList);
assertEquals(1, resList.size());
assertTrue(resList.contains(andy));
// LFTIS: []   LFTQIS: [andy, larry]


My conclusions from this:
1. LFTIS works only for single-word queries
How can I then query for "Andy Wachowski"?

2. LFTQIS is case sensitive: getNodes() retrieves nothing for "Andy
Wachowski" while "andy wachowski" finds both, andy AND lary
on the other hand getSingleNode() works only for "Andy Wachowski"

If this behavior is deliberate, I'd really like the understand the
semantics behind it. It would also be a good idea to add some more
documentation on the wiki.

Best regards,
Sebastian


___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user