Re: [Neo4j] numericRange vs. exact/fulltext index

Balazs E. Pataki Wed, 29 Jun 2011 01:10:53 -0700

Great, thank you!

Another thing I just found in IndexType: for a fulltext index each field 
is stored as fulltext (tokenized into terms) and there's also an "exact" 
field (the same index postfixed with "_e"). I think this feature is not 
documented officially, so my question is, whether we can count on these 
"_e" fields to exist in the lucene index in future versions of neo4j as 
well? Because I would be glad to use it.


In my application I have text content which I index in a "fulltext" 
neo4j/lucene index, but I also have to store it in "exact" form. Without 
knowing about this "_e" feature I had to do this "exact" indexing by 
tweaking the content (replacing spaces with "_" and doing the same with 
search terms). If I could instead depend on "_e" fields, I could get rid 
of my own exact index tweaking and my index would be half as big.

Do you think I can safely depend on these implicit exact indexed fields 
in fulletxt indexes?

Regards,
---
balazs

On 6/29/11 7:44 AM, Mattias Persson wrote:
> Wow, thank you for finding that. Well done!
>
> I'll fix it and if it doesn't break anything else then I'll commit it.
>
> Best,
> Mattias
>
> Den tisdagen den 28:e juni 2011 skrev Balazs E. Pataki<pat...@dsd.sztaki.hu>:
>> Hi Mattias,
>>
>> Thanks for the tip!
>>
>> I started to look around and I think I found something. When "fulltext"
>> type index is created its type will be CustomType (subclass of IndexType
>> - IndexType is used for "exact" indexes) in neo4j. CustomType overrides
>> the addToDocument() of IndexType method, which is the function that
>> actually created a Lucene field.
>>
>> IndexType's looks like this:
>>
>> public void addToDocument( Document document, String key, Object value )
>> {
>>     document.add( instantiateField( key, value, Index.NOT_ANALYZED ) );
>> }
>>
>> CustomType's implementation on teh other hand:
>>
>> @Override
>> public void addToDocument( Document document, String key, Object value )
>> {
>>     document.add( new Field( exactKey( key ), value.toString(),
>> Store.YES, Index.NOT_ANALYZED ) );
>>     document.add( instantiateField( key, value.toString(), Index.ANALYZED
>> ) );
>> }
>>
>> What I can see here is that CustomType's version explicitely converts
>> value to a String and therefore instantiateField won't detect it as a
>> number and will not create a NumericField for it.
>>
>> Could this be the root of the problem?
>>
>> I just replaced 'value.toString()' with 'value', and now my test runs OK
>> (and fulltext search for terms still work beside numeric range queries).
>>
>> Regards,
>> ---
>> balazs
>>
>> On 6/28/11 4:41 PM, Mattias Persson wrote:
>>> Hi Balazs,
>>>
>>> I think the issue could be in lucene, with the mix of the
>>> white-space-tokenizing-analyzer and numeric values. I don't know. What I see
>>> in neo4j is that it treats the values the exact same way, the queries to the
>>> index is exactly the same, but it just doesn't return any values. I think
>>> there needs to be some more googling around this to get more answers.
>>>
>>>
>>> 2011/6/28 Balazs E. Pataki<pat...@dsd.sztaki.hu>
>>>
>>>> Hi,
>>>>
>>>> I'm playing around with indexing and numeric range queries according to
>>>> this documentation:
>>>>
>>>>     http://docs.neo4j.org/chunked/snapshot/indexing-lucene-extras.html
>>>>
>>>> According to my tests numeric range queries
>>>> (QueryContext.numericRange()) only have effect when "exact" type index
>>>> is used.
>>>>
>>>>
>>>> I tried this:
>>>>
>>>> Transaction tx = graphDb.beginTx();
>>>> try {
>>>>
>>>>     Index<Node>    exactIndex = graphDb.index().forNodes("exactIndex",
>>>> MapUtil.stringMap( IndexManager.PROVIDER, "lucene", "type", "exact" ));
>>>>     Index<Node>    fulltextIndex = 
>>>> graphDb.index().forNodes("fulltextIndex",
>>>> MapUtil.stringMap( IndexManager.PROVIDER, "lucene", "type", "fulltext" ));
>>>>
>>>>     Node n1 = graphDb.createNode();
>>>>     n1.setProperty("foo", 5);
>>>>     exactIndex.add(n1, "foo", ValueContext.numeric(5));
>>>>     fulltextIndex.add(n1, "foo", ValueContext.numeric(5));
>>>>
>>>>     Node n2 = graphDb.createNode();
>>>>     n2.setProperty("foo", 25);
>>>>     exactIndex.add(n2, "foo", ValueContext.numeric(25));
>>>>     fulltextIndex.add(n2, "foo", ValueContext.numeric(25));
>>>>
>>>>     // Force commit
>>>>     tx.success();
>>>>     tx.finish();
>>>>     tx = graphDb.beginTx();
>>>>
>>>>     //Search exact
>>>>     QueryContext qctx = QueryContext.numericRange("foo", 3, 25);
>>>>     IndexHits<Node>    hits = exactIndex.query(qctx);
>>>>     Iterator<Node>    it = hits.iterator();
>>>>     while (it.hasNext()) {
>>>>         Node n = it.next();
>>>>         System.out.println("Found foo in exact: "+n+":
>>>> "+n.getProperty("foo"));
>>>>     }
>>>>     assertEquals(2, hits.size());
>>>>
>>>>     //Search fulltext
>>>>     qctx = QueryContext.numericRange("foo", 3, 25);
>>>>     hits = fulltextIndex.query(qctx);
>>>>     it = hits.iterator();
>>>>     while (it.hasNext()) {
>>>>         Node n = it.next();
>>>>         System.out.println("Found foo in fulltext: "+n+":
>>>> "+n.getProperty("foo"));
>>>>     }
>>>>     assertEquals(2, hits.size());
>>>>
>>>>     tx.success();
>>>> } finally {
>>>>     tx.finish();
>>>> }
>>>>
>>>> For the "exact" configured index the range query returns two nodes,
>>>> while in "fulltext" configured index I get no result.
>>>>
>>>> Is there a way to use numeric range queries with fulltext indexes?
>>>>
>>>> Thanks for any hints,
>>>> ---
>>>> balazs
>>>> _______________________________________________
>>>> Neo4j mailing list
>>>> User@lists.neo4j.org
>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>
>>>
>>>
>>>
>> _______________________________________________
>> Neo4j mailing list
>>
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] numericRange vs. exact/fulltext index

Reply via email to