Re: indexing numbers in texts for range queries

2014-12-02 Thread Ahmet Arslan
Hi Mikhail,

Range queries allowed inside phrases with ComplexPhraseQParser, but I think 
string order is used.

Also LUCENE-5205 / SOLR-5410 is meant to supersede complex phrase. It might 
have that functionality too.

Ahmet
 



On Tuesday, December 2, 2014 10:43 PM, Mikhail Khludnev 
 wrote:
Hello Michael,

On Tue, Dec 2, 2014 at 11:15 PM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:

> Mikhail - I can imagine a filter that strips out everything but numbers
> and then indexes those with a (separate) numeric (trie) field.  But I don't
> believe you can do phrase or other proximity queries across multiple
> fields.

Technically it's not a big deal. I used FieldMaskingSpanQuery before.

As long as an or-query is good enough, I think this problem is not too
> hard?  But if you need proximity it becomes more complicated.  Once in the
> distant past we coded a numeric range query using a complicated set of
> wildcard queries that could handle large numbers efficiently - this search
> index (Verity) had no range capability, so we had to mock it up using
> text.  The way this worked was something along these lines:
>
> 1) transform all the numbers into their binary encoding (8 = 0b1000,
> eg)
> 2) write queries by encoding the range as a set of bitmasks represented by
> wildcard queries:
> [8 TO 20] becomes (0b1000 0b000100?? 0b00010100)
>
> I know you said you cannot use [0-9]* terms, but you will not see terrible
> term explosion with this.  What's your concern there?
>
it's not terrible but significant, I wish to make a try with the trie
magic, which reduces query time processing.

Thanks for suggestions.
Do I remember correctly that you ignored last Lucene Revolution?

>
> -Mike
>
>
>
> On 12/02/2014 02:59 PM, Mikhail Khludnev wrote:
>
>> Hello Searchers,
>>
>> Don't you remember any examples of indexing numbers inside of plain text.
>> eg. if I have a text: "foo and 10 bars" I want to find it with a query
>> like
>> foo [8 TO 20] bars.
>> The question no.1 whether to put trie terms into the separate field or
>> they
>> can reside at the same text one? Note, enumerating [0-9]* terms in
>> MultiTermQuery is not an option for me, I definitely need the trie field
>> magic!
>> Perhaps you can remind a blog or chapter, whatever makes me happy.
>>
>> Thanks a lot!
>>
>>
>


-- 
Sincerely yours

Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>



Re: indexing numbers in texts for range queries

2014-12-02 Thread Michael Sokolov


On 12/02/2014 03:41 PM, Mikhail Khludnev wrote:
Thanks for suggestions. Do I remember correctly that you ignored last 
Lucene Revolution?
I wouldn't say I ignored it, but it's true I wasn't there in DC: I'm 
excited to catch up on the presentations as the videos become available, 
though.


-Mike


Re: indexing numbers in texts for range queries

2014-12-02 Thread Mikhail Khludnev
Hello Michael,

On Tue, Dec 2, 2014 at 11:15 PM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:

> Mikhail - I can imagine a filter that strips out everything but numbers
> and then indexes those with a (separate) numeric (trie) field.  But I don't
> believe you can do phrase or other proximity queries across multiple
> fields.

Technically it's not a big deal. I used FieldMaskingSpanQuery before.

As long as an or-query is good enough, I think this problem is not too
> hard?  But if you need proximity it becomes more complicated.  Once in the
> distant past we coded a numeric range query using a complicated set of
> wildcard queries that could handle large numbers efficiently - this search
> index (Verity) had no range capability, so we had to mock it up using
> text.  The way this worked was something along these lines:
>
> 1) transform all the numbers into their binary encoding (8 = 0b1000,
> eg)
> 2) write queries by encoding the range as a set of bitmasks represented by
> wildcard queries:
> [8 TO 20] becomes (0b1000 0b000100?? 0b00010100)
>
> I know you said you cannot use [0-9]* terms, but you will not see terrible
> term explosion with this.  What's your concern there?
>
it's not terrible but significant, I wish to make a try with the trie
magic, which reduces query time processing.

Thanks for suggestions.
Do I remember correctly that you ignored last Lucene Revolution?

>
> -Mike
>
>
>
> On 12/02/2014 02:59 PM, Mikhail Khludnev wrote:
>
>> Hello Searchers,
>>
>> Don't you remember any examples of indexing numbers inside of plain text.
>> eg. if I have a text: "foo and 10 bars" I want to find it with a query
>> like
>> foo [8 TO 20] bars.
>> The question no.1 whether to put trie terms into the separate field or
>> they
>> can reside at the same text one? Note, enumerating [0-9]* terms in
>> MultiTermQuery is not an option for me, I definitely need the trie field
>> magic!
>> Perhaps you can remind a blog or chapter, whatever makes me happy.
>>
>> Thanks a lot!
>>
>>
>


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>



Re: indexing numbers in texts for range queries

2014-12-02 Thread Michael Sokolov
Mikhail - I can imagine a filter that strips out everything but numbers 
and then indexes those with a (separate) numeric (trie) field.  But I 
don't believe you can do phrase or other proximity queries across 
multiple fields.  As long as an or-query is good enough, I think this 
problem is not too hard?  But if you need proximity it becomes more 
complicated.  Once in the distant past we coded a numeric range query 
using a complicated set of wildcard queries that could handle large 
numbers efficiently - this search index (Verity) had no range 
capability, so we had to mock it up using text.  The way this worked was 
something along these lines:


1) transform all the numbers into their binary encoding (8 = 0b1000, eg)
2) write queries by encoding the range as a set of bitmasks represented 
by wildcard queries:

[8 TO 20] becomes (0b1000 0b000100?? 0b00010100)

I know you said you cannot use [0-9]* terms, but you will not see 
terrible term explosion with this.  What's your concern there?


-Mike


On 12/02/2014 02:59 PM, Mikhail Khludnev wrote:

Hello Searchers,

Don't you remember any examples of indexing numbers inside of plain text.
eg. if I have a text: "foo and 10 bars" I want to find it with a query like
foo [8 TO 20] bars.
The question no.1 whether to put trie terms into the separate field or they
can reside at the same text one? Note, enumerating [0-9]* terms in
MultiTermQuery is not an option for me, I definitely need the trie field
magic!
Perhaps you can remind a blog or chapter, whatever makes me happy.

Thanks a lot!





indexing numbers in texts for range queries

2014-12-02 Thread Mikhail Khludnev
Hello Searchers,

Don't you remember any examples of indexing numbers inside of plain text.
eg. if I have a text: "foo and 10 bars" I want to find it with a query like
foo [8 TO 20] bars.
The question no.1 whether to put trie terms into the separate field or they
can reside at the same text one? Note, enumerating [0-9]* terms in
MultiTermQuery is not an option for me, I definitely need the trie field
magic!
Perhaps you can remind a blog or chapter, whatever makes me happy.

Thanks a lot!

-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>



Re: indexing numbers

2011-05-25 Thread Rob Casson
the default schema.xml provided in the Solr distribution is
well-documented, and a good place to get started (including numeric
fieldTypes):

 http://wiki.apache.org/solr/SchemaXml

Lucid Imagination also provides a nice reference guide:

 
http://www.lucidimagination.com/Downloads/LucidWorks-for-Solr/Reference-Guide

hope that helps,
rob

On Wed, May 25, 2011 at 6:20 PM, antoniosi  wrote:
> Hi,
>
> How does solr index a numeric value? Does it index it as a string or does it
> keep it as a numeric value?
>
> Thanks.
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/indexing-numbers-tp2986424p2986424.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


indexing numbers

2011-05-25 Thread antoniosi
Hi,

How does solr index a numeric value? Does it index it as a string or does it
keep it as a numeric value?

Thanks.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-numbers-tp2986424p2986424.html
Sent from the Solr - User mailing list archive at Nabble.com.