Re: NumberTools

2008-01-16 Thread mark harwood
o just run up a test rig for the scale of index you need and benchmarking. Cheers Mark - Original Message From: Cam Bazz <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Wednesday, 16 January, 2008 3:37:44 PM Subject: NumberTools Hello, When storing fields to serve

NumberTools

2008-01-16 Thread Cam Bazz
Hello, When storing fields to serve as id's - is it better to use NumberTools.longToString(id) or just store the id as a field? I have noticed when using NumberTools to store number as a string, this makes range queries easier, however - you end up storing a long string. Considering millio

Re: NumberTools - Range Searches

2007-09-18 Thread Mohammad Norouzi
AIL PROTECTED]> wrote: > > Hi, > > > > I'm trying to do Range searches on a numeric field and have come across > the lexicographical behavior, such that > > > > mylength: [0-2] > > > > brings back values such as 1172 > > > > > > I&#

NumberTools - Range Searches

2007-09-18 Thread marc.dumontier
Hi, I'm trying to do Range searches on a numeric field and have come across the lexicographical behavior, such that mylength: [0-2] brings back values such as 1172 I've seen prior posts about the NumberTools class, and have indexed a field using the longToString() meth

Re: two questions about NumberTools

2007-07-29 Thread Mohammad Norouzi
work. field:[20 TO 25] works fine but when > > the > > > both limits of the range have different number of digits the query > > won't > > > work. so the solution is NumberTools. > > > > > 1 - the NumberTools utility has two methods longToString() and > >

Re: two questions about NumberTools

2007-07-16 Thread Mohammad Norouzi
gt; both limits of the range have different number of digits the query won't > work. so the solution is NumberTools. > 1 - the NumberTools utility has two methods longToString() and > stringToLong(). how about the double or float numbers?? I think it should > be ex

Re: two questions about NumberTools

2007-07-16 Thread Dima May
:[20 TO 25] works fine but when the both limits of the range have different number of digits the query won't work. so the solution is NumberTools. 1 - the NumberTools utility has two methods longToString() and stringToLong(). how about the double or float numbers?? I think it should be ext

two questions about NumberTools

2007-07-16 Thread Mohammad Norouzi
work. so the solution is NumberTools. 1 - the NumberTools utility has two methods longToString() and stringToLong(). how about the double or float numbers?? I think it should be extends to support more data types 2 - suppose we are indexing a bunch of information and a field is in long that we need to ap

Re: Numbertools and efficient sorting

2006-06-11 Thread Chris Hostetter
earch about the memory usage and expanding to : more boxes and repost to the list. Don't overlook my first point about using FieldCache with your own IntParser -- that should be just as fast as adding another version of the field that doesn't use NumberTools with the added bonus of

Re: Numbertools and efficient sorting

2006-06-10 Thread Benjamin Stein
On 6/9/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : I have an integer field that I've indexed after converting to a string : using NumberTools.longToString(). : Now I want to sort my results using this field. Everything works when : treating the field as a string, but is very slow and memor

Re: Numbertools and efficient sorting

2006-06-09 Thread Chris Hostetter
: I have an integer field that I've indexed after converting to a string : using NumberTools.longToString(). : Now I want to sort my results using this field. Everything works when : treating the field as a string, but is very slow and memory intensive. : : I want to use INT sorting instead, but

Numbertools and efficient sorting

2006-06-09 Thread Benjamin Stein
these strings can not be parsed back into integers by Java's built in parsing functions, which is presumably what is called when Lucene does its sorting. They require NumberTools.stringToLong(). My plan is to reindex and zero pad the number myself instead of using NumberTools. Is there a way to

Re: NumberTools

2005-03-24 Thread Yonik Seeley
> An int will be stored as a 2 char string which will be sorted "char by char" > so > they will be almost as fast as sorting as integers. John, two problems: 1) Memory consumption - string sorting uses String[] instead of int[] 2) Lucene uses UTF-8 to store strings, and you can't round-trip arbit

Re: NumberTools

2005-03-22 Thread Chuck Williams
Doug Cutting writes (3/22/2005 10:05 AM): Chuck Williams wrote: If there is going to be any generalization to built-in sorting representations, I'd like to suggest two things be included: 1. Fix issue 34028 (delete the one word "final") Done. Thank you! 2. Include a provision for query-time

Re: NumberTools

2005-03-22 Thread John Patterson
Doug Cutting apache.org> writes: > I'd like to see benchmarks that demonstrate the improvement before we > consider including such a patch. You're making a lot of assumptions > about where time is spent performing numeric searching and sorting. > Sort and RangeFilter are already pretty effici

Re: NumberTools

2005-03-22 Thread Doug Cutting
Chuck Williams wrote: If there is going to be any generalization to built-in sorting representations, I'd like to suggest two things be included: 1. Fix issue 34028 (delete the one word "final") Done. 2. Include a provision for query-time parameters Can you provide a proposal? Doug --

Re: NumberTools

2005-03-22 Thread Chuck Williams
John Patterson writes (3/22/2005 12:56 AM): It would be great if this could be incorporated into Lucene as it will make numeric searches much more efficient. I will soon need to store simple geographical data in my index to do a "find the nearest x" type of search. I just added "find the neares

Re: NumberTools

2005-03-22 Thread Doug Cutting
John Patterson wrote: It would be great if this could be incorporated into Lucene as it will make numeric searches much more efficient. I'd like to see benchmarks that demonstrate the improvement before we consider including such a patch. You're making a lot of assumptions about where time is sp

Re: NumberTools

2005-03-22 Thread John Patterson
Chris Hostetter fucit.org> writes: > I haven't worked through the math to prove to myself that your algorithm > is a viable way of expressing any Integer as a 4 byte String; such that > any two Integers sort lexigraphically correct as strings ... but let's > assume that i have, and that it works

Re: NumberTools

2005-03-22 Thread Chris Hostetter
: > I can see in FieldDocSortedHitQueue where the case statement deals with : > the various types of SortField, but at that point it's comparing FieldDoc : > objects whose fields[i] is expected to allready be an "Integer" object. : > where is that "Integer" object parsed from the String value of th

Re: NumberTools

2005-03-21 Thread John Patterson
Chris Hostetter fucit.org> writes: > > So why couldn't a user specified NumberFormat object be used to > convert that string into an Integer? Allowing people to format > their numbers in a way that sorts lexigraphically for Range Filters, > but still get the good Numeric Sot

Re: NumberTools

2005-03-21 Thread Chuck Williams
: One annoyance I have run across is the impedance mismatch between : range queries and sorting. : : If your terms are indexed as standard numbers, then integer sorting : is fast, but range queries don't work (for negative values). If you : format the terms such that range queries work for any i

Re: NumberTools

2005-03-21 Thread Chris Hostetter
: One annoyance I have run across is the impedance mismatch between : range queries and sorting. : : If your terms are indexed as standard numbers, then integer sorting : is fast, but range queries don't work (for negative values). If you : format the terms such that range queries work for any in

Re: NumberTools

2005-03-18 Thread John Patterson
Erik Hatcher ehatchersolutions.com> writes: > Lucene's index works with any String. But, when dealing with numbers > and dates such that range queries work, they need to be formatted in a > way that makes them orderable. What I am suggesting here is storing numeric values as unsigned binary v

Re: NumberTools

2005-03-18 Thread Erik Hatcher
Using a zero-padded number like 0001 is fine for both range queries and for integer-based sorting. Are you finding otherwise? Erik On Mar 18, 2005, at 12:46 PM, Yonik Seeley wrote: There is prefix compression used on term values. So you could pad numbers with lots of leading zeros a

Re: NumberTools

2005-03-18 Thread Yonik Seeley
> There is prefix compression used on term values. So you could pad > numbers with lots of leading zeros and not incur much additional > size... 0001, for example. Interesting... One annoyance I have run across is the impedance mismatch between range queries and sorting. If your terms a

Re: NumberTools

2005-03-18 Thread Erik Hatcher
On Mar 18, 2005, at 11:21 AM, John Patterson wrote: Because Lucene deals with String's lexicographically ordered. I thought lexographical ordering simply used the Unicode value of the chars and so would also work with non alpa-numeric strings. Lucene's index works with any String. But, when deal

Re: NumberTools

2005-03-18 Thread John Patterson
> Because Lucene deals with String's lexicographically ordered. I thought lexographical ordering simply used the Unicode value of the chars and so would also work with non alpa-numeric strings. > Is there an issue you're encountering? No issue - I will soon need to add a lot of unstored numeric

Re: NumberTools

2005-03-18 Thread Erik Hatcher
Because Lucene deals with String's lexicographically ordered. Is there an issue you're encountering? Erik On Mar 18, 2005, at 4:31 AM, John Patterson wrote: Hi all, I was wondering why NumberTools and DateTools create strings restricted to alpha-numberic val

NumberTools

2005-03-18 Thread John Patterson
Hi all, I was wondering why NumberTools and DateTools create strings restricted to alpha-numberic values? John. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]