Title: [jdjlist] Re: String List Compare

Nothing.  Except I have to do 45000 of these... and the data is quite dirty as I metioned, and I am in no hurry to try and clean it up... and write the validation/cleanup routine for the new data... You can say I am somewhat lazy. hehehehe

I think of all the Java String operations I've tested, indexOf is nether the slowest, nor the fastest, clocking in at 15 miliseconds.  To contrast, String.equalsIgnoreCase is a hog, at 28 miliseconds, and String.equals is an acceptable 7 miliseconds, which is still far away from the coveted int == int is only 3 miliseconds.  (on my test strings, your results will vary).

However, I wanted to make sure I don't miss something more obvious. For example, SQL LIKE is a great idea, however, my DB data structure may unfortunately prevent me from doing this (I have to think some more, maybe I can use it).  I also like the HashTable lookup idea, but java.util.StringTokenizer is so unbelivably slow, that I will have to write one of my own if I am to implement this method. And those perl utilities are priceless!!

Great ideas, everyone, Thank you so much!!!

Greg

-----Original Message-----
From: Barzilai Spinak [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 26, 2003 10:57 AM
To: jdjlist
Subject: [jdjlist] Re: String List Compare


I don't get it... what's wrong with String.indexOf(String)  ?
It's not fast enough for you?  Or is there another problem that you
have not stated?   By what you said the problem seems trivial and I
don't see
a reason to bother with HashSet's or Tokenizers.... unless I didn't
understand your
problem at all and I should smash my computer and transform myself into
a cowboy.

BarZ

David Rosenstrauch wrote:

>How about:
>
>1) use StringTokenizer to break the string into tokens (i.e., 2-letter state
>Strings)
>
>2) put each token into a HashSet
>
>3) query for membership using set.contains("CA");
>
>
>DR
>
>
>On Wednesday 26 March 2003 01:31 pm, Greg Nudelman wrote:

>
>>I have a DB field of user-entered 2-letter states, separated by a comma.
>>
>>
>>states = "CA, OR, TX";
>>
>>or it could be
>>
>>states = " CA,OR,TX, HI ";
>>
>>in other words, spacing is inconsistent, but case seems to be OK.
>>
>>I need to relaibly and FAST! answer:
>>
>>is state = "CA" in states
>>
>>Any ideas?
>>
>>Greg
>>
>>----------------------------------------------
>>
>>P.S. this is what we got so far:
>>
>>1) run a perl script on DB that will remove the extra spacing
>>2) add the flanking commas to both:
>>
>>state = ",CA,";
>>states = ",CA,OR,TX,HI,";
>>
>>3) states.indexOf(state) != -1
>>   
>>



---
You are currently subscribed to jdjlist as: [EMAIL PROTECTED]
To unsubscribe send a blank email to [EMAIL PROTECTED]
http://www.sys-con.com/fusetalk

---
You are currently subscribed to jdjlist as: [EMAIL PROTECTED]
To unsubscribe send a blank email to [EMAIL PROTECTED]
http://www.sys-con.com/fusetalk

Reply via email to