It might be worth looking at Jakarta Lucence 
(http://jakarta.apache.org/lucene)

Full text search engine - no point reinventing the wheel

(I've never used it, just about to start for my new ultra-secret project).

Trenton D. Adams wrote:

>Yep, perl is better for big text searches as far as the programmer is concerned, but 
>it's not really all that fast.  After all, regular expressions are literally 
>integrated right into the language.  No function calls.  Much easier to work with.
>
>
>I think it's primarily due to the precompiled nature of java BYTE-CODE.  It's easier 
>to translate byte values to other byte values for native machine instruction than it 
>is for translating straight text such as perl to native machine instructions.
>
>Nope, no database backend.  I used a plain text file.  It has one verse per line.  
>Example below.
>Genesis;01;001;001; In the beginning God created the heaven and the earth.
>
>I am thinking of switching to a database backend though.
>
>I'm using the regular expression package with JDK1.4.0.  Then I have a split routine 
>that I made myself for splitting up the book;booknum;chapter;verse;versetext into 
>their separate parts.  The new String.split () isn't fast enough because it's got the 
>overhead of all possible regular expressions.  In fact, I reduced a search from like 
>1300ms to less than 800ms by switching to my new split.  The reason it's a big 
>improvement is because it's in a loop.  I could do extra programming to work around 
>the fact that it has to be in a loop every time, but I figured it's not worth it.
>
>It's mostly JSP for the HTML part, then I have a bibleSearch method with a simple 
>bean.
>
>
>
>Who knows, maybe it's not actually faster!  Maybe יהוה has made it faster without 
>my knowledge.  LOL
>
>-----Original Message-----
>From: August Detlefsen [mailto:[EMAIL PROTECTED]] 
>Sent: June 14, 2002 5:30 PM
>To: Tomcat Users List
>Subject: RE: Tomcat Performance is good!
>
>
>That is interesting. I always thought perl was better for big text searches...
>
>Did you give it a database backend? 
>What API are you using for the search? 
>Or is your performance improvement strictly due to the precompiled always-on nature 
>of servlets (as opposed to interpreted perl)?
>
>-August
>
>--- Sébastien_Dui <[EMAIL PROTECTED]> wrote:
>  
>
>>Let's hope that it will also work for applications without acts of 
>>god...
>>
>>
>>
>>-----Message d'origine-----
>>De : Trenton D. Adams [mailto:[EMAIL PROTECTED]]
>>Envoyé : sam. 15 juin 2002 0:06
>>À : 'Tomcat Users List'
>>Objet : Tomcat Performance is good!
>>
>>
>>I've recently been doing some fiddling with Tomcat.  I've built a King
>>James Bible search program.  I based it off of my brother's perl
>>bible
>>search program.  It outputs all identical HTML to the perl version. 
>>It
>>was already out performing it by a couple of seconds on one
>>particular
>>search.  The search for perl took about 6 seconds and the search in
>>the
>>J2EE version on tomcat took approx 4 seconds.  With some code
>>efficiency
>>improvements, I've gotten that down to less than 3 seconds.
>>
>>Just thought you might all like to hear some good news about Tomcat.
>>
>>
>>Tomcat 4.0.3
>>Apache 1.3.20
>>Perl 5.6.0
>>
>>
>>--
>>To unsubscribe, e-mail: 
>><mailto:[EMAIL PROTECTED]>
>>For additional commands, e-mail: 
>><mailto:[EMAIL PROTECTED]>
>>
>>--
>>To unsubscribe, e-mail:
>><mailto:[EMAIL PROTECTED]>
>>For additional commands, e-mail:
>><mailto:[EMAIL PROTECTED]>
>>
>>    
>>
>
>
>__________________________________________________
>Do You Yahoo!?
>Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com
>
>--
>To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
>For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
>
>
>
>--
>To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
>For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
>
>
>  
>




--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to