Doron and Erick,

I found the problem which slowed down indexing. It is our NFS file system.

Thanks for help.
Tony

From: "Tony Qian" <[EMAIL PROTECTED]>
Reply-To: java-user@lucene.apache.org
To: java-user@lucene.apache.org
Subject: Re: Index performance
Date: Mon, 16 Apr 2007 14:37:46 +0000


Doron,

I'll try that and let you know the result.

thanks for suggestions.
Tony

From: Doron Cohen <[EMAIL PROTECTED]>
Reply-To: java-user@lucene.apache.org
To: java-user@lucene.apache.org
Subject: Re: Index performance
Date: Thu, 12 Apr 2007 13:40:07 -0700

To cover all possible non-indexing overhead, better measure with something
like this:

   static long indexContents(IndexWriter writer, List storyContentList)
     throws IOException {
     long res = 0;
     if (storyContentList != null && storyContentList.size() != 0) {
         try {
             Iterator itr = storyContentList.iterator();
             while (itr.hasNext()){
                 StoryContents content = (StoryContents) itr.next();
                 res += content.getStoryText().length();
                 res +=
String.valueOf(content.getStoryIdentity()).length();
                 res += String.valueOf(content.getHeadline1()).length();
             }
         }catch(Exception ex){
              System.out.println(" caught a " + ex.getClass() );
         }
         return res;
     }
   }

Doron Cohen/Haifa/[EMAIL PROTECTED] wrote on 12/04/2007 13:26:34:

> > I tried to index it. It took from 7-10 seconds to index about 90
> documents.
>
> That would be around 10 documents per second - way too slow. A Lucene's
> perf test adding 12,000 docs sized similar to your sample doc (1400
> characters) on a not so strong machine shows much faster pace - 146 docs
> per second, or 237 with larger mix-beffered-docs setting:
>
>  Operation     round maxBuf  runCnt  recsPerRun  rec/s  elapsedSec
>  AddDocs_12000     0     10       1       12000  146.3       82.04
>  AddDocs_12000     1   1000       1       12000  237.8       50.45
>
> As Otis suggested, a larger max-buffered docs speeds things up.
> But even without that, pace is 14 times faster than your numbers.
>
> It might be interesting to measure without indexing at all, i.e. modify
the
> method indexContents() to something like:
>
>   static int indexContents(IndexWriter writer, List storyContentList)
>     throws IOException {
>     int res = 0;
>     if (storyContentList != null && storyContentList.size() != 0) {
>         try {
>             Iterator itr = storyContentList.iterator();
>             while (itr.hasNext()){
>                 StoryContents content = (StoryContents) itr.next();
>                 res += content.length();
>             }
>         }catch(Exception ex){
>              System.out.println(" caught a " + ex.getClass() );
>         }
>     }
>     return res;
>   }
>
> This should show if there is other overhead involved, unrelated to
> indexing.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


_________________________________________________________________
The average US Credit Score is 675. The cost to see yours: $0 by Experian. http://www.freecreditreport.com/pm/default.aspx?sc=660600&bcd=EMAILFOOTERAVERAGE


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


_________________________________________________________________
Mortgage refinance is Hot. *Terms. Get a 5.375%* fix rate. Check savings https://www2.nextag.com/goto.jsp?product=100000035&url=%2fst.jsp&tm=y&search=mortgage_text_links_88_h2bbb&disc=y&vers=925&s=4056&p=5117


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to