To cover all possible non-indexing overhead, better measure with something like this:
static long indexContents(IndexWriter writer, List storyContentList) throws IOException { long res = 0; if (storyContentList != null && storyContentList.size() != 0) { try { Iterator itr = storyContentList.iterator(); while (itr.hasNext()){ StoryContents content = (StoryContents) itr.next(); res += content.getStoryText().length(); res += String.valueOf(content.getStoryIdentity()).length(); res += String.valueOf(content.getHeadline1()).length(); } }catch(Exception ex){ System.out.println(" caught a " + ex.getClass() ); } return res; } } Doron Cohen/Haifa/[EMAIL PROTECTED] wrote on 12/04/2007 13:26:34: > > I tried to index it. It took from 7-10 seconds to index about 90 > documents. > > That would be around 10 documents per second - way too slow. A Lucene's > perf test adding 12,000 docs sized similar to your sample doc (1400 > characters) on a not so strong machine shows much faster pace - 146 docs > per second, or 237 with larger mix-beffered-docs setting: > > Operation round maxBuf runCnt recsPerRun rec/s elapsedSec > AddDocs_12000 0 10 1 12000 146.3 82.04 > AddDocs_12000 1 1000 1 12000 237.8 50.45 > > As Otis suggested, a larger max-buffered docs speeds things up. > But even without that, pace is 14 times faster than your numbers. > > It might be interesting to measure without indexing at all, i.e. modify the > method indexContents() to something like: > > static int indexContents(IndexWriter writer, List storyContentList) > throws IOException { > int res = 0; > if (storyContentList != null && storyContentList.size() != 0) { > try { > Iterator itr = storyContentList.iterator(); > while (itr.hasNext()){ > StoryContents content = (StoryContents) itr.next(); > res += content.length(); > } > }catch(Exception ex){ > System.out.println(" caught a " + ex.getClass() ); > } > } > return res; > } > > This should show if there is other overhead involved, unrelated to > indexing. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]