Hi Otis:

   Thanks for your reply.

   The problem is the setting of maxBufferedDocs (see the code snippet from
my previous email). I wouldn't think it would affect things but it does.

-John

On 5/30/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:

John,
I can't spot anything wrong in the code.  I assume you are certain there
are no IOException, no issues with locks, and that your writer is really
getting close()d.  I haven't used IndexModifier, and since that's a layer on
top of the real IndexWriter with the same API exposed, I would just use
IndexWriter directly and see if that makes a difference.  If it does, then
maybe there is a bug in IndexModifier.

Otis

----- Original Message ----
From: John Wang <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, May 30, 2006 5:45:39 AM
Subject: merge factor and real time indexing

Hi folks:

    I am working on an application that requires real time indexing, e.g.
for every insert, I open the writer, add a document and then closes the
writer.

    I want to control the number of files created, and according to the
documentation, a small mergeFactor is desired. However, I am experiencing
the opposite, see the following code segment:

public static void main(String[] args) throws IOException{
        int mfactor=10;
        int mbuffer=1000;


        IndexModifier writer=null;
        File dir=new File("/tmp/john/");

        long start=System.currentTimeMillis();
        for (int i=0;i<5000;++i){
            try{
                boolean create=!IndexReader.indexExists(dir);
                writer=new IndexModifier(dir,new
StandardAnalyzer(),create);
                writer.setMergeFactor(mfactor);
                writer.setMaxBufferedDocs(mbuffer);
                    Document doc=new Document();
                    doc.add(new Field("test","this is a test doc",
Field.Store.YES,Field.Index.TOKENIZED,Field.TermVector.YES));
                    writer.addDocument(doc);
            }
            finally{
                if (writer!=null){
                    writer.close();
                }
            }

        }
        long end=System.currentTimeMillis();

        System.out.println("took: "+(end-start));

    }

If I set the mfactor value to a high number, e.g. 1000, indexing takes
much
longer but the number of files decreases dramatically.

Is this expected or are there any better ways of tuning the indexing
parameters so that I limit the number of open files while gettting a
decent
indexing speed?

Thanks

-John




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to