Yikes! That's way too many files. Have you changed mergeFactor? Or implemented a custom DeletionPolicy or MergePolicy?

Or... does anyone know of something else in Solr's configuration that could lead to such an insane number of files?

Mike

Uwe Klosa wrote:

There are around 35.000 files in the index. When I started Indexing 5 weeks ago with only 2000 documents I did not this issue. I have seen it the first
time with around 10.000 documents.

Before that I have been using the same instance on a Linux machine with up to 17.000 documents and I haven't seen this issue at all. The original plan has always been to use Solr on Linux, but I'm still waiting for the new
server.

Uwe

On Sat, Oct 4, 2008 at 12:06 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:


Hmm OK that seems like a possible explanation then. Still it's spooky that it's taking 5 minutes. How many files are in the index at the time you call
commit?

I wonder if you were to simply pause for say 30 seconds, before issuing the commit, whether you'd then see the commit go faster? On Windows at least such a silly trick does seem to improve performance, I think because it allows the OS to move the bytes from its write cache onto stable storage "on its own schedule" whereas when we commit we are demanding the OS move the
bytes on our [arbitrary] schedule.

I really wish OSs would add an API that would just block & return once the file has made it to stable storage (letting the OS sync on its own optimal
schedule), rather than demanding the file be fsync'd immediately.

I really haven't explored the performance of fsync on different
filesystems. I think I've read that ReiserFS may have issues, though it could have been addressed by now. I *believe* ext3 is OK (at least, it didn't show the strange "sleep to get better performance" issue above, in my
limited testing).

Mike


Uwe Klosa wrote:

Thanks Mike

The use of fsync() might be the answer to my problem, because I have
installed Solr for lack of other possibilities in a zone on Solaris with
ZFS
which slows down when many fsync() calls are made. This will be fixed in a upcoming release of Solaris, but I will move as soon as possible the Solr instances to another server with a different file system. Would the use of
a
different file system than ext3 boost the performance?

Uwe

On Fri, Oct 3, 2008 at 8:28 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:


Yonik Seeley wrote:

On Fri, Oct 3, 2008 at 1:56 PM, Uwe Klosa <[EMAIL PROTECTED]> wrote:


I have a big problem with one of my solr instances. A commit can take
up
to
5 minutes. This time does not depend on the number of documents which
are
updated. The difference for 1 or 100 updated documents is only a few
seconds.


Since Solr's commit logic really hasn't changed, I wonder if this
could be lucene related somehow.


Lucene's commit logic has changed: we now fsync() each file in the index
to
ensure all bytes are on stable storage, before returning.

But I can't imagine that taking 5 minutes, unless there are somehow a
great
many files added to the index?

Uwe, what filesystem are you using?

Yonik, when Solr commits what does it actually do?

Mike




Reply via email to