Re: Lucene (unexpected ) fsync on existing segments

Uwe Schindler Mon, 15 Mar 2021 12:54:07 -0700

This is not true. Memory mapping does not need to load the index into ram, so 
you don't need so much physical memory. Paging is done only between index files 
and ram, that's what memory mapping is about.


Please read the blog post: 
https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Uwe

Am March 15, 2021 7:43:29 PM UTC schrieb Rahul Goswami <[email protected]>:
>Mike,
>Yes I am using a 64 bit JVM on Windows. I haven't tried reproducing the
>issue on Linux yet. In the past we have had problems with mmap on
>Windows
>with the machine freezing. The rationale I gave to myself is the amount
>of
>disk and CPU activity for paging in and out must be intense for the OS
>while trying to map an index that large into 64 GB of heap. Also since
>it's
>an on-premise deployment, we can't expect the customers of the product
>to
>provide nodes with > 400 GB RAM which is what *I think* would be
>required
>to get a decent performance with mmap. Hence we had to switch to
>SimpleFSDirectory.
>
>As for the fsync behavior, you are right. I tried with
>NRTCachingDirectoryFactory as well which defaults to using mmap
>underneath
>and still makes fsync calls for already existing index files.
>
>Thanks,
>Rahul
>
>On Mon, Mar 15, 2021 at 3:15 PM Michael McCandless <
>[email protected]> wrote:
>
>> Thanks Rahul.
>>
>> > primary reason being that memory mapping multi-terabyte indexes is
>not
>> feasible through mmap
>>
>> Hmm, that is interesting -- are you using a 64 bit JVM?  If so, what
>goes
>> wrong with such large maps?  Lucene's MMapDirectory should chunk the
>> mapping to deal with ByteBuffer int only address space.
>>
>> SimpleFSDirectory usually has substantially worse performance than
>> MMapDirectory.
>>
>> Still, I suspect you would hit the same issue if you used other
>> FSDirectory implementations -- the fsync behavior should be the same.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Fri, Mar 12, 2021 at 1:46 PM Rahul Goswami <[email protected]>
>> wrote:
>>
>>> Thanks Michael. For your question...yes I am running Solr on Windows
>and
>>> running it with SimpleFSDirectoryFactory (primary reason being that
>memory
>>> mapping multi-terabyte indexes is not feasible through mmap). I will
>create
>>> a Jira later today with the details in this thread and assign it to
>myself.
>>> Will take a shot at the fix.
>>>
>>> Thanks,
>>> Rahul
>>>
>>> On Fri, Mar 12, 2021 at 10:00 AM Michael McCandless <
>>> [email protected]> wrote:
>>>
>>>> I think long ago we used to track which files were actually dirty
>(we
>>>> had written bytes to) and only fsync those ones.  But something
>went wrong
>>>> with that, and at some point we "simplified" this logic, I think on
>the
>>>> assumption that asking the OS to fsync a file that does in fact
>exist yet
>>>> indeed has not changed would be harmless?  But somehow it is not in
>your
>>>> case?  Are you on Windows?
>>>>
>>>> I tried to do a bit of digital archaeology and remember what
>>>> happened here, and I came across this relevant looking issue:
>>>> https://issues.apache.org/jira/browse/LUCENE-2328.  That issue
>moved
>>>> tracking of which files have been written but not yet fsync'd down
>from
>>>> IndexWriter into FSDirectory.
>>>>
>>>> But there was another change that then removed staleFiles from
>>>> FSDirectory entirely.... still trying to find that.  Aha, found it!
>>>> https://issues.apache.org/jira/browse/LUCENE-6150.  Phew Uwe was
>really
>>>> quite upset in that issue ;)
>>>>
>>>> I also came across this delightful related issue, showing how a
>massive
>>>> hurricane (Irene) can lead to finding and fixing a bug in Lucene!
>>>> https://issues.apache.org/jira/browse/LUCENE-3418
>>>>
>>>> > The assumption is that while the commit point is saved, no
>changes
>>>> happen to the segment files in the saved generation.
>>>>
>>>> This assumption should really be true.  Lucene writes the files,
>append
>>>> only, once, and then never changes them, once they are closed. 
>Pulling a
>>>> commit point from Solr should further ensure that, even as indexing
>>>> continues and new segments are written, the old segments referenced
>in that
>>>> commit point will not be deleted.  But apparently this "harmless
>fsync"
>>>> Lucene is doing is not so harmless in your use case.  Maybe open an
>issue
>>>> and pull out the details from this discussion onto it?
>>>>
>>>> Mike McCandless
>>>>
>>>> http://blog.mikemccandless.com
>>>>
>>>>
>>>> On Fri, Mar 12, 2021 at 9:03 AM Michael Sokolov
><[email protected]>
>>>> wrote:
>>>>
>>>>> Also - I should have said - I think the first step here is to
>write a
>>>>> focused unit test that demonstrates the existence of the extra
>fsyncs
>>>>> that we want to eliminate. It would be awesome if you were able to
>>>>> create such a thing.
>>>>>
>>>>> On Fri, Mar 12, 2021 at 9:00 AM Michael Sokolov
><[email protected]>
>>>>> wrote:
>>>>> >
>>>>> > Yes, please go ahead and open an issue. TBH I'm not sure why
>this is
>>>>> > happening - there may be a good reason?? But let's explore it
>using an
>>>>> > issue, thanks.
>>>>> >
>>>>> > On Fri, Mar 12, 2021 at 12:16 AM Rahul Goswami
><[email protected]>
>>>>> wrote:
>>>>> > >
>>>>> > > I can create a Jira and assign it to myself if that's ok (?).
>I
>>>>> think this can help improve commit performance.
>>>>> > > Also, to answer your question, we have indexes sometimes going
>into
>>>>> multiple terabytes. Using the replication handler for backup would
>mean
>>>>> requiring a disk capacity more than 2x the index size on the
>machine at all
>>>>> times, which might not be feasible. So we directly back the index
>up from
>>>>> the Solr node to a remote repository.
>>>>> > >
>>>>> > > Thanks,
>>>>> > > Rahul
>>>>> > >
>>>>> > > On Thu, Mar 11, 2021 at 4:09 PM Michael Sokolov
><[email protected]>
>>>>> wrote:
>>>>> > >>
>>>>> > >> Well, it certainly doesn't seem necessary to fsync files that
>are
>>>>> > >> unchanged and have already been fsync'ed. Maybe there's an
>>>>> opportunity
>>>>> > >> to improve it? On the other hand, support for external
>processes
>>>>> > >> reading Lucene index files isn't likely to become a feature
>of
>>>>> Lucene.
>>>>> > >> You might want to consider using Solr replication to power
>your
>>>>> > >> backup?
>>>>> > >>
>>>>> > >> On Thu, Mar 11, 2021 at 2:52 PM Rahul Goswami <
>>>>> [email protected]> wrote:
>>>>> > >> >
>>>>> > >> > Thanks Michael. I thought since this discussion is closer
>to the
>>>>> code than most discussions on the solr-users list, it seemed like
>a more
>>>>> appropriate forum. Will be mindful going forward.
>>>>> > >> > On your point about new segments, I attached a debugger and
>>>>> tried to do a new commit (just pure Solr commit, no backup process
>>>>> running), and the code indeed does fsync on a pre-existing segment
>file.
>>>>> Hence I was a bit baffled since it challenged my fundamental
>understanding
>>>>> that segment files once written are immutable, no matter what
>(unless
>>>>> picked up for a merge of course). Hence I thought of reaching out,
>in case
>>>>> there are scenarios where this might happen which I might be
>unaware of.
>>>>> > >> >
>>>>> > >> > Thanks,
>>>>> > >> > Rahul
>>>>> > >> >
>>>>> > >> > On Thu, Mar 11, 2021 at 2:38 PM Michael Sokolov <
>>>>> [email protected]> wrote:
>>>>> > >> >>
>>>>> > >> >> This isn't a support forum; solr-users@ might be more
>>>>> appropriate. On
>>>>> > >> >> that list someone might have a better idea about how the
>>>>> replication
>>>>> > >> >> handler gets its list of files. This would be a good list
>to
>>>>> try if
>>>>> > >> >> you wanted to propose a fix for the problem you're having.
>But
>>>>> since
>>>>> > >> >> you're here -- it looks to me as if IndexWriter indeed
>syncs
>>>>> all "new"
>>>>> > >> >> files in the current segments being committed; look in
>>>>> > >> >> IndexWriter.startCommit and SegmentInfos.files. Caveat:
>(1) I'm
>>>>> > >> >> looking at this code for the first time, and (2) things
>may
>>>>> have been
>>>>> > >> >> different in 7.7.2? Sorry I don't know for sure, but are
>you
>>>>> sure that
>>>>> > >> >> your backup process is not attempting to copy one of the
>new
>>>>> files?
>>>>> > >> >>
>>>>> > >> >> On Thu, Mar 11, 2021 at 1:35 PM Rahul Goswami <
>>>>> [email protected]> wrote:
>>>>> > >> >> >
>>>>> > >> >> > Hello,
>>>>> > >> >> > Just wanted to follow up one more time to see if this is
>the
>>>>> right form for my question? Or is this suitable for some other
>mailing list?
>>>>> > >> >> >
>>>>> > >> >> > Best,
>>>>> > >> >> > Rahul
>>>>> > >> >> >
>>>>> > >> >> > On Sat, Mar 6, 2021 at 3:57 PM Rahul Goswami <
>>>>> [email protected]> wrote:
>>>>> > >> >> >>
>>>>> > >> >> >> Hello everyone,
>>>>> > >> >> >> Following up on my question in case anyone has any
>idea. Why
>>>>> it's important to know this is because I am thinking of allowing
>the backup
>>>>> process to not hold any lock on the index files, which should
>allow the
>>>>> fsync during parallel commits. BUT, in case doing an fsync on
>existing
>>>>> segment files in a saved commit point DOES have an effect, it
>might render
>>>>> the backed up index in a corrupt state.
>>>>> > >> >> >>
>>>>> > >> >> >> Thanks,
>>>>> > >> >> >> Rahul
>>>>> > >> >> >>
>>>>> > >> >> >> On Fri, Mar 5, 2021 at 3:04 PM Rahul Goswami <
>>>>> [email protected]> wrote:
>>>>> > >> >> >>>
>>>>> > >> >> >>> Hello,
>>>>> > >> >> >>> We have a process which backs up the index (Solr
>7.7.2) on
>>>>> a schedule. The way we do it is we first save a commit point on
>the index
>>>>> and then using Solr's /replication handler, get the list of files
>in that
>>>>> generation. After the backup completes, we release the commit
>point (Please
>>>>> note that this is a separate backup process outside of Solr and
>not the
>>>>> backup command of the /replication handler)
>>>>> > >> >> >>> The assumption is that while the commit point is
>saved, no
>>>>> changes happen to the segment files in the saved generation.
>>>>> > >> >> >>>
>>>>> > >> >> >>> Now the issue... The backup process opens the index
>files
>>>>> in a shared READ mode, preventing writes. This is causing any
>parallel
>>>>> commits to fail as it seems to be complaining about the index
>files to be
>>>>> locked by another process(the backup process). Upon debugging, I
>see that
>>>>> fsync is being called during commit on already existing segment
>files which
>>>>> is not expected. So, my question is, is there any reason for
>lucene to call
>>>>> fsync on already existing segment files?
>>>>> > >> >> >>>
>>>>> > >> >> >>> The line of code I am referring to is as below:
>>>>> > >> >> >>> try (final FileChannel file =
>FileChannel.open(fileToSync,
>>>>> isDir ? StandardOpenOption.READ : StandardOpenOption.WRITE))
>>>>> > >> >> >>>
>>>>> > >> >> >>> in method fsync(Path fileToSync, boolean isDir) of the
>>>>> class file
>>>>> > >> >> >>>
>>>>> > >> >> >>>
>lucene\core\src\java\org\apache\lucene\util\IOUtils.java
>>>>> > >> >> >>>
>>>>> > >> >> >>> Thanks,
>>>>> > >> >> >>> Rahul
>>>>> > >> >>
>>>>> > >> >>
>>>>>
>---------------------------------------------------------------------
>>>>> > >> >> To unsubscribe, e-mail: [email protected]
>>>>> > >> >> For additional commands, e-mail:
>[email protected]
>>>>> > >> >>
>>>>> > >>
>>>>> > >>
>>>>>
>---------------------------------------------------------------------
>>>>> > >> To unsubscribe, e-mail: [email protected]
>>>>> > >> For additional commands, e-mail: [email protected]
>>>>> > >>
>>>>>
>>>>>
>---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [email protected]
>>>>> For additional commands, e-mail: [email protected]
>>>>>
>>>>>

--
Uwe Schindler
Achterdiek 19, 28357 Bremen
https://www.thetaphi.de

Re: Lucene (unexpected ) fsync on existing segments

Reply via email to