Thanks Michael. I thought since this discussion is closer to the code than
most discussions on the solr-users list, it seemed like a more appropriate
forum. Will be mindful going forward.
On your point about new segments, I attached a debugger and tried to do a
new commit (just pure Solr commit, no backup process running), and the code
indeed does fsync on a pre-existing segment file. Hence I was a bit baffled
since it challenged my fundamental understanding that segment files once
written are immutable, no matter what (unless picked up for a merge of
course). Hence I thought of reaching out, in case there are scenarios where
this might happen which I might be unaware of.

Thanks,
Rahul

On Thu, Mar 11, 2021 at 2:38 PM Michael Sokolov <[email protected]> wrote:

> This isn't a support forum; solr-users@ might be more appropriate. On
> that list someone might have a better idea about how the replication
> handler gets its list of files. This would be a good list to try if
> you wanted to propose a fix for the problem you're having. But since
> you're here -- it looks to me as if IndexWriter indeed syncs all "new"
> files in the current segments being committed; look in
> IndexWriter.startCommit and SegmentInfos.files. Caveat: (1) I'm
> looking at this code for the first time, and (2) things may have been
> different in 7.7.2? Sorry I don't know for sure, but are you sure that
> your backup process is not attempting to copy one of the new files?
>
> On Thu, Mar 11, 2021 at 1:35 PM Rahul Goswami <[email protected]>
> wrote:
> >
> > Hello,
> > Just wanted to follow up one more time to see if this is the right form
> for my question? Or is this suitable for some other mailing list?
> >
> > Best,
> > Rahul
> >
> > On Sat, Mar 6, 2021 at 3:57 PM Rahul Goswami <[email protected]>
> wrote:
> >>
> >> Hello everyone,
> >> Following up on my question in case anyone has any idea. Why it's
> important to know this is because I am thinking of allowing the backup
> process to not hold any lock on the index files, which should allow the
> fsync during parallel commits. BUT, in case doing an fsync on existing
> segment files in a saved commit point DOES have an effect, it might render
> the backed up index in a corrupt state.
> >>
> >> Thanks,
> >> Rahul
> >>
> >> On Fri, Mar 5, 2021 at 3:04 PM Rahul Goswami <[email protected]>
> wrote:
> >>>
> >>> Hello,
> >>> We have a process which backs up the index (Solr 7.7.2) on a schedule.
> The way we do it is we first save a commit point on the index and then
> using Solr's /replication handler, get the list of files in that
> generation. After the backup completes, we release the commit point (Please
> note that this is a separate backup process outside of Solr and not the
> backup command of the /replication handler)
> >>> The assumption is that while the commit point is saved, no changes
> happen to the segment files in the saved generation.
> >>>
> >>> Now the issue... The backup process opens the index files in a shared
> READ mode, preventing writes. This is causing any parallel commits to fail
> as it seems to be complaining about the index files to be locked by another
> process(the backup process). Upon debugging, I see that fsync is being
> called during commit on already existing segment files which is not
> expected. So, my question is, is there any reason for lucene to call fsync
> on already existing segment files?
> >>>
> >>> The line of code I am referring to is as below:
> >>> try (final FileChannel file = FileChannel.open(fileToSync, isDir ?
> StandardOpenOption.READ : StandardOpenOption.WRITE))
> >>>
> >>> in method fsync(Path fileToSync, boolean isDir) of the class file
> >>>
> >>> lucene\core\src\java\org\apache\lucene\util\IOUtils.java
> >>>
> >>> Thanks,
> >>> Rahul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to