Hi Mike, Windows has unfortunately some crazy limitation on address space, so number of address bits is limited to 43, see my blog post @ https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
That's 8 Terabyte. On Linux this limitation is at 47 bits, and with later kernels and hardware it's even huger so the universe fits into it. 😜 Uwe Am March 15, 2021 7:15:11 PM UTC schrieb Michael McCandless <[email protected]>: >Thanks Rahul. > >> primary reason being that memory mapping multi-terabyte indexes is >not >feasible through mmap > >Hmm, that is interesting -- are you using a 64 bit JVM? If so, what >goes >wrong with such large maps? Lucene's MMapDirectory should chunk the >mapping to deal with ByteBuffer int only address space. > >SimpleFSDirectory usually has substantially worse performance than >MMapDirectory. > >Still, I suspect you would hit the same issue if you used other >FSDirectory >implementations -- the fsync behavior should be the same. > >Mike McCandless > >http://blog.mikemccandless.com > > >On Fri, Mar 12, 2021 at 1:46 PM Rahul Goswami <[email protected]> >wrote: > >> Thanks Michael. For your question...yes I am running Solr on Windows >and >> running it with SimpleFSDirectoryFactory (primary reason being that >memory >> mapping multi-terabyte indexes is not feasible through mmap). I will >create >> a Jira later today with the details in this thread and assign it to >myself. >> Will take a shot at the fix. >> >> Thanks, >> Rahul >> >> On Fri, Mar 12, 2021 at 10:00 AM Michael McCandless < >> [email protected]> wrote: >> >>> I think long ago we used to track which files were actually dirty >(we had >>> written bytes to) and only fsync those ones. But something went >wrong with >>> that, and at some point we "simplified" this logic, I think on the >>> assumption that asking the OS to fsync a file that does in fact >exist yet >>> indeed has not changed would be harmless? But somehow it is not in >your >>> case? Are you on Windows? >>> >>> I tried to do a bit of digital archaeology and remember what >>> happened here, and I came across this relevant looking issue: >>> https://issues.apache.org/jira/browse/LUCENE-2328. That issue moved >>> tracking of which files have been written but not yet fsync'd down >from >>> IndexWriter into FSDirectory. >>> >>> But there was another change that then removed staleFiles from >>> FSDirectory entirely.... still trying to find that. Aha, found it! >>> https://issues.apache.org/jira/browse/LUCENE-6150. Phew Uwe was >really >>> quite upset in that issue ;) >>> >>> I also came across this delightful related issue, showing how a >massive >>> hurricane (Irene) can lead to finding and fixing a bug in Lucene! >>> https://issues.apache.org/jira/browse/LUCENE-3418 >>> >>> > The assumption is that while the commit point is saved, no changes >>> happen to the segment files in the saved generation. >>> >>> This assumption should really be true. Lucene writes the files, >append >>> only, once, and then never changes them, once they are closed. >Pulling a >>> commit point from Solr should further ensure that, even as indexing >>> continues and new segments are written, the old segments referenced >in that >>> commit point will not be deleted. But apparently this "harmless >fsync" >>> Lucene is doing is not so harmless in your use case. Maybe open an >issue >>> and pull out the details from this discussion onto it? >>> >>> Mike McCandless >>> >>> http://blog.mikemccandless.com >>> >>> >>> On Fri, Mar 12, 2021 at 9:03 AM Michael Sokolov <[email protected]> >>> wrote: >>> >>>> Also - I should have said - I think the first step here is to write >a >>>> focused unit test that demonstrates the existence of the extra >fsyncs >>>> that we want to eliminate. It would be awesome if you were able to >>>> create such a thing. >>>> >>>> On Fri, Mar 12, 2021 at 9:00 AM Michael Sokolov ><[email protected]> >>>> wrote: >>>> > >>>> > Yes, please go ahead and open an issue. TBH I'm not sure why this >is >>>> > happening - there may be a good reason?? But let's explore it >using an >>>> > issue, thanks. >>>> > >>>> > On Fri, Mar 12, 2021 at 12:16 AM Rahul Goswami ><[email protected]> >>>> wrote: >>>> > > >>>> > > I can create a Jira and assign it to myself if that's ok (?). I >>>> think this can help improve commit performance. >>>> > > Also, to answer your question, we have indexes sometimes going >into >>>> multiple terabytes. Using the replication handler for backup would >mean >>>> requiring a disk capacity more than 2x the index size on the >machine at all >>>> times, which might not be feasible. So we directly back the index >up from >>>> the Solr node to a remote repository. >>>> > > >>>> > > Thanks, >>>> > > Rahul >>>> > > >>>> > > On Thu, Mar 11, 2021 at 4:09 PM Michael Sokolov ><[email protected]> >>>> wrote: >>>> > >> >>>> > >> Well, it certainly doesn't seem necessary to fsync files that >are >>>> > >> unchanged and have already been fsync'ed. Maybe there's an >>>> opportunity >>>> > >> to improve it? On the other hand, support for external >processes >>>> > >> reading Lucene index files isn't likely to become a feature of >>>> Lucene. >>>> > >> You might want to consider using Solr replication to power >your >>>> > >> backup? >>>> > >> >>>> > >> On Thu, Mar 11, 2021 at 2:52 PM Rahul Goswami < >>>> [email protected]> wrote: >>>> > >> > >>>> > >> > Thanks Michael. I thought since this discussion is closer to >the >>>> code than most discussions on the solr-users list, it seemed like a >more >>>> appropriate forum. Will be mindful going forward. >>>> > >> > On your point about new segments, I attached a debugger and >tried >>>> to do a new commit (just pure Solr commit, no backup process >running), and >>>> the code indeed does fsync on a pre-existing segment file. Hence I >was a >>>> bit baffled since it challenged my fundamental understanding that >segment >>>> files once written are immutable, no matter what (unless picked up >for a >>>> merge of course). Hence I thought of reaching out, in case there >are >>>> scenarios where this might happen which I might be unaware of. >>>> > >> > >>>> > >> > Thanks, >>>> > >> > Rahul >>>> > >> > >>>> > >> > On Thu, Mar 11, 2021 at 2:38 PM Michael Sokolov < >>>> [email protected]> wrote: >>>> > >> >> >>>> > >> >> This isn't a support forum; solr-users@ might be more >>>> appropriate. On >>>> > >> >> that list someone might have a better idea about how the >>>> replication >>>> > >> >> handler gets its list of files. This would be a good list >to try >>>> if >>>> > >> >> you wanted to propose a fix for the problem you're having. >But >>>> since >>>> > >> >> you're here -- it looks to me as if IndexWriter indeed >syncs all >>>> "new" >>>> > >> >> files in the current segments being committed; look in >>>> > >> >> IndexWriter.startCommit and SegmentInfos.files. Caveat: (1) >I'm >>>> > >> >> looking at this code for the first time, and (2) things may >have >>>> been >>>> > >> >> different in 7.7.2? Sorry I don't know for sure, but are >you >>>> sure that >>>> > >> >> your backup process is not attempting to copy one of the >new >>>> files? >>>> > >> >> >>>> > >> >> On Thu, Mar 11, 2021 at 1:35 PM Rahul Goswami < >>>> [email protected]> wrote: >>>> > >> >> > >>>> > >> >> > Hello, >>>> > >> >> > Just wanted to follow up one more time to see if this is >the >>>> right form for my question? Or is this suitable for some other >mailing list? >>>> > >> >> > >>>> > >> >> > Best, >>>> > >> >> > Rahul >>>> > >> >> > >>>> > >> >> > On Sat, Mar 6, 2021 at 3:57 PM Rahul Goswami < >>>> [email protected]> wrote: >>>> > >> >> >> >>>> > >> >> >> Hello everyone, >>>> > >> >> >> Following up on my question in case anyone has any idea. >Why >>>> it's important to know this is because I am thinking of allowing >the backup >>>> process to not hold any lock on the index files, which should allow >the >>>> fsync during parallel commits. BUT, in case doing an fsync on >existing >>>> segment files in a saved commit point DOES have an effect, it might >render >>>> the backed up index in a corrupt state. >>>> > >> >> >> >>>> > >> >> >> Thanks, >>>> > >> >> >> Rahul >>>> > >> >> >> >>>> > >> >> >> On Fri, Mar 5, 2021 at 3:04 PM Rahul Goswami < >>>> [email protected]> wrote: >>>> > >> >> >>> >>>> > >> >> >>> Hello, >>>> > >> >> >>> We have a process which backs up the index (Solr 7.7.2) >on a >>>> schedule. The way we do it is we first save a commit point on the >index and >>>> then using Solr's /replication handler, get the list of files in >that >>>> generation. After the backup completes, we release the commit point >(Please >>>> note that this is a separate backup process outside of Solr and not >the >>>> backup command of the /replication handler) >>>> > >> >> >>> The assumption is that while the commit point is saved, >no >>>> changes happen to the segment files in the saved generation. >>>> > >> >> >>> >>>> > >> >> >>> Now the issue... The backup process opens the index >files in >>>> a shared READ mode, preventing writes. This is causing any parallel >commits >>>> to fail as it seems to be complaining about the index files to be >locked by >>>> another process(the backup process). Upon debugging, I see that >fsync is >>>> being called during commit on already existing segment files which >is not >>>> expected. So, my question is, is there any reason for lucene to >call fsync >>>> on already existing segment files? >>>> > >> >> >>> >>>> > >> >> >>> The line of code I am referring to is as below: >>>> > >> >> >>> try (final FileChannel file = >FileChannel.open(fileToSync, >>>> isDir ? StandardOpenOption.READ : StandardOpenOption.WRITE)) >>>> > >> >> >>> >>>> > >> >> >>> in method fsync(Path fileToSync, boolean isDir) of the >class >>>> file >>>> > >> >> >>> >>>> > >> >> >>> >lucene\core\src\java\org\apache\lucene\util\IOUtils.java >>>> > >> >> >>> >>>> > >> >> >>> Thanks, >>>> > >> >> >>> Rahul >>>> > >> >> >>>> > >> >> >>>> >--------------------------------------------------------------------- >>>> > >> >> To unsubscribe, e-mail: [email protected] >>>> > >> >> For additional commands, e-mail: [email protected] >>>> > >> >> >>>> > >> >>>> > >> >>>> >--------------------------------------------------------------------- >>>> > >> To unsubscribe, e-mail: [email protected] >>>> > >> For additional commands, e-mail: [email protected] >>>> > >> >>>> >>>> >--------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [email protected] >>>> For additional commands, e-mail: [email protected] >>>> >>>> -- Uwe Schindler Achterdiek 19, 28357 Bremen https://www.thetaphi.de
