Hi Emir, Yes, I am running the merging on a Windows machine. The hard disk is a SSD disk in NTFS file system.
Regards, Edwin On 22 November 2017 at 16:50, Emir Arnautović <emir.arnauto...@sematext.com> wrote: > Hi Edwin, > Quick googling suggests that this is the issue of NTFS related to large > number of file fragments caused by large number of files in one directory > of huge files. Are you running this merging on a Windows machine? > > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 22 Nov 2017, at 02:33, Zheng Lin Edwin Yeo <edwinye...@gmail.com> > wrote: > > > > Hi, > > > > I have encountered this error during the merging of the 3.5TB of index. > > What could be the cause that lead to this? > > > > Exception in thread "main" Exception in thread "Lucene Merge Thread #8" > > java.io. > > > > IOException: background merge hit exception: _6f(6.5.1):C7256757 > > _6e(6.5.1):C646 > > > > 2072 _6d(6.5.1):C3750777 _6c(6.5.1):C2243594 _6b(6.5.1):C1015431 > > _6a(6.5.1):C105 > > > > 0220 _69(6.5.1):c273879 _28(6.4.1):c79011/84:delGen=84 > > _26(6.4.1):c44960/8149:de > > > > lGen=100 _29(6.4.1):c73855/68:delGen=68 _5(6.4.1):C46672/31:delGen=31 > > _68(6.5.1) > > > > :c66 into _6g [maxNumSegments=1] > > > > at > > org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1931) > > > > > > > > at > > org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1871) > > > > > > > > at > > org.apache.lucene.misc.IndexMergeTool.main(IndexMergeTool.java:57) > > > > Caused by: java.io.IOException: The requested operation could not be > > completed d > > > > ue to a file system limitation > > > > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > > > > at sun.nio.ch.FileDispatcherImpl.write(Unknown Source) > > > > at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source) > > > > at sun.nio.ch.IOUtil.write(Unknown Source) > > > > at sun.nio.ch.FileChannelImpl.write(Unknown Source) > > > > at java.nio.channels.Channels.writeFullyImpl(Unknown Source) > > > > at java.nio.channels.Channels.writeFully(Unknown Source) > > > > at java.nio.channels.Channels.access$000(Unknown Source) > > > > at java.nio.channels.Channels$1.write(Unknown Source) > > > > at > > org.apache.lucene.store.FSDirectory$FSIndexOutput$1.write(FSDirectory > > > > .java:419) > > > > at java.util.zip.CheckedOutputStream.write(Unknown Source) > > > > at java.io.BufferedOutputStream.flushBuffer(Unknown Source) > > > > at java.io.BufferedOutputStream.write(Unknown Source) > > > > at > > org.apache.lucene.store.OutputStreamIndexOutput.writeBytes(OutputStre > > > > amIndexOutput.java:53) > > > > at > > org.apache.lucene.store.RateLimitedIndexOutput.writeBytes(RateLimited > > > > IndexOutput.java:73) > > > > at org.apache.lucene.store.DataOutput.writeBytes( > DataOutput.java:52) > > > > at > > org.apache.lucene.codecs.lucene50.ForUtil.writeBlock(ForUtil.java:175 > > > > ) > > > > at > > org.apache.lucene.codecs.lucene50.Lucene50PostingsWriter.addPosition( > > > > Lucene50PostingsWriter.java:286) > > > > at > > org.apache.lucene.codecs.PushPostingsWriterBase.writeTerm(PushPosting > > > > sWriterBase.java:156) > > > > at > > org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.w > > > > rite(BlockTreeTermsWriter.java:866) > > > > at > > org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTr > > > > eeTermsWriter.java:344) > > > > at > > org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105 > > > > ) > > > > at > > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter > > > > .merge(PerFieldPostingsFormat.java:164) > > > > at > > org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:2 > > > > 16) > > > > at > > org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:101) > > > > at > > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4353 > > > > ) > > > > at org.apache.lucene.index.IndexWriter.merge(IndexWriter. > java:3928) > > > > at > > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMe > > > > rgeScheduler.java:624) > > > > at > > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Conc > > > > urrentMergeScheduler.java:661) > > > > org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: > > The req > > > > uested operation could not be completed due to a file system limitation > > > > at > > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException > > > > (ConcurrentMergeScheduler.java:703) > > > > at > > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Conc > > > > urrentMergeScheduler.java:683) > > > > Caused by: java.io.IOException: The requested operation could not be > > completed d > > > > ue to a file system limitation > > > > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > > > > at sun.nio.ch.FileDispatcherImpl.write(Unknown Source) > > > > at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source) > > > > at sun.nio.ch.IOUtil.write(Unknown Source) > > > > at sun.nio.ch.FileChannelImpl.write(Unknown Source) > > > > at java.nio.channels.Channels.writeFullyImpl(Unknown Source) > > > > at java.nio.channels.Channels.writeFully(Unknown Source) > > > > at java.nio.channels.Channels.access$000(Unknown Source) > > > > at java.nio.channels.Channels$1.write(Unknown Source) > > > > Regards, > > Edwin > > > > On 22 November 2017 at 00:10, Zheng Lin Edwin Yeo <edwinye...@gmail.com> > > wrote: > > > >> I am using the IndexMergeTool from Solr, from the command below: > >> > >> java -classpath lucene-core-6.5.1.jar;lucene-misc-6.5.1.jar > >> org.apache.lucene.misc.IndexMergeTool > >> > >> The heap size is 32GB. There are more than 20 million documents in the > two > >> cores. > >> > >> Regards, > >> Edwin > >> > >> > >> > >> On 21 November 2017 at 21:54, Shawn Heisey <apa...@elyograg.org> wrote: > >> > >>> On 11/20/2017 9:35 AM, Zheng Lin Edwin Yeo wrote: > >>> > >>>> Does anyone knows how long usually the merging in Solr will take? > >>>> > >>>> I am currently merging about 3.5TB of data, and it has been running > for > >>>> more than 28 hours and it is not completed yet. The merging is > running on > >>>> SSD disk. > >>>> > >>> > >>> The following will apply if you mean Solr's "optimize" feature when you > >>> say "merging". > >>> > >>> In my experience, merging proceeds at about 20 to 30 megabytes per > second > >>> -- even if the disks are capable of far faster data transfer. Merging > is > >>> not just copying the data. Lucene is completely rebuilding very large > data > >>> structures, and *not* including data from deleted documents as it does > so. > >>> It takes a lot of CPU power and time. > >>> > >>> If we average the data rates I've seen to 25, then that would indicate > >>> that an optimize on a 3.5TB is going to take about 39 hours, and might > take > >>> as long as 48 hours. And if you're running SolrCloud with multiple > >>> replicas, multiply that by the number of copies of the 3.5TB index. An > >>> optimize on a SolrCloud collection handles one shard replica at a time > and > >>> works its way through the entire collection. > >>> > >>> If you are merging different indexes *together*, which a later message > >>> seems to state, then the actual Lucene operation is probably nearly > >>> identical, but I'm not really familiar with it, so I cannot say for > sure. > >>> > >>> Thanks, > >>> Shawn > >>> > >>> > >> > >