[ https://issues.apache.org/jira/browse/LUCENE-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Earwin Burrfoot updated LUCENE-2814: ------------------------------------ Attachment: LUCENE-2814.patch Patch updated to trunk, no nocommits, no *.closeDocStore(), tests pass. SegmentWriteState vs DocumentsWriter bother me. We track flushed files in both, we inconsistently get current segment from both of them. > stop writing shared doc stores across segments > ---------------------------------------------- > > Key: LUCENE-2814 > URL: https://issues.apache.org/jira/browse/LUCENE-2814 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Affects Versions: 3.1, 4.0 > Reporter: Michael McCandless > Assignee: Michael McCandless > Attachments: LUCENE-2814.patch, LUCENE-2814.patch, LUCENE-2814.patch > > > Shared doc stores enables the files for stored fields and term vectors to be > shared across multiple segments. We've had this optimization since 2.1 I > think. > It works best against a new index, where you open an IW, add lots of docs, > and then close it. In that case all of the written segments will reference > slices a single shared doc store segment. > This was a good optimization because it means we never need to merge these > files. But, when you open another IW on that index, it writes a new set of > doc stores, and then whenever merges take place across doc stores, they must > now be merged. > However, since we switched to shared doc stores, there have been two > optimizations for merging the stores. First, we now bulk-copy the bytes in > these files if the field name/number assignment is "congruent". Second, we > now force congruent field name/number mapping in IndexWriter. This means > this optimization is much less potent than it used to be. > Furthermore, the optimization adds *a lot* of hair to > IndexWriter/DocumentsWriter; this has been the source of sneaky bugs over > time, and causes odd behavior like a merge possibly forcing a flush when it > starts. Finally, with DWPT (LUCENE-2324), which gets us truly concurrent > flushing, we can no longer share doc stores. > So, I think we should turn off the write-side of shared doc stores to pave > the path for DWPT to land on trunk and simplify IW/DW. We still must support > reading them (until 5.0), but the read side is far less hairy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org