anuragrai16 opened a new pull request, #18885:
URL: https://github.com/apache/pinot/pull/18885

   When the realtime -> immutable conversion path runs with 
`reuseMutableIndex`, `convertMutableSegment` copies the mutable Lucene index to 
the v1 destination and opens the writer with CREATE_OR_APPEND. If a prior 
conversion attempt crashed or was killed mid-merge, the destination can hold 
leftover Lucene segments that FileUtils.copyDirectory preserves. 
CREATE_OR_APPEND then opens the highest segments_N file - which may reference 
the stale segments - and the resulting Lucene index ends up with a different 
document count than the surrounding Pinot segment. At query time 
DocIdTranslator's mapping buffer (sized by the segment's numDocs) throws 
ArrayIndexOutOfBoundsException for orphan Lucene docIDs.
   
   **Changes:**
   - Clean the destination directory before copying in both 
LuceneTextIndexCreator and MultiColumnLuceneTextIndexCreator. - - Add 
regression tests that prime the destination with a force-merged stale index 
(its bumped segments_N counter deterministically survives the copy) and assert 
both creators wipe and rebuild correctly.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to