vvivekiyer opened a new issue, #11130:
URL: https://github.com/apache/pinot/issues/11130
We noticed that index removal threw an exception during segment reload in
certain cases. This error is not deterministic and only occurs for some
segments.
**Stack-trace**
> 2023/07/17 23:33:53.329 INFO [InvertedIndexHandler]
[HelixTaskExecutor-message_handle_thread_25] [pinot-server] [] Creating new
inverted index for segment: <segmentName> column: <columnName>
> 2023/07/17 23:34:23.516 INFO [InvertedIndexHandler]
[HelixTaskExecutor-message_handle_thread_25] [pinot-server] [] Created inverted
index for segment: <segmentName>, column: <columnName>
> 2023/07/17 23:34:23.516 INFO [BloomFilterHandler]
[HelixTaskExecutor-message_handle_thread_25] [pinot-server] [] Removing
existing bloom filter from segment: <segmentName> column: <columnName>
> 2023/07/17 23:34:23.516 INFO [BloomFilterHandler]
[HelixTaskExecutor-message_handle_thread_25] [pinot-server] [] Removed existing
bloom filter from segment: <segmentName>, column: <columnName>
> 2023/07/17 23:34:28.344 ERROR [PinotDataBuffer]
[HelixTaskExecutor-message_handle_thread_25] [pinot-server] [] Caught exception
while mapping file: segmentDir/segment_2023-04-28_2023-04-28_0/v3/columns.psf
from offset: 2480743340 of size: 4049668988 with description:
SingleFileIndexDirectory.segmentDir/segment_2023-04-28_2023-04-28_0/v3/columns.psf.single_file_index.rw..2480743340.4049668988
>
>
>
> java.io.IOException: Bad file descriptor
> at java.io.RandomAccessFile.write0(Native Method) ~[?:?]
> at java.io.RandomAccessFile.write(RandomAccessFile.java:523) ~[?:?]
> at xerial.larray.mmap.MMapBuffer.<init>(MMapBuffer.java:87)
~[larray-mmap-0.4.1.jar:0.4.1]
> at
org.apache.pinot.segment.spi.memory.PinotNonNativeOrderLBuffer.mapFile(PinotNonNativeOrderLBuffer.java:49)
~[pinot-segment-spi-0.13.0-dev-852.jar:0.13.0-dev-852-0e88f54510e58d61ce00337ae01a860368683a73]
> at
org.apache.pinot.segment.spi.memory.PinotDataBuffer.mapFile(PinotDataBuffer.java:196)
[pinot-segment-spi-0.13.0-dev-852.jar:0.13.0-dev-852-0e88f54510e58d61ce00337ae01a860368683a73]
> at
org.apache.pinot.segment.local.segment.store.SingleFileIndexDirectory.mapAndSliceFile(SingleFileIndexDirectory.java:293)
[pinot-segment-local-0.13.0-dev-852.jar:0.13.0-dev-852-0e88f54510e58d61ce00337ae01a860368683a73]
> at
org.apache.pinot.segment.local.segment.store.SingleFileIndexDirectory.mapBufferEntries(SingleFileIndexDirectory.java:264)
[pinot-segment-local-0.13.0-dev-852.jar:0.13.0-dev-852-0e88f54510e58d61ce00337ae01a860368683a73]
Debugging this, it looks like the columns.psf file has incomplete data.
> ❯ ls -lirt
> total 11075616
> 112226639 -rw-r--r-- 1 vvaidyan 101 16 Jul 18 17:50
creation.meta
> 112226641 -rw-r--r-- 1 vvaidyan 101 4779 Jul 18 17:50
metadata.properties
> 112226689 -rw-r--r-- 1 vvaidyan 101 5666876107 Jul 18 17:51 columns.psf
> 112226700 -rw-r--r-- 1 vvaidyan 101 1301 Jul 18 17:51 index_map
But index_map file refers to offset > size of columns.psf
> col1.dictionary.startOffset = 0
> col1.dictionary.size = 728547872
> col1.forward_index.startOffset = 728547872
> col1.forward_index.size = 442462399
> col1.inverted_index.startOffset = 1171010271
> col1.inverted_index.size = 1309733069
> col2.dictionary.startOffset = 2480743340
> col2.dictionary.size = 4049668988
> col2.forward_index.startOffset = 6530412328
Debugging this, it looks like our call to
https://github.com/apache/pinot/blob/8fff7eb70597eb74c05ee788378eb8fa1ed8efaf/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/store/SingleFileIndexDirectory.java#L420
is incorrect.
[FileChannel.transferTo](https://docs.oracle.com/javase/8/docs/api/java/nio/channels/FileChannel.html#transferTo-long-long-java.nio.channels.WritableByteChannel-)
is not guaranteed to transfer the entire data and hence should be called
within a loop.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]