swaminathanmanish commented on code in PR #12220:
URL: https://github.com/apache/pinot/pull/12220#discussion_r1454308821
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/mapper/SegmentMapper.java:
##########
@@ -129,40 +135,30 @@ public Map<String, GenericRowFileManager> map()
private Map<String, GenericRowFileManager> doMap()
throws Exception {
Consumer<Object> observer = _processorConfig.getProgressObserver();
- int totalCount = _recordReaderFileConfigs.size();
- int count = 1;
+ int count = _totalNumRecordReaders - _recordReaderFileConfigs.size() + 1;
GenericRow reuse = new GenericRow();
for (RecordReaderFileConfig recordReaderFileConfig :
_recordReaderFileConfigs) {
- RecordReader recordReader = recordReaderFileConfig._recordReader;
- if (recordReader == null) {
- // We create and use the recordReader here.
- try {
- recordReader =
-
RecordReaderFactory.getRecordReader(recordReaderFileConfig._fileFormat,
recordReaderFileConfig._dataFile,
- recordReaderFileConfig._fieldsToRead,
recordReaderFileConfig._recordReaderConfig);
- mapAndTransformRow(recordReader, reuse, observer, count, totalCount);
- } finally {
- if (recordReader != null) {
- recordReader.close();
- }
- }
- } else {
- mapAndTransformRow(recordReader, reuse, observer, count, totalCount);
+ RecordReader recordReader = recordReaderFileConfig.getRecordReader();
+ boolean shouldMapperTerminate = mapAndTransformRow(recordReader, reuse,
observer, count, _totalNumRecordReaders);
+
+ // Terminate the map phase if intermediate file size has crossed the
threshold.
+ if (shouldMapperTerminate) {
Review Comment:
Could you add a comment on why we terminate without completing the
reader/because we don't close the reader as well and someone reading the code
might think there's a leak.
Ideally I feel SegmentMapper should not used be outside the
SegmentProcessorFramework because they are closely related, but its a public
class.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]