gortiz commented on code in PR #15591:
URL: https://github.com/apache/pinot/pull/15591#discussion_r2052068409
##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/metadata/ColumnMetadataImpl.java:
##########
@@ -164,10 +172,40 @@ public Set<Integer> getPartitions() {
return _partitions;
}
- @Nullable
@Override
- public Map<IndexType<?, ?, ?>, Long> getIndexSizeMap() {
- return _indexSizeMap;
+ public long getIndexSizeFor(IndexType type) {
+ short indexId = IndexService.getInstance().getNumericId(type);
+ for (int i = 0, n = getIndexTypeSizesCount(); i < n; i++) {
+ if (indexId == getIndexType(i)) {
+ return getIndexSize(i);
+ }
+ }
+
+ return INDEX_NOT_FOUND;
+ }
+
+ // size should be non-negative 48-bit value
+ public void addIndexSize(short indexType, long size) {
+ if (size < 0 || size > SIZE_MASK) {
Review Comment:
Probably copilot is correct here. I'm even thinking whether we should fail
or not in this case.
##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/metadata/ColumnMetadataImpl.java:
##########
@@ -67,14 +70,19 @@ public class ColumnMetadataImpl implements ColumnMetadata {
private final int _totalNumberOfEntries;
private final PartitionFunction _partitionFunction;
private final Set<Integer> _partitions;
- private final Map<IndexType<?, ?, ?>, Long> _indexSizeMap;
+
+ /* List of longs, each encodes:
+ * 2 byte - numeric id of IndexType
+ * 6 byte - index size */
+ private final LongArrayList _indexTypeSizeList;
Review Comment:
Same as in
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/column/PhysicalColumnIndexContainer.java.
This is a change I find interesting, but it makes the code more fragile.
I recommend creating a new class that encapsulates the same logic while
keeping ColumnMetadataImpl and SegmentMetadataImpl clean. In this case, the new
code is not so aggressive in terms of new lines and state management, so
consider this an optional change.
##########
pinot-integration-test-base/src/test/java/org/apache/pinot/integration/tests/ClusterIntegrationTestUtils.java:
##########
@@ -295,7 +295,7 @@ public static void buildSegmentsFromAvro(List<File>
avroFiles, TableConfig table
if (numAvroFiles == 1) {
buildSegmentFromAvro(avroFiles.get(0), tableConfig, schema,
baseSegmentIndex, segmentDir, tarDir);
} else {
- ExecutorService executorService =
Executors.newFixedThreadPool(numAvroFiles);
Review Comment:
nit: Maybe we can use a number proportional to number of cores?
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/column/PhysicalColumnIndexContainer.java:
##########
@@ -39,7 +39,11 @@
public final class PhysicalColumnIndexContainer implements
ColumnIndexContainer {
private static final Logger LOGGER =
LoggerFactory.getLogger(PhysicalColumnIndexContainer.class);
- private final Map<IndexType, IndexReader> _readersByIndex;
+ private static final IndexReader[] EMPTY_READERS = new IndexReader[0];
+
Review Comment:
I like the idea of this change, but the current implementation makes this
class much more complex than required. Could we create a custom class that
implements this _map_ even if it doesn't actually implement java.util.Map
interface?
Even if it is an internal class here, it would make the code easier to
maintain, but I think it may be common to have maps indexed by index type and
given these index types are calculated at java static time, we can offer this
`IndexTypeMap` class as a poor man's EnumMap
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]