[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework
mcvsubbu commented on a change in pull request #6021: URL: https://github.com/apache/incubator-pinot/pull/6021#discussion_r489089813 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentMapper.java ## @@ -100,8 +110,11 @@ public void map() } // Partitioning - // TODO: 2 step partitioner. 1) Apply custom partitioner 2) Apply table config partitioner. Combine both to get final partition. - String partition = _partitioner.getPartition(reusableRow); + int p = 0; + for (Partitioner partitioner : _partitioners) { +partitions[p++] = partitioner.getPartition(reusableRow); + } + String partition = StringUtil.join("_", partitions); Review comment: The `"_"` here is very significant, right? It cannot be changed, and has to be used the same way across multiple components. Could you please declare it as. a final string in some Constants class as a partition separator or something? And then re-use in tests thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework
mcvsubbu commented on a change in pull request #6021: URL: https://github.com/apache/incubator-pinot/pull/6021#discussion_r489138013 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentMapper.java ## @@ -100,8 +110,11 @@ public void map() } // Partitioning - // TODO: 2 step partitioner. 1) Apply custom partitioner 2) Apply table config partitioner. Combine both to get final partition. - String partition = _partitioner.getPartition(reusableRow); + int p = 0; + for (Partitioner partitioner : _partitioners) { +partitions[p++] = partitioner.getPartition(reusableRow); + } + String partition = StringUtil.join("_", partitions); Review comment: We are partitioning the data, and the brokers have to construct the same partition id in the same order of columns and with the same partition function right? Also, what is the use case for partitioning on more than one column? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework
mcvsubbu commented on a change in pull request #6021: URL: https://github.com/apache/incubator-pinot/pull/6021#discussion_r489551646 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentMapper.java ## @@ -100,8 +110,11 @@ public void map() } // Partitioning - // TODO: 2 step partitioner. 1) Apply custom partitioner 2) Apply table config partitioner. Combine both to get final partition. - String partition = _partitioner.getPartition(reusableRow); + int p = 0; + for (Partitioner partitioner : _partitioners) { +partitions[p++] = partitioner.getPartition(reusableRow); + } + String partition = StringUtil.join("_", partitions); Review comment: Maybe I am missing something, I will try to find it in the design document. 1. It seems to me that there may be at most 2 partitioners, so is making that a Pair better? Or, there should be a comment and an assert some place that the size is two. 2. The word partition confused me into thinking that we are somehow respecting partitioning of data using a partition key with underscores (with brokers constructing some partition keys). That is clearly not the case here. Maybe rename this as a splitKey instead of partition? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org