[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework

2020-09-15 Thread GitBox


mcvsubbu commented on a change in pull request #6021:
URL: https://github.com/apache/incubator-pinot/pull/6021#discussion_r489089813



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentMapper.java
##
@@ -100,8 +110,11 @@ public void map()
   }
 
   // Partitioning
-  // TODO: 2 step partitioner. 1) Apply custom partitioner 2) Apply table 
config partitioner. Combine both to get final partition.
-  String partition = _partitioner.getPartition(reusableRow);
+  int p = 0;
+  for (Partitioner partitioner : _partitioners) {
+partitions[p++] = partitioner.getPartition(reusableRow);
+  }
+  String partition = StringUtil.join("_", partitions);

Review comment:
   The `"_"` here is very significant, right? It cannot be changed, and has 
to be used the same way across multiple components. Could you please declare it 
as. a final string in some Constants class as a partition separator or 
something? 
   
   And then re-use in tests
   
   thanks





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework

2020-09-15 Thread GitBox


mcvsubbu commented on a change in pull request #6021:
URL: https://github.com/apache/incubator-pinot/pull/6021#discussion_r489138013



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentMapper.java
##
@@ -100,8 +110,11 @@ public void map()
   }
 
   // Partitioning
-  // TODO: 2 step partitioner. 1) Apply custom partitioner 2) Apply table 
config partitioner. Combine both to get final partition.
-  String partition = _partitioner.getPartition(reusableRow);
+  int p = 0;
+  for (Partitioner partitioner : _partitioners) {
+partitions[p++] = partitioner.getPartition(reusableRow);
+  }
+  String partition = StringUtil.join("_", partitions);

Review comment:
   We are partitioning the data, and the brokers have to construct the same 
partition id in the same order of columns and  with the same partition function 
right?
   
   Also, what is the use case for partitioning on more than one column?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework

2020-09-16 Thread GitBox


mcvsubbu commented on a change in pull request #6021:
URL: https://github.com/apache/incubator-pinot/pull/6021#discussion_r489551646



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentMapper.java
##
@@ -100,8 +110,11 @@ public void map()
   }
 
   // Partitioning
-  // TODO: 2 step partitioner. 1) Apply custom partitioner 2) Apply table 
config partitioner. Combine both to get final partition.
-  String partition = _partitioner.getPartition(reusableRow);
+  int p = 0;
+  for (Partitioner partitioner : _partitioners) {
+partitions[p++] = partitioner.getPartition(reusableRow);
+  }
+  String partition = StringUtil.join("_", partitions);

Review comment:
   Maybe I am missing something, I will try to find it in the design 
document.
   1. It seems to me that there may be at most 2 partitioners, so is making 
that a Pair better? Or, there should be a comment and an assert some place that 
the size is two.
   2. The word partition confused me into thinking that we are somehow 
respecting partitioning of data using a partition key with underscores (with 
brokers constructing some partition keys). That is clearly not the case here. 
Maybe rename this as a splitKey instead of partition?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org