[GitHub] [incubator-pinot] npawar commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework
npawar commented on a change in pull request #6021: URL: https://github.com/apache/incubator-pinot/pull/6021#discussion_r489672177 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentMapper.java ## @@ -100,8 +110,11 @@ public void map() } // Partitioning - // TODO: 2 step partitioner. 1) Apply custom partitioner 2) Apply table config partitioner. Combine both to get final partition. - String partition = _partitioner.getPartition(reusableRow); + int p = 0; + for (Partitioner partitioner : _partitioners) { +partitions[p++] = partitioner.getPartition(reusableRow); + } + String partition = StringUtil.join("_", partitions); Review comment: Practically, for the use case I described, it will be 2. But it need not be (there could be more custom logic). Also the json config spec has List of partitions, so I just continued it as List. All these things are not set in stone as of now. We will be continuosly re-evaluating, optimizing and editing this framework, as we begin using it (for minion, and merge). It is difficult to predict right now and I prefer to not introduce restrictions on number of partitioners. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] npawar commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework
npawar commented on a change in pull request #6021: URL: https://github.com/apache/incubator-pinot/pull/6021#discussion_r489182763 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentMapper.java ## @@ -100,8 +110,11 @@ public void map() } // Partitioning - // TODO: 2 step partitioner. 1) Apply custom partitioner 2) Apply table config partitioner. Combine both to get final partition. - String partition = _partitioner.getPartition(reusableRow); + int p = 0; + for (Partitioner partitioner : _partitioners) { +partitions[p++] = partitioner.getPartition(reusableRow); + } + String partition = StringUtil.join("_", partitions); Review comment: Use case: say data in input segments is spread across 3 days. In the resulting segments, we want to create a segment for each day. Additionally, we want partitioning on some id column for query purposes. Partitioning by time column is first step. This doesn't affect segment metadata or broker routing. This is simply used by the framework, and it's scope ends with the framework. It's merely helping create date aligned input files for Segment generation stage. Partitioning by id column is second step. This is for queries. This will be whatever is in the table config. Only this partition will get set in the segment metadata. And even that will happen during segment creation. See this comment and discussion:https://github.com/apache/incubator-pinot/pull/5934#discussion_r486006754 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] npawar commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework
npawar commented on a change in pull request #6021: URL: https://github.com/apache/incubator-pinot/pull/6021#discussion_r489102318 ## File path: pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentMapper.java ## @@ -100,8 +110,11 @@ public void map() } // Partitioning - // TODO: 2 step partitioner. 1) Apply custom partitioner 2) Apply table config partitioner. Combine both to get final partition. - String partition = _partitioner.getPartition(reusableRow); + int p = 0; + for (Partitioner partitioner : _partitioners) { +partitions[p++] = partitioner.getPartition(reusableRow); + } + String partition = StringUtil.join("_", partitions); Review comment: Actually, it is not significant at all. It can be changed, and is not used by any other components. It won't even matter beyond the scope of that joiner line. And hence I don't think it needs to be scoped out of this class, or even out of this method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org