Re: [I] Upsert table backfill enhancement: support externally partitioned data [pinot]
rohityadav1993 commented on issue #12987: URL: https://github.com/apache/pinot/issues/12987#issuecomment-2075075108 Another approach I believe can be utlized is defining a naming convention for uploaded segment similar to LLC. The segment name can capture the partition id. We already have a segment type as UPLOADED and `SegmentPartitionMetadataManager#getPartitionId` can be enhanced to extract partition id from name. This would not require any changes to existing contracts or zk metadata. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
Re: [I] Upsert table backfill enhancement: support externally partitioned data [pinot]
tibrewalpratik17 commented on issue #12987: URL: https://github.com/apache/pinot/issues/12987#issuecomment-2071920309 > Provide partition id externally: Option 1: Provide partition id as http headers during segment upload Option 2: Provide partition id as part of uploaded segment metadata(not as columnPartitionMap) (metadata.properties) IMO if we go for option-2, then we should be consistent to add this / update this metadata for all present segments too. Option-1 is better in that aspect as we already pass a lot of info as headers during segment upload and use each header as more of a config. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
Re: [I] Upsert table backfill enhancement: support externally partitioned data [pinot]
Jackie-Jiang commented on issue #12987: URL: https://github.com/apache/pinot/issues/12987#issuecomment-2071013541 For real-time ingested data, the partition must match the upstream partition id to ensure the upsert assumption of all data of the same partition served by the same server, and I don't think we can loose this requirement. `Partition function` is required for partition pruning. If partition pruning is not required, then we may allow custom partition id without a partition function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org