Re: [I] [discussion] On Upsert Hybrid Tables [pinot]

via GitHub Thu, 11 Jan 2024 14:23:59 -0800


deemoliu commented on issue #12261:
URL: https://github.com/apache/pinot/issues/12261#issuecomment-1888065455


   imo, the solution for different customer depends on the requirements and 
ability of each customer. 
   
   - currently open source doesn’t constraint the usage of hybrid table for 
upsert. you can setup hybrid table with realtime upsert table and compacted 
offline table with the same name.
   - In this use case the offline part we do not upserting based on Pinot. It’s 
very expensive to host the offline part PK indexes into the memory. The 
customer is doing upserting in the spark job.
   - Since the algorithm is different in the realtime part and offline part, 
they customer is willing to merge one records from realtime part and one record 
from offline part for their final result.
   
   We are happy to gather feedback and concerns from open source.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] [discussion] On Upsert Hybrid Tables [pinot]

Reply via email to