[ https://issues.apache.org/jira/browse/CALCITE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747228#comment-17747228 ]
LakeShen commented on CALCITE-5871: ----------------------------------- Hi [~grandfisher] ,can you explain the problem you described through an SQL example? I think it will help people understand the problem better > Data distributions need to be combined and represented. > ------------------------------------------------------- > > Key: CALCITE-5871 > URL: https://issues.apache.org/jira/browse/CALCITE-5871 > Project: Calcite > Issue Type: Improvement > Components: core > Reporter: grandfisher > Priority: Major > > For a distributed partition database, the data may be partitioned by time, > and also hash partitioned by the `region` field. > If there is agg that aggregate on "(Day,Region)", It's hard to show AGG rel > distribution.(range(Day) hash(region)) > And for another hash shuffle join case `( L join R on L.a=R.c and L.b =R.d > ) as T` , now T has satisfy two distributions, one is Hash(a,b) and another > is Hash(c,d), it's not Hash(a,b,c,d). But we must lost one of them because > the Reldistribution can only has one distribution. > We think this is common in time-series distributed databases -- This message was sent by Atlassian Jira (v8.20.10#820010)