Re: SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread James
Hi Tyler, I think upsert is a good alternative, concise as INSERT and have the valid semantics. Just that user seems rarely use UPSERT either(might because there's no UPDATE in batch big data processing). By *"INSERT will behave differently in batch & stream processing"* I mean, if we use the "IN

Re: SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread James
Hi Jesse, Yeah, I know the insert...select grammar. In my scenario, each of the value column is calculated separately(might calculated from different datasources), so insert...select might not be sufficient. Jesse Anderson 于2017年6月22日周四 下午10:35写道: > If I'm understanding correctly, Hive does that

Re: SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread Tyler Akidau
Calcite appears to have UPSERT support, can we just use that instead? Also, I don't understand your statement that "INSERT will behave differently in batch & stream processing". Can you explain further? -Tyler On Thu, Jun 22, 2017 at 7:35 AM J

Re: SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread Jesse Anderson
If I'm understanding correctly, Hive does that with a insert into followed by a select statement that does the aggregation. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingdataintoHiveTablesfromqueries On Thu, Jun 22, 2017 at 1:32 AM James wrote: >

SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread James
Hi team, I am thinking about a SQL and stream computing related problem, want to hear your opinions. In stream computing, there is a typical case like this: *We want to calculate a big wide result table, which has one rowkey and ten value columns:* *create table result (* *rowkey varchar(127