If I'm understanding correctly, Hive does that with a insert into followed
by a select statement that does the aggregation.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingdataintoHiveTablesfromqueries

On Thu, Jun 22, 2017 at 1:32 AM James <xumingmi...@gmail.com> wrote:

> Hi team,
>
> I am thinking about a SQL and stream computing related problem, want to
> hear your opinions.
>
> In stream computing, there is a typical case like this:
>
> *We want to calculate a big wide result table, which has one rowkey and ten
> value columns:*
> *create table result (*
> *    rowkey varchar(127) PRIMARY KEY,*
> *    col1 int,*
> *    col2 int,*
> *    ...*
> *    col10 int*
> *);*
>
> Each of the value columns is calculated by a complex query, so there will
> be ten SQLs to calculate
> data for this table, for each sql:
>
> * First check whether there is a row for the specified `rowkey`.
> * If yes, then `update`, otherwise `insert`.
>
> There is actually a dedicated sql syntax called `MERGE` designed for
> this(SQL2008), a sample usage is:
>
> MERGE INTO result D
>    USING (SELECT rowkey, col1 FROM input WHERE flag = 80) S
>    ON (D.rowkey = S.rowkey)
>    WHEN MATCHED THEN UPDATE SET D.col1 = S.col1
>    WHEN NOT MATCHED THEN INSERT (D.rowkey, D.col1)
>
>
> *The semantic fits perfectly, but it is very verbose, and normal users
> rarely used this syntax.*
>
> So my colleagues invented a new syntax for this scenario (Or more
> precisely, a new interpretation for the INSERT statement). For the above
> scenario, user will always write `insert` statement:
>
> insert into result(rowkey, col1) values(...)
> insert into result(rowkey, col2) values(...)
>
> The sql interpreter will do a trick behind the scene: if the `rowkey`
> exists, then update, otherwise `insert`. This solution is very concise, but
> violates the semantics of `insert`, using this solution INSERT will behave
> differently in batch & stream processing.
>
> How do you guys think? which do you prefer? What's your reasoning?
>
> Looking forward to your opinions, thanks in advance.
>
-- 
Thanks,

Jesse

Reply via email to