I have seen Trafodion optimizer uses sort, a blocking operator for 
self-referencing updates to provide before image.

Selva 

-----Original Message-----
From: Rohit Jain [mailto:[email protected]] 
Sent: Wednesday, April 12, 2017 1:46 PM
To: [email protected]
Subject: RE: Embedding an insert/select inside an insert/select?

Unless the keys are the same and so is the partitioning, then a sort would be 
efficient, though does that matter as much for HBase as it does a B-tree?

-----Original Message-----
From: Dave Birdsall [mailto:[email protected]] 
Sent: Wednesday, April 12, 2017 3:24 PM
To: [email protected]
Subject: RE: Embedding an insert/select inside an insert/select?

Good thought. I did an experiment. With the 4171 error check commented out, I 
did:

>>prepare s1 from insert into t1 select * from (insert into t2 select * from t1)
+> as tx;

*** WARNING[6008] Statistics for column (A) from table TRAFODION.SEABASE.T2 
were not available. As a result, the access path chosen might not be the best 
possible.

--- SQL command prepared.
>>explain options 'f' s1;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     1.00E+006
7    .    8    esp_exchange                    1:4(hash2)            1.00E+006
5    6    7    tuple_flow                                            1.00E+006
.    .    6    trafodion_insert                T1                    1.00E+000
4    .    5    sort                                                  1.00E+006
2    3    4    nested_join                                           1.00E+006
.    .    3    trafodion_insert                T2                    1.00E+000
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.

I'm guessing this would work correctly because the sort operators would block 
until the scans are done. In fact, that might be why the sort operators are 
there (I had wondered about that in my original e-mail).

Dave


-----Original Message-----
From: Rohit Jain [mailto:[email protected]] 
Sent: Wednesday, April 12, 2017 1:19 PM
To: [email protected]
Subject: RE: Embedding an insert/select inside an insert/select?

Maybe because it will have to verify that none of the outer inserts are into 
the same table(s) as the inner insert(s) so that you don't run into a Halloween 
problem perhaps?

Rohit

-----Original Message-----
From: Dave Birdsall [mailto:[email protected]] 
Sent: Wednesday, April 12, 2017 2:54 PM
To: [email protected]
Subject: Embedding an insert/select inside an insert/select?

Hi,

Trafodion supports a notion of embedded IUD statements. I can, for example, do:

Select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This statement inserts all the rows from t1 into t2, and returns those rows 
where A = 5 as its result.

Just for fun, I tried the following:

Insert into t3 select * from (insert into t2 select * from t1) as tx where tx.a 
= 5;

This fails with the following error:

*** ERROR[4171] Embedded INSERT statements are not supported in INSERT 
statements.

This comes from GenericUpdate::bindNode in optimizer/BindRelExpr.cpp.

Just for fun, I commented out the code that raises this error. I got some 
reasonable plans. For example:

>>prepare ss from insert into t3 select * from (insert into t2 select * from t1)
+> as tx where tx.a = 5;

--- SQL command prepared.
>>explain options 'f' ss;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     5.00E+005
7    .    8    esp_exchange                    1:4(hash2)            5.00E+005
5    6    7    tuple_flow                                            5.00E+005
.    .    6    trafodion_insert                T3                    1.00E+000
4    .    5    sort                                                  5.00E+005
2    3    4    nested_join                                           5.00E+005
.    .    3    trafodion_insert                T2                    9.99E-006
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.
>>

In this example, T1 has 1 million rows. T1, T2 and T3 are simple tables with 
just two integer columns, A and B, primary key A. I had done UPDATE STATISTICS 
on t1 after populating it. I salted each table 4 ways on the primary key.

This looks like a nice parallel plan, though I wonder about the need for the 
sort nodes, since the partitions should line up.

This is a lot of background for my question: Why does the binder forbid such 
statements? Are there some holes in the handling of nested IUDs that need to be 
filled before such a feature is turned on?

Thanks,

Dave


Reply via email to