[jira] Updated: (DERBY-1482) Update triggers on tables with blob columns stream blobs into memory even when the blobs are not referenced/accessed.

Mamta A. Satoor (JIRA) Thu, 27 May 2010 13:21:58 -0700

     [ 
https://issues.apache.org/jira/browse/DERBY-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mamta A. Satoor updated DERBY-1482:
-----------------------------------

    Attachment: derby1482_patch3_diff.txt
                derby1482_patch3_stat.txt

Attaching patch derby1482_patch3_diff.txt which along with the core work also 
takes care of upgrade. Please share your comments on the patch.

Here are the patch comments
DERBY-1482 will improve on Derby's behavior about which columns should be read 
in when a trigger fires. The current behavior of Derby reads all the columns 
from the trigger table whether or not all the columns are needed for trigger 
execution. This can be troublesome when the trigger table has large data LOB 
columns causing Derby to run out of memory. With this patch, Derby will narrow 
down cases where it should be sufficient to read only a subset of columns from 
the trigger table.
eg of such a trigger would be the trigger below which has both trigger columns 
and trigger action columns through old/new transition variables
         * CREATE TRIGGER tr1 AFTER UPDATE OF c12 ON t1
         *    REFERENCING OLD AS oldt NEW AS newt
         *    FOR EACH ROW UPDATE t2 SET c24=oldt.j;
For a trigger like above, Derby knows exactly which columns are needed from the 
trigger table during trigger execution. A note of those columns to read will be 
made in SYSTRIGGERS table.

The general rules for what columns will be read in during a trigger execution 
will be as follows
Rule1)If trigger column information is null, then Derby will read all the 
columns from trigger table into memory irrespective of whether there is any 
trigger action column information. eg triggers
            create trigger tr1 after update on t1 for each row values(1); 
            create trigger tr1 after update on t1 referencing old as oldt for 
each row insert into t2 values(2,oldt.j,-2); 

Rule2)If trigger column information is available but no trigger action column 
information is found and no REFERENCES clause is used for the trigger, then 
read all the columns identified by the trigger column. eg trigger
            create trigger tr1 after update of c1 on t1 for each row values(1); 

Rule3)If trigger column information and trigger action column information both 
are not null, then only those columns will be read into memory. This will be a 
trigger created in 10.7 or higher level db. Because prior to that we did not 
collect trigger action column informatoin. eg
            create trigger tr1 after update of c1 on t1 referencing old as oldt 
for each row insert into t2 values(2,oldt.j,-2); 

Rule4)If trigger column information is available and no trigger action column 
information is found but REFERENCES clause is used for the trigger, then we 
will read all the columns from the trigger table. I believe that this will 
cover soft-upgrade and hard-
upgrade scenario during trigger execution of triggers created pre-10.7. This 
will prevent us from having special logic of soft-upgrade. 
Additionally, this logic makes invalidation of existing triggers unnecessary 
during hard-upgrade. The pre-10.7 created triggers will work just fine even 
though for some triggers, they would have trigger action columns missing. A 
user can choose to drop and recreate such triggers to take advantage of 
DERBY-1482 work which will avoid unnecssary column reads during trigger 
execution if the trigger action column information was available along with 
trigger column info. eg trigger created prior to 10.7
            create trigger tr1 after update of c1 on t1 referencing old as oldt 
for each row insert into t2 values(2,oldt.j,-2);
The Rule4) is there to cover triggers created with pre-10,7 releases but now 
that database has been hard/soft-upgraded to 10.7
or higher version. Prior to 10.7, we did not collect any information about 
trigger action columns. So, both of the 2 kinds of triggers shown above prior 
to 10.7 will not have any trigger action column info on them in SYSTRIGGERS 
table. Without our last rule about what columns should be read in, we will run 
into problems for soft-upgrade or hard-upgrade causing us to not read all the 
required columns.The trigger eg above if created prior to 10.7 mode will have 
trigger column and REFERENCING clause set in SYSTRIGGERS but will not have 
trigger action column info even      though there are columns referenced in 
trigger action through old/new transition variables. This same trigger if 
created in 10.7 or higher      version will have trigger column, REFERENCING 
clause and trigger action column set for it in SYSTRIGGERS. To handle this on a 
hard/soft-upgraded db, we will just choose to read all the columns from trigger 
table if SYSTRIGGER shows trigger column and REFERENCING clause for a trigger 
but no trigger action column. 

Rule5)The only place we will need special code for soft-upgrade is during 
trigger creation. If we are in soft-upgrade mode, we want to make sure that we 
do not save information about trigger action columns in SYSTRIGGERS because the 
releases prior to 10.7 do not understand trigger action column information.
        


Let's see with examples, when the rules described above kick in
Consider following 4 trigger scenarios 
            1) create trigger tr1 after update on t1 for each row values(1); 
            2) create trigger tr1 after update on t1 referencing old as oldt 
for each row insert into t2 values(2,oldt.j,-2); 
            3) create trigger tr1 after update of c1 on t1 for each row 
values(1); 
            4) create trigger tr1 after update of c1 on t1 referencing old as 
oldt for each row insert into t2 values(2,oldt.j,-2); 

And these different triggers could be created in following various combination 
of soft-upgrade, hard-upgrade, brand new db
        a)trigger is getting created in newly created 10.7 db 
        b)trigger already created in the pre-10.7db before soft upgrade 
        c)trigger is getting created while in soft upgrad mode with pre-10.7 db 
        d)trigger already created in the pre-10.7db before hard upgrade 
        e)trigger is getting created after pre-10.7db is hard upgraded 

All the db variations for trigger scenario 1 will be covered by Rule1).
All the db variations for trigger scenario 2 will be covered by Rule1)
All the db variations for trigger scenario 3 will be covered by Rule2).
For trigger scenario 4, different rules will apply depending on db variations
        For 4a), Rule3) will apply
        For 4b), Rule4) will apply
        For 4c), Rule5) will apply
        For 4d), Rule4) will apply
        For 4e), Rule3) will apply


> Update triggers on tables with blob columns stream blobs into memory even 
> when the blobs are not referenced/accessed.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-1482
>                 URL: https://issues.apache.org/jira/browse/DERBY-1482
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.2.1.6
>            Reporter: Daniel John Debrunner
>            Assignee: Mamta A. Satoor
>            Priority: Minor
>         Attachments: derby1482_patch1_diff.txt, derby1482_patch1_stat.txt, 
> derby1482_patch2_diff.txt, derby1482_patch2_stat.txt, 
> derby1482_patch3_diff.txt, derby1482_patch3_stat.txt, 
> derby1482DeepCopyAfterTriggerOnLobColumn.java, derby1482Repro.java, 
> derby1482ReproVersion2.java, junitUpgradeTestFailureWithPatch1.out, 
> TriggerTests_ver1_diff.txt, TriggerTests_ver1_stat.txt
>
>
> Suppose I have 1) a table "t1" with blob data in it, and 2) an UPDATE trigger 
> "tr1" defined on that table, where the triggered-SQL-action for "tr1" does 
> NOT reference any of the blob columns in the table. [ Note that this is 
> different from DERBY-438 because DERBY-438 deals with triggers that _do_ 
> reference the blob column(s), whereas this issue deals with triggers that do 
> _not_ reference the blob columns--but I think they're related, so I'm 
> creating this as subtask to 438 ]. In such a case, if the trigger is fired, 
> the blob data will be streamed into memory and thus consume JVM heap, even 
> though it (the blob data) is never actually referenced/accessed by the 
> trigger statement.
> For example, suppose we have the following DDL:
>     create table t1 (id int, status smallint, bl blob(2G));
>     create table t2 (id int, updated int default 0);
>     create trigger tr1 after update of status on t1 referencing new as n_row 
> for each row mode db2sql update t2 set updated = updated + 1 where t2.id = 
> n_row.id;
> Then if t1 and t2 both have data and we make a call to:
>     update t1 set status = 3;
> the trigger tr1 will fire, which will cause the blob column in t1 to be 
> streamed into memory for each row affected by the trigger. The result is 
> that, if the blob data is large, we end up using a lot of JVM memory when we 
> really shouldn't have to (at least, in _theory_ we shouldn't have to...).
> Ideally, Derby could figure out whether or not the blob column is referenced, 
> and avoid streaming the lob into memory whenever possible (hence this is 
> probably more of an "enhancement" request than a bug)... 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-1482) Update triggers on tables with blob columns stream blobs into memory even when the blobs are not referenced/accessed.

Reply via email to