[
https://issues.apache.org/jira/browse/DERBY-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mamta A. Satoor updated DERBY-1482:
-----------------------------------
Attachment: derby1482_patch3_diff.txt
derby1482_patch3_stat.txt
Attaching patch derby1482_patch3_diff.txt which along with the core work also
takes care of upgrade. Please share your comments on the patch.
Here are the patch comments
DERBY-1482 will improve on Derby's behavior about which columns should be read
in when a trigger fires. The current behavior of Derby reads all the columns
from the trigger table whether or not all the columns are needed for trigger
execution. This can be troublesome when the trigger table has large data LOB
columns causing Derby to run out of memory. With this patch, Derby will narrow
down cases where it should be sufficient to read only a subset of columns from
the trigger table.
eg of such a trigger would be the trigger below which has both trigger columns
and trigger action columns through old/new transition variables
* CREATE TRIGGER tr1 AFTER UPDATE OF c12 ON t1
* REFERENCING OLD AS oldt NEW AS newt
* FOR EACH ROW UPDATE t2 SET c24=oldt.j;
For a trigger like above, Derby knows exactly which columns are needed from the
trigger table during trigger execution. A note of those columns to read will be
made in SYSTRIGGERS table.
The general rules for what columns will be read in during a trigger execution
will be as follows
Rule1)If trigger column information is null, then Derby will read all the
columns from trigger table into memory irrespective of whether there is any
trigger action column information. eg triggers
create trigger tr1 after update on t1 for each row values(1);
create trigger tr1 after update on t1 referencing old as oldt for
each row insert into t2 values(2,oldt.j,-2);
Rule2)If trigger column information is available but no trigger action column
information is found and no REFERENCES clause is used for the trigger, then
read all the columns identified by the trigger column. eg trigger
create trigger tr1 after update of c1 on t1 for each row values(1);
Rule3)If trigger column information and trigger action column information both
are not null, then only those columns will be read into memory. This will be a
trigger created in 10.7 or higher level db. Because prior to that we did not
collect trigger action column informatoin. eg
create trigger tr1 after update of c1 on t1 referencing old as oldt
for each row insert into t2 values(2,oldt.j,-2);
Rule4)If trigger column information is available and no trigger action column
information is found but REFERENCES clause is used for the trigger, then we
will read all the columns from the trigger table. I believe that this will
cover soft-upgrade and hard-
upgrade scenario during trigger execution of triggers created pre-10.7. This
will prevent us from having special logic of soft-upgrade.
Additionally, this logic makes invalidation of existing triggers unnecessary
during hard-upgrade. The pre-10.7 created triggers will work just fine even
though for some triggers, they would have trigger action columns missing. A
user can choose to drop and recreate such triggers to take advantage of
DERBY-1482 work which will avoid unnecssary column reads during trigger
execution if the trigger action column information was available along with
trigger column info. eg trigger created prior to 10.7
create trigger tr1 after update of c1 on t1 referencing old as oldt
for each row insert into t2 values(2,oldt.j,-2);
The Rule4) is there to cover triggers created with pre-10,7 releases but now
that database has been hard/soft-upgraded to 10.7
or higher version. Prior to 10.7, we did not collect any information about
trigger action columns. So, both of the 2 kinds of triggers shown above prior
to 10.7 will not have any trigger action column info on them in SYSTRIGGERS
table. Without our last rule about what columns should be read in, we will run
into problems for soft-upgrade or hard-upgrade causing us to not read all the
required columns.The trigger eg above if created prior to 10.7 mode will have
trigger column and REFERENCING clause set in SYSTRIGGERS but will not have
trigger action column info even though there are columns referenced in
trigger action through old/new transition variables. This same trigger if
created in 10.7 or higher version will have trigger column, REFERENCING
clause and trigger action column set for it in SYSTRIGGERS. To handle this on a
hard/soft-upgraded db, we will just choose to read all the columns from trigger
table if SYSTRIGGER shows trigger column and REFERENCING clause for a trigger
but no trigger action column.
Rule5)The only place we will need special code for soft-upgrade is during
trigger creation. If we are in soft-upgrade mode, we want to make sure that we
do not save information about trigger action columns in SYSTRIGGERS because the
releases prior to 10.7 do not understand trigger action column information.
Let's see with examples, when the rules described above kick in
Consider following 4 trigger scenarios
1) create trigger tr1 after update on t1 for each row values(1);
2) create trigger tr1 after update on t1 referencing old as oldt
for each row insert into t2 values(2,oldt.j,-2);
3) create trigger tr1 after update of c1 on t1 for each row
values(1);
4) create trigger tr1 after update of c1 on t1 referencing old as
oldt for each row insert into t2 values(2,oldt.j,-2);
And these different triggers could be created in following various combination
of soft-upgrade, hard-upgrade, brand new db
a)trigger is getting created in newly created 10.7 db
b)trigger already created in the pre-10.7db before soft upgrade
c)trigger is getting created while in soft upgrad mode with pre-10.7 db
d)trigger already created in the pre-10.7db before hard upgrade
e)trigger is getting created after pre-10.7db is hard upgraded
All the db variations for trigger scenario 1 will be covered by Rule1).
All the db variations for trigger scenario 2 will be covered by Rule1)
All the db variations for trigger scenario 3 will be covered by Rule2).
For trigger scenario 4, different rules will apply depending on db variations
For 4a), Rule3) will apply
For 4b), Rule4) will apply
For 4c), Rule5) will apply
For 4d), Rule4) will apply
For 4e), Rule3) will apply
> Update triggers on tables with blob columns stream blobs into memory even
> when the blobs are not referenced/accessed.
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: DERBY-1482
> URL: https://issues.apache.org/jira/browse/DERBY-1482
> Project: Derby
> Issue Type: Bug
> Components: SQL
> Affects Versions: 10.2.1.6
> Reporter: Daniel John Debrunner
> Assignee: Mamta A. Satoor
> Priority: Minor
> Attachments: derby1482_patch1_diff.txt, derby1482_patch1_stat.txt,
> derby1482_patch2_diff.txt, derby1482_patch2_stat.txt,
> derby1482_patch3_diff.txt, derby1482_patch3_stat.txt,
> derby1482DeepCopyAfterTriggerOnLobColumn.java, derby1482Repro.java,
> derby1482ReproVersion2.java, junitUpgradeTestFailureWithPatch1.out,
> TriggerTests_ver1_diff.txt, TriggerTests_ver1_stat.txt
>
>
> Suppose I have 1) a table "t1" with blob data in it, and 2) an UPDATE trigger
> "tr1" defined on that table, where the triggered-SQL-action for "tr1" does
> NOT reference any of the blob columns in the table. [ Note that this is
> different from DERBY-438 because DERBY-438 deals with triggers that _do_
> reference the blob column(s), whereas this issue deals with triggers that do
> _not_ reference the blob columns--but I think they're related, so I'm
> creating this as subtask to 438 ]. In such a case, if the trigger is fired,
> the blob data will be streamed into memory and thus consume JVM heap, even
> though it (the blob data) is never actually referenced/accessed by the
> trigger statement.
> For example, suppose we have the following DDL:
> create table t1 (id int, status smallint, bl blob(2G));
> create table t2 (id int, updated int default 0);
> create trigger tr1 after update of status on t1 referencing new as n_row
> for each row mode db2sql update t2 set updated = updated + 1 where t2.id =
> n_row.id;
> Then if t1 and t2 both have data and we make a call to:
> update t1 set status = 3;
> the trigger tr1 will fire, which will cause the blob column in t1 to be
> streamed into memory for each row affected by the trigger. The result is
> that, if the blob data is large, we end up using a lot of JVM memory when we
> really shouldn't have to (at least, in _theory_ we shouldn't have to...).
> Ideally, Derby could figure out whether or not the blob column is referenced,
> and avoid streaming the lob into memory whenever possible (hence this is
> probably more of an "enhancement" request than a bug)...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.