[jira] Updated: (DERBY-1482) Update triggers on tables with blob columns stream blobs into memory even when the blobs are not referenced/accessed.

Mamta A. Satoor (JIRA) Mon, 15 Feb 2010 14:38:52 -0800

     [ 
https://issues.apache.org/jira/browse/DERBY-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mamta A. Satoor updated DERBY-1482:
-----------------------------------

    Attachment: derby1482ReproVersion2.java

The current reproducible attached to this jira derby1482Repro.java really runs 
into OOM because it can't get enough page cache from the heap available to the 
jvm. This problem can be easily resolved by decreasing the amount of page cache 
requested from the heap. This can be achieved by using the property 
-Dderby.storage.pageCacheSize=smallerpagecache. Once the page cache requirement 
is lowered derby1482Repro.java runs fine without any problems with LOB 
streaming. The reason we don't run into OOM for streaming of LOB with 
derby1482Repro.java is that the LOB data size is not big enough. 
derby1482Repro.java is inserting only 30K data into the 2G LOB(notice I am 
talking about data size and not column size. LOB column is declared as 2G but 
data loaded into it is only 30K). Store does not stream data out of store when 
the data size is less than the page size (in Derby, page size is 32K). In order 
for us to cause the data to be read in the memory, the data size has to be 
bigger. 

What we are really after is a reproducible program which runs into OOM while 
LOBs are being streamed into memory as part of trigger execution when the LOBs 
are really not needed to be streamed in because they are not being 
accessed/updated. I have spent some time on coming up with a repro for OOM as a 
result of LOB streaming. That repro is attached as derby1482ReproVersion2.java 
This test case runs into OOM while streaming the LOB because now the LOB data 
is 320M in size, much larger than the page size of 32K. Since the BLOB is so 
large, Store will stream it into memory, thus running into OOM). 
To reproduce the problem, use the following steps
1)First create the database with 3 populated tables using the attached java 
program 
derby1482ReproVersion2.java as follows 
java org.apache.derbyTesting.functionTests.tests.lang.derby1482ReproVersion2
Some information about the 3 tables 
   a)Table t1 has a BLOB column along with other columns. 
   b)Table t3 is similar to t1 but without a BLOB column 
   c)Table t1 has an update trigger defined on it which updates table t2 in the 
trigger action. 
   d)Table t3 also has an update trigger defined on it which also updates table 
t2 in the trigger action. 
2)To reproduce the problem of OutOfMemory, connect to the database using ij 
with very limited heap 
java -Xmx4m -Dderby.storage.pageCacheSize=40 -Dij.exceptionTrace=true 
org.apache.derby.tools.ij 
connect 'jdbc:derby:testDB1482;create=true'; 
--following update of table t3 works with no problem. This is the table with no 
BLOB column 
update t3 set status = 1 where id = 1; 
--But the following update will fail with OOM. This update is on the table with 
BLOB column 
update t1 set status = 1 where id = 1; 

> Update triggers on tables with blob columns stream blobs into memory even 
> when the blobs are not referenced/accessed.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-1482
>                 URL: https://issues.apache.org/jira/browse/DERBY-1482
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.2.1.6
>            Reporter: Daniel John Debrunner
>            Priority: Minor
>         Attachments: derby1482DeepCopyAfterTriggerOnLobColumn.java, 
> derby1482Repro.java, derby1482ReproVersion2.java
>
>
> Suppose I have 1) a table "t1" with blob data in it, and 2) an UPDATE trigger 
> "tr1" defined on that table, where the triggered-SQL-action for "tr1" does 
> NOT reference any of the blob columns in the table. [ Note that this is 
> different from DERBY-438 because DERBY-438 deals with triggers that _do_ 
> reference the blob column(s), whereas this issue deals with triggers that do 
> _not_ reference the blob columns--but I think they're related, so I'm 
> creating this as subtask to 438 ]. In such a case, if the trigger is fired, 
> the blob data will be streamed into memory and thus consume JVM heap, even 
> though it (the blob data) is never actually referenced/accessed by the 
> trigger statement.
> For example, suppose we have the following DDL:
>     create table t1 (id int, status smallint, bl blob(2G));
>     create table t2 (id int, updated int default 0);
>     create trigger tr1 after update of status on t1 referencing new as n_row 
> for each row mode db2sql update t2 set updated = updated + 1 where t2.id = 
> n_row.id;
> Then if t1 and t2 both have data and we make a call to:
>     update t1 set status = 3;
> the trigger tr1 will fire, which will cause the blob column in t1 to be 
> streamed into memory for each row affected by the trigger. The result is 
> that, if the blob data is large, we end up using a lot of JVM memory when we 
> really shouldn't have to (at least, in _theory_ we shouldn't have to...).
> Ideally, Derby could figure out whether or not the blob column is referenced, 
> and avoid streaming the lob into memory whenever possible (hence this is 
> probably more of an "enhancement" request than a bug)... 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-1482) Update triggers on tables with blob columns stream blobs into memory even when the blobs are not referenced/accessed.

Reply via email to