[ 
https://issues.apache.org/jira/browse/DERBY-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287884#comment-17287884
 ] 

Ondrej Bouda commented on DERBY-7099:
-------------------------------------

[~rhillegas]  - I was also thinking of corruption but there are still two 
reasons to suspect for a bug:
 * You need 2 SELECTs before the DELETE to reproduce the issue. Might be some 
locking-related bug.
 * If it was a corruption, there would be a good chance that DELETE + COMMIT + 
RE-INSERT + COMMIT would fix the issue. However, the issue persists.
 * Maybe it's not corruption but rather some unusual physical data layout - 
valid but rare to happen - which is just not well covered by the actual 
implementation of the SQL engine.

Regarding table compression - yes, we have a regular job for that and this is 
one of our suspects.
 * About a year ago, we switched from SYSCS_COMPRESS_TABLE to 
SYSCS_INPLACE_COMPRESS_TABLE - and in the past months we started to face 
increased number of corrupted databases. We cannot identify the root cause but 
we are reverting back to SYSCS_COMPRESS_TABLE - just to be on the safe side.
 * I was already looking in [DERBY-5876] and [DERBY-5356] before filling this 
issue but found nothing that would clearly fit this reported use case.
 * Please note that we are a SW development company so there are many thousands 
of prod databases and we have no control over them. Therefore, the corruption 
cases may be related to usual causes like hard restart and/or killing the 
Java/app/Derby - but we can neither confirm nor deny these options.

Actually if my issue is caused by a corruption, I would love the 
SYSCS_CHECK_TABLE function to be able to detect this kind of corruption. 
We could then implement a maintenance job to automatically detect and fix this 
issue.

> DELETE skips a row matching the WHERE clause (single delete by PK)
> ------------------------------------------------------------------
>
>                 Key: DERBY-7099
>                 URL: https://issues.apache.org/jira/browse/DERBY-7099
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.14.2.0, 10.15.2.0
>         Environment: Looks like environment-independent bug - reproduced in 
> multiple environments (prod/test, win/linux, different Java and Derby 
> versions).
>            Reporter: Ondrej Bouda
>            Priority: Critical
>         Attachments: repro.zip
>
>
> We came across a bug in Derby when a DELETE by primary key actually does not 
> delete the row (although it exists).
>  * We found the issue in a delete-reinsert use case when the reinsert was 
> failing with a "duplicate key" error.
>  * Note: The attached app to reproduce the issue follows this reinsert 
> pattern.
> To reproduce the issue, a SELECT (by PK) is required before the DELETE (by 
> PK).
>  * SELECT => SELECT => DELETE => RE-INSERT => duplicate key error.
>  * The DELETE works OK if no or just 1 SELECT is executed before.
>  * The "failed" DELETE correctly reports "0 rows affected".
>  * Retrying the DELETE makes the delete really happen.
>  * The behavior is the same both for autocommit and manual commit mode.
>  * The behavior is 100% reproducible in the attached demo app.
> However, it is not reproducible from DBeaver (using the same queries).
> Notes regarding the attached database + demo app to reproduce the issue:
>  * The app takes 3 arguments: path to the database (string), how many times 
> to execute the initial SELECT (int) and an optional "retry" string to retry 
> the DELETE statement.
>  * Use "test_app.bat" => there are several execution scenarios, uncomment the 
> one you intend to test.
>  * With 0-1 SELECTs, the demo app works without any issues.
>  * With 2+ SELECTs, the demo app fails on "duplicate key" when trying to 
> reinsert the deleted row. The app reports that the DELETE affected 0 rows.
>  * In the "retry" mode, you can see that 2nd DELETE is successful (even for 
> 3+ SELECTs).
> Further notes
>  * The issue persists even after a successful delete-reinsert.
>  * Check for corruption (SYSCS_CHECK_TABLE) indicates that the table is 
> healthy.
>  * Full rebuild of the affected table (SYSCS_COMPRESS_TABLE) resolves the 
> issue but - there is no way to tell whether a tables needs rebuild to avoid 
> this kind of issue...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to