[
https://issues.apache.org/jira/browse/HAWQ-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244963#comment-16244963
]
Yi Jin commented on HAWQ-1530:
--
I pushed the fix, Grant. Thank you for your help.
I think you can try to randomly cancel or terminate one running query/insert
anytime, and then check the consequent DROP TABLE works without hang.
This is not so easy to test, the reason I can explain here more to let you know
what happens before fixing:
A process accesses table T with shared lock, and FATAL error is raised when it
is cancelled or terminated, unfortunately, when another ERROR occurs and it is
promoted to a FATAL error, this breaks normal transaction cleanup logic, this
causes the lock counter structure not correctly cleaned.
Then when B process access table T2, B reuses A used lock counter structure,
and after all B logic and computation, B has to clean up the lock counter
structure again, because this has been polluted by A, B cannot make counter
back to 0 again, which makes B not able to release lock of T2. B happily exits
then without reporting any error. This is reason why we can observe one lock
occupied by unknown process.
When a bad luck process C comes and tries to DROP TABLE, it cannot get
exclusive lock. So C hangs.
The fix is to drop potential ERROR when HAWQ is in exit progress due to
previous FATAL error.
> Illegally killing a JDBC select query causes locking problems
> -
>
> Key: HAWQ-1530
> URL: https://issues.apache.org/jira/browse/HAWQ-1530
> Project: Apache HAWQ
> Issue Type: Bug
> Components: Transaction
>Reporter: Grant Krieger
>Assignee: Yi Jin
> Fix For: 2.3.0.0-incubating
>
>
> Hi,
> When you perform a long running select statement on 2 hawq tables (join) from
> JDBC and illegally kill the JDBC client (CTRL ALT DEL) before completion of
> the query the 2 tables remained locked even when the query completes on the
> server.
> The lock is visible via PG_locks. One cannot kill the query via SELECT
> pg_terminate_backend(393937). The only way to get rid of it is to kill -9
> from linux or restart hawq but this can kill other things as well.
> The JDBC client I am using is Aqua Data Studio.
> I can provide exact steps to reproduce if required
> Thank you
> Grant
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)