[ 
https://issues.apache.org/jira/browse/DERBY-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602354#comment-13602354
 ] 

Brett Bergquist commented on DERBY-5632:
----------------------------------------

I think the deadlocks are the same.   I think we are triggering this more 
often.  The 10.8 added the index stats daemon capability but it turned out 
there were some issues (that I reported).  Because index statistics are 
important for our application and we have tables that are initially empty and 
quickly get filled with data, we disable the index stats daemon (or rather 
don't enable it) and have a background job that is running an UPDATE_STATISTICS 
command across all tables that appear to not have statistics.  This runs every 
minute in the background and does not seem to have a negative effect since it 
is only running against those tables that don't have any index statistics.

So there is probably a timing issue with the FREEZE and the UPDATE_STATISTICS. 

I am going to try to force a reproducible case by increasing the frequency of 
both of these in my utility.  I will remove the actually file system backup 
between the freeze/unfreeze and increase the rate at with the background 
UPDATE_STATISTICS is being done.   If I can get this to happen with a good deal 
of regularity, I will apply your patch and try again.

I will post the results.
                
> Logical deadlock happened when freezing/unfreezing the database
> ---------------------------------------------------------------
>
>                 Key: DERBY-5632
>                 URL: https://issues.apache.org/jira/browse/DERBY-5632
>             Project: Derby
>          Issue Type: Bug
>          Components: Documentation, Services
>    Affects Versions: 10.8.2.2
>         Environment: Oracle M3000/Solaris 10
>            Reporter: Brett Bergquist
>            Assignee: Knut Anders Hatlen
>              Labels: derby_triage10_10
>             Fix For: 10.10.0.0
>
>         Attachments: experimental-v1.diff, experimental-v2.diff, stack.txt
>
>
> Tried to make a quick database backup by freezing the database, performing a 
> ZFS snapshot, and then unfreezing the database.   The database was frozen but 
> then a connection to the database could not be established to unfreeze the 
> database.
> Looking at the stack trace of the network server, , I see 3 threads that are 
> trying to process a connection request.   Each of these is waiting on:
>                 at 
> org.apache.derby.impl.store.access.RAMAccessManager.conglomCacheFind(Unknown 
> Source)
>                 - waiting to lock <0xfffffffd3a7fcc68> (a 
> org.apache.derby.impl.services.cache.ConcurrentCache)
> That object is owned by:
>                 - locked <0xfffffffd3a7fcc68> (a 
> org.apache.derby.impl.services.cache.ConcurrentCache)
>                 at 
> org.apache.derby.impl.store.access.RAMTransaction.findExistingConglomerate(Unknown
>  Source)
>                 at 
> org.apache.derby.impl.store.access.RAMTransaction.openGroupFetchScan(Unknown 
> Source)
>                 at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.updateIndexStatsMinion(Unknown
>  Source)
>                 at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.runExplicitly(Unknown
>  Source)
>                 at 
> org.apache.derby.impl.sql.execute.AlterTableConstantAction.updateStatistics(Unknown
>  Source)
> which itself is waiting for the object:
>                 at java.lang.Object.wait(Native Method)
>                 - waiting on <0xfffffffd3ac1d608> (a 
> org.apache.derby.impl.store.raw.log.LogToFile)
>                 at java.lang.Object.wait(Object.java:485)
>                 at 
> org.apache.derby.impl.store.raw.log.LogToFile.flush(Unknown Source)
>                 - locked <0xfffffffd3ac1d608> (a 
> org.apache.derby.impl.store.raw.log.LogToFile)
>                 at 
> org.apache.derby.impl.store.raw.log.LogToFile.flush(Unknown Source)
>                 at 
> org.apache.derby.impl.store.raw.data.BaseDataFileFactory.flush(Unknown Source)
> So basically what I think is happening is that the database is frozen, the 
> statistics are being updated on another thread which has the 
> "org.apache.derby.impl.services.cache.ConcurrentCache" locked and then waits 
> for the LogToFile lock and the connecting threads are waiting to lock 
> "org.apache.derby.impl.services.cache.ConcurrentCache" to connect and these 
> are where the database is going to be unfrozen.    Not a deadlock as far as 
> the JVM is concerned but it will never leave this state either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to