[ 
https://issues.apache.org/jira/browse/DERBY-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529019#comment-13529019
 ] 

Knut Anders Hatlen commented on DERBY-5632:
-------------------------------------------

I think the way the conglomerate cache is accessed breaks with the intention of 
how our generic cache implementation should be accessed. It should not be 
necessary to synchronize on the cache instance, like 
RAMAccessManager.conglomCacheFind() and some other callers do.

To take conglomCacheFind() as an example, I think it ideally should have been 
implemented like this:

Conglomerate conglom = null;
CacheableConglomerate entry = (CacheableConglomerate) conglomCache.find(new 
Long(conglomid));
if (entry != null) {
    conglom = entry.getConglom();
    conglom_cache.release(entry);
}
return conglom;

That is, no explicit synchronization, and let the cache implementation take 
care of faulting in the conglomerate if it's not in the cache.

However, CacheableConglomerate.setIdentity(), which is where the code that 
faults in the conglomerate is supposed to be, is just an empty shell:

        public Cacheable setIdentity(Object key) throws StandardException
    {
                if (SanityManager.DEBUG) {
                        SanityManager.THROWASSERT("not supported.");
                }

        return(null);
    }

I'll have a look and see if it's possible to rewrite the code in a way so that 
we can remove the explicit synchronization on the conglomerate cache instance. 
Hopefully, that would be enough to break the deadlock.
                
> Logical deadlock happened when freezing/unfreezing the database
> ---------------------------------------------------------------
>
>                 Key: DERBY-5632
>                 URL: https://issues.apache.org/jira/browse/DERBY-5632
>             Project: Derby
>          Issue Type: Bug
>          Components: Documentation, Services
>    Affects Versions: 10.8.2.2
>         Environment: Oracle M3000/Solaris 10
>            Reporter: Brett Bergquist
>              Labels: derby_triage10_10
>         Attachments: stack.txt
>
>
> Tried to make a quick database backup by freezing the database, performing a 
> ZFS snapshot, and then unfreezing the database.   The database was frozen but 
> then a connection to the database could not be established to unfreeze the 
> database.
> Looking at the stack trace of the network server, , I see 3 threads that are 
> trying to process a connection request.   Each of these is waiting on:
>                 at 
> org.apache.derby.impl.store.access.RAMAccessManager.conglomCacheFind(Unknown 
> Source)
>                 - waiting to lock <0xfffffffd3a7fcc68> (a 
> org.apache.derby.impl.services.cache.ConcurrentCache)
> That object is owned by:
>                 - locked <0xfffffffd3a7fcc68> (a 
> org.apache.derby.impl.services.cache.ConcurrentCache)
>                 at 
> org.apache.derby.impl.store.access.RAMTransaction.findExistingConglomerate(Unknown
>  Source)
>                 at 
> org.apache.derby.impl.store.access.RAMTransaction.openGroupFetchScan(Unknown 
> Source)
>                 at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.updateIndexStatsMinion(Unknown
>  Source)
>                 at 
> org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.runExplicitly(Unknown
>  Source)
>                 at 
> org.apache.derby.impl.sql.execute.AlterTableConstantAction.updateStatistics(Unknown
>  Source)
> which itself is waiting for the object:
>                 at java.lang.Object.wait(Native Method)
>                 - waiting on <0xfffffffd3ac1d608> (a 
> org.apache.derby.impl.store.raw.log.LogToFile)
>                 at java.lang.Object.wait(Object.java:485)
>                 at 
> org.apache.derby.impl.store.raw.log.LogToFile.flush(Unknown Source)
>                 - locked <0xfffffffd3ac1d608> (a 
> org.apache.derby.impl.store.raw.log.LogToFile)
>                 at 
> org.apache.derby.impl.store.raw.log.LogToFile.flush(Unknown Source)
>                 at 
> org.apache.derby.impl.store.raw.data.BaseDataFileFactory.flush(Unknown Source)
> So basically what I think is happening is that the database is frozen, the 
> statistics are being updated on another thread which has the 
> "org.apache.derby.impl.services.cache.ConcurrentCache" locked and then waits 
> for the LogToFile lock and the connecting threads are waiting to lock 
> "org.apache.derby.impl.services.cache.ConcurrentCache" to connect and these 
> are where the database is going to be unfrozen.    Not a deadlock as far as 
> the JVM is concerned but it will never leave this state either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to