[ https://issues.apache.org/jira/browse/OPENJPA-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258402#comment-13258402 ]
Heath Thomann commented on OPENJPA-2139: ---------------------------------------- Hello Mark! I worked with Albert and Rick on this issue and we came up with a slightly different change than yours but I think it is effectively the same idea. Please review the patch I just uploaded, it is named 'OPENJPA-2139-1.2.x.patch'. First, I used the OPENJPA-2139.mdr.patch which Rick provided a while back. This did not fix some of the issues Albert and I described in latter posts to this JIRA (i.e. registration of a transformer). So our approach was to catch, and eat, the exception at the point we actually attempt to connect to the DB (as well as log a warning message)......specifically JDBCConfigurationImpl.getDBDictionaryInstance. We took the approach that there is no need to allow a connection/SQLException to percolate up the call chain.....we figured it best to catch the exception, eat it, and warn the user of the connection issue. In this way, and with Rick's previous patch, an attempt to connect to the DB can be made at a later time (when hopefully the DB is back up). I noticed that you changed AbstractBrokerFactory.makeReadOnly.......this method eventually causes (indirectly) JDBCConfigurationImpl.getDBDictionaryInstance to be called....when the DB is down getDBDictionaryInstance will cause an exception to be thrown. It appears the 'catch' block you added to AbstractBrokerFactory.makeReadOnly will catch and re-throw this exception as well as reset some state. With my fix, you will no longer receive an exception in some cases (i.e. for the DB connection case where we catch/eat the exception). I would like you to try my patch in your testing environment to see if it fixes your issue. At the very least I'd like you to describe your scenario so we can understand why you made the changes you made. Our fix has been test in a JEE environment whereas I believe you are in a JSE env, right? I've tested a few scenarios in my JEE server where I've cycled the DB at various times and our fix handles these cases as we'd expect. Furthermore, our fix has been tested by an internal group (who uses our JEE server) which is performing some very rigorous DB fail over scenarios and our change fixes their issues. So, I think our fix is necessary, but the remaining question is whether or not your fix is needed in addition to ours. Thanks, Heath > OpenJPA fails to recover from a broken database on startup > ---------------------------------------------------------- > > Key: OPENJPA-2139 > URL: https://issues.apache.org/jira/browse/OPENJPA-2139 > Project: OpenJPA > Issue Type: Bug > Affects Versions: 2.2.0 > Reporter: Mark Struberg > Assignee: Mark Struberg > Priority: Critical > Fix For: 2.3.0 > > Attachments: OPENJPA-2139-1.2.x.patch, OPENJPA-2139.mdr.patch, > OPENJPA-2139.patch > > > The following scenario: > 1.) turn off the database > 2.) perform a query against the database > 3.) turn on the database > 4.) try to re-run the query from 2.) > In 4.) you will get the following Exception: > openjpa-2.2.0-r422266:1244990 nonfatal user error> > org.apache.openjpa.persistence.ArgumentException: An error occurred while > parsing the query filter "SELECT k FROM DbEnumKey AS k where k.type=:typ > ORDER BY k.ordinal". Error message: The name "DbEnumKey" is not a recognized > entity or identifier. Known entity names: [] > Basically the whole app is stale afterwards! > Solution: caching the entities might only be done if a connection can be > established. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira