Hi all, I am facing the same DB issue and trying to fix it. I tried to find the file path for FILE_LOCK=FS to set on gocd DB. Could you help me to locate the file, please?
Thank you, Gulshen On Monday, May 23, 2022 at 3:06:59 AM UTC-4 [email protected] wrote: > When the issue happened (10:24) wrapper log only shows: > INFO | jvm 3 | 2022/05/18 10:23:44 | May 18, 2022 10:23:44 AM > jakarta.mail.Session loadResource > INFO | jvm 3 | 2022/05/18 10:23:44 | WARNING: expected resource not > found: /META-INF/javamail.default.address.map > INFO | jvm 3 | 2022/05/18 10:24:06 | May 18, 2022 10:24:05 AM > jakarta.mail.Session loadResource > INFO | jvm 3 | 2022/05/18 10:24:06 | WARNING: expected resource not > found: /META-INF/javamail.default.address.map > INFO | jvm 3 | 2022/05/18 10:24:18 | [70452.759s][warning][gc,alloc] > qtp1718515850-41: Retried waiting for GCLocker too often allocating 256 > words > INFO | jvm 3 | 2022/05/18 10:24:18 | [70452.759s][warning][gc,alloc] > 105@MessageListener for WorkFinder: Retried waiting for GCLocker too often > allocating 386 words > INFO | jvm 3 | 2022/05/18 10:29:42 | May 18, 2022 10:29:42 AM > jakarta.mail.Session loadResource > INFO | jvm 3 | 2022/05/18 10:29:42 | WARNING: expected resource not > found: /META-INF/javamail.default.address.map > > Having a closer look at the log showed that the OOM situation already > happened one day earlier than expected. On 05/17 at round about 11:30 the > wrapper-log shows the out of memory error messages: > > 2022-05-17 11:30:39,871 ERROR [104@MessageListener for WorkFinder] > JMSMessageListenerAdapter:87 - Exception thrown in message handling by > listener com.thoughtworks.go.server.messaging.scheduling.WorkFinder@4cf95d4c > org.springframework.transaction.TransactionSystemException: Could not roll > back JDBC transaction; nested exception is > org.h2.jdbc.JdbcSQLNonTransientConnectionException: The database has been > closed [90098-200] > at > org.springframework.jdbc.datasource.DataSourceTransactionManager.doRollback(DataSourceTransactionManager.java:329) > at > org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:857) > at > org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:834) > at > org.springframework.transaction.support.TransactionTemplate.rollbackOnException(TransactionTemplate.java:164) > at > org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:137) > at > com.thoughtworks.go.server.transaction.TransactionTemplate.execute(TransactionTemplate.java:28) > at > com.thoughtworks.go.server.service.ScheduleService.rescheduleAbandonedBuildIfNecessary(ScheduleService.java:640) > at > com.thoughtworks.go.server.service.BuildAssignmentService.assignWorkToAgent(BuildAssignmentService.java:183) > at > com.thoughtworks.go.server.service.BuildAssignmentService.assignWorkToAgent(BuildAssignmentService.java:165) > at > com.thoughtworks.go.server.messaging.scheduling.WorkFinder.onMessage(WorkFinder.java:60) > at > com.thoughtworks.go.server.messaging.scheduling.WorkFinder.onMessage(WorkFinder.java:32) > at > com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.runImpl(JMSMessageListenerAdapter.java:83) > at > com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.run(JMSMessageListenerAdapter.java:63) > at java.base/java.lang.Thread.run(Unknown Source) > Caused by: org.h2.jdbc.JdbcSQLNonTransientConnectionException: The > database has been closed [90098-200] > at org.h2.message.DbException.getJdbcSQLException(DbException.java:622) > at org.h2.message.DbException.getJdbcSQLException(DbException.java:429) > at org.h2.message.DbException.get(DbException.java:194) > at org.h2.engine.Session.getTransaction(Session.java:1792) > at org.h2.engine.Session.getStatementSavepoint(Session.java:1804) > at org.h2.engine.Session.setSavepoint(Session.java:915) > at org.h2.command.Command.executeUpdate(Command.java:244) > at > org.h2.jdbc.JdbcConnection.rollbackInternal(JdbcConnection.java:1530) > at org.h2.jdbc.JdbcConnection.rollback(JdbcConnection.java:561) > at > org.apache.commons.dbcp2.DelegatingConnection.rollback(DelegatingConnection.java:781) > at > org.apache.commons.dbcp2.DelegatingConnection.rollback(DelegatingConnection.java:781) > at > org.springframework.jdbc.datasource.DataSourceTransactionManager.doRollback(DataSourceTransactionManager.java:326) > ... 13 common frames omitted > 2022-05-17 11:30:54,167 INFO [WrapperJarAppMain] Jetty9Server:199 - > Configuring Jetty using /etc/go/jetty.xml > 2022-05-17 11:30:54,236 WARN [WrapperJarAppMain] Server:357 - > ErrorPageMapper not supported for Server level Error Handling > 2022-05-17 11:30:54,372 WARN [WrapperJarAppMain] AbstractHandler:96 - No > Server set for ResourceHandler@51be8b61{STOPPED} > 2022-05-17 11:30:58,763 WARN [WrapperJarAppMain] ConnectionManager:117 - > The file /etc/go/db.properties specified by `go.db.config` does not exist. > 2022-05-17 11:30:59,154 INFO [WrapperJarAppMain] DatabaseMigrator:40 - > Upgrading database, this might take a while depending on the size of the > database. > 2022-05-17 11:30:59,154 INFO [WrapperJarAppMain] DatabaseMigrator:49 - > ************************************************************************ > 2022-05-17 11:30:59,155 INFO [WrapperJarAppMain] DatabaseMigrator:49 - > WARNING: Shutting down your server at this point will lead to a database > corruption. Please wait until the database upgrade completes. > 2022-05-17 11:30:59,155 INFO [WrapperJarAppMain] DatabaseMigrator:49 - > ************************************************************************ > 2022-05-17 11:31:01,356 INFO [WrapperJarAppMain] DatabaseMigrator:57 - > Database upgrade completed successfully. > 2022-05-17 11:31:01,357 INFO [WrapperJarAppMain] DataMigrationRunner:34 - > Running data migrations... > 2022-05-17 11:31:01,388 INFO [WrapperJarAppMain] DataMigrationRunner:49 - > Data migration took 3 ms > 2022-05-17 11:31:01,391 INFO [WrapperJarAppMain] DataMigrationRunner:49 - > Data migration took 0 ms > 2022-05-17 11:31:01,391 INFO [WrapperJarAppMain] DataMigrationRunner:39 - > Data migrations completed. > 2022-05-17 11:31:02,206 WARN [WrapperJarAppMain] ConfigurationFactory:136 > - No configuration found. Configuring ehcache from ehcache-failsafe.xml > found in the classpath: > jar:file:/var/lib/go-server/work/jetty-0_0_0_0-8153-cruise_war-_go-any-/webapp/WEB-INF/lib/ehcache-2.10.9.2.jar!/ehcache-failsafe.xml > 2022-05-17 11:31:02,383 WARN [WrapperJarAppMain] EhCacheProvider:93 - > Could not find configuration [com.thoughtworks.go.domain.AccessToken]; > using defaults. > 2022-05-17 11:31:02,436 WARN [WrapperJarAppMain] EhCacheProvider:93 - > Could not find configuration > [com.thoughtworks.go.domain.NotificationFilter]; using defaults. > 2022-05-17 11:31:02,538 WARN [WrapperJarAppMain] EhCacheProvider:93 - > Could not find configuration [com.thoughtworks.go.domain.User]; using > defaults. > 2022-05-17 11:31:02,560 WARN [WrapperJarAppMain] EhCacheProvider:93 - > Could not find configuration [com.thoughtworks.go.domain.Plugin]; using > defaults. > 2022-05-17 11:31:02,570 WARN [WrapperJarAppMain] EhCacheProvider:93 - > Could not find configuration > [com.thoughtworks.go.domain.EnvironmentVariable]; using defaults. > 2022-05-17 11:31:02,705 WARN [WrapperJarAppMain] EhCacheProvider:93 - > Could not find configuration > [com.thoughtworks.go.domain.User.notificationFilters]; using defaults. > 2022-05-17 11:31:02,789 WARN [WrapperJarAppMain] EhCacheProvider:93 - > Could not find configuration [org.hibernate.cache.UpdateTimestampsCache]; > using defaults. > 2022-05-17 11:31:02,826 WARN [WrapperJarAppMain] EhCacheProvider:93 - > Could not find configuration [org.hibernate.cache.StandardQueryCache]; > using defaults. > 2022-05-17 11:31:04,815 WARN [WrapperJarAppMain] BrokerService:2163 - > Temporary Store limit is 51200 mb (current store usage is 0 mb). The data > directory: /var/lib/go-server only has 45488 mb of usable space. - > resetting to maximum available disk space: 45488 mb > 2022-05-17 11:31:04,961 INFO [WrapperJarAppMain] ConnectionManager:98 - > Done loading query extensions, found > com.thoughtworks.go.server.database.h2.H2QueryExtensions@6a25c389 > 2022-05-17 11:31:08,045 INFO [WrapperJarAppMain] GoConfigMigration:93 - > Upgrading config file from version 139 to version 139 > 2022-05-17 11:31:08,097 INFO [WrapperJarAppMain] GoConfigMigration:101 - > Finished upgrading config file > 2022-05-17 11:31:08,097 INFO [WrapperJarAppMain] GoConfigMigrator:106 - > [Config Save] Starting Config Save post upgrade using > FullConfigSaveNormalFlow > 2022-05-17 11:31:08,183 INFO [Thread-76] > DefaultPluginJarChangeListener:67 - Plugin load starting: > /var/lib/go-server/plugins/bundled/gocd-ldap-authentication-plugin.jar > > Afterwards it seems the server is running as usual, until the OOM and > "locked" issue is occurring 23 hours later. So I can imagine that you are > right and it is doing some "unclean restart". I suppose that GoCD is > innocent, since you can never plan how to act when out of memory occurs, > that's unpredictable. I assume that raising the size will solve the issue. > > Thanks again for all the support, this is really a great community. > Chad Wilson schrieb am Freitag, 20. Mai 2022 um 10:38:37 UTC+2: > >> No problem! >> >> Perhaps what is happening is that after the OOM has occurred, it is being >> detected by the "wrapper" process shipped with GoCD with and the Java >> process is being restarted. If so, you should be able to see that >> auto-restart happening in the logs, including the go-server-wrapper.log. >> >> Normally I'd expect that to ensure the file lock is cleanly released >> before the new process tries to acquire it, but perhaps it is not shutting >> down cleanly, or there is something unusual about the file system. >> >> If it keeps happening, in addition to lsof, perhaps lslocks (or looking >> at /proc/locks) is of use to see which PIDs have locked which files more >> explicitly. >> >> In any case, fixing the OOMs is probably a good idea regardless of this, >> so it seems sensible to try with a larger heap to address that issue first. >> >> -Chad >> >> >> On Fri, May 20, 2022 at 3:39 PM AquilaNiger <[email protected]> wrote: >> >>> Hi Chad, >>> >>> thanks for your support. >>> >>>> >>>> - Does the server recover after some time, or you need to restart >>>> GoCD or take some other action to fix it? >>>> >>>> No it does not recover, I have to restart GoCD. >>> >>>> >>>> - How are you running GoCD? i.e in which environment? Container? >>>> Standard server? >>>> >>>> Standard Server on a Ubuntu 18.04.6 LTS virtual machine >>> >>>> >>>> - Is your DB file on some kind of network mount or something like >>>> that? >>>> >>>> No, it isn't. >>> >>>> >>>> - Is there a way to verify there aren't multiple processes/GoCD >>>> Instances trying to access the file? >>>> - when it happens, are you able to use OS-level commands such as >>>> lsof to see if other/multiple processes have handles on the DB file >>>> (depends on whether storage is local) >>>> >>>> Currently this happens only once in a while (last time there were 6 >>> days between the database issues). lsof is a good idea! I'll try that the >>> next time it happens. >>> >>>> >>>> - Would be good to confirm you don't see GoCD crashing or getting >>>> auto-restarted in your logs to rule out GoCD itself having a different >>>> problem, and then this problem is being caused by a zombie GoCD process >>>> or >>>> some kind of stale lock which takes time to expire. >>>> >>>> Actually, we found out with yesterdays go-server.log that the root >>> cause seems to be Out of Memory of Java: >>> 2022-05-18 10:24:18,741 WARN [105@MessageListener for WorkFinder] >>> BasicDataSource:58 - An internal object pool swallowed an Exception. >>> org.h2.jdbc.JdbcSQLNonTransientConnectionException: The database has been >>> closed; SQL statement: ROLLBACK [90098-200] >>> ... >>> 2022-05-18 10:24:18,742 WARN [105@MessageListener for WorkFinder] >>> BasicDataSource:58 - An internal object pool swallowed an Exception. >>> org.h2.jdbc.JdbcSQLNonTransientConnectionException: Out of memory. >>> [90108-200] >>> ... >>> Caused by: java.lang.OutOfMemoryError: Java heap space >>> >>> Actually we haven't touched heap size of GoCD up to now. Therefore, we >>> increased it now in wrapper-properties.conf and hope that the error will be >>> gone. I hope that does not only defer the error to some days later. >>> >>>> >>>> - Do you have any overrides to DB configuration, e.g a custom >>>> *config/db.properties* file? >>>> >>>> No. >>> >>> >>>> To answer your question on the trace files, I think you get two files >>>> when the main trace file reaches an H2-configured maximum size. I ask the >>>> above question on DB properties as I think GoCD sets that to 16MB by >>>> default, whereas yours seems to have got to 64MB which seems curious. >>>> >>> >>> Thanks, that explains a lot. You're right, the "old" file contains >>> timestamps from 3:16 to 6:38 and the new one from 6:38 to 8:16. >>> >>> >>>> There is a way to change the locking approach H2 uses >>>> <https://www.h2database.com/html/advanced.html#file_locking_protocols> >>>> (back to the older ;FILE_LOCK=FS - which creates the stale >>>> cruise.lock.db you have in your screenshot) if the issue is with the >>>> filesystem, however I imagine you'd want to rule out multiple processes or >>>> some other issue first. >>>> >>> >>> Thanks for the hint, I'll keep that in mind as "last resort". >>> >>> Thank you for your support. I think we'll wait now if the error occurs >>> again. >>> >>> Julia >>> >>> -- >>> >> You received this message because you are subscribed to the Google Groups >>> "go-cd" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/go-cd/8c61c76e-bb7b-474c-bfa5-1f623794fce4n%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/go-cd/8c61c76e-bb7b-474c-bfa5-1f623794fce4n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "go-cd" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/001fafdf-ac59-440d-a802-73cde8ee6ea4n%40googlegroups.com.
