Dear All,

Thank you for your suggestions.

I just update to you. I did DB migration from H2DB to Postgresql (Amazon 
Aurora) successfully.

No GoCD server self restarting and DB lock issues were happened again after 
DB migration successfully for 2 weeks.

I just share my DB migration procedure below. Hope It will help anyone 
facing this issue.

*Procedure*

*1.Provision DB (Amazon Aurora)*

psql -h rdsdbhost.rds.amazonaws.com -U gocd -d postgres

create DB in Postgresql

CREATE ROLE "gocd_database_user" PASSWORD 'gocd_database_password' 
NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN;
CREATE DATABASE "gocd" ENCODING="UTF8" TEMPLATE="template0";
GRANT ALL PRIVILEGES ON DATABASE "gocd" TO "gocd_database_user";

*Must use DB OWNER*

ALTER DATABASE gocd OWNER TO gocd_database_user;

*2.Provision EC2*

*3.Prepare DB migration script on EC2*

*Install Java*

yum install java-1.11.0-openjdk (Amazon Linux 2)
or
yum install java-11-openjdk  (Redhat)

*Download DB migration script and unzip*

curl -L -o gocd-database-migrator-1.0.4.tgz 
https://github.com/gocd/gocd-database-migrator/releases/download/1.0.4-229-exp/gocd-database-migrator-1.0.4.tgz
 


gunzip gocd-database-migrator-1.0.4.tgz

tar xf gocd-database-migrator-1.0.4.tar

*4.Mount EFS to EC2*

mkdir -p /gocd/godata

mount -t efs -o tls fs-efsid:/gocd/godata /gocd/godata

*5.Set GoCD maintenance mode*

*6.Stop GoCD server*

*7.Run DB migration script*

cd gocd-database-migrator-1.0.4

./bin/gocd-database-migrator \
        --insert \
        --progress \
        --source-db-url='jdbc:h2:/gocd/godata/db/h2db/cruise' \
        
--target-db-url='jdbc:postgresql://rdsdbhost.rds.amazonaws.com:5432/gocd' \
        --target-db-user='gocd_database_user' \
        --target-db-password='gocd_database_password'

*8.Change DB config*

cd /gocd/godata/config

*Create db.properties file*

db.driver=org.postgresql.Driver
db.url=jdbc:postgresql://rdsdbhost.rds.amazonaws.com:5432/gocd
db.user=gocd_database_user
db.password=gocd_database_password

*Modify db.properties file permission*

chown 1000:1000 db.properties

chmod 644 db.properties

*9.Start GoCD server*

Best Regards,
Komgrit


On Thursday, September 12, 2024 at 6:07:20 PM UTC+7 Sriram Narayanan wrote:

> On Thu, Sep 12, 2024 at 6:33 PM Chad Wilson <[email protected]> 
> wrote:
>
>> The warnings on materials not matching and upstream pipelines may be able 
>> to be ignored. Not relevant to this. Similar with the maximum backtracking 
>> limit (unrelated problem to this issue).
>>
>> But
>>
>>    1. is the GoCD server process being restarted after these "The 
>>    database has been closed" errors? *Before* you then start seeing the 
>>    locking errors? (e.g do you see it logging from something like 
>> Jetty9Server:199 
>>    - Configuring Jetty using /etc/go/jetty.xml again, which only happens 
>>    at )
>>    2. Are there other errors before "The database has been closed 
>>    [90098-200]" if you go back further looking for stack traces or "Out of 
>>    memory"?
>>
>>
>> There are other threads which describe very similar problems, and similar 
>> to them you probably need to keep finding your root cause:
>> https://groups.google.com/g/go-cd/c/KPyCqTpxS-k/m/61Ps4wHvDQAJ
>> https://groups.google.com/g/go-cd/c/4yuK8dx8m-Q/m/dpre3JAhAgAJ
>>
>> Note that the users both traced back H2DB issues to "Out of memory" 
>> errors. Switching to Postgres is unlikely to fix memory problems, which is 
>> why it's important to eliminate this, in my opinion.
>>
>
> After reading through the various messages, I am inclined to agree with 
> Chad. While switching to Postgres has its own benefits, it is wise to 
> identify and address the root cause.
>
> Komgrit, do you have backups configured? See: 
> https://docs.gocd.org/current/advanced_usage/one_click_backup.html
>  
>
>>
>> -Chad
>>
>> On Thu, Sep 12, 2024 at 4:08 PM Komgrit Aneksri <[email protected]> 
>> wrote:
>>
>>> Thank you Chad,
>>>
>>> I dig into GoCD Server logs before DB locked.
>>>
>>> *I always found many ERROR messages below.*
>>>
>>> 2024-09-10 08:14:09,413 WARN  [118@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: ar-eod-service-deploy-prod. Possible Reasons: (1) 
>>> Upstream pipelines have not been built yet. (2) Materials do not match 
>>> between configuration and build-cause.
>>> 2024-09-10 08:14:09,416 WARN  [120@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: thinslice-eligibility-service-deploy-nonProd. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:09,437 WARN  [122@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: mercury-bff-order-prod. Possible Reasons: (1) Upstream 
>>> pipelines have not been built yet. (2) Materials do not match between 
>>> configuration and build-cause.
>>> 2024-09-10 08:14:09,441 WARN  [121@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: digital-help-ios-payment-publish-6.11. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:09,446 WARN  [121@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: collections-litigation-web-keyfong-ka-deploy-prod. 
>>> Possible Reasons: (1) Upstream pipelines have not been built yet. (2) 
>>> Materials do not match between configuration and build-cause.
>>> 2024-09-10 08:14:09,447 WARN  [113@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: sonarqube-venus-backend. Possible Reasons: (1) 
>>> Upstream pipelines have not been built yet. (2) Materials do not match 
>>> between configuration and build-cause.
>>> 2024-09-10 08:14:09,449 WARN  [118@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: collections-enforcement-bff-deploy-qa. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:09,451 WARN  [122@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: ar-eod-cdc-deploy-prod. Possible Reasons: (1) Upstream 
>>> pipelines have not been built yet. (2) Materials do not match between 
>>> configuration and build-cause.
>>> 2024-09-10 08:14:09,472 WARN  [114@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: deep-product-deploy-prod. Possible Reasons: (1) 
>>> Upstream pipelines have not been built yet. (2) Materials do not match 
>>> between configuration and build-cause.
>>> 2024-09-10 08:14:09,480 WARN  [121@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: portal-goloyalty-digital-campaign-prod-deployment. 
>>> Possible Reasons: (1) Upstream pipelines have not been built yet. (2) 
>>> Materials do not match between configuration and build-cause.
>>> 2024-09-10 08:14:09,503 WARN  [117@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: collections-frontend-notification-service-deploy-qa. 
>>> Possible Reasons: (1) Upstream pipelines have not been built yet. (2) 
>>> Materials do not match between configuration and build-cause.
>>> 2024-09-10 08:14:09,512 ERROR [117@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:220 - Error while 
>>> scheduling pipeline: ar-loan-job-deploy-prod
>>> com.thoughtworks.go.server.service.dd.MaxBackTrackLimitReachedException: 
>>> Maximum Backtracking limit reached while trying to resolve revisions for 
>>> material 
>>> DependencyMaterialConfig{pipelineName='ar-loan-job-deploy-nonProd', 
>>> stageName='Deployment-uat'}
>>> at 
>>> com.thoughtworks.go.server.service.dd.DependencyFanInNode.hasMoreInstances(DependencyFanInNode.java:233)
>>> at 
>>> com.thoughtworks.go.server.service.dd.DependencyFanInNode.fillNextRevisions(DependencyFanInNode.java:122)
>>> at 
>>> com.thoughtworks.go.server.service.dd.DependencyFanInNode.handleNeedMoreRevisions(DependencyFanInNode.java:83)
>>> at 
>>> com.thoughtworks.go.server.service.dd.DependencyFanInNode.initRevision(DependencyFanInNode.java:75)
>>> at 
>>> com.thoughtworks.go.server.service.dd.DependencyFanInNode.populateRevisions(DependencyFanInNode.java:61)
>>> at 
>>> com.thoughtworks.go.server.service.dd.FanInGraph.initChildren(FanInGraph.java:311)
>>> at 
>>> com.thoughtworks.go.server.service.dd.FanInGraph.computeRevisions(FanInGraph.java:174)
>>> at 
>>> com.thoughtworks.go.server.service.PipelineService.getRevisionsBasedOnDependencies(PipelineService.java:219)
>>> at 
>>> com.thoughtworks.go.server.service.AutoBuild.fanInOn(AutoBuild.java:108)
>>> at 
>>> com.thoughtworks.go.server.service.AutoBuild.onModifications(AutoBuild.java:67)
>>> at 
>>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:191)
>>> at 
>>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:148)
>>> at 
>>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.autoSchedulePipeline(BuildCauseProducerService.java:110)
>>> at 
>>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:44)
>>> at 
>>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:24)
>>> at 
>>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.runImpl(JMSMessageListenerAdapter.java:83)
>>> at 
>>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.run(JMSMessageListenerAdapter.java:63)
>>> at java.base/java.lang.Thread.run(Unknown Source)
>>>
>>>
>>> *And When DB locked, There were messages below*
>>>
>>> 2024-09-10 08:14:18,425 INFO  [qtp1814840342-32487118] Stage:236 - Stage 
>>> is being completed by transition id: 2129759
>>> 2024-09-10 08:14:18,623 WARN  [121@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: application-domain-security-group-prod. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:18,626 WARN  [117@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: collections-enforcement-bff-deploy-prod. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:18,632 WARN  [118@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: collections-enforcement-service-deploy-qa. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:18,637 WARN  [115@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: collections-litigation-ka-bff-deploy-prod. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:18,641 WARN  [120@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: collections-recovery-work-list-deploy-qa. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:18,644 WARN  [116@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098, 
>>> SQLState: 90098
>>> 2024-09-10 08:14:18,644 WARN  [115@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098, 
>>> SQLState: 90098
>>> 2024-09-10 08:14:18,660 ERROR [115@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:234 - The database has been 
>>> closed [90098-200]
>>> 2024-09-10 08:14:18,658 ERROR [116@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:234 - The database has been 
>>> closed [90098-200]
>>> 2024-09-10 08:14:18,647 WARN  [122@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: collections-litigation-web-ka-deploy-prod. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:18,644 WARN  [121@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098, 
>>> SQLState: 90098
>>> 2024-09-10 08:14:18,660 ERROR [121@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:234 - The database has been 
>>> closed [90098-200]
>>> 2024-09-10 08:14:18,665 WARN  [122@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: collections-onescreen-cdc-deploy-uat. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:18,686 WARN  [114@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098, 
>>> SQLState: 90098
>>> 2024-09-10 08:14:18,696 WARN  [119@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: collections-recovery-web-ui-deploy-qa. Possible 
>>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do 
>>> not match between configuration and build-cause.
>>> 2024-09-10 08:14:18,686 ERROR [114@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:234 - The database has been 
>>> closed [90098-200]
>>> 2024-09-10 08:14:18,706 WARN  [122@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098, 
>>> SQLState: 90098
>>> 2024-09-10 08:14:18,706 ERROR [122@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:234 - The database has been 
>>> closed [90098-200]
>>> 2024-09-10 08:14:18,713 WARN  [119@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while 
>>> scheduling pipeline: mercury-bff-report-test-env. Possible Reasons: (1) 
>>> Upstream pipelines have not been built yet. (2) Materials do not match 
>>> between configuration and build-cause.
>>> 2024-09-10 08:14:18,715 WARN  [118@MessageListener for 
>>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098, 
>>> SQLState: 90098
>>> 2024-09-10 08:14:18,719 ERROR [116@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:220 - Error while 
>>> scheduling pipeline: collections-litigation-cdc-deploy-qa
>>> org.springframework.dao.DataAccessResourceFailureException: Hibernate 
>>> operation: could not execute query; SQL [SELECT materials.id FROM 
>>> pipelineMaterialRevisions INNER JOIN pipelines ON 
>>> pipelineMaterialRevisions.pipelineId = pipelines.id INNER JOIN 
>>> modifications on modifications.id  = 
>>> pipelineMaterialRevisions.torevisionId INNER JOIN materials on 
>>> modifications.materialId = materials.id WHERE materials.id = ? AND 
>>> pipelineMaterialRevisions.toRevisionId >= ? AND 
>>> pipelineMaterialRevisions.fromRevisionId <= ? AND pipelines.name = ? 
>>> GROUP BY materials.id;]; The database has been closed [90098-200]; 
>>> nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: The 
>>> database has been closed [90098-200]
>>> at 
>>> org.springframework.jdbc.support.SQLExceptionSubclassTranslator.doTranslate(SQLExceptionSubclassTranslator.java:79)
>>> at 
>>> org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
>>> at 
>>> org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:82)
>>> at 
>>> org.springframework.orm.hibernate3.HibernateAccessor.convertJdbcAccessException(HibernateAccessor.java:428)
>>> at 
>>> org.springframework.orm.hibernate3.HibernateAccessor.convertHibernateAccessException(HibernateAccessor.java:414)
>>> at 
>>> org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:416)
>>> at 
>>> org.springframework.orm.hibernate3.HibernateTemplate.execute(HibernateTemplate.java:342)
>>> at 
>>> com.thoughtworks.go.server.persistence.MaterialRepository.hasPipelineEverRunWith(MaterialRepository.java:853)
>>> at 
>>> com.thoughtworks.go.server.materials.MaterialChecker.hasPipelineEverRunWith(MaterialChecker.java:100)
>>> at 
>>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:186)
>>> at 
>>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:148)
>>> at 
>>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.autoSchedulePipeline(BuildCauseProducerService.java:110)
>>> at 
>>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:44)
>>> at 
>>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:24)
>>> at 
>>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.runImpl(JMSMessageListenerAdapter.java:83)
>>> at 
>>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.run(JMSMessageListenerAdapter.java:63)
>>> at java.base/java.lang.Thread.run(Unknown Source)
>>> Caused by: org.h2.jdbc.JdbcSQLNonTransientConnectionException: The 
>>> database has been closed [90098-200]
>>> at org.h2.message.DbException.getJdbcSQLException(DbException.java:622)
>>> at org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
>>> at org.h2.message.DbException.get(DbException.java:205)
>>> at org.h2.message.DbException.get(DbException.java:181)
>>> at org.h2.message.DbException.get(DbException.java:170)
>>> at org.h2.engine.Database.checkPowerOff(Database.java:506)
>>> at org.h2.command.Command.executeQuery(Command.java:224)
>>> at 
>>> org.h2.jdbc.JdbcPreparedStatement.executeQuery(JdbcPreparedStatement.java:114)
>>> at 
>>> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>>> at 
>>> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>>> at 
>>> org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:208)
>>> at org.hibernate.loader.Loader.getResultSet(Loader.java:1953)
>>> at org.hibernate.loader.Loader.doQuery(Loader.java:802)
>>> at 
>>> org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
>>> at org.hibernate.loader.Loader.doList(Loader.java:2542)
>>> at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2276)
>>> at org.hibernate.loader.Loader.list(Loader.java:2271)
>>> at org.hibernate.loader.custom.CustomLoader.list(CustomLoader.java:316)
>>> at org.hibernate.impl.SessionImpl.listCustomQuery(SessionImpl.java:1842)
>>> at 
>>> org.hibernate.impl.AbstractSessionImpl.list(AbstractSessionImpl.java:165)
>>> at org.hibernate.impl.SQLQueryImpl.list(SQLQueryImpl.java:157)
>>> at 
>>> com.thoughtworks.go.server.persistence.MaterialRepository.lambda$hasPipelineEverRunWith$10(MaterialRepository.java:876)
>>> at 
>>> org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:411)
>>> ... 11 common frames omitted
>>> 2024-09-10 08:14:18,719 ERROR [121@MessageListener for 
>>> ScheduleCheckListener] BuildCauseProducerService:220 - Error while 
>>> scheduling pipeline: ar-loan-infrastructure-rds-snapshot-prod
>>> org.springframework.dao.DataAccessResourceFailureException: Hibernate 
>>> operation: could not execute query; SQL [SELECT materials.id FROM 
>>> pipelineMaterialRevisions INNER JOIN pipelines ON 
>>> pipelineMaterialRevisions.pipelineId = pipelines.id INNER JOIN 
>>> modifications on modifications.id  = 
>>> pipelineMaterialRevisions.torevisionId INNER JOIN materials on 
>>> modifications.materialId = materials.id WHERE materials.id = ? AND 
>>> pipelineMaterialRevisions.toRevisionId >= ? AND 
>>> pipelineMaterialRevisions.fromRevisionId <= ? AND pipelines.name = ? 
>>> GROUP BY materials.id;]; The database has been closed [90098-200]; 
>>> nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: The 
>>> database has been closed [90098-200]
>>> at 
>>> org.springframework.jdbc.support.SQLExceptionSubclassTranslator.doTranslate(SQLExceptionSubclassTranslator.java:79)
>>> at 
>>> org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
>>> at 
>>> org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:82)
>>> at 
>>> org.springframework.orm.hibernate3.HibernateAccessor.convertJdbcAccessException(HibernateAccessor.java:428)
>>> at 
>>> org.springframework.orm.hibernate3.HibernateAccessor.convertHibernateAccessException(HibernateAccessor.java:414)
>>> at 
>>> org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:416)
>>> at 
>>> org.springframework.orm.hibernate3.HibernateTemplate.execute(HibernateTemplate.java:342)
>>> at 
>>> com.thoughtworks.go.server.persistence.MaterialRepository.hasPipelineEverRunWith(MaterialRepository.java:853)
>>> at 
>>> com.thoughtworks.go.server.materials.MaterialChecker.hasPipelineEverRunWith(MaterialChecker.java:100)
>>> at 
>>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:186)
>>> at 
>>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:148)
>>> at 
>>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.autoSchedulePipeline(BuildCauseProducerService.java:110)
>>> at 
>>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:44)
>>> at 
>>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:24)
>>> at 
>>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.runImpl(JMSMessageListenerAdapter.java:83)
>>> at 
>>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.run(JMSMessageListenerAdapter.java:63)
>>> at java.base/java.lang.Thread.run(Unknown Source)
>>> Caused by: org.h2.jdbc.JdbcSQLNonTransientConnectionException: The 
>>> database has been closed [90098-200]
>>> at org.h2.message.DbException.getJdbcSQLException(DbException.java:622)
>>> at org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
>>> at org.h2.message.DbException.get(DbException.java:194)
>>> at org.h2.engine.Session.getTransaction(Session.java:1792)
>>> at 
>>> org.h2.engine.Session.startStatementWithinTransaction(Session.java:1815)
>>> at org.h2.command.Command.executeQuery(Command.java:190)
>>> at 
>>> org.h2.jdbc.JdbcPreparedStatement.executeQuery(JdbcPreparedStatement.java:114)
>>> at 
>>> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>>> at 
>>> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>>> at 
>>> org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:208)
>>> at org.hibernate.loader.Loader.getResultSet(Loader.java:1953)
>>> at org.hibernate.loader.Loader.doQuery(Loader.java:802)
>>> at 
>>> org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
>>> at org.hibernate.loader.Loader.doList(Loader.java:2542)
>>> at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2276)
>>> at org.hibernate.loader.Loader.list(Loader.java:2271)
>>> at org.hibernate.loader.custom.CustomLoader.list(CustomLoader.java:316)
>>> at org.hibernate.impl.SessionImpl.listCustomQuery(SessionImpl.java:1842)
>>> at 
>>> org.hibernate.impl.AbstractSessionImpl.list(AbstractSessionImpl.java:165)
>>> at org.hibernate.impl.SQLQueryImpl.list(SQLQueryImpl.java:157)
>>> at 
>>> com.thoughtworks.go.server.persistence.MaterialRepository.lambda$hasPipelineEverRunWith$10(MaterialRepository.java:876)
>>> at 
>>> org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:411)
>>> ... 11 common frames omitted
>>>
>>> Best Regards,
>>> Komgrit
>>> On Wednesday, September 11, 2024 at 2:11:45 PM UTC+7 Chad Wilson wrote:
>>>
>>>> Before we get into memory stats, did you look at the logs to see if the 
>>>> server is being restarted internally as I described below? No point 
>>>> looking 
>>>> at memory stats unless we have evidence there is a memory problem.
>>>>
>>>> You generally cannot use OS-level stats to debug memory usage for Java 
>>>> applications like GoCD on its own - you need to look at the internal Java 
>>>> heap used/free stats. You might have available memory at container/host 
>>>> level, but the JVM is not using it due to the settings, and so you are 
>>>> still running out of memory. Furthermore, the increase you show is almost 
>>>> all file buffer/cache rather than application usage.
>>>>
>>>> If I recall correctly, by default the GoCD server only starts with a 
>>>> max heap size of 1G which is relatively small for a bigger server with 
>>>> perhaps hundreds of pipelines, however we should try to find evidence of 
>>>> that before randomly changing things or going deeper.
>>>>
>>>> -Chad
>>>>
>>>> On Wed, Sep 11, 2024 at 12:31 PM Komgrit Aneksri <[email protected]> 
>>>> wrote:
>>>>
>>>>> Thank you Chad for help me to investigate and suggest.
>>>>>
>>>>> Here are more information about my GoCD server resources below.
>>>>>
>>>>> We are using worker node as c6g.2xlarge.
>>>>>
>>>>> Currently, CPU usage is 40 - 45 %
>>>>>
>>>>> After restarted GoCD has memory free around  3GB, Then GoCD server run
>>>>>
>>>>> After restart
>>>>> bash-5.1$ free -m
>>>>>                total        used        free      shared  buff/cache   
>>>>> available
>>>>> Mem:           15678        3037        4500           3        8395   
>>>>>     12640
>>>>> Swap:              0           0           0
>>>>>
>>>>> A while has passed to now, memory free was reduced to 260MB
>>>>> bash-5.1$ free -m
>>>>>                total        used        free      shared  buff/cache   
>>>>> available
>>>>> Mem:           15678        3394         260           3       12278   
>>>>>     12283
>>>>> Swap:              0           0           0
>>>>>
>>>>>  JVM is default setting.
>>>>>
>>>>> Regards,
>>>>> Komgrit
>>>>>
>>>>> On Wednesday, September 11, 2024 at 9:38:20 AM UTC+7 Chad Wilson wrote:
>>>>>
>>>>>> If this has never happened before, and only just started happening, 
>>>>>> then *something* must have changed. Might be worth figuring that out.
>>>>>>
>>>>>> A database becomes locked like this only when two instances are 
>>>>>> trying to connect to the same H2 database file, or one crashed somehow 
>>>>>> without releasing the lock. Probably need to see the full error/stack 
>>>>>> trace 
>>>>>> to see the root cause, however usually it's something like "Caused by: 
>>>>>> java.lang.IllegalStateException: The file is locked: 
>>>>>> nio:/godata/db/h2db/cruise.mv.db [1.4.200/7]"
>>>>>>
>>>>>> I suggest you look inside the GoCD server log file more directly, not 
>>>>>> just k8s stats. GoCD runs as a multi-process container, and has its own 
>>>>>> process manager (Tanuki Java wrapper) so it is possible that even 
>>>>>> without 
>>>>>> Kubernetes showing container or pod restarts that GoCD itself has been 
>>>>>> restarted by the Tanuki process manager. it will log when it does so. 
>>>>>> I'd 
>>>>>> look for when the errors started, and then scroll back through the 
>>>>>> container logs to see if the process was restarted by Tanuki. It will 
>>>>>> restart the main JVM if it thinks the main server process is not 
>>>>>> responding, or due to OOM errors etc. Perhaps the lock is not being 
>>>>>> released fast enough. Anyway - your root problem may be heap 
>>>>>> size/memory/CPU constraints rather than the database itself.
>>>>>>
>>>>>> Even if you use Postgres, if you have cases where there are two GoCD 
>>>>>> server instances overlapping or trying to share the database file you 
>>>>>> will 
>>>>>> have other issues of some sort (due to race conditions) and if you have 
>>>>>> some other server stability issue causing restarts it's probably wise to 
>>>>>> understand how it is getting into this state first so you're addressing 
>>>>>> the 
>>>>>> right problem.
>>>>>>
>>>>>> As for migration to Postgres, the docs are at 
>>>>>> https://github.com/gocd/gocd-database-migrator . There's nothing 
>>>>>> specific for EKS/Kubernetes however generally speaking you'd need to
>>>>>>
>>>>>>    - prepare your postgres instance per 
>>>>>>    
>>>>>> https://docs.gocd.org/current/installation/configuring_database/postgres.html
>>>>>>    - (when ready to do the "proper" run) stop your GoCD server 
>>>>>>    instance
>>>>>>    - get your H2 DB file off EFS somewhere to run the migration tool 
>>>>>>    against
>>>>>>    - run the migrator tool
>>>>>>    - change GoCD server Helm chart to mount the db.properties that 
>>>>>>    tell it how to connect to Postgres
>>>>>>    - start the GoCD server instances against postgres
>>>>>>
>>>>>>
>>>>>> -Chad
>>>>>>
>>>>>> On Wed, Sep 11, 2024 at 9:56 AM Komgrit Aneksri <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>>> I have no any change in configurations.
>>>>>>>
>>>>>>> But We have add new users and new pipelines every day.
>>>>>>>
>>>>>>> Pods is still running status and no restart/spawn/evicted.
>>>>>>>
>>>>>>> If we have to migrate h2 to postgresql. Do you have any migration 
>>>>>>> documentations for K8s?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Komgrit
>>>>>>> On Tuesday, September 10, 2024 at 10:32:31 PM UTC+7 Chad Wilson 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> What changed in your setup when this started happening?
>>>>>>>>
>>>>>>>> Is your GoCD server pod crashing and being automatically restarted? 
>>>>>>>> Are nodes it is running on dying and the pod being re-scheduled 
>>>>>>>> elsewhere?
>>>>>>>>
>>>>>>>> On Tue, 10 Sept 2024, 17:02 Komgrit Aneksri, <[email protected]> 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi team,
>>>>>>>>>
>>>>>>>>> I am facing issue about database.
>>>>>>>>>
>>>>>>>>> Error message below
>>>>>>>>> Could not open JDBC Connection for transaction; nested exception 
>>>>>>>>> is org.h2.jdbc.JdbcSQLNonTransientConnectionException: Database may 
>>>>>>>>> be 
>>>>>>>>> already in use: null. Possible solutions: close all other 
>>>>>>>>> connection(s); 
>>>>>>>>> use the server mode [90020-200]
>>>>>>>>>
>>>>>>>>> Now I did restarted the gocd server then it is back to normal now.
>>>>>>>>>
>>>>>>>>> I used GoCD version 23.1.0 running on EKS
>>>>>>>>>
>>>>>>>>> And store files and database (h2) on EFS.
>>>>>>>>>
>>>>>>>>> I have found this issue 2 times to now (last Thursday and today)
>>>>>>>>>
>>>>>>>>> Cloud you please help me what should improve for fix this issue?
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Komgrit
>>>>>>>>>
>>>>>>>>>  
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>> Groups "go-cd" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>> send an email to [email protected].
>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>> https://groups.google.com/d/msgid/go-cd/9d1a20c3-b463-4dd3-a376-b2bcd014091cn%40googlegroups.com
>>>>>>>>>  
>>>>>>>>> <https://groups.google.com/d/msgid/go-cd/9d1a20c3-b463-4dd3-a376-b2bcd014091cn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "go-cd" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>>
>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/go-cd/30709745-11f5-40dc-a194-0365f39a1c1en%40googlegroups.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/go-cd/30709745-11f5-40dc-a194-0365f39a1c1en%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "go-cd" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>>
>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/go-cd/df53a982-2be6-40d4-b64c-2ed551b5a191n%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/go-cd/df53a982-2be6-40d4-b64c-2ed551b5a191n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "go-cd" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/go-cd/34ef081a-c453-4687-b923-5a477bc88841n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/go-cd/34ef081a-c453-4687-b923-5a477bc88841n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "go-cd" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/go-cd/CAA1RwH85ZVPpv6m4tNGtBdta79ONS3dE1HD4Jm8WJxr7ShxLYg%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/go-cd/CAA1RwH85ZVPpv6m4tNGtBdta79ONS3dE1HD4Jm8WJxr7ShxLYg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/go-cd/363329d0-0a0a-4e7d-83f5-608e8f59738fn%40googlegroups.com.

Reply via email to