Re: [go-cd] Found Database issue

Sriram Narayanan Thu, 12 Sep 2024 04:07:24 -0700

On Thu, Sep 12, 2024 at 6:33 PM Chad Wilson <[email protected]> wrote:


> The warnings on materials not matching and upstream pipelines may be able
> to be ignored. Not relevant to this. Similar with the maximum backtracking
> limit (unrelated problem to this issue).
>
> But
>
>    1. is the GoCD server process being restarted after these "The
>    database has been closed" errors? *Before* you then start seeing the
>    locking errors? (e.g do you see it logging from something like 
> Jetty9Server:199
>    - Configuring Jetty using /etc/go/jetty.xml again, which only happens
>    at )
>    2. Are there other errors before "The database has been closed
>    [90098-200]" if you go back further looking for stack traces or "Out of
>    memory"?
>
>
> There are other threads which describe very similar problems, and similar
> to them you probably need to keep finding your root cause:
> https://groups.google.com/g/go-cd/c/KPyCqTpxS-k/m/61Ps4wHvDQAJ
> https://groups.google.com/g/go-cd/c/4yuK8dx8m-Q/m/dpre3JAhAgAJ
>
> Note that the users both traced back H2DB issues to "Out of memory"
> errors. Switching to Postgres is unlikely to fix memory problems, which is
> why it's important to eliminate this, in my opinion.
>

After reading through the various messages, I am inclined to agree with
Chad. While switching to Postgres has its own benefits, it is wise to
identify and address the root cause.

Komgrit, do you have backups configured? See:
https://docs.gocd.org/current/advanced_usage/one_click_backup.html


>
> -Chad
>
> On Thu, Sep 12, 2024 at 4:08 PM Komgrit Aneksri <[email protected]>
> wrote:
>
>> Thank you Chad,
>>
>> I dig into GoCD Server logs before DB locked.
>>
>> *I always found many ERROR messages below.*
>>
>> 2024-09-10 08:14:09,413 WARN  [118@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: ar-eod-service-deploy-prod. Possible Reasons: (1)
>> Upstream pipelines have not been built yet. (2) Materials do not match
>> between configuration and build-cause.
>> 2024-09-10 08:14:09,416 WARN  [120@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: thinslice-eligibility-service-deploy-nonProd. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:09,437 WARN  [122@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: mercury-bff-order-prod. Possible Reasons: (1) Upstream
>> pipelines have not been built yet. (2) Materials do not match between
>> configuration and build-cause.
>> 2024-09-10 08:14:09,441 WARN  [121@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: digital-help-ios-payment-publish-6.11. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:09,446 WARN  [121@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: collections-litigation-web-keyfong-ka-deploy-prod.
>> Possible Reasons: (1) Upstream pipelines have not been built yet. (2)
>> Materials do not match between configuration and build-cause.
>> 2024-09-10 08:14:09,447 WARN  [113@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: sonarqube-venus-backend. Possible Reasons: (1)
>> Upstream pipelines have not been built yet. (2) Materials do not match
>> between configuration and build-cause.
>> 2024-09-10 08:14:09,449 WARN  [118@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: collections-enforcement-bff-deploy-qa. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:09,451 WARN  [122@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: ar-eod-cdc-deploy-prod. Possible Reasons: (1) Upstream
>> pipelines have not been built yet. (2) Materials do not match between
>> configuration and build-cause.
>> 2024-09-10 08:14:09,472 WARN  [114@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: deep-product-deploy-prod. Possible Reasons: (1)
>> Upstream pipelines have not been built yet. (2) Materials do not match
>> between configuration and build-cause.
>> 2024-09-10 08:14:09,480 WARN  [121@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: portal-goloyalty-digital-campaign-prod-deployment.
>> Possible Reasons: (1) Upstream pipelines have not been built yet. (2)
>> Materials do not match between configuration and build-cause.
>> 2024-09-10 08:14:09,503 WARN  [117@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: collections-frontend-notification-service-deploy-qa.
>> Possible Reasons: (1) Upstream pipelines have not been built yet. (2)
>> Materials do not match between configuration and build-cause.
>> 2024-09-10 08:14:09,512 ERROR [117@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:220 - Error while
>> scheduling pipeline: ar-loan-job-deploy-prod
>> com.thoughtworks.go.server.service.dd.MaxBackTrackLimitReachedException:
>> Maximum Backtracking limit reached while trying to resolve revisions for
>> material
>> DependencyMaterialConfig{pipelineName='ar-loan-job-deploy-nonProd',
>> stageName='Deployment-uat'}
>> at
>> com.thoughtworks.go.server.service.dd.DependencyFanInNode.hasMoreInstances(DependencyFanInNode.java:233)
>> at
>> com.thoughtworks.go.server.service.dd.DependencyFanInNode.fillNextRevisions(DependencyFanInNode.java:122)
>> at
>> com.thoughtworks.go.server.service.dd.DependencyFanInNode.handleNeedMoreRevisions(DependencyFanInNode.java:83)
>> at
>> com.thoughtworks.go.server.service.dd.DependencyFanInNode.initRevision(DependencyFanInNode.java:75)
>> at
>> com.thoughtworks.go.server.service.dd.DependencyFanInNode.populateRevisions(DependencyFanInNode.java:61)
>> at
>> com.thoughtworks.go.server.service.dd.FanInGraph.initChildren(FanInGraph.java:311)
>> at
>> com.thoughtworks.go.server.service.dd.FanInGraph.computeRevisions(FanInGraph.java:174)
>> at
>> com.thoughtworks.go.server.service.PipelineService.getRevisionsBasedOnDependencies(PipelineService.java:219)
>> at
>> com.thoughtworks.go.server.service.AutoBuild.fanInOn(AutoBuild.java:108)
>> at
>> com.thoughtworks.go.server.service.AutoBuild.onModifications(AutoBuild.java:67)
>> at
>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:191)
>> at
>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:148)
>> at
>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.autoSchedulePipeline(BuildCauseProducerService.java:110)
>> at
>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:44)
>> at
>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:24)
>> at
>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.runImpl(JMSMessageListenerAdapter.java:83)
>> at
>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.run(JMSMessageListenerAdapter.java:63)
>> at java.base/java.lang.Thread.run(Unknown Source)
>>
>>
>> *And When DB locked, There were messages below*
>>
>> 2024-09-10 08:14:18,425 INFO  [qtp1814840342-32487118] Stage:236 - Stage
>> is being completed by transition id: 2129759
>> 2024-09-10 08:14:18,623 WARN  [121@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: application-domain-security-group-prod. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:18,626 WARN  [117@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: collections-enforcement-bff-deploy-prod. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:18,632 WARN  [118@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: collections-enforcement-service-deploy-qa. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:18,637 WARN  [115@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: collections-litigation-ka-bff-deploy-prod. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:18,641 WARN  [120@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: collections-recovery-work-list-deploy-qa. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:18,644 WARN  [116@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098,
>> SQLState: 90098
>> 2024-09-10 08:14:18,644 WARN  [115@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098,
>> SQLState: 90098
>> 2024-09-10 08:14:18,660 ERROR [115@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:234 - The database has been
>> closed [90098-200]
>> 2024-09-10 08:14:18,658 ERROR [116@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:234 - The database has been
>> closed [90098-200]
>> 2024-09-10 08:14:18,647 WARN  [122@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: collections-litigation-web-ka-deploy-prod. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:18,644 WARN  [121@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098,
>> SQLState: 90098
>> 2024-09-10 08:14:18,660 ERROR [121@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:234 - The database has been
>> closed [90098-200]
>> 2024-09-10 08:14:18,665 WARN  [122@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: collections-onescreen-cdc-deploy-uat. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:18,686 WARN  [114@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098,
>> SQLState: 90098
>> 2024-09-10 08:14:18,696 WARN  [119@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: collections-recovery-web-ui-deploy-qa. Possible
>> Reasons: (1) Upstream pipelines have not been built yet. (2) Materials do
>> not match between configuration and build-cause.
>> 2024-09-10 08:14:18,686 ERROR [114@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:234 - The database has been
>> closed [90098-200]
>> 2024-09-10 08:14:18,706 WARN  [122@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098,
>> SQLState: 90098
>> 2024-09-10 08:14:18,706 ERROR [122@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:234 - The database has been
>> closed [90098-200]
>> 2024-09-10 08:14:18,713 WARN  [119@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:175 - Error while
>> scheduling pipeline: mercury-bff-report-test-env. Possible Reasons: (1)
>> Upstream pipelines have not been built yet. (2) Materials do not match
>> between configuration and build-cause.
>> 2024-09-10 08:14:18,715 WARN  [118@MessageListener for
>> ScheduleCheckListener] JDBCExceptionReporter:233 - SQL Error: 90098,
>> SQLState: 90098
>> 2024-09-10 08:14:18,719 ERROR [116@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:220 - Error while
>> scheduling pipeline: collections-litigation-cdc-deploy-qa
>> org.springframework.dao.DataAccessResourceFailureException: Hibernate
>> operation: could not execute query; SQL [SELECT materials.id FROM
>> pipelineMaterialRevisions INNER JOIN pipelines ON
>> pipelineMaterialRevisions.pipelineId = pipelines.id INNER JOIN
>> modifications on modifications.id  =
>> pipelineMaterialRevisions.torevisionId INNER JOIN materials on
>> modifications.materialId = materials.id WHERE materials.id = ? AND
>> pipelineMaterialRevisions.toRevisionId >= ? AND
>> pipelineMaterialRevisions.fromRevisionId <= ? AND pipelines.name = ?
>> GROUP BY materials.id;]; The database has been closed [90098-200];
>> nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: The
>> database has been closed [90098-200]
>> at
>> org.springframework.jdbc.support.SQLExceptionSubclassTranslator.doTranslate(SQLExceptionSubclassTranslator.java:79)
>> at
>> org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
>> at
>> org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:82)
>> at
>> org.springframework.orm.hibernate3.HibernateAccessor.convertJdbcAccessException(HibernateAccessor.java:428)
>> at
>> org.springframework.orm.hibernate3.HibernateAccessor.convertHibernateAccessException(HibernateAccessor.java:414)
>> at
>> org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:416)
>> at
>> org.springframework.orm.hibernate3.HibernateTemplate.execute(HibernateTemplate.java:342)
>> at
>> com.thoughtworks.go.server.persistence.MaterialRepository.hasPipelineEverRunWith(MaterialRepository.java:853)
>> at
>> com.thoughtworks.go.server.materials.MaterialChecker.hasPipelineEverRunWith(MaterialChecker.java:100)
>> at
>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:186)
>> at
>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:148)
>> at
>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.autoSchedulePipeline(BuildCauseProducerService.java:110)
>> at
>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:44)
>> at
>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:24)
>> at
>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.runImpl(JMSMessageListenerAdapter.java:83)
>> at
>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.run(JMSMessageListenerAdapter.java:63)
>> at java.base/java.lang.Thread.run(Unknown Source)
>> Caused by: org.h2.jdbc.JdbcSQLNonTransientConnectionException: The
>> database has been closed [90098-200]
>> at org.h2.message.DbException.getJdbcSQLException(DbException.java:622)
>> at org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
>> at org.h2.message.DbException.get(DbException.java:205)
>> at org.h2.message.DbException.get(DbException.java:181)
>> at org.h2.message.DbException.get(DbException.java:170)
>> at org.h2.engine.Database.checkPowerOff(Database.java:506)
>> at org.h2.command.Command.executeQuery(Command.java:224)
>> at
>> org.h2.jdbc.JdbcPreparedStatement.executeQuery(JdbcPreparedStatement.java:114)
>> at
>> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>> at
>> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>> at
>> org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:208)
>> at org.hibernate.loader.Loader.getResultSet(Loader.java:1953)
>> at org.hibernate.loader.Loader.doQuery(Loader.java:802)
>> at
>> org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
>> at org.hibernate.loader.Loader.doList(Loader.java:2542)
>> at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2276)
>> at org.hibernate.loader.Loader.list(Loader.java:2271)
>> at org.hibernate.loader.custom.CustomLoader.list(CustomLoader.java:316)
>> at org.hibernate.impl.SessionImpl.listCustomQuery(SessionImpl.java:1842)
>> at
>> org.hibernate.impl.AbstractSessionImpl.list(AbstractSessionImpl.java:165)
>> at org.hibernate.impl.SQLQueryImpl.list(SQLQueryImpl.java:157)
>> at
>> com.thoughtworks.go.server.persistence.MaterialRepository.lambda$hasPipelineEverRunWith$10(MaterialRepository.java:876)
>> at
>> org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:411)
>> ... 11 common frames omitted
>> 2024-09-10 08:14:18,719 ERROR [121@MessageListener for
>> ScheduleCheckListener] BuildCauseProducerService:220 - Error while
>> scheduling pipeline: ar-loan-infrastructure-rds-snapshot-prod
>> org.springframework.dao.DataAccessResourceFailureException: Hibernate
>> operation: could not execute query; SQL [SELECT materials.id FROM
>> pipelineMaterialRevisions INNER JOIN pipelines ON
>> pipelineMaterialRevisions.pipelineId = pipelines.id INNER JOIN
>> modifications on modifications.id  =
>> pipelineMaterialRevisions.torevisionId INNER JOIN materials on
>> modifications.materialId = materials.id WHERE materials.id = ? AND
>> pipelineMaterialRevisions.toRevisionId >= ? AND
>> pipelineMaterialRevisions.fromRevisionId <= ? AND pipelines.name = ?
>> GROUP BY materials.id;]; The database has been closed [90098-200];
>> nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: The
>> database has been closed [90098-200]
>> at
>> org.springframework.jdbc.support.SQLExceptionSubclassTranslator.doTranslate(SQLExceptionSubclassTranslator.java:79)
>> at
>> org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
>> at
>> org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:82)
>> at
>> org.springframework.orm.hibernate3.HibernateAccessor.convertJdbcAccessException(HibernateAccessor.java:428)
>> at
>> org.springframework.orm.hibernate3.HibernateAccessor.convertHibernateAccessException(HibernateAccessor.java:414)
>> at
>> org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:416)
>> at
>> org.springframework.orm.hibernate3.HibernateTemplate.execute(HibernateTemplate.java:342)
>> at
>> com.thoughtworks.go.server.persistence.MaterialRepository.hasPipelineEverRunWith(MaterialRepository.java:853)
>> at
>> com.thoughtworks.go.server.materials.MaterialChecker.hasPipelineEverRunWith(MaterialChecker.java:100)
>> at
>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:186)
>> at
>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.newProduceBuildCause(BuildCauseProducerService.java:148)
>> at
>> com.thoughtworks.go.server.scheduling.BuildCauseProducerService.autoSchedulePipeline(BuildCauseProducerService.java:110)
>> at
>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:44)
>> at
>> com.thoughtworks.go.server.scheduling.ScheduleCheckListener.onMessage(ScheduleCheckListener.java:24)
>> at
>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.runImpl(JMSMessageListenerAdapter.java:83)
>> at
>> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.run(JMSMessageListenerAdapter.java:63)
>> at java.base/java.lang.Thread.run(Unknown Source)
>> Caused by: org.h2.jdbc.JdbcSQLNonTransientConnectionException: The
>> database has been closed [90098-200]
>> at org.h2.message.DbException.getJdbcSQLException(DbException.java:622)
>> at org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
>> at org.h2.message.DbException.get(DbException.java:194)
>> at org.h2.engine.Session.getTransaction(Session.java:1792)
>> at
>> org.h2.engine.Session.startStatementWithinTransaction(Session.java:1815)
>> at org.h2.command.Command.executeQuery(Command.java:190)
>> at
>> org.h2.jdbc.JdbcPreparedStatement.executeQuery(JdbcPreparedStatement.java:114)
>> at
>> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>> at
>> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>> at
>> org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:208)
>> at org.hibernate.loader.Loader.getResultSet(Loader.java:1953)
>> at org.hibernate.loader.Loader.doQuery(Loader.java:802)
>> at
>> org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
>> at org.hibernate.loader.Loader.doList(Loader.java:2542)
>> at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2276)
>> at org.hibernate.loader.Loader.list(Loader.java:2271)
>> at org.hibernate.loader.custom.CustomLoader.list(CustomLoader.java:316)
>> at org.hibernate.impl.SessionImpl.listCustomQuery(SessionImpl.java:1842)
>> at
>> org.hibernate.impl.AbstractSessionImpl.list(AbstractSessionImpl.java:165)
>> at org.hibernate.impl.SQLQueryImpl.list(SQLQueryImpl.java:157)
>> at
>> com.thoughtworks.go.server.persistence.MaterialRepository.lambda$hasPipelineEverRunWith$10(MaterialRepository.java:876)
>> at
>> org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:411)
>> ... 11 common frames omitted
>>
>> Best Regards,
>> Komgrit
>> On Wednesday, September 11, 2024 at 2:11:45 PM UTC+7 Chad Wilson wrote:
>>
>>> Before we get into memory stats, did you look at the logs to see if the
>>> server is being restarted internally as I described below? No point looking
>>> at memory stats unless we have evidence there is a memory problem.
>>>
>>> You generally cannot use OS-level stats to debug memory usage for Java
>>> applications like GoCD on its own - you need to look at the internal Java
>>> heap used/free stats. You might have available memory at container/host
>>> level, but the JVM is not using it due to the settings, and so you are
>>> still running out of memory. Furthermore, the increase you show is almost
>>> all file buffer/cache rather than application usage.
>>>
>>> If I recall correctly, by default the GoCD server only starts with a max
>>> heap size of 1G which is relatively small for a bigger server with perhaps
>>> hundreds of pipelines, however we should try to find evidence of that
>>> before randomly changing things or going deeper.
>>>
>>> -Chad
>>>
>>> On Wed, Sep 11, 2024 at 12:31 PM Komgrit Aneksri <[email protected]>
>>> wrote:
>>>
>>>> Thank you Chad for help me to investigate and suggest.
>>>>
>>>> Here are more information about my GoCD server resources below.
>>>>
>>>> We are using worker node as c6g.2xlarge.
>>>>
>>>> Currently, CPU usage is 40 - 45 %
>>>>
>>>> After restarted GoCD has memory free around  3GB, Then GoCD server run
>>>>
>>>> After restart
>>>> bash-5.1$ free -m
>>>>                total        used        free      shared  buff/cache
>>>> available
>>>> Mem:           15678        3037        4500           3        8395
>>>>     12640
>>>> Swap:              0           0           0
>>>>
>>>> A while has passed to now, memory free was reduced to 260MB
>>>> bash-5.1$ free -m
>>>>                total        used        free      shared  buff/cache
>>>> available
>>>> Mem:           15678        3394         260           3       12278
>>>>     12283
>>>> Swap:              0           0           0
>>>>
>>>>  JVM is default setting.
>>>>
>>>> Regards,
>>>> Komgrit
>>>>
>>>> On Wednesday, September 11, 2024 at 9:38:20 AM UTC+7 Chad Wilson wrote:
>>>>
>>>>> If this has never happened before, and only just started happening,
>>>>> then *something* must have changed. Might be worth figuring that out.
>>>>>
>>>>> A database becomes locked like this only when two instances are trying
>>>>> to connect to the same H2 database file, or one crashed somehow without
>>>>> releasing the lock. Probably need to see the full error/stack trace to see
>>>>> the root cause, however usually it's something like "Caused by:
>>>>> java.lang.IllegalStateException: The file is locked:
>>>>> nio:/godata/db/h2db/cruise.mv.db [1.4.200/7]"
>>>>>
>>>>> I suggest you look inside the GoCD server log file more directly, not
>>>>> just k8s stats. GoCD runs as a multi-process container, and has its own
>>>>> process manager (Tanuki Java wrapper) so it is possible that even without
>>>>> Kubernetes showing container or pod restarts that GoCD itself has been
>>>>> restarted by the Tanuki process manager. it will log when it does so. I'd
>>>>> look for when the errors started, and then scroll back through the
>>>>> container logs to see if the process was restarted by Tanuki. It will
>>>>> restart the main JVM if it thinks the main server process is not
>>>>> responding, or due to OOM errors etc. Perhaps the lock is not being
>>>>> released fast enough. Anyway - your root problem may be heap
>>>>> size/memory/CPU constraints rather than the database itself.
>>>>>
>>>>> Even if you use Postgres, if you have cases where there are two GoCD
>>>>> server instances overlapping or trying to share the database file you will
>>>>> have other issues of some sort (due to race conditions) and if you have
>>>>> some other server stability issue causing restarts it's probably wise to
>>>>> understand how it is getting into this state first so you're addressing 
>>>>> the
>>>>> right problem.
>>>>>
>>>>> As for migration to Postgres, the docs are at
>>>>> https://github.com/gocd/gocd-database-migrator . There's nothing
>>>>> specific for EKS/Kubernetes however generally speaking you'd need to
>>>>>
>>>>>    - prepare your postgres instance per
>>>>>    
>>>>> https://docs.gocd.org/current/installation/configuring_database/postgres.html
>>>>>    - (when ready to do the "proper" run) stop your GoCD server
>>>>>    instance
>>>>>    - get your H2 DB file off EFS somewhere to run the migration tool
>>>>>    against
>>>>>    - run the migrator tool
>>>>>    - change GoCD server Helm chart to mount the db.properties that
>>>>>    tell it how to connect to Postgres
>>>>>    - start the GoCD server instances against postgres
>>>>>
>>>>>
>>>>> -Chad
>>>>>
>>>>> On Wed, Sep 11, 2024 at 9:56 AM Komgrit Aneksri <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I have no any change in configurations.
>>>>>>
>>>>>> But We have add new users and new pipelines every day.
>>>>>>
>>>>>> Pods is still running status and no restart/spawn/evicted.
>>>>>>
>>>>>> If we have to migrate h2 to postgresql. Do you have any migration
>>>>>> documentations for K8s?
>>>>>>
>>>>>> Regards,
>>>>>> Komgrit
>>>>>> On Tuesday, September 10, 2024 at 10:32:31 PM UTC+7 Chad Wilson wrote:
>>>>>>
>>>>>>> What changed in your setup when this started happening?
>>>>>>>
>>>>>>> Is your GoCD server pod crashing and being automatically restarted?
>>>>>>> Are nodes it is running on dying and the pod being re-scheduled 
>>>>>>> elsewhere?
>>>>>>>
>>>>>>> On Tue, 10 Sept 2024, 17:02 Komgrit Aneksri, <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi team,
>>>>>>>>
>>>>>>>> I am facing issue about database.
>>>>>>>>
>>>>>>>> Error message below
>>>>>>>> Could not open JDBC Connection for transaction; nested exception is
>>>>>>>> org.h2.jdbc.JdbcSQLNonTransientConnectionException: Database may be 
>>>>>>>> already
>>>>>>>> in use: null. Possible solutions: close all other connection(s); use 
>>>>>>>> the
>>>>>>>> server mode [90020-200]
>>>>>>>>
>>>>>>>> Now I did restarted the gocd server then it is back to normal now.
>>>>>>>>
>>>>>>>> I used GoCD version 23.1.0 running on EKS
>>>>>>>>
>>>>>>>> And store files and database (h2) on EFS.
>>>>>>>>
>>>>>>>> I have found this issue 2 times to now (last Thursday and today)
>>>>>>>>
>>>>>>>> Cloud you please help me what should improve for fix this issue?
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Komgrit
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "go-cd" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/go-cd/9d1a20c3-b463-4dd3-a376-b2bcd014091cn%40googlegroups.com
>>>>>>>> <https://groups.google.com/d/msgid/go-cd/9d1a20c3-b463-4dd3-a376-b2bcd014091cn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "go-cd" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>>
>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/go-cd/30709745-11f5-40dc-a194-0365f39a1c1en%40googlegroups.com
>>>>>> <https://groups.google.com/d/msgid/go-cd/30709745-11f5-40dc-a194-0365f39a1c1en%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "go-cd" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>>
>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/go-cd/df53a982-2be6-40d4-b64c-2ed551b5a191n%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/go-cd/df53a982-2be6-40d4-b64c-2ed551b5a191n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "go-cd" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/go-cd/34ef081a-c453-4687-b923-5a477bc88841n%40googlegroups.com
>> <https://groups.google.com/d/msgid/go-cd/34ef081a-c453-4687-b923-5a477bc88841n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "go-cd" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/go-cd/CAA1RwH85ZVPpv6m4tNGtBdta79ONS3dE1HD4Jm8WJxr7ShxLYg%40mail.gmail.com
> <https://groups.google.com/d/msgid/go-cd/CAA1RwH85ZVPpv6m4tNGtBdta79ONS3dE1HD4Jm8WJxr7ShxLYg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/go-cd/CANiY96ZgzTkf3B_M0pkvegGT5v%2B%3DgzkkryFVpc3Z%3D%3Dw3pcWo9g%40mail.gmail.com.

Re: [go-cd] Found Database issue

Reply via email to