Hi Chad,

thanks for your support. 

>
>    - Does the server recover after some time, or you need to restart GoCD 
>    or take some other action to fix it?
>
> No it does not recover, I have to restart GoCD. 

>
>    - How are you running GoCD? i.e in which environment? Container? 
>    Standard server?
>
> Standard Server on a Ubuntu 18.04.6 LTS virtual machine

>
>    - Is your DB file on some kind of network mount or something like that?
>
> No, it isn't. 

>
>    - Is there a way to verify there aren't multiple processes/GoCD 
>    Instances trying to access the file?
>       - when it happens, are you able to use OS-level commands such as 
>       lsof to see if other/multiple processes have handles on the DB file 
>       (depends on whether storage is local)
>    
> Currently this happens only once in a while (last time there were 6 days 
between the database issues). lsof is a good idea! I'll try that the next 
time it happens.  

>
>    - Would be good to confirm you don't see GoCD crashing or getting 
>    auto-restarted in your logs to rule out GoCD itself having a different 
>    problem, and then this problem is being caused by a zombie GoCD process or 
>    some kind of stale lock which takes time to expire.
>    
> Actually, we found out with yesterdays go-server.log that the root cause 
seems to be Out of Memory of Java:
2022-05-18 10:24:18,741 WARN [105@MessageListener for WorkFinder] 
BasicDataSource:58 - An internal object pool swallowed an Exception. 
org.h2.jdbc.JdbcSQLNonTransientConnectionException: The database has been 
closed; SQL statement: ROLLBACK [90098-200]
...
2022-05-18 10:24:18,742 WARN [105@MessageListener for WorkFinder] 
BasicDataSource:58 - An internal object pool swallowed an Exception. 
org.h2.jdbc.JdbcSQLNonTransientConnectionException: Out of memory. 
[90108-200]
...
Caused by: java.lang.OutOfMemoryError: Java heap space

Actually we haven't touched heap size of GoCD up to now. Therefore, we 
increased it now in wrapper-properties.conf and hope that the error will be 
gone. I hope that does not only defer the error to some days later.

>
>    - Do you have any overrides to DB configuration, e.g a custom 
>    *config/db.properties* file?
>
> No.
 

> To answer your question on the trace files, I think you get two files when 
> the main trace file reaches an H2-configured maximum size. I ask the above 
> question on DB properties as I think GoCD sets that to 16MB by default, 
> whereas yours seems to have got to 64MB which seems curious.
>
 
Thanks, that explains a lot. You're right, the "old" file contains 
timestamps from 3:16 to 6:38 and the new one from 6:38 to 8:16.


> There is a way to change the locking approach H2 uses 
> <https://www.h2database.com/html/advanced.html#file_locking_protocols> 
> (back to the older ;FILE_LOCK=FS - which creates the stale cruise.lock.db 
> you have in your screenshot) if the issue is with the filesystem, however I 
> imagine you'd want to rule out multiple processes or some other issue first.
>
 
Thanks for the hint, I'll keep that in mind as "last resort". 

Thank you for your support. I think we'll wait now if the error occurs 
again. 

Julia

-- 
You received this message because you are subscribed to the Google Groups 
"go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/go-cd/8c61c76e-bb7b-474c-bfa5-1f623794fce4n%40googlegroups.com.

Reply via email to