Thank you so much for responding and that script is essentially what I was looking for, as I knew there had to be a way to view stuck reports.
The first time I ran *pgrep -af Clark* it returned 2 report names. However, since restarting Clark that first time after killing those two processes I get nothing but the Clark, waiting for trouble output. Yes, I removed the reporter-lock folder when restarting Clark and did everything as the opensrf user. When I run *select * from reporter.currently_running* I get no output. Just to double check I ran it on both the production and replicated databases with the same result. However when I go to my version of the staff client under reports, I show a stuck report from yesterday still present in the queue. Ideas? -Jon On Sun, Jun 2, 2019 at 5:34 AM Jason Stephenson <ja...@sigio.com> wrote: > I sounds like you have dead reports that are preventing new reports from > starting. When a report dies or is killed, they aren't cleaned up and > Clark will think that they are still running. > > First, check if Clark is running and running any reports: > > pgrep -af Clark > > If you run that on the server where the reporter runs, you should get > output like this: > > 7180 Clark Kent, waiting for trouble > > The number is the process ID, so will be different. If any reports are > running, there will be additional lines similar to the above, but will > have some portion of the report's name: > > 7201 Clark Kent reporting: [Report Name] > > If no reports are currently running, then it is safe to do the following > steps. > > To check for dead reports, run the following query: > > select * from reporter.currently_running > > There can be up to "parallel" number of rows in that view, and when > there are that many, Clark will not start new reports. ("Parallel" is > the reporter/setup/parallel setting from opensrf.xml.) > > If you have any rows in that view, and no reports are currently running, > it is advisable to clear them out. You do that by setting the > complete_time on the listed reports. I have attached a SQL script that > I use for this purpose. It not only sets the complete_time, but also > sets the error_code and error_text to something semi-useful for our > environment. You might want to change that to suit your situation. > > HtH, > Jason > > On 6/1/19 6:12 PM, JonGeorg SageLibrary wrote: > > Greetings, I've run into an issue where the reporting module does not > > appear to want to restart. > > > > Reports are run on the log server against the replicated database > server. > > Normally what I do is: > > > > * just restart it > > per > http://docs.evergreen-ils.org/3.1/_starting_and_stopping_the_reporter_daemon.html > as > > opensrf user > > > > I've also done the following: > > > > * Restarted all osrf services on the application and log servers along > > with ejabberd/memcached where applicable. > > * Killed all processes on the database server older than 2 minutes. > > * Re-ran replication of the production server to replicated database > > server. I did this just to rule out that there was not an issue with > > the replicated copy because we did have a fines issue that was > > related to the replication at one point. > > * I ran "SELECT now()-query_start,pid,state,application_name,waiting > > FROM pg_stat_activity;" but had to remove ",waiting" as it threw an > > error. > > o That returns a list of processes like open-ils.cstore, > > open-ils.pcrud, open-ils.reporter-store and the like. I > > attempted to kill the old reporter-store processes with the > > command "SELECT pg_cancel_backend(backend_pid);" and Clark > > stopped, and while it returned a value of true showing the > > process was dead, when I re-ran it, it appears to still be > present. > > > > I don't see anything else > > under > http://docs.evergreen-ils.org/reorg/3.1/command_line_admin/Evergreen_Documentation.pdf > > or > https://wiki.evergreen-ils.org/doku.php?id=scratchpad:random_magic_spells > . > > > > The only thing I haven't tried, but shouldn't need to, is to actually > > restart that server, but am waiting until there is someone physically > > present in case it does not properly restart on its own. > > > > -Jon > > > > >