Not being such an expert to linearstore as Kim, I have two ideas: 1) in case you have thousands of durable queues, you can hit kernel's limit on AIO operations and need to increase fs.aio-max-nr parameter. For calculation: I recall on some systems (rhel6?) one durable queue required 33 AIO handlers, on rhel7 it seems less (half?), but take this as a rule of thumb only.
2) It seems the journal file handler has not been initialized as it is null pointer. That could be consequence of the improper shutdown (though a buggy one). If you don't care about the data in the queue, you can replace the jrnl file(s) by empty one (I can share the file). But I expect you would like to get the data - then I would start with enabling trace logs via adding log-enable=trace+:linearstore log-to-file=/path/to/file.log # if not already logging somewhere, i.e. syslog (with trace logs not dropped) and observing how journal recovery happened on all jrnl files (or symlinks to them) under /var/lib/qpidd/.qpidd/qls/jrnl2/440d04db-7fb6-3424-a83c-b70014fa32a0 directory (here I deduce the uuid is a real queue name, per your error logs). I expect one jrnl file (the most current) recovery would fail in some manner. Kind regards, Pavel On Thu, May 30, 2019 at 11:57 PM Justin Ross <justin.r...@gmail.com> wrote: > Kim? > > On Tue, May 14, 2019, 14:01 Gordon Sim <g...@redhat.com> wrote: > > > On 14/05/2019 10:46 am, Pål Skjager Løberg wrote: > > > For a client, just getting "illegal-argument: Value for replyText is > too > > > large" back as an error when sending is not the most useful info and I > > > suspect, especially after reading the mentioned thread from November, > > there > > > might be a bug in how the error responses to the client is generated > > > causing the actual error to be masked by another error. > > > > > > Also, there seems to be a possibility that the Qpid broker will start > wth > > > broken queues, causing it to fail only when messages are written to > that > > > queue, including some null pointer problems. > > > > > > Are any of these known issues or is it the expected behavior? > > > > No, neither of these is the correct behaviour. > > > > I have committed a fix for the first issue: > > https://issues.apache.org/jira/browse/QPID-8313 > > > > For the issue with the journal recovery, I'd need to defer to the > > expert. Kim, can you recommend any diagnostics to figure out what would > > cause the problems in the queues on recovery? i.e. errors such as: > > > > > jexception 0x010b LinearFileController::getCurrentSerial() threw > > > JERR__NULL: Operation on null pointer > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org > > For additional commands, e-mail: users-h...@qpid.apache.org > > > > >