Ansgar J. Sachs created ARTEMIS-2678:
----------------------------------------

             Summary: Incomplete records for pages under high load
                 Key: ARTEMIS-2678
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2678
             Project: ActiveMQ Artemis
          Issue Type: Bug
          Components: Broker
         Environment: Linux
            Reporter: Ansgar J. Sachs
         Attachments: Bildschirmfoto 2020-03-23 um 17.35.41.png

{quote}As developer, I expect paging to be resource saving and resilient to 
high load{quote}

h3. Current Situation

During a performance test with a throughput of ~25.000 messages per second that 
run mulitple hours, some consumers were too slow to consume and messages piled 
up on the broker. For this reason, the broker started to page the messages of 
growing queues. When we reduced the load from the broker, some queues stopped 
consuming due to the following logs:
{code}
AMQ222033: Page file 000000007.page had incomplete records at position 
39,795,399 at record number 13,952

target message no.16146 not found from start offset 46032883 and start message 
number 16146: java.lang.RuntimeException: target message no.16146 not found 
from start offset 46032883 and start message number 16146
{code}

It wasnt possible to recover from this state but deleting the paging directory.

h3. Expected Situation

I would expect that the paging mechanism is resilient to any errors.

h3. Scenario Setup

Master configuration:
{code:xml}
<ha-policy>
  <shared-store>
    <master>
      <failover-on-shutdown>true</failover-on-shutdown>
    </master>
  </shared-store>
</ha-policy>
<!-- ... -->
 <address-setting match="#">
        <max-size-bytes>256Mb</max-size-bytes>
        <page-size-bytes>64Mb</page-size-bytes>
        
<message-counter-history-day-limit>10</message-counter-history-day-limit>
        <address-full-policy>PAGE</address-full-policy>
</address-setting>
{code}

An extract of the monitoring of the Performance Test is attached to this issue.

h3. Workaround

Right now we disabled paging at all and only use our Heap. However, the heap is 
exhausted at 5 million messages which is in our use case better than loosing 
any of them.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to