from:"Alan Protasio \(JIRA\)"

[jira] [Commented] (AMQ-7221) Delete Scheduled messages causes ActiveMQ create/write a unnecessary huge transaction file

2019-06-14 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864171#comment-16864171
 ] 

Alan Protasio commented on AMQ-7221:


[~cshannon] Thanks for fixing the test ;) Sorry for that! 

> Delete Scheduled messages causes ActiveMQ create/write a unnecessary huge 
> transaction file
> --
>
> Key: AMQ-7221
> URL: https://issues.apache.org/jira/browse/AMQ-7221
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0, 5.15.10
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hi,
> I tried to delete all my scheduled messages and I noticed that the broker 
> stop responding and a huge transaction file was created.
> The Scheduled.db file had ~150mb and the transaction was already 140GB when I 
> give up and wiped the broker.
> Looking at the code, I noticed that the cause of it is mostly this 2 lines:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L512]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L514]
> And
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L501]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L481]
> On the first case, we are updating an List inside a transaction but in 
> order to do that, we get its value from the BTree and put back (with an added 
> element). When we do that, the whole list is again serialized and written 
> inside the transaction.
> Similar thing is happening on the second case, We are decrementing one by a 
> number of references to a journal file but we could do it only one time in 
> the end saving lots of writes.
> After the proposed patch, the transaction that was before more than 140GB 
> (and it didn't finish) was reduced to ~500 mb, and the broker could recover 
> in less than 1 minute (i gave up before with ~15 minutes).
> On the modified test in the PR we can see that the transaction before the 
> change is ~50mb and after the change is ~500k
> Before:
> 2019-06-03 23:58:19,931 
> [//localhost#1-1|https://issues.apache.org/localhost#1-1] - DEBUG Transaction 
> - Committing transaction 5177: Size 53320 kb
> After:
>  2019-06-03 23:59:13,578 
> [//localhost#1-2|https://issues.apache.org/localhost#1-2] - DEBUG Transaction 
> - Committing transaction 5178: Size 496 kb
>  
> PS: Keeping removedJobFileIds in memory will not increase the memory usage bc 
> we were loading it in memory anyway when reading from the index.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7221) Delete Scheduled messages causes ActiveMQ create/write a unnecessary huge transaction file

2019-06-11 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861247#comment-16861247
 ] 

Alan Protasio commented on AMQ-7221:


[~cshannon] Thanks a lot! :D

> Delete Scheduled messages causes ActiveMQ create/write a unnecessary huge 
> transaction file
> --
>
> Key: AMQ-7221
> URL: https://issues.apache.org/jira/browse/AMQ-7221
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0, 5.15.10
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hi,
> I tried to delete all my scheduled messages and I noticed that the broker 
> stop responding and a huge transaction file was created.
> The Scheduled.db file had ~150mb and the transaction was already 140GB when I 
> give up and wiped the broker.
> Looking at the code, I noticed that the cause of it is mostly this 2 lines:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L512]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L514]
> And
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L501]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L481]
> On the first case, we are updating an List inside a transaction but in 
> order to do that, we get its value from the BTree and put back (with an added 
> element). When we do that, the whole list is again serialized and written 
> inside the transaction.
> Similar thing is happening on the second case, We are decrementing one by a 
> number of references to a journal file but we could do it only one time in 
> the end saving lots of writes.
> After the proposed patch, the transaction that was before more than 140GB 
> (and it didn't finish) was reduced to ~500 mb, and the broker could recover 
> in less than 1 minute (i gave up before with ~15 minutes).
> On the modified test in the PR we can see that the transaction before the 
> change is ~50mb and after the change is ~500k
> Before:
> 2019-06-03 23:58:19,931 
> [//localhost#1-1|https://issues.apache.org/localhost#1-1] - DEBUG Transaction 
> - Committing transaction 5177: Size 53320 kb
> After:
>  2019-06-03 23:59:13,578 
> [//localhost#1-2|https://issues.apache.org/localhost#1-2] - DEBUG Transaction 
> - Committing transaction 5178: Size 496 kb
>  
> PS: Keeping removedJobFileIds in memory will not increase the memory usage bc 
> we were loading it in memory anyway when reading from the index.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7221) Delete Scheduled messages causes ActiveMQ create/write a unnecessary huge transaction file

2019-06-10 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860207#comment-16860207
 ] 

Alan Protasio commented on AMQ-7221:


Hey [~gtully] [~cshannon]

Can you guys take a look on this change when you have a some time? I think it 
is straight forward..

Thanks a lot! :D

> Delete Scheduled messages causes ActiveMQ create/write a unnecessary huge 
> transaction file
> --
>
> Key: AMQ-7221
> URL: https://issues.apache.org/jira/browse/AMQ-7221
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi,
> I tried to delete all my scheduled messages and I noticed that the broker 
> stop responding and a huge transaction file was created.
> The Scheduled.db file had ~150mb and the transaction was already 140GB when I 
> give up and wiped the broker.
> Looking at the code, I noticed that the cause of it is mostly this 2 lines:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L512]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L514]
> And
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L501]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L481]
> On the first case, we are updating an List inside a transaction but in 
> order to do that, we get its value from the BTree and put back (with an added 
> element). When we do that, the whole list is again serialized and written 
> inside the transaction.
> Similar thing is happening on the second case, We are decrementing one by a 
> number of references to a journal file but we could do it only one time in 
> the end saving lots of writes.
> After the proposed patch, the transaction that was before more than 140GB 
> (and it didn't finish) was reduced to ~500 mb, and the broker could recover 
> in less than 1 minute (i gave up before with ~15 minutes).
> On the modified test in the PR we can see that the transaction before the 
> change is ~50mb and after the change is ~500k
> Before:
> 2019-06-03 23:58:19,931 
> [//localhost#1-1|https://issues.apache.org/localhost#1-1] - DEBUG Transaction 
> - Committing transaction 5177: Size 53320 kb
> After:
>  2019-06-03 23:59:13,578 
> [//localhost#1-2|https://issues.apache.org/localhost#1-2] - DEBUG Transaction 
> - Committing transaction 5178: Size 496 kb
>  
> PS: Keeping removedJobFileIds in memory will not increase the memory usage bc 
> we were loading it in memory anyway when reading from the index.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7221) Delete Scheduled messages causes ActiveMQ create/write a unnecessary huge transaction file

2019-06-04 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7221:
---
Description: 
Hi,

I tried to delete all my scheduled messages and I noticed that the broker stop 
responding and a huge transaction file was created.

The Scheduled.db file had ~150mb and the transaction was already 140GB when I 
give up and wiped the broker.

Looking at the code, I noticed that the cause of it is mostly this 2 lines:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L512]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L514]

And

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L501]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L481]

On the first case, we are updating an List inside a transaction but in 
order to do that, we get its value from the BTree and put back (with an added 
element). When we do that, the whole list is again serialized and written 
inside the transaction.

Similar thing is happening on the second case, We are decrementing one by a 
number of references to a journal file but we could do it only one time in the 
end saving lots of writes.

After the proposed patch, the transaction that was before more than 140GB (and 
it didn't finish) was reduced to ~500 mb, and the broker could recover in less 
than 1 minute (i gave up before with ~15 minutes).

On the modified test in the PR we can see that the transaction before the 
change is ~50mb and after the change is ~500k

Before:

2019-06-03 23:58:19,931 
[//localhost#1-1|https://issues.apache.org/localhost#1-1] - DEBUG Transaction - 
Committing transaction 5177: Size 53320 kb

After:
 2019-06-03 23:59:13,578 
[//localhost#1-2|https://issues.apache.org/localhost#1-2] - DEBUG Transaction - 
Committing transaction 5178: Size 496 kb

 

PS: Keeping removedJobFileIds in memory will not increase the memory usage bc 
we were loading it in memory anyway when reading from the index.

  was:
Hi,

I tried to delete all my scheduled messages and I noticed that the broker stop 
responding and a huge transaction file was created.

The Scheduled.db file had ~150mb and the transaction was already 140GB when I 
give up and wiped the broker.

Looking at the code, I noticed that the cause of it is mostly this 2 lines:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L512]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L514]

And

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L501]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L481]

On the first case, we are updating an List inside a transaction but in 
order to do that, we get its value from the BTree and put back (with an added 
element). When we do that, the whole list is again serialized and written 
inside the transaction.

Similar thing is happening on the second case, We are decrementing one by a 
number of references to a journal file but we could do it only one time in the 
end saving lots of writes.

After the proposed patch, the transaction that was before more than 140GB (and 
it didn't finish) was reduced to ~500 mb, and the broker could recover in less 
than 1 minute (i gave up before with ~15 minutes).

On the modified test in the PR we can see that the transaction before the 
change is ~50mb and after the change is ~500k

Before:

2019-06-03 23:58:19,931 [//localhost#1-1] - DEBUG Transaction - Committing 
transaction 5177: Size 53320 kb

After:
2019-06-03 23:59:13,578 [//localhost#1-2] - DEBUG Transaction - Committing 
transaction 5178: Size 496 kb

 

 


> Delete Scheduled messages causes ActiveMQ create/write a unnecessary huge 
> transaction file
> --
>
> Key: AMQ-7221
> URL: https://issues.apache.org/jira/browse/AMQ-7221
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi,
> I tried to delete all my scheduled messages and I noticed that the broker 
> stop responding

[jira] [Updated] (AMQ-7221) Delete Scheduled messages causes ActiveMQ create/write a unnecessary huge transaction file

2019-06-04 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7221:
---
Summary: Delete Scheduled messages causes ActiveMQ create/write a 
unnecessary huge transaction file  (was: Remove all Scheduled messages causes 
ActiveMQ create/write a unnecessary huge transaction file)

> Delete Scheduled messages causes ActiveMQ create/write a unnecessary huge 
> transaction file
> --
>
> Key: AMQ-7221
> URL: https://issues.apache.org/jira/browse/AMQ-7221
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Major
>
> Hi,
> I tried to delete all my scheduled messages and I noticed that the broker 
> stop responding and a huge transaction file was created.
> The Scheduled.db file had ~150mb and the transaction was already 140GB when I 
> give up and wiped the broker.
> Looking at the code, I noticed that the cause of it is mostly this 2 lines:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L512]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L514]
> And
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L501]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L481]
> On the first case, we are updating an List inside a transaction but in 
> order to do that, we get its value from the BTree and put back (with an added 
> element). When we do that, the whole list is again serialized and written 
> inside the transaction.
> Similar thing is happening on the second case, We are decrementing one by a 
> number of references to a journal file but we could do it only one time in 
> the end saving lots of writes.
> After the proposed patch, the transaction that was before more than 140GB 
> (and it didn't finish) was reduced to ~500 mb, and the broker could recover 
> in less than 1 minute (i gave up before with ~15 minutes).
> On the modified test in the PR we can see that the transaction before the 
> change is ~50mb and after the change is ~500k
> Before:
> 2019-06-03 23:58:19,931 [//localhost#1-1] - DEBUG Transaction - Committing 
> transaction 5177: Size 53320 kb
> After:
> 2019-06-03 23:59:13,578 [//localhost#1-2] - DEBUG Transaction - Committing 
> transaction 5178: Size 496 kb
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (AMQ-7221) Remove all Scheduled messages causes ActiveMQ create/write a unnecessary huge transaction file

2019-06-04 Thread Alan Protasio (JIRA)

Alan Protasio created AMQ-7221:
--

 Summary: Remove all Scheduled messages causes ActiveMQ 
create/write a unnecessary huge transaction file
 Key: AMQ-7221
 URL: https://issues.apache.org/jira/browse/AMQ-7221
 Project: ActiveMQ
  Issue Type: Improvement
  Components: KahaDB
Reporter: Alan Protasio


Hi,

I tried to delete all my scheduled messages and I noticed that the broker stop 
responding and a huge transaction file was created.

The Scheduled.db file had ~150mb and the transaction was already 140GB when I 
give up and wiped the broker.

Looking at the code, I noticed that the cause of it is mostly this 2 lines:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L512]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L514]

And

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L501]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L481]

On the first case, we are updating an List inside a transaction but in 
order to do that, we get its value from the BTree and put back (with an added 
element). When we do that, the whole list is again serialized and written 
inside the transaction.

Similar thing is happening on the second case, We are decrementing one by a 
number of references to a journal file but we could do it only one time in the 
end saving lots of writes.

After the proposed patch, the transaction that was before more than 140GB (and 
it didn't finish) was reduced to ~500 mb, and the broker could recover in less 
than 1 minute (i gave up before with ~15 minutes).

On the modified test in the PR we can see that the transaction before the 
change is ~50mb and after the change is ~500k

Before:

2019-06-03 23:58:19,931 [//localhost#1-1] - DEBUG Transaction - Committing 
transaction 5177: Size 53320 kb

After:
2019-06-03 23:59:13,578 [//localhost#1-2] - DEBUG Transaction - Committing 
transaction 5178: Size 496 kb

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7219) ActiveMQ replays journal file on a clean/unclean shutdown with transacted session + Non persistent Messages

2019-05-31 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853228#comment-16853228
 ] 

Alan Protasio commented on AMQ-7219:


Thanks [~gtully]

 

I thought the same about skipping the write of the commit command for empty 
transactions.. But it would be a larger change... Probably we would have to 
create another visitor to execute the write and call it 
[here]([https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1133])

This would be an optimization though..  I can create another Jira and do this 
if you think it worth!

> ActiveMQ replays journal file on a clean/unclean shutdown with transacted 
> session + Non persistent Messages
> ---
>
> Key: AMQ-7219
> URL: https://issues.apache.org/jira/browse/AMQ-7219
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.9
>Reporter: Alan Protasio
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hi,
>  
> Sending non persistent messages on a Transacted session is causing activeMQ 
> to keep and replay (on startup) unnecessary journal files. If all messages 
> are in this situation activemq will replay the whole journal file even with a 
> clean shutdown.
>  The problem is because is if the transaction has no persistent operation, 
> the metadata.lastUpdate is never updated.
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1400]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7219) ActiveMQ replays journal file on a clean/unclean shutdown with transacted session + Non persistent Messages

2019-05-30 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7219:
---
Description: 
Hi,

 

Sending non persistent messages on a Transacted session is causing activeMQ to 
keep and replay (on startup) unnecessary journal files. If all messages are in 
this situation activemq will replay the whole journal file even with a clean 
shutdown.

 The problem is because is if the transaction has no persistent operation, the 
metadata.lastUpdate is never updated.

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1400]

  was:
Hi,

 

Sending non persistent messages on a Transacted session is causing activeMQ to 
replay unnecessary journal files. If all messages are in this situation 
activemq will replay the whole journal file even with a clean shutdown.

 The problem is because is if the transaction has no persistent operation, the 
metadata.lastUpdate is never updated.

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1400]


> ActiveMQ replays journal file on a clean/unclean shutdown with transacted 
> session + Non persistent Messages
> ---
>
> Key: AMQ-7219
> URL: https://issues.apache.org/jira/browse/AMQ-7219
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.9
>Reporter: Alan Protasio
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hi,
>  
> Sending non persistent messages on a Transacted session is causing activeMQ 
> to keep and replay (on startup) unnecessary journal files. If all messages 
> are in this situation activemq will replay the whole journal file even with a 
> clean shutdown.
>  The problem is because is if the transaction has no persistent operation, 
> the metadata.lastUpdate is never updated.
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1400]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7219) ActiveMQ replays journal file on a clean/unclean shutdown with transacted session + Non persistent Messages

2019-05-30 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7219:
---
Description: 
Hi,

 

Sending non persistent messages on a Transacted session is causing activeMQ to 
replay unnecessary journal files. If all messages are in this situation 
activemq will replay the whole journal file even with a clean shutdown.

 The problem is because is if the transaction has no persistent operation, the 
metadata.lastUpdate is never updated.

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1400]

  was:
Hi,

 

Sending non persistent messages on a Transacted session is causing activeMQ to 
replay unnecessary journal files. If all messages are in this situation 
activemq will replay the whole journal file even with a clean shutdown.

 The problem is because is the transaction has no persistent operation, the 
metadata.lastUpdate is never updated.

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1400]


> ActiveMQ replays journal file on a clean/unclean shutdown with transacted 
> session + Non persistent Messages
> ---
>
> Key: AMQ-7219
> URL: https://issues.apache.org/jira/browse/AMQ-7219
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.9
>Reporter: Alan Protasio
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hi,
>  
> Sending non persistent messages on a Transacted session is causing activeMQ 
> to replay unnecessary journal files. If all messages are in this situation 
> activemq will replay the whole journal file even with a clean shutdown.
>  The problem is because is if the transaction has no persistent operation, 
> the metadata.lastUpdate is never updated.
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1400]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7219) ActiveMQ replays journal file on a clean/unclean shutdown with transacted session + Non persistent Messages

2019-05-29 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7219:
---
Summary: ActiveMQ replays journal file on a clean/unclean shutdown with 
transacted session + Non persistent Messages  (was: ActiveMQ replaying journal 
file on a clean/unclean shutdown with transacted session + Non persistent 
Messages)

> ActiveMQ replays journal file on a clean/unclean shutdown with transacted 
> session + Non persistent Messages
> ---
>
> Key: AMQ-7219
> URL: https://issues.apache.org/jira/browse/AMQ-7219
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.9
>Reporter: Alan Protasio
>Priority: Major
>
> Hi,
>  
> Sending non persistent messages on a Transacted session is causing activeMQ 
> to replay unnecessary journal files. If all messages are in this situation 
> activemq will replay the whole journal file even with a clean shutdown.
>  The problem is because is the transaction has no persistent operation, the 
> metadata.lastUpdate is never updated.
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1400]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (AMQ-7219) ActiveMQ replaying journal file on a clean/unclean shutdown with transacted session + Non persistent Messages

2019-05-29 Thread Alan Protasio (JIRA)

Alan Protasio created AMQ-7219:
--

 Summary: ActiveMQ replaying journal file on a clean/unclean 
shutdown with transacted session + Non persistent Messages
 Key: AMQ-7219
 URL: https://issues.apache.org/jira/browse/AMQ-7219
 Project: ActiveMQ
  Issue Type: Bug
  Components: KahaDB
Affects Versions: 5.15.9
Reporter: Alan Protasio


Hi,

 

Sending non persistent messages on a Transacted session is causing activeMQ to 
replay unnecessary journal files. If all messages are in this situation 
activemq will replay the whole journal file even with a clean shutdown.

 The problem is because is the transaction has no persistent operation, the 
metadata.lastUpdate is never updated.

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1400]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7196) During startup ActiveMq load all the scheduleDB.data on memory causing OOM

2019-05-13 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838883#comment-16838883
 ] 

Alan Protasio commented on AMQ-7196:


Thanks [~cshannon]!! :D:D

> During startup ActiveMq load all the scheduleDB.data on memory causing OOM 
> ---
>
> Key: AMQ-7196
> URL: https://issues.apache.org/jira/browse/AMQ-7196
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.16.0, 5.15.9
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0, 5.15.10
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I had a broker with lots of scheduled messages and I noticed that during 
> startup (clean or unclean) the broker was reading the whole index file and 
> storing it os memory:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L665]
> In order to prevent the OOM, I changed this method to return a 
> Iterator instead of a List avoiding load all this 
> data into the heap.
>  
> I also noticed that during the startup we read the index at least 3 times:
>  
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L829]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L857]
> and 
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L940]
>  
> Probably we can optimize this but this should be another Jira (also maybe we 
> dont need to call recover(tx) when its a clean shutdown)
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L787]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7196) During startup ActiveMq load all the scheduleDB.data on memory causing OOM

2019-05-10 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837551#comment-16837551
 ] 

Alan Protasio commented on AMQ-7196:


Thanks for your input [~tabish121] :D

> During startup ActiveMq load all the scheduleDB.data on memory causing OOM 
> ---
>
> Key: AMQ-7196
> URL: https://issues.apache.org/jira/browse/AMQ-7196
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.16.0, 5.15.9
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I had a broker with lots of scheduled messages and I noticed that during 
> startup (clean or unclean) the broker was reading the whole index file and 
> storing it os memory:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L665]
> In order to prevent the OOM, I changed this method to return a 
> Iterator instead of a List avoiding load all this 
> data into the heap.
>  
> I also noticed that during the startup we read the index at least 3 times:
>  
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L829]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L857]
> and 
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L940]
>  
> Probably we can optimize this but this should be another Jira (also maybe we 
> dont need to call recover(tx) when its a clean shutdown)
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L787]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7196) During startup ActiveMq load all the scheduleDB.data on memory causing OOM

2019-05-08 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835861#comment-16835861
 ] 

Alan Protasio commented on AMQ-7196:


yeah, and the broker cannot startup when it gets in this situation.

 

With this change the broker can at least startup, but it still read the index 
file at least 3 times (even with a clean shutdown). 

 

 

> During startup ActiveMq load all the scheduleDB.data on memory causing OOM 
> ---
>
> Key: AMQ-7196
> URL: https://issues.apache.org/jira/browse/AMQ-7196
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.16.0, 5.15.9
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I had a broker with lots of scheduled messages and I noticed that during 
> startup (clean or unclean) the broker was reading the whole index file and 
> storing it os memory:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L665]
> In order to prevent the OOM, I changed this method to return a 
> Iterator instead of a List avoiding load all this 
> data into the heap.
>  
> I also noticed that during the startup we read the index at least 3 times:
>  
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L829]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L857]
> and 
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L940]
>  
> Probably we can optimize this but this should be another Jira (also maybe we 
> dont need to call recover(tx) when its a clean shutdown)
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L787]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (AMQ-7196) During startup ActiveMq load all the scheduleDB.data on memory causing OOM

2019-05-08 Thread Alan Protasio (JIRA)

Alan Protasio created AMQ-7196:
--

 Summary: During startup ActiveMq load all the scheduleDB.data on 
memory causing OOM 
 Key: AMQ-7196
 URL: https://issues.apache.org/jira/browse/AMQ-7196
 Project: ActiveMQ
  Issue Type: Bug
  Components: KahaDB
Affects Versions: 5.15.9, 5.16.0
Reporter: Alan Protasio


I had a broker with lots of scheduled messages and I noticed that during 
startup (clean or unclean) the broker was reading the whole index file and 
storing it os memory:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerImpl.java#L665]

In order to prevent the OOM, I changed this method to return a 
Iterator instead of a List avoiding load all this 
data into the heap.

 

I also noticed that during the startup we read the index at least 3 times:

 

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L829]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L857]

and 

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L940]

 

Probably we can optimize this but this should be another Jira (also maybe we 
dont need to call recover(tx) when its a clean shutdown)

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/scheduler/JobSchedulerStoreImpl.java#L787]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7112) Network Bridges not showing Duplex bridges on the Remote broker console

2019-03-15 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793783#comment-16793783
 ] 

Alan Protasio commented on AMQ-7112:


[~cshannon] Thanks! :D:D:D:D

> Network Bridges not showing Duplex bridges on the Remote broker console
> ---
>
> Key: AMQ-7112
> URL: https://issues.apache.org/jira/browse/AMQ-7112
> Project: ActiveMQ
>  Issue Type: Test
>  Components: networkbridge
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
> Attachments: brokerA - After Change.png, brokerA - JMX View.png, 
> brokerB - After Change .png, brokerB - Before Change.png, brokerB - JMX 
> view.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi,
> I created a duplex network connector and I noticed that the "[Created By 
> Duplex|http://localhost:8161/admin/network.jsp]; column was false in the 
> local broker. I found this weird and I noticed that the collumn should have 
> the value true on the remote broker as described here:
> [http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]
> https://issues.apache.org/jira/browse/AMQ-3109
> After analyzing why i noticed that the name of the remote bean changed here:
> https://issues.apache.org/jira/browse/AMQ-4237
> [https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]
> After this change this information stopped to be displayed in the console.
> I Added some screen shots of before and after the change.
>  
> BrokerA -> BrokerB (duplex)
> {quote} 
>  
>  {{         name="connector" password="admin" uri="static:(tcp://localhost:61616)" 
> userName="admin">}}
>  {{        }}
>   
> {quote}
>  
> So... in the broker a the bean is:
> org.apache.activemq:type=Broker,brokerName=localhost,connector=networkConnectors,networkConnectorName=connector
>  
> In the broker b, the bean name is:
> org.apache.activemq:brokerName=localhost,connector=duplexNetworkConnectors,networkConnectorName=#4,networkBridge=tcp_//127.0.0.1_49610,type=Broker
>  
> As e can see, the attribute connector is different 
> (connector=networkConnectors x connector=duplexNetworkConnectors)
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7159) Adding a new attribute on PersistenceAdapterViewMBean to show information about Storage write/read latency

2019-03-12 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790938#comment-16790938
 ] 

Alan Protasio commented on AMQ-7159:


Thanks a lot [~cshannon] :)

> Adding a new attribute on PersistenceAdapterViewMBean to show information 
> about Storage write/read latency
> --
>
> Key: AMQ-7159
> URL: https://issues.apache.org/jira/browse/AMQ-7159
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Minor
> Fix For: 5.16.0, 5.15.9
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Hi all,
> I was trying to find a way to monitor the real storage write latency observed 
> by ActiveMq and I could not find.
> The only thing that I found was a log line:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]
>  
> I think would be really useful to have this information on the 
> PersistenceAdapterViewMBean.
>  
> The change proposed create a new attribute on this bean called "Statistics" - 
> that contains write and read times statistics - and a operation to reset it.
>  
> OBS: This information can be extended to all other persistence adapters.
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7163) Related to AMQ-7082 - If the broker had an unclean shutdown and number of free pages is Zero after the recovery, the next shutdown will also be "unclean"

2019-03-07 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787388#comment-16787388
 ] 

Alan Protasio commented on AMQ-7163:


Thanks again [~cshannon] :D

> Related to AMQ-7082 - If the broker had an unclean shutdown and number of 
> free pages is Zero after the recovery, the next shutdown will also be 
> "unclean"
> -
>
> Key: AMQ-7163
> URL: https://issues.apache.org/jira/browse/AMQ-7163
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hi,
> This is related to AMQ-7082.
> If the broker had an unclean shutdown and the recovery thread didn't find any 
> free pages (newFreePages is empty the recovery), the broker will have a 
> second unclean shutdown - and this will happens to any future restart as long 
> as the number of free pages is = 0 
>  
> See:
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7163) Related to AMQ-7082 - If the broker had an unclean shutdown and number of free pages is Zero after the recovery, the next shutdown will also be "unclean"

2019-03-06 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7163:
---
Description: 
Hi,

This is related to AMQ-7082.

If the broker had an unclean shutdown and the recovery thread didn't find any 
free pages (newFreePages is empty the recovery), the broker will have a second 
unclean shutdown (and this will happens to any future restart as long as the 
number of free pages is = 0)

 

See:

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527]

 

  was:
Hi,

This is related to AMQ-7082.

If the broker had an unclean shutdown and the recovery thread didn't find any 
free pages (newFreePages is empty the recovery), the broker will have a second 
unclean shutdown (and this will happens to any future restart as long as the 
number of free pages is = 0;

 

See:

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527]

 


> Related to AMQ-7082 - If the broker had an unclean shutdown and number of 
> free pages is Zero after the recovery, the next shutdown will also be 
> "unclean"
> -
>
> Key: AMQ-7163
> URL: https://issues.apache.org/jira/browse/AMQ-7163
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi,
> This is related to AMQ-7082.
> If the broker had an unclean shutdown and the recovery thread didn't find any 
> free pages (newFreePages is empty the recovery), the broker will have a 
> second unclean shutdown (and this will happens to any future restart as long 
> as the number of free pages is = 0)
>  
> See:
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7163) Related to AMQ-7082 - If the broker had an unclean shutdown and number of free pages is Zero after the recovery, the next shutdown will also be "unclean"

2019-03-06 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7163:
---
Description: 
Hi,

This is related to AMQ-7082.

If the broker had an unclean shutdown and the recovery thread didn't find any 
free pages (newFreePages is empty the recovery), the broker will have a second 
unclean shutdown - and this will happens to any future restart as long as the 
number of free pages is = 0 

 

See:

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527]

 

  was:
Hi,

This is related to AMQ-7082.

If the broker had an unclean shutdown and the recovery thread didn't find any 
free pages (newFreePages is empty the recovery), the broker will have a second 
unclean shutdown (and this will happens to any future restart as long as the 
number of free pages is = 0)

 

See:

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527]

 


> Related to AMQ-7082 - If the broker had an unclean shutdown and number of 
> free pages is Zero after the recovery, the next shutdown will also be 
> "unclean"
> -
>
> Key: AMQ-7163
> URL: https://issues.apache.org/jira/browse/AMQ-7163
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi,
> This is related to AMQ-7082.
> If the broker had an unclean shutdown and the recovery thread didn't find any 
> free pages (newFreePages is empty the recovery), the broker will have a 
> second unclean shutdown - and this will happens to any future restart as long 
> as the number of free pages is = 0 
>  
> See:
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7163) Related to AMQ-7082 - If the broker had an unclean shutdown and number of free pages is Zero after the recovery, the next shutdown will also be "unclean"

2019-03-06 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7163:
---
Description: 
Hi,

This is related to AMQ-7082.

If the broker had an unclean shutdown and the recovery thread didn't find any 
free pages (newFreePages is empty the recovery), the broker will have a second 
unclean shutdown (and this will happens to any future restart as long as the 
number of free pages is = 0;

 

See:

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527]

 

  was:
Hi,

This is related to AMQ-7082.

If the broker had an unclean shutdown and the recovery thread didn't find any 
free pages (newFreePages is empty the recovery), the broker will have a second 
unclean shutdown (and this will remain as to any number of restarts if the 
number of free pages is always = 0;

 

See:

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]

 

https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527

 


> Related to AMQ-7082 - If the broker had an unclean shutdown and number of 
> free pages is Zero after the recovery, the next shutdown will also be 
> "unclean"
> -
>
> Key: AMQ-7163
> URL: https://issues.apache.org/jira/browse/AMQ-7163
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi,
> This is related to AMQ-7082.
> If the broker had an unclean shutdown and the recovery thread didn't find any 
> free pages (newFreePages is empty the recovery), the broker will have a 
> second unclean shutdown (and this will happens to any future restart as long 
> as the number of free pages is = 0;
>  
> See:
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]
>  
> [https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (AMQ-7163) Related to AMQ-7082 - If the broker had an unclean shutdown and number of free pages is Zero after the recovery, the next shutdown will also be "unclean"

2019-03-06 Thread Alan Protasio (JIRA)

Alan Protasio created AMQ-7163:
--

 Summary: Related to AMQ-7082 - If the broker had an unclean 
shutdown and number of free pages is Zero after the recovery, the next shutdown 
will also be "unclean"
 Key: AMQ-7163
 URL: https://issues.apache.org/jira/browse/AMQ-7163
 Project: ActiveMQ
  Issue Type: Bug
  Components: KahaDB
Reporter: Alan Protasio


Hi,

This is related to AMQ-7082.

If the broker had an unclean shutdown and the recovery thread didn't find any 
free pages (newFreePages is empty the recovery), the broker will have a second 
unclean shutdown (and this will remain as to any number of restarts if the 
number of free pages is always = 0;

 

See:

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L484]

 

[https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L584]

 

https://github.com/apache/activemq/blob/9e6543551731ef0241967ca545c9a4956876cb86/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L527

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7159) Adding a new attribute on PersistenceAdapterViewMBean to show information about Storage write/read latency

2019-03-06 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786201#comment-16786201
 ] 

Alan Protasio commented on AMQ-7159:


Thank you very much [~cshannon] :D:D:D

> Adding a new attribute on PersistenceAdapterViewMBean to show information 
> about Storage write/read latency
> --
>
> Key: AMQ-7159
> URL: https://issues.apache.org/jira/browse/AMQ-7159
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Minor
> Fix For: 5.16.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Hi all,
> I was trying to find a way to monitor the real storage write latency observed 
> by ActiveMq and I could not find.
> The only thing that I found was a log line:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]
>  
> I think would be really useful to have this information on the 
> PersistenceAdapterViewMBean.
>  
> The change proposed create a new attribute on this bean called "Statistics" - 
> that contains write and read times statistics - and a operation to reset it.
>  
> OBS: This information can be extended to all other persistence adapters.
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7159) Adding a new attribute on PersistenceAdapterViewMBean to show information about Storage write/read latency

2019-03-05 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785032#comment-16785032
 ] 

Alan Protasio commented on AMQ-7159:


[~gtully] [~cshannon]

What do you guys think about this? :)

> Adding a new attribute on PersistenceAdapterViewMBean to show information 
> about Storage write/read latency
> --
>
> Key: AMQ-7159
> URL: https://issues.apache.org/jira/browse/AMQ-7159
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi all,
> I was trying to find a way to monitor the real storage write latency observed 
> by ActiveMq and I could not find.
> The only thing that I found was a log line:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]
>  
> I think would be really useful to have this information on the 
> PersistenceAdapterViewMBean.
>  
> The change proposed create a new attribute on this bean called "Statistics" - 
> that contains write and read times statistics - and a operation to reset it.
>  
> OBS: This information can be extended to all other persistence adapters.
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7159) Adding a new attribute on PersistenceAdapterViewMBean to show information about Storage write/read latency

2019-02-28 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7159:
---
Description: 
Hi all,

I was trying to find a way to monitor the real storage write latency observed 
by ActiveMq and I could not find.

The only thing that I found was a log line:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]

 

I think would be really useful to have this information on the 
PersistenceAdapterViewMBean.

 

The change proposed create a new attribute on this bean called "Statistics" - 
that contains write and read times statistics - and a operation to reset it.

 

OBS: This information can be extended to all other persistence adapters.

 

Thanks

  was:
Hi all,

I was trying to find a way to monitor the real storage write latency observed 
by ActiveMq and I could not find.

The only thing that I found was a log line:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]

 

I think would be really useful to have this information on the 
PersistenceAdapterViewMBean.

 

The change proposed create a new attribute on this bean called "storeTime" and 
a new operation to reset this statistics.

 

OBS: This information can be extended to all other persistence adapters.

 

Thanks


> Adding a new attribute on PersistenceAdapterViewMBean to show information 
> about Storage write/read latency
> --
>
> Key: AMQ-7159
> URL: https://issues.apache.org/jira/browse/AMQ-7159
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi all,
> I was trying to find a way to monitor the real storage write latency observed 
> by ActiveMq and I could not find.
> The only thing that I found was a log line:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]
>  
> I think would be really useful to have this information on the 
> PersistenceAdapterViewMBean.
>  
> The change proposed create a new attribute on this bean called "Statistics" - 
> that contains write and read times statistics - and a operation to reset it.
>  
> OBS: This information can be extended to all other persistence adapters.
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7159) Adding a new attribute on PersistenceAdapterViewMBean to show information about Storage write/read latency

2019-02-28 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7159:
---
Summary: Adding a new attribute on PersistenceAdapterViewMBean to show 
information about Storage write/read latency  (was: Adding a new attribute on 
PersistenceAdapterViewMBean to show information about Storage write latency)

> Adding a new attribute on PersistenceAdapterViewMBean to show information 
> about Storage write/read latency
> --
>
> Key: AMQ-7159
> URL: https://issues.apache.org/jira/browse/AMQ-7159
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi all,
> I was trying to find a way to monitor the real storage write latency observed 
> by ActiveMq and I could not find.
> The only thing that I found was a log line:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]
>  
> I think would be really useful to have this information on the 
> PersistenceAdapterViewMBean.
>  
> The change proposed create a new attribute on this bean called "storeTime" 
> and a new operation to reset this statistics.
>  
> OBS: This information can be extended to all other persistence adapters.
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7159) Adding a new attribute on PersistenceAdapterViewMBean to show information about Storage write latency

2019-02-28 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781127#comment-16781127
 ] 

Alan Protasio commented on AMQ-7159:


PR: https://github.com/apache/activemq/pull/349

> Adding a new attribute on PersistenceAdapterViewMBean to show information 
> about Storage write latency
> -
>
> Key: AMQ-7159
> URL: https://issues.apache.org/jira/browse/AMQ-7159
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi all,
> I was trying to find a way to monitor the real storage write latency observed 
> by ActiveMq and I could not find.
> The only thing that I found was a log line:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]
>  
> I think would be really useful to have this information on the 
> PersistenceAdapterViewMBean.
>  
> The change proposed create a new attribute on this bean called "storeTime" 
> and a new operation to reset this statistics.
>  
> OBS: This information can be extended to all other persistence adapters.
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7159) Adding a new attribute on PersistenceAdapterViewMBean to show information about Storage write latency

2019-02-28 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7159:
---
Description: 
Hi all,

I was trying to find a way to monitor the real storage write latency observed 
by ActiveMq and I could not find.

The only thing that I found was a log line:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]

 

I think would be really useful to have this information on the 
PersistenceAdapterViewMBean.

 

The change proposed create a new attribute on this bean called "storeTime" and 
a new operation to reset this statistics.

 

OBS: This information can be extended to all other persistence adapters.

 

Thanks

  was:
Hi all,

I was trying to find a way to monitor the real storage write latency observed 
by ActiveMq and I could not find.


The only thing that I found was a log line:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]

 

I think would be really useful to have this information on the 
PersistenceAdapterViewMBean.

 

The change proposed create a new attribute on this bean called "storeTime" and 
a new operation to reset this statistics.

Thanks


> Adding a new attribute on PersistenceAdapterViewMBean to show information 
> about Storage write latency
> -
>
> Key: AMQ-7159
> URL: https://issues.apache.org/jira/browse/AMQ-7159
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Reporter: Alan Protasio
>Priority: Minor
>
> Hi all,
> I was trying to find a way to monitor the real storage write latency observed 
> by ActiveMq and I could not find.
> The only thing that I found was a log line:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]
>  
> I think would be really useful to have this information on the 
> PersistenceAdapterViewMBean.
>  
> The change proposed create a new attribute on this bean called "storeTime" 
> and a new operation to reset this statistics.
>  
> OBS: This information can be extended to all other persistence adapters.
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (AMQ-7159) Adding a new attribute on PersistenceAdapterViewMBean to show information about Storage write latency

2019-02-28 Thread Alan Protasio (JIRA)

Alan Protasio created AMQ-7159:
--

 Summary: Adding a new attribute on PersistenceAdapterViewMBean to 
show information about Storage write latency
 Key: AMQ-7159
 URL: https://issues.apache.org/jira/browse/AMQ-7159
 Project: ActiveMQ
  Issue Type: Improvement
  Components: KahaDB
Reporter: Alan Protasio


Hi all,

I was trying to find a way to monitor the real storage write latency observed 
by ActiveMq and I could not find.


The only thing that I found was a log line:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L1142]

 

I think would be really useful to have this information on the 
PersistenceAdapterViewMBean.

 

The change proposed create a new attribute on this bean called "storeTime" and 
a new operation to reset this statistics.

Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-02-05 Thread Alan Protasio (JIRA)

[
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760958#comment-16760958
]

Alan Protasio edited comment on AMQ-7080 at 2/5/19 3:54 PM:

I'm trying to do it here... but every time that think about i find a case where
i can break it..

Now I was implementing using HashIndex where FreePageKey
is only a Long with a customized hashCode (and so, we would achieve the same
goal to "restrict the updates to the pages that have changed" - and also "
preallocated block of the first 10 pages or something like that." as the
hashIndex preallocate 'binCapacity'); In this solution i was thinking that for
instances pages from 1-1000 would have the same hash and so storage in the same
Bin (page) inside the HashIndex - so if pages 1-1000 get modified, only the
first page of the index would be written).

Then I created a pageFile Method to update this index:

void updatePageMap(Transaction tx, SequenceSet freeList, SequenceSet
allocatedList)

That was invoked on Transaction#commit

The problem here is the circular self dependency. When I call updatePageMap
inside the transaction.commit, i pass the transaction's free and allocated
pages. But the "updatePageMap" itself can allocate and write more pages using
the same transaction. If updatePageMap allocate one page for instance, the
freeList will have a stale data.

I cannot also create another transaction to call updatePageMap. As this second
transaction itself has to be committed creating another transaction.

To make short...

If i use the same transaction to update the SequenceSet index, I can have stale
data.

Transaction.commit -> UpdateIndex with freePages data (this operation can
allocate or free pages) -> If the UpdateIndex allocated/freed pages then we
need UpdateIndex again.

We can try to update the SequenceSet index until it stabilize and stop freeing
or allocating page (this seems not right).

If i use other transaction to call UpdateIndex, this new transaction will also
be committed creating a new transaction.

This will happen if i use any index implementation. :(

This is the change i was doing. See:

[https://github.com/alanprot/activemq/commit/18730a14db2f88d5a2ac8765fa1de50864fd2c4c#diff-1de6e29861ce2a712a3bd575dd25a013R663]

[https://github.com/alanprot/activemq/commit/18730a14db2f88d5a2ac8765fa1de50864fd2c4c#diff-b36dac7750d0c2eb9cd7e102704bee77R817]

I know you are not comfortable with this change... that's why i'm asking if its
not better put it behind a feature flag (config) and came back with the
data.free in the clean shutdown (so no change at all if this flag is no
enabled).

was (Author: alanprot):
I'm trying to do it here... but every time that think about i find a case where
i can break it..

Then I created a pageFile Method to update this index:

void updatePageMap(Transaction tx, SequenceSet freeList, SequenceSet
allocatedList)

That was invoked on Transaction#commit

I cannot also create another transaction to call updatePageMap. As this second
transaction itself has to be committed creating another transaction.

To make short...

If i use the same transaction to update the SequenceSet index, I can have stale
data.

Transaction.commit -> UpdateIndex with freePages data (this operation can
allocate or free pages) -> If the UpdateIndex allocated/freed pages then we
need UpdateIndex again.

We can try to update the SequenceSet index until it stabilize and stop freeing
or allocating page (this seems not right).

If i use other transaction to call UpdateIndex, this new transaction will also
be committed creating a new transaction.

This will happen if i use any index implementation. :(

This is the change i was doing. See:

[https://github.com/alanprot/activemq/commit/18730a14db2f88d5a2ac8765fa1de50864fd2c4c#diff-1de6e29861ce2a712a3bd575dd25a013R663]

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-02-05 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760958#comment-16760958
 ] 

Alan Protasio edited comment on AMQ-7080 at 2/5/19 3:52 PM:


I'm trying to do it here... but every time that think about i find a case where 
i can break it..

 

Now I was implementing using HashIndex where FreePageKey 
is only a Long with a customized hashCode (and so, we would achieve the same 
goal to "restrict the updates to the pages that have changed" - and also " 
preallocated block of the first 10 pages or something like that." as the 
hashIndex preallocate 'binCapacity'); In this solution i was thinking that for 
instances pages from 1-1000 would have the same hash and so storage in the same 
Bin (page) inside the HashIndex - so if pages 1-1000 get modified, only the 
first page of the index would be written).

 

Then I created a pageFile Method to update this index:

void updatePageMap(Transaction tx, SequenceSet freeList, SequenceSet 
allocatedList)

That was invoked on Transaction#commit

The problem here is the circular self dependency. When I call updatePageMap 
inside the transaction.commit, i pass the transaction's free and allocated 
pages. But the "updatePageMap" itself can allocate and write more pages using 
the same transaction. If updatePageMap allocate one page for instance, the 
freeList will have a stale data.

I cannot also create another transaction to call updatePageMap. As this second 
transaction itself has to be committed creating another transaction. 

To make short... 

If i use the same transaction to update the SequenceSet index, I can have stale 
data.

Transaction.commit -> UpdateIndex with freePages data (this operation can 
allocate or free pages)  -> If the UpdateIndex allocated/freed pages then we 
need UpdateIndex again.

We can try to update the SequenceSet index until it stabilize and stop freeing 
or allocating page (this seems not right).

If i use other transaction to call UpdateIndex, this new transaction will also 
be committed creating a new transaction.

 

This will happen if i use any index implementation. :(

 

This is the change i was doing. See:

[https://github.com/alanprot/activemq/commit/18730a14db2f88d5a2ac8765fa1de50864fd2c4c#diff-1de6e29861ce2a712a3bd575dd25a013R663]

[https://github.com/alanprot/activemq/commit/18730a14db2f88d5a2ac8765fa1de50864fd2c4c#diff-b36dac7750d0c2eb9cd7e102704bee77R817]

 

 


was (Author: alanprot):
I'm trying to do it here... but every time that think about i find a case where 
i can break it..

 

Now I was implementing using HashIndex where FreePageKey 
is only a Long with a customized hashCode (and so, we would achieve the same 
goal to "restrict the updates to the pages that have changed" - and also " 
preallocated block of the first 10 pages or something like that." as the 
hashIndex preallocate 'binCapacity'); In this solution i was thinking that for 
instances pages from 1-1000 would have the same hash and so storage in the same 
Bin (page) inside the HashIndex - so if pages 1-1000 get modified, only the 
first page of the index would be written).

 

Then I created a pageFile Method to update this index:

void updatePageMap(Transaction tx, SequenceSet freeList, SequenceSet 
allocatedList)

That was invoked on Transaction#commit

The problem here is the circular self dependency. When I call updatePageMap 
inside the transaction.commit, i pass the transaction's free and allocated 
pages. But the "updatePageMap" itself can allocate and write more pages using 
the same transaction. If updatePageMap allocate one page for instance, the 
freeList will have a stale data.

I cannot also create another transaction to call updatePageMap. As this second 
transaction itself has to be committed creating another transaction. 

To make short... 

If i use the same transaction to update the SequenceSet index, I can have stale 
data.

Transaction.commit -> UpdateIndex with freePages data (this operation can 
allocate or free pages)  -> If the UpdateIndex allocated/freed pages then we 
need UpdateIndex again.

We can try to update the SequenceSet index until it stabilize and stop freeing 
or allocating page (this seems not right).

If i use other transaction to call UpdateIndex, this new transaction will also 
be committed creating a new transaction.

This is the change i was doing. See:

[https://github.com/alanprot/activemq/commit/18730a14db2f88d5a2ac8765fa1de50864fd2c4c#diff-1de6e29861ce2a712a3bd575dd25a013R663]

https://github.com/alanprot/activemq/commit/18730a14db2f88d5a2ac8765fa1de50864fd2c4c#diff-b36dac7750d0c2eb9cd7e102704bee77R817

 

 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project:

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-02-05 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760958#comment-16760958
 ] 

Alan Protasio commented on AMQ-7080:


I'm trying to do it here... but every time that think about i find a case where 
i can break it..

 

Now I was implementing using HashIndex where FreePageKey 
is only a Long with a customized hashCode (and so, we would achieve the same 
goal to "restrict the updates to the pages that have changed" - and also " 
preallocated block of the first 10 pages or something like that." as the 
hashIndex preallocate 'binCapacity'); In this solution i was thinking that for 
instances pages from 1-1000 would have the same hash and so storage in the same 
Bin (page) inside the HashIndex - so if pages 1-1000 get modified, only the 
first page of the index would be written).

 

Then I created a pageFile Method to update this index:

void updatePageMap(Transaction tx, SequenceSet freeList, SequenceSet 
allocatedList)

That was invoked on Transaction#commit

The problem here is the circular self dependency. When I call updatePageMap 
inside the transaction.commit, i pass the transaction's free and allocated 
pages. But the "updatePageMap" itself can allocate and write more pages using 
the same transaction. If updatePageMap allocate one page for instance, the 
freeList will have a stale data.

I cannot also create another transaction to call updatePageMap. As this second 
transaction itself has to be committed creating another transaction. 

To make short... 

If i use the same transaction to update the SequenceSet index, I can have stale 
data.

Transaction.commit -> UpdateIndex with freePages data (this operation can 
allocate or free pages)  -> If the UpdateIndex allocated/freed pages then we 
need UpdateIndex again.

We can try to update the SequenceSet index until it stabilize and stop freeing 
or allocating page (this seems not right).

If i use other transaction to call UpdateIndex, this new transaction will also 
be committed creating a new transaction.

This is the change i was doing. See:

[https://github.com/alanprot/activemq/commit/18730a14db2f88d5a2ac8765fa1de50864fd2c4c#diff-1de6e29861ce2a712a3bd575dd25a013R663]

https://github.com/alanprot/activemq/commit/18730a14db2f88d5a2ac8765fa1de50864fd2c4c#diff-b36dac7750d0c2eb9cd7e102704bee77R817

 

 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-02-05 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760790#comment-16760790
 ] 

Alan Protasio edited comment on AMQ-7080 at 2/5/19 1:42 PM:


I see... I did this way to keep it simple with the minimum performance impact.

What is your suggestion? I see some possibilities:
 * create a HashIndex.
 * update the bit map at the end of the pages (end of the index file)
 * write the bit map (or the sequenceSet) in pages (with overflow)

The first solution we will write way more bytes to keep track of the same 
information. And also, we will have to read all those bytes during startup. 
(key = pageFile and value = true if page is free)

The second solution, everything that we allocate a new page we will have to 
write the whole map again.

The third one, we will have to always write the whole bitmap (or sequenceSet) 
into pages. I thought that this could cause more performance degradation than 
the proposed solution. Remember that with the proposed solution only the bytes 
that changed are updated.

 

The last option that I can think of is create a sequenceset index. This can be 
a little better than the first but not a lot in a fragmented pagefile.

Do you have any other idea [~gtully]?


was (Author: alanprot):
I see... I did this way to keep it simple with the minimum performance impact.

What is your suggestion? I see some possibilities:
 * create a listindex.
 * update the map bit at the end of the pages (end of the index file)

The first solution we will write way more bytes to keep track of the same 
information. And also, we will have to read all those bytes during startup.

The second solution, everything that we allocate a new page we will have to 
write the whole map again.

The last option that I can think of is create a sequenceset index. This can be 
a little better than the first but not a lot in a fragmented pagefile.

Do you have any other idea [~gtully]?

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-02-05 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760790#comment-16760790
 ] 

Alan Protasio edited comment on AMQ-7080 at 2/5/19 1:20 PM:


I see... I did this way to keep it simple with the minimum performance impact.

What is your suggestion? I see some possibilities:
 * create a listindex.
 * update the map bit at the end of the pages (end of the index file)

The first solution we will write way more bytes to keep track of the same 
information. And also, we will have to read all those bytes during startup.

The second solution, everything that we allocate a new page we will have to 
write the whole map again.

The last option that I can think of is create a sequenceset index. This can be 
a little better than the first but not a lot in a fragmented pagefile.

Do you have any other ideias [~gtully]?


was (Author: alanprot):
I see... I did this way to keep it simple with the minimum performance impact. 

What is your suggestion? I see some possibilities:

* create a listindex.
* update the map bit at the end of the pages (end of the index file)

The first solution we will write way more bytes to keep track of the same 
information. And also, we will have to read all those bytes during startup.

The second solution, everything that we allocate a new page we will have to 
write the whole map again.

The last option that I can think of is create a sequenceset index. This can be 
a little better than the first but not a lot in a fragmented pagefile.

Do you have any other ideias @gtully? 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-02-05 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760790#comment-16760790
 ] 

Alan Protasio edited comment on AMQ-7080 at 2/5/19 1:21 PM:


I see... I did this way to keep it simple with the minimum performance impact.

What is your suggestion? I see some possibilities:
 * create a listindex.
 * update the map bit at the end of the pages (end of the index file)

The first solution we will write way more bytes to keep track of the same 
information. And also, we will have to read all those bytes during startup.

The second solution, everything that we allocate a new page we will have to 
write the whole map again.

The last option that I can think of is create a sequenceset index. This can be 
a little better than the first but not a lot in a fragmented pagefile.

Do you have any other idea [~gtully]?


was (Author: alanprot):
I see... I did this way to keep it simple with the minimum performance impact.

What is your suggestion? I see some possibilities:
 * create a listindex.
 * update the map bit at the end of the pages (end of the index file)

The first solution we will write way more bytes to keep track of the same 
information. And also, we will have to read all those bytes during startup.

The second solution, everything that we allocate a new page we will have to 
write the whole map again.

The last option that I can think of is create a sequenceset index. This can be 
a little better than the first but not a lot in a fragmented pagefile.

Do you have any other ideias [~gtully]?

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-02-05 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760790#comment-16760790
 ] 

Alan Protasio commented on AMQ-7080:


I see... I did this way to keep it simple with the minimum performance impact. 

What is your suggestion? I see some possibilities:

* create a listindex.
* update the map bit at the end of the pages (end of the index file)

The first solution we will write way more bytes to keep track of the same 
information. And also, we will have to read all those bytes during startup.

The second solution, everything that we allocate a new page we will have to 
write the whole map again.

The last option that I can think of is create a sequenceset index. This can be 
a little better than the first but not a lot in a fragmented pagefile.

Do you have any other ideias @gtully? 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-02-04 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760407#comment-16760407
 ] 

Alan Protasio commented on AMQ-7080:


Hi [~gtully]

Have you had a change to look the change and the perf numbers? I can put this 
behind a feature flag if you think make more sense (enablePageMapFile or 
something like that). :D

 

Thanks! :D

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-31 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757609#comment-16757609
 ] 

Alan Protasio commented on AMQ-7080:


Hi...

All the tests seems to be ok! 

We can put this behind a feature flag if you guys think that this can have a 
performance impact.  But during my tests seems that the it is negligible.

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7143) Temporary transaction file (PageFile) being opened and closed many times, causing poor performance on high latency FS

2019-01-31 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757442#comment-16757442
 ] 

Alan Protasio commented on AMQ-7143:


Thanks [~cshannon] :)

> Temporary transaction file (PageFile) being opened and closed many times, 
> causing poor performance on high latency FS
> -
>
> Key: AMQ-7143
> URL: https://issues.apache.org/jira/browse/AMQ-7143
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
>
> Hi,
> This is an optimization when we have a transaction with many writes (bigger 
> than 10mb by default) and activemq creates a temporary transaction file 
> (pageFile transaction).
> The problem is when this transaction is committed, this temporary file is 
> opened and closed many times (number of writes inside the transaction), 
> causing poor performance (this is operation "freezes the world" during the 
> checkpoint).
>  
> See:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]
>  
> Note that is the recoveryFile is enabled, we open and close the temporary 3 
> times for each index write.
> There is also a small bug that if the transaction is rolledBack, the 
> temporary file is left there forever, see:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]
>  
> As the transaction is rolledback, no writes will be performed and so, the 
> file will never be deleted.
> The proposed change, use the same "RandomAccessFile" object used in the 
> transaction and close if when all writes is done (so we stop opening and 
> closing it for each write, and only do it one time per transaction).
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7143) Temporary transaction file (PageFile) being opened and closed many times, causing poor performance on high latency FS

2019-01-31 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7143:
---
Summary: Temporary transaction file (PageFile) being opened and closed many 
times, causing poor performance on high latency FS  (was: Temporary transaction 
file (PageFile) being opened and closed many times, leading poor performance)

> Temporary transaction file (PageFile) being opened and closed many times, 
> causing poor performance on high latency FS
> -
>
> Key: AMQ-7143
> URL: https://issues.apache.org/jira/browse/AMQ-7143
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
>
> Hi,
> This is an optimization when we have a transaction with many writes (bigger 
> than 10mb by default) and activemq creates a temporary transaction file 
> (pageFile transaction).
> The problem is when this transaction is committed, this temporary file is 
> opened and closed many times (number of writes inside the transaction), 
> causing poor performance (this is operation "freezes the world" during the 
> checkpoint).
>  
> See:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]
>  
> Note that is the recoveryFile is enabled, we open and close the temporary 3 
> times for each index write.
> There is also a small bug that if the transaction is rolledBack, the 
> temporary file is left there forever, see:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]
>  
> As the transaction is rolledback, no writes will be performed and so, the 
> file will never be deleted.
> The proposed change, use the same "RandomAccessFile" object used in the 
> transaction and close if when all writes is done (so we stop opening and 
> closing it for each write, and only do it one time per transaction).
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7143) Temporary transaction file (PageFile) being opened and closed many times, leading poor performance

2019-01-31 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757246#comment-16757246
 ] 

Alan Protasio commented on AMQ-7143:


[~cshannon] Good catch! thanks! 

 

I updated the PR! :)

> Temporary transaction file (PageFile) being opened and closed many times, 
> leading poor performance
> --
>
> Key: AMQ-7143
> URL: https://issues.apache.org/jira/browse/AMQ-7143
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
>
> Hi,
> This is an optimization when we have a transaction with many writes (bigger 
> than 10mb by default) and activemq creates a temporary transaction file 
> (pageFile transaction).
> The problem is when this transaction is committed, this temporary file is 
> opened and closed many times (number of writes inside the transaction), 
> causing poor performance (this is operation "freezes the world" during the 
> checkpoint).
>  
> See:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]
>  
> Note that is the recoveryFile is enabled, we open and close the temporary 3 
> times for each index write.
> There is also a small bug that if the transaction is rolledBack, the 
> temporary file is left there forever, see:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]
>  
> As the transaction is rolledback, no writes will be performed and so, the 
> file will never be deleted.
> The proposed change, use the same "RandomAccessFile" object used in the 
> transaction and close if when all writes is done (so we stop opening and 
> closing it for each write, and only do it one time per transaction).
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756477#comment-16756477
 ] 

Alan Protasio edited comment on AMQ-7080 at 1/30/19 11:11 PM:
--

Hi [~gtully]

I did some benchmark here and seems that the performance hit is negligible... 
See the result here: 
[https://github.com/alanprot/activemq/blob/master/AMQ-7080.benchmark.md]

Also i did a extra test and sent 1 million messages to 1 queue. The result was 
a db.data file with 346M and db.map with only 16K.

346M data/kahadb/db.data
 16K data/kahadb/db.map

Keep in mind that does not matter how much the pages is fragmented, this file 
will not grow with the fragmentation.

Also during checkpoints, only the bytes that matter in the db.map is being 
updated (not the whole file).

I also updated the PR. Now we save the db.map also on the clean shutdown and we 
dont save db.free anymore. We still reading db.free on the start for migration 
purpose.

I'm running all the tests now! 

 

 


was (Author: alanprot):
Hi [~gtully]

I did some benchmark here and seems that the performance hit is negligible... 
See the result here: 
[https://github.com/alanprot/activemq/blob/master/AMQ-7080.benchmark.md]

Also i did a extra test and sent 1 million messages to 1 queue. The result was 
a db.data file with 346M and db.map with only 16K.

346M data/kahadb/db.data
 16K data/kahadb/db.map

Keep in mind that does not matter how much the pages is fragmented, this file 
will not grow with the fragmentation.

Also during checkpoints, only the bytes that matter in the db.map is being 
updated (not the whole file).

 

 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
>

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756477#comment-16756477
 ] 

Alan Protasio edited comment on AMQ-7080 at 1/30/19 7:33 PM:
-

Hi [~gtully]

I did some benchmark here and seems that the performance hit is negligible... 
See the result here: 
[https://github.com/alanprot/activemq/blob/master/AMQ-7080.benchmark.md]

Also i did a extra test and sent 1 million messages to 1 queue. The result was 
a db.data file with 346M and db.map with only 16K.

346M data/kahadb/db.data
 16K data/kahadb/db.map

Keep in mind that does not matter how much the pages is fragmented, this file 
will not grow with the fragmentation.

Also during checkpoints, only the bytes that matter in the db.map is being 
updated (not the whole file).

 

 


was (Author: alanprot):
Hi [~gtully]

I did some benchmark here and seems that the performance hit is negligible... 
See the result here: 
[https://github.com/alanprot/activemq/blob/master/AMQ-7080.benchmark.md]

Also i did a extra test and sent 1 million messages to 1 queue. The result was 
a db.data file with 346M and db.map with only 16K.

346M data/kahadb/db.data
 16K data/kahadb/db.map

Keep in mind that does not matter how much the pages is fragmented, this file 
will not grow with the fragmentation.

Also during checkpoints, only the bytes that matter in the db.map is being 
updated (no the whole file).

 

 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756477#comment-16756477
 ] 

Alan Protasio commented on AMQ-7080:


Hi [~gtully]

I did some benchmark here and seems that the performance hit is negligible... 
See the result here: 
[https://github.com/alanprot/activemq/blob/master/AMQ-7080.benchmark.md]

Also i did a extra test and sent 1 million messages to 1 queue. The result was 
a db.data file with 346M and db.map with only 16K.

346M data/kahadb/db.data
 16K data/kahadb/db.map

Keep in mind that does not matter how much the pages is fragmented, this file 
will not grow with the fragmentation.

Also during checkpoints, only the bytes that matter in the db.map is being 
updated (no the whole file).

 

 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7143) Temporary transaction file (PageFile) being opened and closed many times, leading poor performance

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756363#comment-16756363
 ] 

Alan Protasio commented on AMQ-7143:


Hey [~cshannon],

 

I just updated the Pull request! :D

 

Thanks

> Temporary transaction file (PageFile) being opened and closed many times, 
> leading poor performance
> --
>
> Key: AMQ-7143
> URL: https://issues.apache.org/jira/browse/AMQ-7143
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
>
> Hi,
> This is an optimization when we have a transaction with many writes (bigger 
> than 10mb by default) and activemq creates a temporary transaction file 
> (pageFile transaction).
> The problem is when this transaction is committed, this temporary file is 
> opened and closed many times (number of writes inside the transaction), 
> causing poor performance (this is operation "freezes the world" during the 
> checkpoint).
>  
> See:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]
>  
> Note that is the recoveryFile is enabled, we open and close the temporary 3 
> times for each index write.
> There is also a small bug that if the transaction is rolledBack, the 
> temporary file is left there forever, see:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]
>  
> As the transaction is rolledback, no writes will be performed and so, the 
> file will never be deleted.
> The proposed change, use the same "RandomAccessFile" object used in the 
> transaction and close if when all writes is done (so we stop opening and 
> closing it for each write, and only do it one time per transaction).
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756337#comment-16756337
 ] 

Alan Protasio commented on AMQ-7080:


Gotcha... Yeah.. i think we can keep it simple for now... 
Update a map of bits to reflect the state of the page file and as i'm 
updating only the relevant bits the performance implication should be almost 
none. What do you think? I will still do some perf tests to prove that... 
I can run some tests that you think is relevant or the JMSTools 

I agree that we dont need the db.free anymore, as we can still save the db.map 
in a clean shutdown!

 

Hey! Thanks a lot! haha :D

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7143) Temporary transaction file (PageFile) being opened and closed many times, leading poor performance

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756319#comment-16756319
 ] 

Alan Protasio commented on AMQ-7143:


Hi [~cshannon]

 

I will update the tests... On my local SSD i cannot se huge improvement but 
when i run in a shared volume (NFS with higher latency) this makes a huge 
difference. See the results for a 10MB transaction.

Before:

Running org.apache.activemq.store.kahadb.disk.page.TransactionTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.774 sec - in 
org.apache.activemq.store.kahadb.disk.page.TransactionTest

 

After:


Running org.apache.activemq.store.kahadb.disk.page.TransactionTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.834 sec - in 
org.apache.activemq.store.kahadb.disk.page.TransactionTest

> Temporary transaction file (PageFile) being opened and closed many times, 
> leading poor performance
> --
>
> Key: AMQ-7143
> URL: https://issues.apache.org/jira/browse/AMQ-7143
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
>
> Hi,
> This is an optimization when we have a transaction with many writes (bigger 
> than 10mb by default) and activemq creates a temporary transaction file 
> (pageFile transaction).
> The problem is when this transaction is committed, this temporary file is 
> opened and closed many times (number of writes inside the transaction), 
> causing poor performance (this is operation "freezes the world" during the 
> checkpoint).
>  
> See:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]
>  
> Note that is the recoveryFile is enabled, we open and close the temporary 3 
> times for each index write.
> There is also a small bug that if the transaction is rolledBack, the 
> temporary file is left there forever, see:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]
>  
> As the transaction is rolledback, no writes will be performed and so, the 
> file will never be deleted.
> The proposed change, use the same "RandomAccessFile" object used in the 
> transaction and close if when all writes is done (so we stop opening and 
> closing it for each write, and only do it one time per transaction).
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756269#comment-16756269
 ] 

Alan Protasio edited comment on AMQ-7080 at 1/30/19 4:29 PM:
-

 
{quote}SequenceSet can be better at just serialising what has changed, ie 
tracking the modified pages
{quote}
I'm thinking into that but i cannot see a good way... I can add a method in the 
PageFile call it from the Transaction like:

  [https://paste.ofcode.org/RXcLJ8GVLBeDJjNhMsMHH]

 

(this solves the problem of add writes in the same transaction)

But I'm running out of ideas how I can write only what changed. I thought in 
add a constructor to the Marshaller and pass as parameter the transaction 
freeList and allocatedList and do a diff. But inside the 
Marshaller#writePayload, we have the "DataOutput" object that does not support 
seek to update only the relevant byte. And even so, the diff logic can be 
tricky.


was (Author: alanprot):
 
{quote}SequenceSet can be better at just serialising what has changed, ie 
tracking the modified pages
{quote}
I'm thinking into that but i cannot see a good way... I can add a method in the 
PageFile i call it from the Transaction like:

  https://paste.ofcode.org/RXcLJ8GVLBeDJjNhMsMHH

But I'm running out of ideas how I can write only what changed. I thought in 
add a constructor to the Marshaller and pass as parameter the transaction 
freeList and allocatedList and do a diff. But inside the 
Marshaller#writePayload, we have the "DataOutput" object that does not support 
seek to update only the relevant byte. And even so, the diff logic can be 
tricky.

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
>

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756269#comment-16756269
 ] 

Alan Protasio edited comment on AMQ-7080 at 1/30/19 4:18 PM:
-

{quote}SequenceSet can be better at just serialising what has changed, ie 
tracking the modified pages
{quote}
I'm thinking into that but i cannot see a good way... I can add a method in the 
PageFile like:

 

{{void updatePageMap(Transaction tx, SequenceSet freeList, SequenceSet 
allocatedList)  }}{{{            }}{{   }}

{{    freeListPage.set(freeList);             }}{{   }}

{{     tx.store(freeListPage, new SequenceSet.Marshaller(), true); }}

{{}}}

 

And call it from commit or rollback (from the transaction object).

Like:
{quote}public void commit() throws IOException {
       if( writeTransactionId!=-1 ) {
             if (tmpFile != null)
Unknown macro: \{                         tmpFile.close();                      
   pageFile.removeTmpFile(getTempFile());                         tmpFile = 
null;                         txFile = null;             }
{color:#8eb021}            pageFile.updatePageMap(this, freeList, 
allocateList);{color}
             // Actually do the page writes...
             pageFile.write(writes.entrySet());
             // Release the pages that were freed up in the transaction..
             freePages(freeList);

            freeList.clear();
             allocateList.clear();
             writes.clear();
             writeTransactionId = -1;
       } else
Unknown macro: \{             freePages(allocateList);       }
      size = 0;
 }
{quote}
But I'm running out of ideas how I can write only what changed. I thought in 
add a constructor to the Marshaller and pass as parameter the transaction 
freeList and allocatedList and do a diff. But inside the 
Marshaller#writePayload, we have the "DataOutput" object that does not support 
seek to update only the relevant byte. And even so, the diff logic can be 
tricky.


was (Author: alanprot):
{quote}SequenceSet can be better at just serialising what has changed, ie 
tracking the modified pages
{quote}
I'm thinking into that but i cannot see a good way... I can add a method in the 
PageFile like:
{quote}void updatePageMap(Transaction tx, SequenceSet freeList, SequenceSet 
allocatedList)  {
            freeListPage.set(freeList);
            tx.store(freeListPage, new SequenceSet.Marshaller(), true);
}
{quote}
And call it from commit or rollback (from the transaction object).

Like:
{quote}public void commit() throws IOException {
      if( writeTransactionId!=-1 ) {
             if (tmpFile != null) {
                         tmpFile.close();
                         pageFile.removeTmpFile(getTempFile());
                         tmpFile = null;
                         txFile = null;
             }


 {color:#8eb021}            pageFile.updatePageMap(this, freeList, 
allocateList);{color}
             // Actually do the page writes...
             pageFile.write(writes.entrySet());
            // Release the pages that were freed up in the transaction..
             freePages(freeList);

             freeList.clear();
             allocateList.clear();
             writes.clear();
             writeTransactionId = -1;
      } else {
             freePages(allocateList);
      }
       size = 0;
}
{quote}
But I'm running out of ideas how I can write only what changed. I thought in 
add a constructor to the Marshaller and pass as parameter the transaction 
freeList and allocatedList and do a diff. But inside the 
Marshaller#writePayload, we have the "DataOutput" object that does not support 
seek to update only the relevant byte. And even so, the diff logic can be 
tricky.

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
>

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756269#comment-16756269
 ] 

Alan Protasio edited comment on AMQ-7080 at 1/30/19 4:21 PM:
-

 
{quote}SequenceSet can be better at just serialising what has changed, ie 
tracking the modified pages
{quote}
I'm thinking into that but i cannot see a good way... I can add a method in the 
PageFile i call it from the Transaction like:

  https://paste.ofcode.org/RXcLJ8GVLBeDJjNhMsMHH

But I'm running out of ideas how I can write only what changed. I thought in 
add a constructor to the Marshaller and pass as parameter the transaction 
freeList and allocatedList and do a diff. But inside the 
Marshaller#writePayload, we have the "DataOutput" object that does not support 
seek to update only the relevant byte. And even so, the diff logic can be 
tricky.


was (Author: alanprot):
{quote}SequenceSet can be better at just serialising what has changed, ie 
tracking the modified pages
{quote}
I'm thinking into that but i cannot see a good way... I can add a method in the 
PageFile like:

 

{{void updatePageMap(Transaction tx, SequenceSet freeList, SequenceSet 
allocatedList)  }}{{{            }}{{   }}

{{    freeListPage.set(freeList);             }}{{   }}

{{     tx.store(freeListPage, new SequenceSet.Marshaller(), true); }}

{{}}}

 

And call it from commit or rollback (from the transaction object).

Like:
{quote}public void commit() throws IOException {
       if( writeTransactionId!=-1 ) {
             if (tmpFile != null)
Unknown macro: \{                         tmpFile.close();                      
   pageFile.removeTmpFile(getTempFile());                         tmpFile = 
null;                         txFile = null;             }
{color:#8eb021}            pageFile.updatePageMap(this, freeList, 
allocateList);{color}
             // Actually do the page writes...
             pageFile.write(writes.entrySet());
             // Release the pages that were freed up in the transaction..
             freePages(freeList);

            freeList.clear();
             allocateList.clear();
             writes.clear();
             writeTransactionId = -1;
       } else
Unknown macro: \{             freePages(allocateList);       }
      size = 0;
 }
{quote}
But I'm running out of ideas how I can write only what changed. I thought in 
add a constructor to the Marshaller and pass as parameter the transaction 
freeList and allocatedList and do a diff. But inside the 
Marshaller#writePayload, we have the "DataOutput" object that does not support 
seek to update only the relevant byte. And even so, the diff logic can be 
tricky.

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756269#comment-16756269
 ] 

Alan Protasio commented on AMQ-7080:


{quote}SequenceSet can be better at just serialising what has changed, ie 
tracking the modified pages
{quote}
I'm thinking into that but i cannot see a good way... I can add a method in the 
PageFile like:
{quote}void updatePageMap(Transaction tx, SequenceSet freeList, SequenceSet 
allocatedList)  {
            freeListPage.set(freeList);
            tx.store(freeListPage, new SequenceSet.Marshaller(), true);
}
{quote}
And call it from commit or rollback (from the transaction object).

Like:
{quote}public void commit() throws IOException {
      if( writeTransactionId!=-1 ) {
             if (tmpFile != null) {
                         tmpFile.close();
                         pageFile.removeTmpFile(getTempFile());
                         tmpFile = null;
                         txFile = null;
             }


 {color:#8eb021}            pageFile.updatePageMap(this, freeList, 
allocateList);{color}
             // Actually do the page writes...
             pageFile.write(writes.entrySet());
            // Release the pages that were freed up in the transaction..
             freePages(freeList);

             freeList.clear();
             allocateList.clear();
             writes.clear();
             writeTransactionId = -1;
      } else {
             freePages(allocateList);
      }
       size = 0;
}
{quote}
But I'm running out of ideas how I can write only what changed. I thought in 
add a constructor to the Marshaller and pass as parameter the transaction 
freeList and allocatedList and do a diff. But inside the 
Marshaller#writePayload, we have the "DataOutput" object that does not support 
seek to update only the relevant byte. And even so, the diff logic can be 
tricky.

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-30 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756174#comment-16756174
 ] 

Alan Protasio commented on AMQ-7080:


Hi [~gtully] :D thanks for looking into that. :D

Here is my comments:
{quote}this is interesting. the async recovery will be expensive for sure, it 
may help to start that thread *after* normal recovery.
{quote}
That's true... we can start the recovery after the normal one... but it still 
hurt performance (dis read/write latency will increase).
{quote}on the free list map, this is a replacement for the db.free that uses 
less space. There is probably no need for db.free at all.
{quote}
Yeah.. i thought the same thing... I can do that.. The reason why I decided to 
keep the db.free is because i'm only writing db.map IF the recoveryFile is 
enable. That's why if the recovery file is not enabled, I will not know the 
"nextTransactionId" in a unclean shutdown and i'm not able to see if db.map is 
in sync to db.data. But yeah.. i can still save the db.map in a clean shutdown 
(and in this case i will know the nextTransactionId).
{quote}One thought, the bit per page is good, it is very compact. 
The sequence set is a little more heavy weight, being the actual page Ids, 
until there are large gaps in the free pages, then tracking 1-1000 as free is 
nice.
{quote}
Yeah.. and the main advantage i think is that i'm only writing the bytes that 
belongs to the pages modified. With the actual serialization I have to aways 
write the whole sequence set (so, instead of O(m) worst case - m = number os 
pages - we have o(n) where n = number of writes on this checkpoint and m always 
>> n)
{quote}I wonder, is it worth improving the sequence set in two ways:
1) having preallocated pages such that the pages are linear (this will avoid 
seeks around the page file)
2) having it keep track of modifications such that only modified pages (the 
contents of the sequence set are contained on pages) are written on a store.

In other words, I am wondering, would a better sequence set suffice? the 
sequence set is used is a few places that could benefit if that was the case.
{quote}
If I understood you correctly you are saying that instead of having a extra 
file (db.map) we could use a preallocate space in the db.data to store the same 
information, right?

If that's the case I though the same think. The problem is that db.data can 
grow indefinitely and because of that, I cannot know the size that i have to 
preallocate. Maybe we can preallocate a defined size and if the db.data grows 
bigger than that, we can allocate again 2x the size before and copy the data 
(and free the prev pages that were used). To do so I would have to keep in the 
metadata the page where i'm storing this information). 

The problem with that is that i will have to add writes in the batch and this 
seems odd:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1145]

I using page files to store information about pages files and to do so i have 
do add writes in the batch without any transaction.. Seems a chicken and egg 
problem. I dont know if i could do it without a further hack.

Because of that I thought that the best solution was to have a separated memory 
space (db.map) to store this information.
{quote}The trade off here is an additional file sync on every batch write and 
checkpoint. If the sequenceSet can be improved the sync is on a single file.
{quote}
On my preliminary tests i could not see any significant performance hit. Its 
important to note that now i'm only write the information that changed (only 
pages that the type changed). Do you suggest any test to run?

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over

[jira] [Updated] (AMQ-7143) Temporary transaction file (PageFile) being opened and closed many times, leading poor performance

2019-01-29 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7143:
---
Description: 
Hi,

This is an optimization when we have a transaction with many writes (bigger 
than 10mb by default) and activemq creates a temporary transaction file 
(pageFile transaction).

The problem is when this transaction is committed, this temporary file is 
opened and closed many times (number of writes inside the transaction), causing 
poor performance (this is operation "freezes the world" during the checkpoint).

 

See:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]

 

Note that is the recoveryFile is enabled, we open and close the temporary 3 
times for each index write.

There is also a small bug that if the transaction is rolledBack, the temporary 
file is left there forever, see:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]

 

As the transaction is rolledback, no writes will be performed and so, the file 
will never be deleted.

The proposed change, use the same "RandomAccessFile" object used in the 
transaction and close if when all writes is done (so we stop opening and 
closing it for each write, and only do it one time per transaction).

 

Thanks

  was:
Hi,

This is just an optimization when we have a transaction with many writes 
(bigger than 10mb by default) and we create a temporary file.

The problem is when this transaction is committed, this temporary file is 
opened and closed many times (number of writes inside the transaction), causing 
poor performance (this is operation "freezes the world" during the checkpoint).

 

See:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]

 

Note that is the recoveryFile is enabled, we open and close the temporary 3 
times for each index write.

There is also a small bug that if the transaction is rolledBack, the temporary 
file is left there forever, see:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]

 

As the transaction is rolledback, no writes will be performed and so, the file 
will never be deleted.

The proposed change, use the same "RandomAccessFile" object used in the 
transaction and close if when all writes is done (so we stop opening and 
closing it for each write, and only do it one time per transaction).

 

Thanks


> Temporary transaction file (PageFile) being opened and closed many times, 
> leading poor performance
> --
>
> Key: AMQ-7143
> URL: https://issues.apache.org/jira/browse/AMQ-7143
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
>
> Hi,
> This is an optimization when we have a transaction with many writes (bigger 
> than 10mb by default) and activemq creates a temporary transaction file 
> (pageFile transaction).
> The problem is when this transaction is committed, this temporary file is 
> opened and closed many times (number of writes inside the transaction), 
> causing poor performance (this is operation "freezes the world" during the 
> checkpoint).
>  
> See:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]
>  
> Note that is the recoveryFile is enabled, we open and close the temporary 3 
> times for each index write.
> There is also a small bug that if the transaction is rolledBack, the 
> temporary file is left there forever, see:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]
>

[jira] [Comment Edited] (AMQ-7143) Temporary transaction file (PageFile) being opened and closed many times, leading poor performance

2019-01-29 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755721#comment-16755721
 ] 

Alan Protasio edited comment on AMQ-7143 at 1/30/19 6:08 AM:
-

This is the proposed patch:

[https://github.com/apache/activemq/pull/343]

All tests are succeeding: "mvn clean install -Dactivemq.tests=all"

 

Thanks :D

 


was (Author: alanprot):
This is the proposed patch:

[https://github.com/apache/activemq/pull/343]

All tests are succeeding: "mvn clean install -Dactivemq.tests=all"

 

Thanks

 

> Temporary transaction file (PageFile) being opened and closed many times, 
> leading poor performance
> --
>
> Key: AMQ-7143
> URL: https://issues.apache.org/jira/browse/AMQ-7143
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
>
> Hi,
> This is just an optimization when we have a transaction with many writes 
> (bigger than 10mb by default) and we create a temporary file.
> The problem is when this transaction is committed, this temporary file is 
> opened and closed many times (number of writes inside the transaction), 
> causing poor performance (this is operation "freezes the world" during the 
> checkpoint).
>  
> See:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]
>  
> Note that is the recoveryFile is enabled, we open and close the temporary 3 
> times for each index write.
> There is also a small bug that if the transaction is rolledBack, the 
> temporary file is left there forever, see:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]
>  
> As the transaction is rolledback, no writes will be performed and so, the 
> file will never be deleted.
> The proposed change, use the same "RandomAccessFile" object used in the 
> transaction and close if when all writes is done (so we stop opening and 
> closing it for each write, and only do it one time per transaction).
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-29 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755723#comment-16755723
 ] 

Alan Protasio commented on AMQ-7080:


Thanks [~cshannon] :D Let me know if you have any questions! :D

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7143) Temporary transaction file (PageFile) being opened and closed many times, leading poor performance

2019-01-29 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755721#comment-16755721
 ] 

Alan Protasio edited comment on AMQ-7143 at 1/30/19 6:08 AM:
-

This is the proposed patch:

[https://github.com/apache/activemq/pull/343]

All tests are succeeding: "mvn clean install -Dactivemq.tests=all"

 

Thanks

 


was (Author: alanprot):
This is the patch:

[https://github.com/apache/activemq/pull/343]


All tests are succeeding: "mvn clean install -Dactivemq.tests=all"

 

> Temporary transaction file (PageFile) being opened and closed many times, 
> leading poor performance
> --
>
> Key: AMQ-7143
> URL: https://issues.apache.org/jira/browse/AMQ-7143
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
>
> Hi,
> This is just an optimization when we have a transaction with many writes 
> (bigger than 10mb by default) and we create a temporary file.
> The problem is when this transaction is committed, this temporary file is 
> opened and closed many times (number of writes inside the transaction), 
> causing poor performance (this is operation "freezes the world" during the 
> checkpoint).
>  
> See:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]
>  
> Note that is the recoveryFile is enabled, we open and close the temporary 3 
> times for each index write.
> There is also a small bug that if the transaction is rolledBack, the 
> temporary file is left there forever, see:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]
>  
> As the transaction is rolledback, no writes will be performed and so, the 
> file will never be deleted.
> The proposed change, use the same "RandomAccessFile" object used in the 
> transaction and close if when all writes is done (so we stop opening and 
> closing it for each write, and only do it one time per transaction).
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7143) Temporary transaction file (PageFile) being opened and closed many times, leading poor performance

2019-01-29 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755721#comment-16755721
 ] 

Alan Protasio commented on AMQ-7143:


This is the patch:

[https://github.com/apache/activemq/pull/343]


All tests are succeeding: "mvn clean install -Dactivemq.tests=all"

 

> Temporary transaction file (PageFile) being opened and closed many times, 
> leading poor performance
> --
>
> Key: AMQ-7143
> URL: https://issues.apache.org/jira/browse/AMQ-7143
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
>
> Hi,
> This is just an optimization when we have a transaction with many writes 
> (bigger than 10mb by default) and we create a temporary file.
> The problem is when this transaction is committed, this temporary file is 
> opened and closed many times (number of writes inside the transaction), 
> causing poor performance (this is operation "freezes the world" during the 
> checkpoint).
>  
> See:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]
>  
> Note that is the recoveryFile is enabled, we open and close the temporary 3 
> times for each index write.
> There is also a small bug that if the transaction is rolledBack, the 
> temporary file is left there forever, see:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]
>  
> As the transaction is rolledback, no writes will be performed and so, the 
> file will never be deleted.
> The proposed change, use the same "RandomAccessFile" object used in the 
> transaction and close if when all writes is done (so we stop opening and 
> closing it for each write, and only do it one time per transaction).
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (AMQ-7143) Temporary transaction file (PageFile) being opened and closed many times, leading poor performance

2019-01-29 Thread Alan Protasio (JIRA)

Alan Protasio created AMQ-7143:
--

 Summary: Temporary transaction file (PageFile) being opened and 
closed many times, leading poor performance
 Key: AMQ-7143
 URL: https://issues.apache.org/jira/browse/AMQ-7143
 Project: ActiveMQ
  Issue Type: Improvement
  Components: KahaDB
Affects Versions: 5.15.8
Reporter: Alan Protasio


Hi,

This is just an optimization when we have a transaction with many writes 
(bigger than 10mb by default) and we create a temporary file.

The problem is when this transaction is committed, this temporary file is 
opened and closed many times (number of writes inside the transaction), causing 
poor performance (this is operation "freezes the world" during the checkpoint).

 

See:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1147]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L203]

 

Note that is the recoveryFile is enabled, we open and close the temporary 3 
times for each index write.

There is also a small bug that if the transaction is rolledBack, the temporary 
file is left there forever, see:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/Transaction.java#L686]

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/page/PageFile.java#L1200]

 

As the transaction is rolledback, no writes will be performed and so, the file 
will never be deleted.

The proposed change, use the same "RandomAccessFile" object used in the 
transaction and close if when all writes is done (so we stop opening and 
closing it for each write, and only do it one time per transaction).

 

Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-28 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754146#comment-16754146
 ] 

Alan Protasio commented on AMQ-7080:


Hey [~gtully] [~cshannon]

 

What you guys think about this new change? Recovering the free pages in 
background is a good change but it still uses a lot of IO. This happens an a 
unclean shutdown. We noticed that in this case, the background thread use lots 
of IO and makes the recovery from since the last checkpoint really slow of the 
"recoveryStatistics" (also reading from index file) also really slow. So at the 
end of the day it improved but not solved the problem. 

 

This new change should not hurt performance, as we only are writing the data 
that changed and is only 1 byte per page.


I can put it behind a feature flag if you guys think that is better.

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-25 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752534#comment-16752534
 ] 

Alan Protasio edited comment on AMQ-7080 at 1/25/19 7:27 PM:
-

Hi Guys

I created a PR with a change that I think make sense.

Instead of updating the db.free file on the checkpoints (which can be big and 
hurt performance) i created a new file called data.map.

This file is only a map of the state of the PageFile at the moment of the last 
checkpoint (bit 1 means free and 0 means occupied). During checkpoints, only 
the byte that contains the bit of the pages being write on that checkpoint are 
written. And they are only written if the bit flipped.

Also, the next we keep track os the next transactionID in the header of the new 
file to make sure that this file is in sync with the db.data (pagefile itself).

 

This is the PR: [https://github.com/apache/activemq/pull/342]

 

All the tests ( mvn clean install -Dactivemq.tests=all ) are succeeding.

 

Thanks a lot! :D

 

 


was (Author: alanprot):
Hi Guys

I created a PR with a change that I think make sense.

Instead of updating the db.free file on the checkpoints (which can be big and 
hurt performance) i created a new file called data.map.

This file is only a map of the state of the PageFile at the moment of the last 
checkpoint (bit 1 means free and 0 means occupied). During checkpoints, only 
the byte that contains the bit of the pages being write on that checkpoint are 
written. And they are only written if the bit flipped.

Also, the next we keep track os the next transactionID in the header of the new 
file to make sure that this file is in sync with the db.data (pagefile itself).

 

This is the PR: [https://github.com/apache/activemq/pull/342]

 

Thanks a lot! :D

 

 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a

[jira] [Commented] (AMQ-7080) Keep track of free pages - Update db.free file during checkpoints

2019-01-25 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752534#comment-16752534
 ] 

Alan Protasio commented on AMQ-7080:


Hi Guys

I created a PR with a change that I think make sense.

Instead of updating the db.free file on the checkpoints (which can be big and 
hurt performance) i created a new file called data.map.

This file is only a map of the state of the PageFile at the moment of the last 
checkpoint (bit 1 means free and 0 means occupied). During checkpoints, only 
the byte that contains the bit of the pages being write on that checkpoint are 
written. And they are only written if the bit flipped.

Also, the next we keep track os the next transactionID in the header of the new 
file to make sure that this file is in sync with the db.data (pagefile itself).

 

This is the PR: [https://github.com/apache/activemq/pull/342]

 

Thanks a lot! :D

 

 

> Keep track of free pages - Update db.free file during checkpoints
> -
>
> Key: AMQ-7080
> URL: https://issues.apache.org/jira/browse/AMQ-7080
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.6
>Reporter: Alan Protasio
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: AMQ-7080-freeList-update.diff
>
>
> In a event of an unclean shutdown, Activemq loses the information about the 
> free pages in the index. In order to recover this information, ActiveMQ read 
> the whole index during shutdown searching for free pages and then save the 
> db.free file. This operation can take a long time, making the failover 
> slower. (during the shutdown, activemq will still hold the lock).
> From http://activemq.apache.org/shared-file-system-master-slave.html
> {quote}"If you have a SAN or shared file system it can be used to provide 
> high availability such that if a broker is killed, another broker can take 
> over immediately."
> {quote}
> Is important to note if the shutdown takes more than ACTIVEMQ_KILL_MAXSECONDS 
> seconds, any following shutdown will be unclean. This broker will stay in 
> this state unless the index is deleted (this state means that every failover 
> will take more then ACTIVEMQ_KILL_MAXSECONDS, so, if you increase this time 
> to 5 minutes, you fail over can take more than 5 minutes).
>  
> In order to prevent ActiveMQ reading the whole index file to search for free 
> pages, we can keep track of those on every Checkpoint. In order to do that we 
> need to be sure that db.data and db.free are in sync. To achieve that we can 
> have a attribute in the db.free page that is referenced by the db.data.
> So during the checkpoint we have:
> 1 - Save db.free and give a freePageUniqueId
> 2 - Save this freePageUniqueId in the db.data (metadata)
> In a crash, we can see if the db.data has the same freePageUniqueId as the 
> db.free. If this is the case we can safely use the free page information 
> contained in the db.free
> Now, the only way to read the whole index file again is IF the crash happens 
> btw step 1 and 2 (what is very unlikely).
> The drawback of this implementation is that we will have to save db.free 
> during the checkpoint, what can possibly increase the checkpoint time.
> Is also important to note that we CAN (and should) have stale data in db.free 
> as it is referencing stale db.data:
> Imagine the timeline:
> T0 -> P1, P2 and P3 are free.
> T1 -> Checkpoint
> T2 -> P1 got occupied.
> T3 -> Crash
> In the current scenario after the  Pagefile#load the P1 will be free and then 
> the replay will mark P1 as occupied or will occupied another page (now that 
> the recovery of free pages is done on shutdown)
> This change only make sure that db.data and db.free are in sync and showing 
> the reality in T1 (checkpoint), If they are in sync we can trust the db.free.
> This is a really fast draft of what i'm suggesting... If you guys agree, i 
> can create the proper patch after:
> [https://github.com/alanprot/activemq/commit/18036ef7214ef0eaa25c8650f40644dd8b4632a5]
>  
> This is related to https://issues.apache.org/jira/browse/AMQ-6590



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-16 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744355#comment-16744355
 ] 

Alan Protasio commented on AMQ-7132:


I saw it!! LGTM! :D

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: output.tgz
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-16 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744351#comment-16744351
 ] 

Alan Protasio commented on AMQ-7132:


Thanks again [~cshannon]!!! :D:D:D

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: output.tgz
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-16 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744328#comment-16744328
 ] 

Alan Protasio commented on AMQ-7132:


Thanks [~cshannon] :D

 

My only concern is if tuning the SYNC to periodic can make the 
"UNCLEAN_SHUTDOWN" test brittle (as in this test the files are copied to 
another folder and copied back to to simulate a unclean shutdown - and now we 
copy a "outdated" file).

If the test became brittle we can send less messages (10K messages was a 
arbitrary number - there is not a big reason for that). I just wanted to make 
sure that we have more than one journal file when restarting the broker.

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: output.tgz
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-15 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743401#comment-16743401
 ] 

Alan Protasio edited comment on AMQ-7132 at 1/15/19 9:53 PM:
-

How do you run the tests? Like this?

mvn clean install -Dtest=RecoveryStatsBrokerTest

I could run here with a fedora. This is my ENV:

 
{quote}[fedora@ip-172-31-16-252 ~]$ cat /etc/os-release
 NAME=Fedora
 VERSION="29 (Cloud Edition)"
 ID=fedora
 VERSION_ID=29
 PLATFORM_ID="platform:f29"
 PRETTY_NAME="Fedora 29 (Cloud Edition)"
 ANSI_COLOR="0;34"
 CPE_NAME="cpe:/o:fedoraproject:fedora:29"
 HOME_URL="https://fedoraproject.org/;
 
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f29/system-administrators-guide/;
 SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help;
 BUG_REPORT_URL="https://bugzilla.redhat.com/;
 REDHAT_BUGZILLA_PRODUCT="Fedora"
 REDHAT_BUGZILLA_PRODUCT_VERSION=29
 REDHAT_SUPPORT_PRODUCT="Fedora"
 REDHAT_SUPPORT_PRODUCT_VERSION=29
 PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy;
 VARIANT="Cloud Edition"
 VARIANT_ID=cloud

[fedora@ip-172-31-16-252 ~]$ java -version
 openjdk version "1.8.0_191"
 OpenJDK Runtime Environment (build 1.8.0_191-b12)
 OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode) 
{quote}
 

The test also succeeded here:
{quote}---
 Test set: org.apache.activemq.broker.RecoveryStatsBrokerTest
 ---
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.349 sec - 
in org.apache.activemq.broker.RecoveryStatsBrokerTest
{quote}
 

Seems that here is taking a little longer though.. (in my host it was taking 
10s and here is taking ~30 secs, so maybe only changing the timeout will 
resolve this as it is only 60 seconds)

I will attach the whole output in the ticked.

This ran in a aws m5.large instance.


was (Author: alanprot):
How do you run the tests? Like this?

mvn clean install -Dtest=RecoveryStatsBrokerTest

I could run here with a fedora. This is my ENV:

 
{quote}[fedora@ip-172-31-16-252 ~]$ cat /etc/os-release
NAME=Fedora
VERSION="29 (Cloud Edition)"
ID=fedora
VERSION_ID=29
PLATFORM_ID="platform:f29"
PRETTY_NAME="Fedora 29 (Cloud Edition)"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:29"
HOME_URL="https://fedoraproject.org/;
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f29/system-administrators-guide/;
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help;
BUG_REPORT_URL="https://bugzilla.redhat.com/;
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=29
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=29
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy;
VARIANT="Cloud Edition"
VARIANT_ID=cloud


[fedora@ip-172-31-16-252 ~]$ java -version
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode) 
{quote}
 

The test also succeeded here:
{quote}---
Test set: org.apache.activemq.broker.RecoveryStatsBrokerTest
---
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.349 sec - in 
org.apache.activemq.broker.RecoveryStatsBrokerTest
{quote}
 

I will attach the whole output in the ticked.

This ran in a aws m5.large instance.

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: output.tgz
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
>

[jira] [Updated] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-15 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7132:
---
Attachment: output.tgz

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: output.tgz
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-15 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743401#comment-16743401
 ] 

Alan Protasio edited comment on AMQ-7132 at 1/15/19 9:51 PM:
-

How do you run the tests? Like this?

mvn clean install -Dtest=RecoveryStatsBrokerTest

I could run here with a fedora. This is my ENV:

 
{quote}[fedora@ip-172-31-16-252 ~]$ cat /etc/os-release
NAME=Fedora
VERSION="29 (Cloud Edition)"
ID=fedora
VERSION_ID=29
PLATFORM_ID="platform:f29"
PRETTY_NAME="Fedora 29 (Cloud Edition)"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:29"
HOME_URL="https://fedoraproject.org/;
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f29/system-administrators-guide/;
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help;
BUG_REPORT_URL="https://bugzilla.redhat.com/;
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=29
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=29
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy;
VARIANT="Cloud Edition"
VARIANT_ID=cloud


[fedora@ip-172-31-16-252 ~]$ java -version
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode) 
{quote}
 

The test also succeeded here:
{quote}---
Test set: org.apache.activemq.broker.RecoveryStatsBrokerTest
---
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.349 sec - in 
org.apache.activemq.broker.RecoveryStatsBrokerTest
{quote}
 

I will attach the whole output in the ticked.

This ran in a aws m5.large instance.


was (Author: alanprot):
How do you run the tests? Like this?

 

mvn clean install -Dtest=RecoveryStatsBrokerTest

 

I noticed that in my host I dont see this line:
{quote}WARN BrokerService - Memory Usage for the Broker (1024mb) is more than 
the maximum available for the JVM: 491 mb - resetting to 70% of maximum 
available: 343 mb
{quote}
 

But I set manually the memory to 343mb and the test still succeeding..

 

 

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
> Attachments: output.tgz
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-15 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743401#comment-16743401
 ] 

Alan Protasio commented on AMQ-7132:


How do you run the tests? Like this?

 

mvn clean install -Dtest=RecoveryStatsBrokerTest

 

I noticed that in my host I dont see this line:
{quote}WARN BrokerService - Memory Usage for the Broker (1024mb) is more than 
the maximum available for the JVM: 491 mb - resetting to 70% of maximum 
available: 343 mb
{quote}
 

But I set manually the memory to 343mb and the test still succeeding..

 

 

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-15 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743352#comment-16743352
 ] 

Alan Protasio commented on AMQ-7132:


Hi [~cshannon]

I just ran it in my mac and on linux and all tests seems to be ok:

mvn clean install -Dtest=RecoveryStatsBrokerTest
---
 T E S T S
---

---
 T E S T S
---
Running org.apache.activemq.broker.RecoveryStatsBrokerTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.536 sec - in 
org.apache.activemq.broker.RecoveryStatsBrokerTest

Results :

Tests run: 3, Failures: 0, Errors: 0, Skipped: 0

 

What is weird is that the test timed out in a line before doing anything... 
just creating the broker and sending messages:

[https://github.com/apache/activemq/blob/master/activemq-unit-tests/src/test/java/org/apache/activemq/broker/RecoveryStatsBrokerTest.java#L152]

 

At this stage... does not make any difference if its a NORMAL, FULL_RECOVERY or 
UNCLEAN_SHUTDOWN test... All of them are failing failing in your setup?

Are you using Java 11 already? I am running the tests with java 8.

 

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-14 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742249#comment-16742249
 ] 

Alan Protasio commented on AMQ-7132:


Thanks for the help guys!! :D

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-14 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742231#comment-16742231
 ] 

Alan Protasio edited comment on AMQ-7132 at 1/14/19 3:53 PM:
-

Hi [~cshannon], Thanks.

The check you commented there is being done in the StoredDestinationMarshaller:

[https://github.com/apache/activemq/pull/336/commits/eebcdbfe7b1681e6093439ff5be281a585a307c4#diff-e3b8fff8c2133dfd70999705bbb558b3R2580]

 If in the future we need to change the MessageStoreStatisticsMarshaller then 
we need to add a kahadb version check there. But for now, the check should be 
in the StoredDestinationMarshaller, as the difference btw the versions is 
whether StoredDestinationMarshaller has or has not this information.


was (Author: alanprot):
Hi [~cshannon], Thanks.

The check you commented there is being done in the StoredDestinationMarshaller:

[https://github.com/apache/activemq/pull/336/commits/eebcdbfe7b1681e6093439ff5be281a585a307c4#diff-e3b8fff8c2133dfd70999705bbb558b3R2580]

 If in the future we need to change the MessageStoreStatisticsMarshaller then 
we need to add a kahadb version check there. But for now, the check should be 
in the StoredDestinationMarshaller, as the difference btw the versions is if 
StoredDestinationMarshaller has or has not this information.

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-14 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742231#comment-16742231
 ] 

Alan Protasio commented on AMQ-7132:


Hi [~cshannon], Thanks.

The check you commented there is being done in the StoredDestinationMarshaller:

[https://github.com/apache/activemq/pull/336/commits/eebcdbfe7b1681e6093439ff5be281a585a307c4#diff-e3b8fff8c2133dfd70999705bbb558b3R2580]

 If in the future we need to change the MessageStoreStatisticsMarshaller then 
we need to add a kahadb version check there. But for now, the check should be 
in the StoredDestinationMarshaller, as the difference btw the versions is if 
StoredDestinationMarshaller has or has not this information.

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Improvement
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Assignee: Christopher L. Shannon
>Priority: Major
> Fix For: 5.16.0
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-11 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740697#comment-16740697
 ] 

Alan Protasio commented on AMQ-7132:


Sounds good! :D Thanks [~cshannon]

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Test
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-11 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740664#comment-16740664
 ] 

Alan Protasio commented on AMQ-7132:


Hi.

I've updated PR with the KahaDBVersionTest to test the migration from kahadb 6 
to 7.

 

[~cshannon] I can definitely do the same thing that I did with 
MessageStoreStatistics with the MessageStoreSubscriptionStatistics and keep 
this information also in the index. But i would do it in another PR to keep the 
PRs as understandable and simple as possible. What do you think?

 

 

 

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Test
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-11 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740313#comment-16740313
 ] 

Alan Protasio commented on AMQ-7132:


Hi [~gtully]  :D

I did run all the tests (activemq.tests=all) before and all tests succeeded.

For this one specifically i ran again now and it is passing.

I also have the test to verify normal restart and recovery (full and after a 
unclean shutdown). 

https://github.com/apache/activemq/pull/336/commits/7467f1cbad4e6cfd3f24e02d7c4a941666740e13#diff-fd047ddbd84966d497d865453f73

I still have to add the test to migrate from the version 6 to 7 here:

[https://github.com/apache/activemq/blob/master/activemq-unit-tests/src/test/java/org/apache/activemq/store/kahadb/KahaDBVersionTest.java#L59]

 

 

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Test
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-11 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740187#comment-16740187
 ] 

Alan Protasio edited comment on AMQ-7132 at 1/11/19 10:16 AM:
--

With 1GB index file in a NFS we can see a improvement from 2minutes to 10 
seconds on the start up.

Before doing the test the SO cache was flushed (echo 3 > 
/proc/sys/vm/drop_caches)

I think the github integration didn't work here... so this is the PR:

[https://github.com/apache/activemq/pull/336]

 


was (Author: alanprot):
With 1GB index file in a NFS we can see a improvement from 2minutes to 10 
seconds on the start up.

Before doing the test the SO cache was flushed (echo 3 > 
/proc/sys/vm/drop_caches)

 

https://github.com/apache/activemq/pull/336

 

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Test
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-11 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740187#comment-16740187
 ] 

Alan Protasio commented on AMQ-7132:


With 1GB index file in a NFS we can see a improvement from 2minutes to 10 
seconds on the start up.

Before doing the test the SO cache was flushed (echo 3 > 
/proc/sys/vm/drop_caches)

 

https://github.com/apache/activemq/pull/336

 

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Test
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or ungraceful shutdown)

2019-01-10 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7132:
---
Summary: ActiveMQ reads lots of index pages upon startup (after a graceful 
or ungraceful shutdown)  (was: ActiveMQ reads lots of index pages upon startup 
(after a graceful or un graceful shutdown))

> ActiveMQ reads lots of index pages upon startup (after a graceful or 
> ungraceful shutdown)
> -
>
> Key: AMQ-7132
> URL: https://issues.apache.org/jira/browse/AMQ-7132
> Project: ActiveMQ
>  Issue Type: Test
>  Components: KahaDB
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
>
> Hi.
> We noticed that ActiveMQ reads lots of pages in the index file when is 
> starting up to recover the destinations statistics:
> [https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]
> Nowadays, in order to do that, activemq traverse the 
> storedDestination.locationIndex to get the messageCount and totalMessageSize 
> of each destination. For destinations with lots of messages this process can 
> take a while making the startup process take long time.
> In a case of a master-slave broker, this prevent the broker to fast failover 
> and does not meet what is stated on 
> [http://activemq.apache.org/shared-file-system-master-slave.html.]
> {quote}If you have a SAN or shared file system it can be used to provide 
> _high availability_ such that if a broker is killed, another broker can take 
> over immediately. 
> {quote}
> One solution for this is keep track of the destination statistics summary in 
> the index file and doing so, we dont need to read all the locationIndex on 
> the start up.
> The code change proposed is backward compatible but need a bump on the kahadb 
> version. If this information is not in the index, the broker will fall back 
> to the current implementation, which means that the first time people upgrade 
> to the new version, it will still have to read the locationIndex, but 
> subsequent restarts will be fast.
> This change should have a negligible performance impact during normal 
> activemq operation, as this change introduce a few more bytes of data to the 
> index and this information will be on checkpoints. Also, this new information 
> is synchronized with the locationIndex as they are update at the same 
> transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (AMQ-7132) ActiveMQ reads lots of index pages upon startup (after a graceful or un graceful shutdown)

2019-01-10 Thread Alan Protasio (JIRA)

Alan Protasio created AMQ-7132:
--

 Summary: ActiveMQ reads lots of index pages upon startup (after a 
graceful or un graceful shutdown)
 Key: AMQ-7132
 URL: https://issues.apache.org/jira/browse/AMQ-7132
 Project: ActiveMQ
  Issue Type: Test
  Components: KahaDB
Affects Versions: 5.15.8
Reporter: Alan Protasio
 Fix For: 5.16.0, 5.15.9


Hi.

We noticed that ActiveMQ reads lots of pages in the index file when is starting 
up to recover the destinations statistics:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/KahaDBStore.java#L819]

Nowadays, in order to do that, activemq traverse the 
storedDestination.locationIndex to get the messageCount and totalMessageSize of 
each destination. For destinations with lots of messages this process can take 
a while making the startup process take long time.

In a case of a master-slave broker, this prevent the broker to fast failover 
and does not meet what is stated on 
[http://activemq.apache.org/shared-file-system-master-slave.html.]
{quote}If you have a SAN or shared file system it can be used to provide _high 
availability_ such that if a broker is killed, another broker can take over 
immediately. 
{quote}
One solution for this is keep track of the destination statistics summary in 
the index file and doing so, we dont need to read all the locationIndex on the 
start up.

The code change proposed is backward compatible but need a bump on the kahadb 
version. If this information is not in the index, the broker will fall back to 
the current implementation, which means that the first time people upgrade to 
the new version, it will still have to read the locationIndex, but subsequent 
restarts will be fast.

This change should have a negligible performance impact during normal activemq 
operation, as this change introduce a few more bytes of data to the index and 
this information will be on checkpoints. Also, this new information is 
synchronized with the locationIndex as they are update at the same transaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7118) KahaDB store limit can be exceeded with durable subscribers.

2019-01-07 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736168#comment-16736168
 ] 

Alan Protasio commented on AMQ-7118:


Just to clarify, the hostname is changing the message ID and so, changing the 
number of journal files..

 

CommandType: KAHA_ADD_MESSAGE_COMMAND - TOPIC (DestId: 1:test), MsgId: 
ID:hostnamehostname-61683-1546886442069-4:2:1:1:1.

 

> KahaDB store limit can be exceeded with durable subscribers.
> 
>
> Key: AMQ-7118
> URL: https://issues.apache.org/jira/browse/AMQ-7118
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.16.0, 5.15.8
> Environment: JDK 8
>Reporter: Jamie goodyear
>Priority: Critical
> Fix For: 5.16.0, 5.15.8
>
> Attachments: kahaCommands.jpg
>
>
> KahaDB store limit can be exceeded with durable subscribers.
> AMQ with store limit set, we can observe that the usage continues to increase 
> AFTER PFC is engaged. Given time, this growth stabilizes. The issue of having 
> exceeded the store limit remains.
> See below output from KahaDB dump in attachments:
> This appears to be caused by checkpointAckMessageFileMap. The log files are 
> not GC'd, and the KAHA_ACK_MESSAGE_FILE_MAP_COMMAND is replicated and the DB 
> log files continue to expand - this can become exponential. Side effect of 
> also not checking storage size in checkpoint update can cause the DB log 
> files to exceed any set limits. The real critical part is the duplicated and 
> leaking Kaha messages which appears to happen with durable subscribers.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7118) KahaDB store limit can be exceeded with durable subscribers.

2019-01-06 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735270#comment-16735270
 ] 

Alan Protasio edited comment on AMQ-7118 at 1/6/19 6:47 PM:


I think I just found the problem...

The test is relying in the size of the hostname of your host...

Ex (long hostname):

sudo hostname hostnamehostnamehostname

Running org.apache.activemq.bugs.AMQ7118Test
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.483 sec <<< 
FAILURE! - in org.apache.activemq.bugs.AMQ7118Test
 testCompaction(org.apache.activemq.bugs.AMQ7118Test) Time elapsed: 2.094 sec 
<<< FAILURE!
 java.lang.AssertionError: expected:<21> but was:<22>
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:834)
 at org.junit.Assert.assertEquals(Assert.java:645)
 at org.junit.Assert.assertEquals(Assert.java:631)
 at org.apache.activemq.bugs.AMQ7118Test.checkFiles(AMQ7118Test.java:193)
 at org.apache.activemq.bugs.AMQ7118Test.testCompaction(AMQ7118Test.java:116)

 

Ex (short hostname):

sudo hostname hostname

Running org.apache.activemq.bugs.AMQ7118Test
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 15.718 sec <<< 
FAILURE! - in org.apache.activemq.bugs.AMQ7118Test
 testCompaction(org.apache.activemq.bugs.AMQ7118Test) Time elapsed: 15.372 sec 
<<< FAILURE!
 org.junit.ComparisonFailure: expected: but was:
 at org.junit.Assert.assertEquals(Assert.java:115)
 at org.junit.Assert.assertEquals(Assert.java:144)
 at org.apache.activemq.bugs.AMQ7118Test.checkFiles(AMQ7118Test.java:194)
 at org.apache.activemq.bugs.AMQ7118Test.testCompaction(AMQ7118Test.java:135)

 

But the test runs successfully if the hostname is set to "hostnamehostname"

sudo hostname hostnamehostname

---
 T E S T S
 ---
 Running org.apache.activemq.bugs.AMQ7118Test
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.281 sec - 
in org.apache.activemq.bugs.AMQ7118Test

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

 

 


was (Author: alanprot):
I think just found the problem...

The test is relying in the size of the hostname of your host...

Ex (long hostname):

sudo hostname hostnamehostnamehostname

Running org.apache.activemq.bugs.AMQ7118Test
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.483 sec <<< 
FAILURE! - in org.apache.activemq.bugs.AMQ7118Test
testCompaction(org.apache.activemq.bugs.AMQ7118Test) Time elapsed: 2.094 sec 
<<< FAILURE!
java.lang.AssertionError: expected:<21> but was:<22>
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:834)
 at org.junit.Assert.assertEquals(Assert.java:645)
 at org.junit.Assert.assertEquals(Assert.java:631)
 at org.apache.activemq.bugs.AMQ7118Test.checkFiles(AMQ7118Test.java:193)
 at org.apache.activemq.bugs.AMQ7118Test.testCompaction(AMQ7118Test.java:116)

 

Ex (short hostname):

sudo hostname hostname

Running org.apache.activemq.bugs.AMQ7118Test
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 15.718 sec <<< 
FAILURE! - in org.apache.activemq.bugs.AMQ7118Test
testCompaction(org.apache.activemq.bugs.AMQ7118Test) Time elapsed: 15.372 sec 
<<< FAILURE!
org.junit.ComparisonFailure: expected: but was:
 at org.junit.Assert.assertEquals(Assert.java:115)
 at org.junit.Assert.assertEquals(Assert.java:144)
 at org.apache.activemq.bugs.AMQ7118Test.checkFiles(AMQ7118Test.java:194)
 at org.apache.activemq.bugs.AMQ7118Test.testCompaction(AMQ7118Test.java:135)

 

But the test runs successfully with if i set the hostname to "hostnamehostname"

sudo hostname hostnamehostname

---
 T E S T S
---
Running org.apache.activemq.bugs.AMQ7118Test
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.281 sec - in 
org.apache.activemq.bugs.AMQ7118Test

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

 

 

> KahaDB store limit can be exceeded with durable subscribers.
> 
>
> Key: AMQ-7118
> URL: https://issues.apache.org/jira/browse/AMQ-7118
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.16.0, 5.15.8
> Environment: JDK 8
>Reporter: Jamie goodyear
>Priority: Critical
> Fix For: 5.16.0, 5.15.8
>
> Attachments: kahaCommands.jpg
>
>
> KahaDB store limit can be exceeded with durable subscribers.
> AMQ with store limit set, we can observe that the usage continues to increase 
> AFTER PFC is engaged. Given time, this growth stabilizes. The issue of having 
> exceeded the store limit remains.
> See below output from KahaDB dump in

[jira] [Commented] (AMQ-7118) KahaDB store limit can be exceeded with durable subscribers.

2019-01-06 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735270#comment-16735270
 ] 

Alan Protasio commented on AMQ-7118:


I think just found the problem...

The test is relying in the size of the hostname of your host...

Ex (long hostname):

sudo hostname hostnamehostnamehostname

Running org.apache.activemq.bugs.AMQ7118Test
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.483 sec <<< 
FAILURE! - in org.apache.activemq.bugs.AMQ7118Test
testCompaction(org.apache.activemq.bugs.AMQ7118Test) Time elapsed: 2.094 sec 
<<< FAILURE!
java.lang.AssertionError: expected:<21> but was:<22>
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:834)
 at org.junit.Assert.assertEquals(Assert.java:645)
 at org.junit.Assert.assertEquals(Assert.java:631)
 at org.apache.activemq.bugs.AMQ7118Test.checkFiles(AMQ7118Test.java:193)
 at org.apache.activemq.bugs.AMQ7118Test.testCompaction(AMQ7118Test.java:116)

 

Ex (short hostname):

sudo hostname hostname

Running org.apache.activemq.bugs.AMQ7118Test
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 15.718 sec <<< 
FAILURE! - in org.apache.activemq.bugs.AMQ7118Test
testCompaction(org.apache.activemq.bugs.AMQ7118Test) Time elapsed: 15.372 sec 
<<< FAILURE!
org.junit.ComparisonFailure: expected: but was:
 at org.junit.Assert.assertEquals(Assert.java:115)
 at org.junit.Assert.assertEquals(Assert.java:144)
 at org.apache.activemq.bugs.AMQ7118Test.checkFiles(AMQ7118Test.java:194)
 at org.apache.activemq.bugs.AMQ7118Test.testCompaction(AMQ7118Test.java:135)

 

But the test runs successfully with if i set the hostname to "hostnamehostname"

sudo hostname hostnamehostname

---
 T E S T S
---
Running org.apache.activemq.bugs.AMQ7118Test
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.281 sec - in 
org.apache.activemq.bugs.AMQ7118Test

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

 

 

> KahaDB store limit can be exceeded with durable subscribers.
> 
>
> Key: AMQ-7118
> URL: https://issues.apache.org/jira/browse/AMQ-7118
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.16.0, 5.15.8
> Environment: JDK 8
>Reporter: Jamie goodyear
>Priority: Critical
> Fix For: 5.16.0, 5.15.8
>
> Attachments: kahaCommands.jpg
>
>
> KahaDB store limit can be exceeded with durable subscribers.
> AMQ with store limit set, we can observe that the usage continues to increase 
> AFTER PFC is engaged. Given time, this growth stabilizes. The issue of having 
> exceeded the store limit remains.
> See below output from KahaDB dump in attachments:
> This appears to be caused by checkpointAckMessageFileMap. The log files are 
> not GC'd, and the KAHA_ACK_MESSAGE_FILE_MAP_COMMAND is replicated and the DB 
> log files continue to expand - this can become exponential. Side effect of 
> also not checking storage size in checkpoint update can cause the DB log 
> files to exceed any set limits. The real critical part is the duplicated and 
> leaking Kaha messages which appears to happen with durable subscribers.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7118) KahaDB store limit can be exceeded with durable subscribers.

2019-01-05 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735041#comment-16735041
 ] 

Alan Protasio commented on AMQ-7118:


The Unit test `testCompaction` seems to be failing on master.

> KahaDB store limit can be exceeded with durable subscribers.
> 
>
> Key: AMQ-7118
> URL: https://issues.apache.org/jira/browse/AMQ-7118
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.16.0, 5.15.8
> Environment: JDK 8
>Reporter: Jamie goodyear
>Priority: Critical
> Fix For: 5.16.0, 5.15.8
>
> Attachments: kahaCommands.jpg
>
>
> KahaDB store limit can be exceeded with durable subscribers.
> AMQ with store limit set, we can observe that the usage continues to increase 
> AFTER PFC is engaged. Given time, this growth stabilizes. The issue of having 
> exceeded the store limit remains.
> See below output from KahaDB dump in attachments:
> This appears to be caused by checkpointAckMessageFileMap. The log files are 
> not GC'd, and the KAHA_ACK_MESSAGE_FILE_MAP_COMMAND is replicated and the DB 
> log files continue to expand - this can become exponential. Side effect of 
> also not checking storage size in checkpoint update can cause the DB log 
> files to exceed any set limits. The real critical part is the duplicated and 
> leaking Kaha messages which appears to happen with durable subscribers.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7112) Network Bridges not showing Duplex bridges on the Remote broker console

2018-11-26 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7112:
---
Attachment: (was: brokerA - JMX View.png)

> Network Bridges not showing Duplex bridges on the Remote broker console
> ---
>
> Key: AMQ-7112
> URL: https://issues.apache.org/jira/browse/AMQ-7112
> Project: ActiveMQ
>  Issue Type: Test
>  Components: networkbridge
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
> Attachments: brokerA - After Change.png, brokerA - JMX View.png, 
> brokerB - After Change .png, brokerB - Before Change.png, brokerB - JMX 
> view.png
>
>
> Hi,
> I created a duplex network connector and I noticed that the "[Created By 
> Duplex|http://localhost:8161/admin/network.jsp]; column was false in the 
> local broker. I found this weird and I noticed that the collumn should have 
> the value true on the remote broker as described here:
> [http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]
> https://issues.apache.org/jira/browse/AMQ-3109
> After analyzing why i noticed that the name of the remote bean changed here:
> https://issues.apache.org/jira/browse/AMQ-4237
> [https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]
> After this change this information stopped to be displayed in the console.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7112) Network Bridges not showing Duplex bridges on the Remote broker console

2018-11-26 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7112:
---
Description: 
Hi,

I created a duplex network connector and I noticed that the "[Created By 
Duplex|http://localhost:8161/admin/network.jsp]; column was false in the local 
broker. I found this weird and I noticed that the collumn should have the value 
true on the remote broker as described here:

[http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]

https://issues.apache.org/jira/browse/AMQ-3109

After analyzing why i noticed that the name of the remote bean changed here:

https://issues.apache.org/jira/browse/AMQ-4237

[https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]

After this change this information stopped to be displayed in the console.

I Added some screen shots of before and after the change.

 

BrokerA -> BrokerB (duplex)
{quote} 
 
 {{        }}
 {{        }}
  
{quote}
 

So... in the broker a the bean is:

org.apache.activemq:type=Broker,brokerName=localhost,connector=networkConnectors,networkConnectorName=connector

 

In the broker b, the bean name is:

org.apache.activemq:brokerName=localhost,connector=duplexNetworkConnectors,networkConnectorName=#4,networkBridge=tcp_//127.0.0.1_49610,type=Broker

 

As e can see, the attribute connector is different (connector=networkConnectors 
x connector=duplexNetworkConnectors)

 

 

 

  was:
Hi,

I created a duplex network connector and I noticed that the "[Created By 
Duplex|http://localhost:8161/admin/network.jsp]; column was false in the local 
broker. I found this weird and I noticed that the collumn should have the value 
true on the remote broker as described here:

[http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]

https://issues.apache.org/jira/browse/AMQ-3109

After analyzing why i noticed that the name of the remote bean changed here:

https://issues.apache.org/jira/browse/AMQ-4237

[https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]

After this change this information stopped to be displayed in the console.

I Added some screen shots of before and after the change.

 

BrokerA -> BrokerB (duplex)
{quote} 

 {{        }}
 {{        }}
 
{quote}
 

 


> Network Bridges not showing Duplex bridges on the Remote broker console
> ---
>
> Key: AMQ-7112
> URL: https://issues.apache.org/jira/browse/AMQ-7112
> Project: ActiveMQ
>  Issue Type: Test
>  Components: networkbridge
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
> Attachments: brokerA - After Change.png, brokerA - JMX View.png, 
> brokerB - After Change .png, brokerB - Before Change.png, brokerB - JMX 
> view.png
>
>
> Hi,
> I created a duplex network connector and I noticed that the "[Created By 
> Duplex|http://localhost:8161/admin/network.jsp]; column was false in the 
> local broker. I found this weird and I noticed that the collumn should have 
> the value true on the remote broker as described here:
> [http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]
> https://issues.apache.org/jira/browse/AMQ-3109
> After analyzing why i noticed that the name of the remote bean changed here:
> https://issues.apache.org/jira/browse/AMQ-4237
> [https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]
> After this change this information stopped to be displayed in the console.
> I Added some screen shots of before and after the change.
>  
> BrokerA -> BrokerB (duplex)
> {quote} 
>  
>  {{         name="connector" password="admin" uri="static:(tcp://localhost:61616)" 
> userName="admin">}}
>  {{        }}
>   
> {quote}
>  
> So... in the broker a the bean is:
> org.apache.activemq:type=Broker,brokerName=localhost,connector=networkConnectors,networkConnectorName=connector
>  
> In the broker b, the bean name is:
> org.apache.activemq:brokerName=localhost,connector=duplexNetworkConnectors,networkConnectorName=#4,networkBridge=tcp_//127.0.0.1_49610,type=Broker
>  
> As e can see, the attribute connector is different 
> (connector=networkConnectors x connector=duplexNetworkConnectors)
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7112) Network Bridges not showing Duplex bridges on the Remote broker console

2018-11-26 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7112:
---
Description: 
Hi,

I created a duplex network connector and I noticed that the "[Created By 
Duplex|http://localhost:8161/admin/network.jsp]; column was false in the local 
broker. I found this weird and I noticed that the collumn should have the value 
true on the remote broker as described here:

[http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]

https://issues.apache.org/jira/browse/AMQ-3109

After analyzing why i noticed that the name of the remote bean changed here:

https://issues.apache.org/jira/browse/AMQ-4237

[https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]

After this change this information stopped to be displayed in the console.

I Added some screen shots of before and after the change.

 

BrokerA -> BrokerB (duplex)
{quote} 

 {{        }}
 {{        }}
 
{quote}
 

 

  was:
Hi,

I created a duplex network connector and I noticed that the "[Created By 
Duplex|http://localhost:8161/admin/network.jsp]; column was false in the local 
broker. I found this weird and I noticed that the collumn should have the value 
true on the remote broker as described here:

[http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]

https://issues.apache.org/jira/browse/AMQ-3109

After analyzing why i noticed that the name of the remote bean changed here:

https://issues.apache.org/jira/browse/AMQ-4237

[https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]

After this change this information stopped to be displayed in the console.

I Added some screen shots of before and after the change.

 

BrokerA -> BrokerB (duplex)

 
{{ }}
{{        }}
{{        }}
{{ }}

 

 


> Network Bridges not showing Duplex bridges on the Remote broker console
> ---
>
> Key: AMQ-7112
> URL: https://issues.apache.org/jira/browse/AMQ-7112
> Project: ActiveMQ
>  Issue Type: Test
>  Components: networkbridge
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
> Attachments: brokerA - After Change.png, brokerA - JMX View.png, 
> brokerB - After Change .png, brokerB - Before Change.png, brokerB - JMX 
> view.png
>
>
> Hi,
> I created a duplex network connector and I noticed that the "[Created By 
> Duplex|http://localhost:8161/admin/network.jsp]; column was false in the 
> local broker. I found this weird and I noticed that the collumn should have 
> the value true on the remote broker as described here:
> [http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]
> https://issues.apache.org/jira/browse/AMQ-3109
> After analyzing why i noticed that the name of the remote bean changed here:
> https://issues.apache.org/jira/browse/AMQ-4237
> [https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]
> After this change this information stopped to be displayed in the console.
> I Added some screen shots of before and after the change.
>  
> BrokerA -> BrokerB (duplex)
> {quote} 
> 
>  {{         name="connector" password="admin" uri="static:(tcp://localhost:61616)" 
> userName="admin">}}
>  {{        }}
>  
> {quote}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7112) Network Bridges not showing Duplex bridges on the Remote broker console

2018-11-26 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7112:
---
Description: 
Hi,

I created a duplex network connector and I noticed that the "[Created By 
Duplex|http://localhost:8161/admin/network.jsp]; column was false in the local 
broker. I found this weird and I noticed that the collumn should have the value 
true on the remote broker as described here:

[http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]

https://issues.apache.org/jira/browse/AMQ-3109

After analyzing why i noticed that the name of the remote bean changed here:

https://issues.apache.org/jira/browse/AMQ-4237

[https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]

After this change this information stopped to be displayed in the console.

I Added some screen shots of before and after the change.

 

BrokerA -> BrokerB (duplex)

 
{{ }}
{{        }}
{{        }}
{{ }}

 

 

  was:
Hi,

I created a duplex network connector and I noticed that the "[Created By 
Duplex|http://localhost:8161/admin/network.jsp]; column was false in the local 
broker. I found this weird and I noticed that the collumn should have the value 
true on the remote broker as described here:

[http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]

https://issues.apache.org/jira/browse/AMQ-3109

After analyzing why i noticed that the name of the remote bean changed here:

https://issues.apache.org/jira/browse/AMQ-4237

[https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]

After this change this information stopped to be displayed in the console.

 


> Network Bridges not showing Duplex bridges on the Remote broker console
> ---
>
> Key: AMQ-7112
> URL: https://issues.apache.org/jira/browse/AMQ-7112
> Project: ActiveMQ
>  Issue Type: Test
>  Components: networkbridge
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
> Attachments: brokerA - After Change.png, brokerA - JMX View.png, 
> brokerB - After Change .png, brokerB - Before Change.png, brokerB - JMX 
> view.png
>
>
> Hi,
> I created a duplex network connector and I noticed that the "[Created By 
> Duplex|http://localhost:8161/admin/network.jsp]; column was false in the 
> local broker. I found this weird and I noticed that the collumn should have 
> the value true on the remote broker as described here:
> [http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]
> https://issues.apache.org/jira/browse/AMQ-3109
> After analyzing why i noticed that the name of the remote bean changed here:
> https://issues.apache.org/jira/browse/AMQ-4237
> [https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]
> After this change this information stopped to be displayed in the console.
> I Added some screen shots of before and after the change.
>  
> BrokerA -> BrokerB (duplex)
>  
> {{ }}
> {{         name="connector" password="admin" uri="static:(tcp://localhost:61616)" 
> userName="admin">}}
> {{        }}
> {{ }}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7112) Network Bridges not showing Duplex bridges on the Remote broker console

2018-11-26 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7112:
---
Attachment: brokerA - JMX View.png

> Network Bridges not showing Duplex bridges on the Remote broker console
> ---
>
> Key: AMQ-7112
> URL: https://issues.apache.org/jira/browse/AMQ-7112
> Project: ActiveMQ
>  Issue Type: Test
>  Components: networkbridge
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
> Attachments: brokerA - After Change.png, brokerA - JMX View.png, 
> brokerB - After Change .png, brokerB - Before Change.png, brokerB - JMX 
> view.png
>
>
> Hi,
> I created a duplex network connector and I noticed that the "[Created By 
> Duplex|http://localhost:8161/admin/network.jsp]; column was false in the 
> local broker. I found this weird and I noticed that the collumn should have 
> the value true on the remote broker as described here:
> [http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]
> https://issues.apache.org/jira/browse/AMQ-3109
> After analyzing why i noticed that the name of the remote bean changed here:
> https://issues.apache.org/jira/browse/AMQ-4237
> [https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]
> After this change this information stopped to be displayed in the console.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7112) Network Bridges not showing Duplex bridges on the Remote broker console

2018-11-26 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7112:
---
Attachment: brokerB - JMX view.png
brokerB - Before Change.png
brokerB - After Change .png
brokerA - JMX View.png
brokerA - After Change.png

> Network Bridges not showing Duplex bridges on the Remote broker console
> ---
>
> Key: AMQ-7112
> URL: https://issues.apache.org/jira/browse/AMQ-7112
> Project: ActiveMQ
>  Issue Type: Test
>  Components: networkbridge
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
> Attachments: brokerA - After Change.png, brokerA - JMX View.png, 
> brokerB - After Change .png, brokerB - Before Change.png, brokerB - JMX 
> view.png
>
>
> Hi,
> I created a duplex network connector and I noticed that the "[Created By 
> Duplex|http://localhost:8161/admin/network.jsp]; column was false in the 
> local broker. I found this weird and I noticed that the collumn should have 
> the value true on the remote broker as described here:
> [http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]
> https://issues.apache.org/jira/browse/AMQ-3109
> After analyzing why i noticed that the name of the remote bean changed here:
> https://issues.apache.org/jira/browse/AMQ-4237
> [https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]
> After this change this information stopped to be displayed in the console.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7112) Network Bridges not showing Duplex bridges on the Remote broker console

2018-11-26 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7112:
---
Description: 
Hi,

I created a duplex network connector and I noticed that the "[Created By 
Duplex|http://localhost:8161/admin/network.jsp]; column was false in the local 
broker. I found this weird and I noticed that the collumn should have the value 
true on the remote broker as described here:

[http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]

https://issues.apache.org/jira/browse/AMQ-3109

After analyzing why i noticed that the name of the remote bean changed here:

https://issues.apache.org/jira/browse/AMQ-4237

[https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]

After this change this information stopped to be displayed in the console.

 

  was:
Hi,

I created a duplex network connector and I noticed that the "[Created By 
Duplex|http://localhost:8161/admin/network.jsp]; column was false in the local 
broker. I found this weird and I noticed that the collumn should have the value 
true on the remote broker as described here:

[http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]

After analyzing why i noticed that the name of the remote bean changed here:

https://issues.apache.org/jira/browse/AMQ-4237

[https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]

After this change this information stopped to be displayed in the console.

 


> Network Bridges not showing Duplex bridges on the Remote broker console
> ---
>
> Key: AMQ-7112
> URL: https://issues.apache.org/jira/browse/AMQ-7112
> Project: ActiveMQ
>  Issue Type: Test
>  Components: networkbridge
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
>
> Hi,
> I created a duplex network connector and I noticed that the "[Created By 
> Duplex|http://localhost:8161/admin/network.jsp]; column was false in the 
> local broker. I found this weird and I noticed that the collumn should have 
> the value true on the remote broker as described here:
> [http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]
> https://issues.apache.org/jira/browse/AMQ-3109
> After analyzing why i noticed that the name of the remote bean changed here:
> https://issues.apache.org/jira/browse/AMQ-4237
> [https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]
> After this change this information stopped to be displayed in the console.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMQ-7112) Network Bridges not showing Duplex bridges on the Remote broker console

2018-11-26 Thread Alan Protasio (JIRA)



 [ 
https://issues.apache.org/jira/browse/AMQ-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Protasio updated AMQ-7112:
---
Summary: Network Bridges not showing Duplex bridges on the Remote broker 
console  (was: Network Bridges not showing Duplex bridges on the Remote broker)

> Network Bridges not showing Duplex bridges on the Remote broker console
> ---
>
> Key: AMQ-7112
> URL: https://issues.apache.org/jira/browse/AMQ-7112
> Project: ActiveMQ
>  Issue Type: Test
>  Components: networkbridge
>Affects Versions: 5.15.8
>Reporter: Alan Protasio
>Priority: Major
> Fix For: 5.16.0, 5.15.9
>
>
> Hi,
> I created a duplex network connector and I noticed that the "[Created By 
> Duplex|http://localhost:8161/admin/network.jsp]; column was false in the 
> local broker. I found this weird and I noticed that the collumn should have 
> the value true on the remote broker as described here:
> [http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]
> After analyzing why i noticed that the name of the remote bean changed here:
> https://issues.apache.org/jira/browse/AMQ-4237
> [https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]
> After this change this information stopped to be displayed in the console.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (AMQ-7112) Network Bridges not showing Duplex bridges on the Remote broker

2018-11-26 Thread Alan Protasio (JIRA)

Alan Protasio created AMQ-7112:
--

 Summary: Network Bridges not showing Duplex bridges on the Remote 
broker
 Key: AMQ-7112
 URL: https://issues.apache.org/jira/browse/AMQ-7112
 Project: ActiveMQ
  Issue Type: Test
  Components: networkbridge
Affects Versions: 5.15.8
Reporter: Alan Protasio
 Fix For: 5.16.0, 5.15.9


Hi,

I created a duplex network connector and I noticed that the "[Created By 
Duplex|http://localhost:8161/admin/network.jsp]; column was false in the local 
broker. I found this weird and I noticed that the collumn should have the value 
true on the remote broker as described here:

[http://sensatic.net/activemq/how-to-monitor-activemq-networks.html]

After analyzing why i noticed that the name of the remote bean changed here:

https://issues.apache.org/jira/browse/AMQ-4237

[https://svn.apache.org/viewvc/activemq/trunk/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java?r1=1425871=1425870=1425871]

After this change this information stopped to be displayed in the console.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (AMQ-7107) QueueBrowsingTest and UsageBlockedDispatchTest are failing with ConcurrentStoreAndDispachQueues=false

2018-11-21 Thread Alan Protasio (JIRA)

Alan Protasio created AMQ-7107:
--

Summary: QueueBrowsingTest and UsageBlockedDispatchTest are
failing with ConcurrentStoreAndDispachQueues=false
Key: AMQ-7107
URL: https://issues.apache.org/jira/browse/AMQ-7107
Project: ActiveMQ
Issue Type: Test
Reporter: Alan Protasio

Hi,

I was working towards https://issues.apache.org/jira/browse/AMQ-7028 and after
my change QueueBrowsingTest and UsageBlockedDispatchTest were failing.

QueueBrowsingTest was changed by https://issues.apache.org/jira/browse/AMQ-4495
and it is testing if a full page was pagedIn by the cursor.
The problem is because this test was only succeeding due how
ConcurrentStoreAndDispachQueues=true is implemented. When this flag is set to
True, we increase the memory usage when start the async task and when decrease
it when the task is done:

https://github.com/alanprot/activemq/blob/master/activemq-broker/src/main/java/org/apache/activemq/broker/region/Queue.java#L897

So, imagine this timeline:

1 . Send message 1

2. The cursor get full and the cache is disabled

3. Message1 finish and the memory is freed

4. messages 2 to 100 are sent and the cache is skipped

5. We call browser queue and the cursor can pageIn messages because the
cursorMemory is not full

Now with ConcurrentStoreAndDispachQueues=false

1 . Send message 1

2 . Send message 2

3. The cursor get full and the cache is disabled

4. messages 3 to 100 are sent and the cache is skipped (memory still full)

5. We call browser queue and the cursor cannot pageIn messages because the
cursorMemory is full

So, in order to make this test work with ConcurrentStoreAndDispachQueues=false
i did a simple change on it... After sending all the messages, consume one of
them and make sure that the cursor has memory to pageIn messages.

Similar thing is happening with UsageBlockedDispatchTest.

I create 2 more test to do the same test with
ConcurrentStoreAndDispachQueues=false and changed it a little bit to make them
works with this flag false.

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7093) KahaDB index, recover free pages in parallel with start (Continued)

2018-11-14 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686707#comment-16686707
 ] 

Alan Protasio commented on AMQ-7093:


[~cshannon]

 

Probably is this change:

https://github.com/apache/activemq/commit/8a1abd9bb2744de70af11053f1755116c40ec55f

> KahaDB index, recover free pages in parallel with start (Continued)
> ---
>
> Key: AMQ-7093
> URL: https://issues.apache.org/jira/browse/AMQ-7093
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.7
>Reporter: Jeff Genender
>Priority: Major
> Fix For: 5.16.0, 5.15.8
>
>
> AMQ-7082 was implemented to create a concurrent thread to handle the free 
> page recovery.  It was included as a part of 5.15.7.  There was some 
> additional add-on coding that was not a part of that release which had 
> introduced some potential bugs.  This was made to track the additional 
> commits for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7091) O(n) Memory consumption when broker has inactive durable subscribes causing OOM

2018-11-08 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678860#comment-16678860
 ] 

Alan Protasio edited comment on AMQ-7091 at 11/8/18 3:49 PM:
-

Hi [~gtully]

Thanks again! haha :D

 
{quote}the solution is to have the data in the pageFile and pageCache such that 
it can get flushed from memory at the cost of accessing from pages in normal 
operation.
{quote}
Did you see my last proposed change? This change does not requite new index 
version (kahadb) and will not write anything new to the index.

I dont know why the bot didn't add the comment (updated the PR) here but i'm 
not adding a new Btree Index anymore... I'm reusing the data that is already 
there:

[https://github.com/apache/activemq/pull/315/commits/98d1be6acaee79f9acd72d2c7c2bdb2358638e18]

We already have this information in other format in the ListIndex ackPositions; So i'm using it.

My comment before was nowadays to discover if one message is referenced by 
one subscribe we only do a sd.messageReferences.get(sequenceId) and check if 
the reference count is 0.

With my change,  we will do when acking a message something similar with that 
we already do when sending a message.

See:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L2924]

and

[https://github.com/apache/activemq/pull/315/commits/98d1be6acaee79f9acd72d2c7c2bdb2358638e18#diff-e3b8fff8c2133dfd70999705bbb558b3R2935]

  

 

 

 


was (Author: alanprot):
Hi [~gtully]

Thanks again! haha :D

 
{quote}the solution is to have the data in the pageFile and pageCache such that 
it can get flushed from memory at the cost of accessing from pages in normal 
operation.
{quote}
Did you see my last proposed change?

I dont know why the bot didn't add the comment (updated the PR) here but i'm 
not flushing the data to the pagefile anymore... I'm reusing the data that is 
already there:

[https://github.com/apache/activemq/pull/315/commits/98d1be6acaee79f9acd72d2c7c2bdb2358638e18]

We already have this information in other format in the ListIndex ackPositions; So i'm using this.

My comment before was nowadays to discover if one message is referenced by 
one subscribe we only do a sd.messageReferences.get(sequenceId) and check if 
the reference count is 0.

With my change,  we will have to do the same as we already do when we are 
sending one message:

See:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L2924]

and

[https://github.com/apache/activemq/pull/315/commits/98d1be6acaee79f9acd72d2c7c2bdb2358638e18#diff-e3b8fff8c2133dfd70999705bbb558b3R2935]

 

As we already have this code when sending a message, i dont know if this is a 
problem.

 

 

 

 

> O(n) Memory consumption when broker has inactive durable subscribes causing 
> OOM
> ---
>
> Key: AMQ-7091
> URL: https://issues.apache.org/jira/browse/AMQ-7091
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.7
>Reporter: Alan Protasio
>Priority: Major
> Attachments: After.png, Before.png, 
> InactiveDurableSubscriberTest.java, memoryAllocation.jpg
>
>
> Hi :D
> One of our brokers was bouncing indefinitely due OOM even though the load 
> (TPS) was pretty low.
> Getting the memory dump I could see that almost 90% of the memory was being 
> used by 
> [messageReferences|https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L2368]
>  TreeMap to keep track of what messages were already acked by all Subscribes 
> in order to delete them.
> This seems to be a problem as if the broker has an inactive durable 
> subscribe, the memory footprint increase proportionally (O) with the number 
> of messages sent to the topic in question, causing the broker to die due OOM 
> sooner or later (the high memory footprint continue even after a restart).
> You can find attached (memoryAllocation.jpg) a screen shot showing my broker 
> using 90% of the memory to keep track of those messages, making it barely 
> usable.
> Looking at the code, I could do a change to change the messageReferences to 
> use a BTreeIndex:
> final TreeMap messageReferences = new TreeMap<>();
>  + BTreeIndex messageReferences;
> Making this change, the memory allocation of the broker stabilized and the 
> broker didn't run OOM anymore.
> Attached you can see the code that I used to reproduce this scenario, also 
> the memory utilization (HEAP and GC graphs) before and after this change.
> Before the change the broker died in 5 minutes and I could send 48. After 
> then change

[jira] [Commented] (AMQ-7091) O(n) Memory consumption when broker has inactive durable subscribes causing OOM

2018-11-08 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680307#comment-16680307
 ] 

Alan Protasio commented on AMQ-7091:


[~gtully] [~jgenender]

I did some performance tests before and after the change:

The test scenarios were:

1, 10, 100 Consumers

Producing and consuming messages in parallel

Producing all message before starting the consumers 

The results are here:

[https://github.com/alanprot/activemq/blob/master/AMQ-7091.md]

 

I used KahaDBDurableTopicTest as base for this test.

 

Seems that this change does not have impact performance wise.

> O(n) Memory consumption when broker has inactive durable subscribes causing 
> OOM
> ---
>
> Key: AMQ-7091
> URL: https://issues.apache.org/jira/browse/AMQ-7091
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.7
>Reporter: Alan Protasio
>Priority: Major
> Attachments: After.png, Before.png, 
> InactiveDurableSubscriberTest.java, memoryAllocation.jpg
>
>
> Hi :D
> One of our brokers was bouncing indefinitely due OOM even though the load 
> (TPS) was pretty low.
> Getting the memory dump I could see that almost 90% of the memory was being 
> used by 
> [messageReferences|https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L2368]
>  TreeMap to keep track of what messages were already acked by all Subscribes 
> in order to delete them.
> This seems to be a problem as if the broker has an inactive durable 
> subscribe, the memory footprint increase proportionally (O) with the number 
> of messages sent to the topic in question, causing the broker to die due OOM 
> sooner or later (the high memory footprint continue even after a restart).
> You can find attached (memoryAllocation.jpg) a screen shot showing my broker 
> using 90% of the memory to keep track of those messages, making it barely 
> usable.
> Looking at the code, I could do a change to change the messageReferences to 
> use a BTreeIndex:
> final TreeMap messageReferences = new TreeMap<>();
>  + BTreeIndex messageReferences;
> Making this change, the memory allocation of the broker stabilized and the 
> broker didn't run OOM anymore.
> Attached you can see the code that I used to reproduce this scenario, also 
> the memory utilization (HEAP and GC graphs) before and after this change.
> Before the change the broker died in 5 minutes and I could send 48. After 
> then change the broker was still pretty healthy after 5 minutes and i could 
> send 2265000 to the topic (almost 5x more due high GC pauses).
>  
> All test are passing: mvn clean install -Dactivemq.tests=all



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7093) KahaDB index, recover free pages in parallel with start (Continued)

2018-11-08 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679933#comment-16679933
 ] 

Alan Protasio commented on AMQ-7093:


Yeah.. seems like a important fix...

About AMQ-7091, it will not require a index upgrade anymore...

The last proposed change will get the information from sd.ackPositions (this is 
already in the index file nowadays).

I can still have a copy of this data 100% in memory but i think it dont worth 
as we already read this information when sending messages.

> KahaDB index, recover free pages in parallel with start (Continued)
> ---
>
> Key: AMQ-7093
> URL: https://issues.apache.org/jira/browse/AMQ-7093
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.7
>Reporter: Jeff Genender
>Priority: Major
> Fix For: 5.16.0, 5.15.8
>
>
> AMQ-7082 was implemented to create a concurrent thread to handle the free 
> page recovery.  It was included as a part of 5.15.7.  There was some 
> additional add-on coding that was not a part of that release which had 
> introduced some potential bugs.  This was made to track the additional 
> commits for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AMQ-7091) O(n) Memory consumption when broker has inactive durable subscribes causing OOM

2018-11-07 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678860#comment-16678860
 ] 

Alan Protasio commented on AMQ-7091:


Hi [~gtully]

Thanks again! haha :D

 
{quote}the solution is to have the data in the pageFile and pageCache such that 
it can get flushed from memory at the cost of accessing from pages in normal 
operation.
{quote}
Did you see my last proposed change?

I dont know why the bot didn't add the comment (updated the PR) here but i'm 
not flushing the data to the pagefile anymore... I'm reusing the data that is 
already there:

[https://github.com/apache/activemq/pull/315/commits/98d1be6acaee79f9acd72d2c7c2bdb2358638e18]

We already have this information in other format in the ListIndex ackPositions; So i'm using this.

My comment before was nowadays to discover if one message is referenced by 
one subscribe we only do a sd.messageReferences.get(sequenceId) and check if 
the reference count is 0.

With my change,  we will have to do the same as we already do when we are 
sending one message:

See:

[https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L2924]

and

[https://github.com/apache/activemq/pull/315/commits/98d1be6acaee79f9acd72d2c7c2bdb2358638e18#diff-e3b8fff8c2133dfd70999705bbb558b3R2935]

 

As we already have this code when sending a message, i dont know if this is a 
problem.

 

 

 

 

> O(n) Memory consumption when broker has inactive durable subscribes causing 
> OOM
> ---
>
> Key: AMQ-7091
> URL: https://issues.apache.org/jira/browse/AMQ-7091
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.7
>Reporter: Alan Protasio
>Priority: Major
> Attachments: After.png, Before.png, 
> InactiveDurableSubscriberTest.java, memoryAllocation.jpg
>
>
> Hi :D
> One of our brokers was bouncing indefinitely due OOM even though the load 
> (TPS) was pretty low.
> Getting the memory dump I could see that almost 90% of the memory was being 
> used by 
> [messageReferences|https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L2368]
>  TreeMap to keep track of what messages were already acked by all Subscribes 
> in order to delete them.
> This seems to be a problem as if the broker has an inactive durable 
> subscribe, the memory footprint increase proportionally (O) with the number 
> of messages sent to the topic in question, causing the broker to die due OOM 
> sooner or later (the high memory footprint continue even after a restart).
> You can find attached (memoryAllocation.jpg) a screen shot showing my broker 
> using 90% of the memory to keep track of those messages, making it barely 
> usable.
> Looking at the code, I could do a change to change the messageReferences to 
> use a BTreeIndex:
> final TreeMap messageReferences = new TreeMap<>();
>  + BTreeIndex messageReferences;
> Making this change, the memory allocation of the broker stabilized and the 
> broker didn't run OOM anymore.
> Attached you can see the code that I used to reproduce this scenario, also 
> the memory utilization (HEAP and GC graphs) before and after this change.
> Before the change the broker died in 5 minutes and I could send 48. After 
> then change the broker was still pretty healthy after 5 minutes and i could 
> send 2265000 to the topic (almost 5x more due high GC pauses).
>  
> All test are passing: mvn clean install -Dactivemq.tests=all



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7091) O(n) Memory consumption when broker has inactive durable subscribes causing OOM

2018-11-07 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678599#comment-16678599
 ] 

Alan Protasio edited comment on AMQ-7091 at 11/7/18 8:49 PM:
-

[~gtully] [~jgoodyear] Seems that all tests succeed! 

If you guys think that getting this from the ListIndex 
ackPositions is expensive I can create a TreeMap to keep 
it in memory.. but I think having this duplicate information can be error 
prone. And also we are already accessing this index in the code... the only 
overhead is when you ack a message instead o getting the sequenceSet only for 
the subscriber that is acking we have to get the sequenceSet for all 
subscribers..

See "isSequenceReferenced" method.


was (Author: alanprot):
[~gtully] [~jgoodyear] Seems that all tests succeed! 

> O(n) Memory consumption when broker has inactive durable subscribes causing 
> OOM
> ---
>
> Key: AMQ-7091
> URL: https://issues.apache.org/jira/browse/AMQ-7091
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.7
>Reporter: Alan Protasio
>Priority: Major
> Attachments: After.png, Before.png, 
> InactiveDurableSubscriberTest.java, memoryAllocation.jpg
>
>
> Hi :D
> One of our brokers was bouncing indefinitely due OOM even though the load 
> (TPS) was pretty low.
> Getting the memory dump I could see that almost 90% of the memory was being 
> used by 
> [messageReferences|https://github.com/apache/activemq/blob/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/MessageDatabase.java#L2368]
>  TreeMap to keep track of what messages were already acked by all Subscribes 
> in order to delete them.
> This seems to be a problem as if the broker has an inactive durable 
> subscribe, the memory footprint increase proportionally (O) with the number 
> of messages sent to the topic in question, causing the broker to die due OOM 
> sooner or later (the high memory footprint continue even after a restart).
> You can find attached (memoryAllocation.jpg) a screen shot showing my broker 
> using 90% of the memory to keep track of those messages, making it barely 
> usable.
> Looking at the code, I could do a change to change the messageReferences to 
> use a BTreeIndex:
> final TreeMap messageReferences = new TreeMap<>();
>  + BTreeIndex messageReferences;
> Making this change, the memory allocation of the broker stabilized and the 
> broker didn't run OOM anymore.
> Attached you can see the code that I used to reproduce this scenario, also 
> the memory utilization (HEAP and GC graphs) before and after this change.
> Before the change the broker died in 5 minutes and I could send 48. After 
> then change the broker was still pretty healthy after 5 minutes and i could 
> send 2265000 to the topic (almost 5x more due high GC pauses).
>  
> All test are passing: mvn clean install -Dactivemq.tests=all



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AMQ-7082) KahaDB index, recover free pages in parallel with start

2018-11-07 Thread Alan Protasio (JIRA)



[ 
https://issues.apache.org/jira/browse/AMQ-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678344#comment-16678344
 ] 

Alan Protasio edited comment on AMQ-7082 at 11/7/18 8:26 PM:
-

[~gtully] Makes lots of sense

If the page was freed after the restart, they should be removed from 
recoveredFreeList before merging as they are already tracked in freeList. This 
will prevent them to came back at the end of the recovery..

Sound really good for me!


was (Author: alanprot):
[~gtully] Makes lots of sense

If the page was freed after the restart, they should be removed from 
recoveredFreeList before merging as they are already tracked in freeList. This 
will prevent it to came back at the end of the recovery..

Sound really good for me!

> KahaDB index, recover free pages in parallel with start
> ---
>
> Key: AMQ-7082
> URL: https://issues.apache.org/jira/browse/AMQ-7082
> Project: ActiveMQ
>  Issue Type: Bug
>  Components: KahaDB
>Affects Versions: 5.15.0
>Reporter: Gary Tully
>Assignee: Gary Tully
>Priority: Major
> Fix For: 5.16.0, 5.15.8
>
>
> AMQ-6590 fixes free page loss through recovery. The recover process can be 
> timely, which prevents fast failover, doing recovery on shutdown is 
> preferable, but it is still not ideal b/c it will hold onto the kahadb lock. 
> It also can stall shutdown unexpectedly.
> AMQ-7080 is going to tackle checkpointing the free list. This should help 
> avoid the need for recovery but it may still be necessary. If the perf hit is 
> significant this may need to be optional.
> There will still be the need to walk the index to find the free list.
> It is possible to run with no free list and grow, and we can do that while we 
> recover the free list in parallel, then merge the two at a safe point. This 
> we can do at startup.
> In cases where the disk is the bottleneck this won't help much, but it will 
> help failover and it will help shutdown, with a bit of luck the recovery will 
> complete before we stop.
>  
> Initially I thought this would be too complex, but if we concede some growth 
> while we recover, ie: start with an empty free list, it is should be straight 
> forward to merge with a recovered one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 >

1 - 100 of 165 matches

Mail list logo