Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-31 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221439
---


Ship it!




Ship It!

- Madhan Neethiraj


On July 31, 2020, 3:57 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 31, 2020, 3:57 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java a942b9f93 
>   intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
> 63116d467 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/12/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2076/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-31 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 31, 2020, 3:57 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java a942b9f93 
  intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
63116d467 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 


Diff: https://reviews.apache.org/r/72666/diff/12/

Changes: https://reviews.apache.org/r/72666/diff/11-12/


Testing (updated)
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour. Without out of memory error.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2076/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-31 Thread Ashutosh Mestry via Review Board


> On July 30, 2020, 11:58 p.m., Sarath Subramanian wrote:
> > good memory improvement fix!

Thanks!


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221431
---


On July 30, 2020, 10:58 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 30, 2020, 10:58 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
> 63116d467 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/11/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-30 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221431
---


Fix it, then Ship it!




good memory improvement fix!


repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
Line 411 (original), 407 (patched)


nit: revert whitespace only changes in lines:
407-411, 417, 424-425, 470-472, 486-489, 493


- Sarath Subramanian


On July 30, 2020, 3:58 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 30, 2020, 3:58 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
> 63116d467 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/11/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-30 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221430
---


Fix it, then Ship it!





intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 53 (patched)


#53 is not needed, as this is handled in reset().


- Madhan Neethiraj


On July 30, 2020, 10:58 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 30, 2020, 10:58 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
> 63116d467 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/11/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-30 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 30, 2020, 10:58 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
63116d467 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 


Diff: https://reviews.apache.org/r/72666/diff/11/

Changes: https://reviews.apache.org/r/72666/diff/10-11/


Testing
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour. Without out of memory error.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-30 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221422
---




intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 59 (patched)


set length to 0 here?
  this.length = 0;



repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
Lines 81 (patched)


withInitial() be called only on the first access from a thread. Calling 
reset() from here is not useful.



repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
Line 93 (original), 100 (patched)


AUDIT_EVENTS_BUFFER.get() will not call ThreadLocal.withInitial(), hence 
will not have the list reset - #81. As suggested earlier, consider using a 
method like:
  FixedBufferList getAuditEventsList() {
FixedBufferList ret = AUDIT_EVENTS_BUFFER.get();

ret.reset();

return ret;
  }



webapp/src/main/java/org/apache/atlas/notification/EntityNotificationSender.java
Line 127 (original), 127 (patched)


This file has only whitespace changes. Please revert.


- Madhan Neethiraj


On July 30, 2020, 5:55 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 30, 2020, 5:55 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
> 63116d467 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationSender.java
>  c2ae5122b 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/10/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-30 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 30, 2020, 5:55 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
63116d467 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationSender.java
 c2ae5122b 


Diff: https://reviews.apache.org/r/72666/diff/10/

Changes: https://reviews.apache.org/r/72666/diff/9-10/


Testing
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour. Without out of memory error.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-29 Thread Ashutosh Mestry via Review Board


> On July 23, 2020, 9:16 p.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
> > Lines 409 (patched)
> > 
> >
> > should we reset/clear these values for EntityAuditEventV2? It might 
> > have remnant values when reused again?

This has been addressed by clearing the unused elements from the list.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221337
---


On July 29, 2020, 6:23 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 29, 2020, 6:23 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
> 63116d467 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationSender.java
>  c2ae5122b 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/9/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-29 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221413
---



What is the impact on the memory usage improvement, after FixedBufferList use 
is removed from EntityNotificationListenerV2?


intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 41 (patched)


"incrementCapacityBy == 0" -> "incrementCapacityBy <= 0"



repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
Line 278 (original), 269 (patched)


In case of any exception in instanceConverter.getAndCacheEntity(), 'events' 
list might be left with some entries - since events.toList() may not be called. 
And these entries will be used in subsequent event dispatch from this thread.

I suggest the following:
 - add methods:
   class EntityAuditListenerV2 {
 ..
 FixedBufferList getEventsList() {
   FixedBufferList ret = AUDIT_EVENTS_BUFFER.get();
 
   ret.reset();
 
   return ret;
 }
   }

   class FixedBufferList {
  ..
  public void reset() {
for (int i = 0; i < length; i++) {
  buffer.get(i).clear();
}

length = 0;
  }
  
  public List toList() {
return this.buffer.subList(0, this.length); // no need to reset 
length here
  }
   }


- Madhan Neethiraj


On July 29, 2020, 6:23 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 29, 2020, 6:23 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
> 63116d467 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationSender.java
>  c2ae5122b 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/9/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-29 Thread Ashutosh Mestry via Review Board


> On July 24, 2020, 7:14 a.m., Madhan Neethiraj wrote:
> > intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
> > Lines 67 (patched)
> > 
> >
> > incrementCapacityBy is already an instance member, line #29 above. It 
> > is unnecessary to pass it as  parameter to methods at #67, #79. Please 
> > review and update.

I removed the initial capacity from the class altogether.


> On July 24, 2020, 7:14 a.m., Madhan Neethiraj wrote:
> > intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
> > Lines 73 (patched)
> > 
> >
> > This assumes that access is done in strict ascending order sequence, 
> > without any gaps. This is not a good assumption for a generic/reusable 
> > class implementation. This will be broken with the following access pattern:
> >   FixedBufferList list = new FixedBufferList<>(10, 1);
> >   
> >   list.getForUpdate(0);
> >   list.getForUpdate(15);
> > 
> > Please review my earlier comment and update to remove this assumption. 
> > Also, I suggest to collapse following methods into one, named 
> > ensureCapacity():
> >  - request()
> >  - ensureCapacity(): it not obvious what the return value of this 
> > method is, and it is critical for the caller at #74. All this noise, and 
> > confusion can be avoided by simply collapsing this method into previous 
> > method at #67.
> >  - instantiateItems(): this method is not callable from any other 
> > context, as it simply "adds()" to the buffer

I re-worked the interface. There is no way to pass index.


> On July 24, 2020, 7:14 a.m., Madhan Neethiraj wrote:
> > webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
> > Line 168 (original), 175 (patched)
> > 
> >
> > sendNotifications() can be called from the same thread multiple times - 
> > for example when a transaction involves create/update/delete of entities 
> > i.e. a graph transaction can call more than one of the following methods:
> > - onEntitiesAdded()
> > - onEntitiesUpdated()
> > - onEntitiesDeleted()
> > - onEntitiesPurged()
> > - onClassificationsAdded()
> > - onClassificationsUpdated()
> > - onClassificationsDeleted()
> > 
> > Each call would end up calling EntityNotificationSender.send() with the 
> > notification objects, which in turn stores the objects in a list. The 
> > second call in the same transaction will end up overwriting 
> > EntityNotificationV2 instances in list buffer - which will cause the 
> > notification objects stored in EntityNotificationSender.send() be updated 
> > as well. This will result in incorrect notification to be sent. Please 
> > review this carefully and update.

I addressed by eliminating usage of FixedBufferList from this class.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221351
---


On July 24, 2020, 5:18 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 24, 2020, 5:18 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a 

Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-29 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 29, 2020, 6:23 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  intg/src/main/java/org/apache/atlas/model/Clearable.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java 
63116d467 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationSender.java
 c2ae5122b 


Diff: https://reviews.apache.org/r/72666/diff/9/

Changes: https://reviews.apache.org/r/72666/diff/8-9/


Testing
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour. Without out of memory error.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-24 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221351
---




intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 67 (patched)


incrementCapacityBy is already an instance member, line #29 above. It is 
unnecessary to pass it as  parameter to methods at #67, #79. Please review and 
update.



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 73 (patched)


This assumes that access is done in strict ascending order sequence, 
without any gaps. This is not a good assumption for a generic/reusable class 
implementation. This will be broken with the following access pattern:
  FixedBufferList list = new FixedBufferList<>(10, 1);
  
  list.getForUpdate(0);
  list.getForUpdate(15);

Please review my earlier comment and update to remove this assumption. 
Also, I suggest to collapse following methods into one, named ensureCapacity():
 - request()
 - ensureCapacity(): it not obvious what the return value of this method 
is, and it is critical for the caller at #74. All this noise, and confusion can 
be avoided by simply collapsing this method into previous method at #67.
 - instantiateItems(): this method is not callable from any other context, 
as it simply "adds()" to the buffer



webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
Line 168 (original), 175 (patched)


sendNotifications() can be called from the same thread multiple times - for 
example when a transaction involves create/update/delete of entities i.e. a 
graph transaction can call more than one of the following methods:
- onEntitiesAdded()
- onEntitiesUpdated()
- onEntitiesDeleted()
- onEntitiesPurged()
- onClassificationsAdded()
- onClassificationsUpdated()
- onClassificationsDeleted()

Each call would end up calling EntityNotificationSender.send() with the 
notification objects, which in turn stores the objects in a list. The second 
call in the same transaction will end up overwriting EntityNotificationV2 
instances in list buffer - which will cause the notification objects stored in 
EntityNotificationSender.send() be updated as well. This will result in 
incorrect notification to be sent. Please review this carefully and update.


- Madhan Neethiraj


On July 24, 2020, 5:18 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 24, 2020, 5:18 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
>  a677b315c 
> 
> 
> Diff: https://reviews.apache.or

Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 24, 2020, 5:18 a.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
 a677b315c 


Diff: https://reviews.apache.org/r/72666/diff/7/


Testing (updated)
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour. Without out of memory error.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221337
---




repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
Lines 409 (patched)


should we reset/clear these values for EntityAuditEventV2? It might have 
remnant values when reused again?


- Sarath Subramanian


On July 23, 2020, 11:48 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 23, 2020, 11:48 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
>  a677b315c 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/7/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221333
---




intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 51 (patched)


this assumes that the elements are accessed in ascending oeder - since 
getAndIncrementLength() simply increments the length by 1. I suggest to get rid 
of getAndIncrementLength() and update getForUpdate() as below:

  public T getForUpdate(int index) {
ensureCapacity(index + 1);

T ret = buffer.get(index);

// update length only if the accessed index is beyond the current length
if (this.length <= index) {
  this.length = index + 1;
}

return ret;
  }



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 55 (patched)


To make it easier to read, I suggest to get rid of method 
resetCurrentLength() and update toList() as below:
  public List toList(boolean resetList) {
List ret = this.buffer.subList(0, this.length);

if (resetList) {
  this.length = 0;
}
  }



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 60 (patched)


- is 'incrementCapacityBy' argument needed, given this is available in 
instance memeber this.incrementCapacityBy?
- consider renaming request() => ensureCapacity()
- consider folding adjustCapacity(), instantiateItems() into this method, 
as below:

  private void ensureCapacity(int capacity) {
if (capacity > this.buffer.size()) {
  int currCapacity = this.buffer.size();
  int newCapacity  = currCapacity + incrementCapacityBy;

  while (newCapacity < capacity) {
newCapacity += incrementCapacityBy;
  }

  this.buffer.ensureCapacity(newCapacity);

  // initialize new entries
  for (int i = currCapacity; i < newCapacity; i++) {
this.buffer.add(itemClass.newInstance());
  }
}
  }



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 71 (patched)


adjustCapacity() => ensureCapacity()



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 87 (patched)


Consider moving instantiateItems() into adjustCapacity() implementation; 
this will avoid callers of adjustCapacity() to call instantiateItems() as well .



intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java
Lines 44 (patched)


Why does this block need to be under synchronized? No state is updated in 
this block.



intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java
Lines 33 (patched)


verifyPeriodicPurge() - the method name doesn't seem to related the 
implementation. Please review and update.



repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
Lines 90 (patched)


FixedBufferListAccessor class doesn't seem to add much value. Consider the 
following simper usage:

  private static final ThreadLocal> 
AUDIT_EVENTS_BUFFER =
  new ThreadLocal.withInitial(() -> new 
FixedBufferList(FIXED_BUFFER_INITIAL_SIZE_DEFAULT, 
FIXED_BUFFER_INCREMENT_DEFAULT));

Note the use of 'static' above; fixedBufferListAccessor can also be 
eliminated. You might consider a simple accesser method in 
EntityAuditListenerV2:

  private FixedBufferList getAuditEventsBuffer() {
return AUDIT_EVENTS_BUFFER.get();
  }

Same applies for EntityNotificationListenerV2 as well.


- Madhan Neethiraj


On July 23, 2020, 6:48 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 23, 2020, 6:48 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixe

Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 23, 2020, 6:48 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
 a677b315c 


Diff: https://reviews.apache.org/r/72666/diff/7/

Changes: https://reviews.apache.org/r/72666/diff/6-7/


Testing
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour. Without out of memory error.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Ashutosh Mestry via Review Board


> On July 23, 2020, 5:59 p.m., Sarath Subramanian wrote:
> > intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
> > Lines 33 (patched)
> > 
> >
> > ArrayList => List

I would prefer to keep it buffer since this implements the *fixed buffer* 
algorithm to handle the problem.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221322
---


On July 22, 2020, 4:16 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 22, 2020, 4:16 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
>  a677b315c 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/6/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221322
---




intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 33 (patched)


ArrayList => List



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 59 (patched)


request => initializeBuffer()



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 105 (patched)


getActualTypeArguments() might return empty array. consider checking length



repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
Lines 97 (patched)


do we need to create a new instance and pass to the constructor? The 
constructor just needs the class.

consider updating constructor of FixedBufferListAccessor to take class:

this.fixedBufferListAccessor = new 
FixedBufferListAccessor<>(EntityAuditEventV2FixedList.class);

same for line EntityNotificationListenerV2.java line#94


- Sarath Subramanian


On July 22, 2020, 9:16 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 22, 2020, 9:16 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
>  a677b315c 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/6/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-22 Thread Nikhil Bonte

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221317
---


Ship it!




Ship It!

- Nikhil Bonte


On July 22, 2020, 4:16 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 22, 2020, 4:16 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
>  a677b315c 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/6/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-22 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 22, 2020, 4:16 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Added configuration property 
atlas.notification.fixed.buffer.items.increment.count with default set to 10.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
 a677b315c 


Diff: https://reviews.apache.org/r/72666/diff/6/

Changes: https://reviews.apache.org/r/72666/diff/5-6/


Testing
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour. Without out of memory error.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-20 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 20, 2020, 11:46 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: 
- Additional refactoring.
- Buffer increments by additive factor (rather than multiplicative factor).


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
 a677b315c 


Diff: https://reviews.apache.org/r/72666/diff/5/

Changes: https://reviews.apache.org/r/72666/diff/4-5/


Testing (updated)
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour. Without out of memory error.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-13 Thread Ashutosh Mestry via Review Board


> On July 10, 2020, 8:11 p.m., Madhan Neethiraj wrote:
> > intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
> > Lines 38 (patched)
> > 
> >
> > 'bufferElementClass' can be derived from the generic, as below. With 
> > this, bufferElementClass argument can be eliminated.
> > 
> >   ParameterizedType genericSuperclass = (ParameterizedType) 
> > getClass().getGenericSuperclass();
> >   Type  type  = 
> > genericSuperclass.getActualTypeArguments()[0];
> > 
> >   if (type instanceof ParameterizedType) {
> > this.bufferElementClass = (Class) ((ParameterizedType) 
> > type).getRawType();
> >   } else {
> > this.bufferElementClass = (Class) type;
> >   }

This inference mechanism works only when a concreate class is derived from the 
gerneric. I have done that. This leads to 2 additional classes but makes the 
implementation cleaner.


> On July 10, 2020, 8:11 p.m., Madhan Neethiraj wrote:
> > intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
> > Lines 56 (patched)
> > 
> >
> > - set() => add() - to be consistent with ArrayList
> > - better yet, consider collapsing both get() and set() with the 
> > following:
> > public T getForUpdate(int idx) {
> >   if (this.length <= idx) {
> > this.length = (idx + 1);
> > 
> > request(this.length);
> >   }
> > 
> >   return buffer.get(idx);
> > }

I hve implemented getForUpdate that makes add or set unnecessary.


> On July 10, 2020, 8:11 p.m., Madhan Neethiraj wrote:
> > intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
> > Lines 62 (patched)
> > 
> >
> > Calling toList() effectively clears the buffer (since the 
> > currentIndex/length is reset). This is not intutive. Consider adding a 
> > separate method, reset(), for this.

Adding reset will lead to additional call in the usage.


> On July 10, 2020, 8:11 p.m., Madhan Neethiraj wrote:
> > intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
> > Lines 77 (patched)
> > 
> >
> > It might help to defer instantiation until an entry is accessed. 
> > Consider handling this in get(idx):
> >   public T get(int i) {
> > request(i + 1);
> > 
> > T ret = buffer.get(i);
> > 
> > if (ret == null) {
> >   ret = elementClass.newInstance();
> >   
> >   buffer.set(i, ret);
> > }
> > 
> > return ret;
> >   }
> > 
> > With this change, instantiateElements() will not be needed.

I prefer keeping the list ready when accessed.


> On July 10, 2020, 8:11 p.m., Madhan Neethiraj wrote:
> > intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
> > Lines 83 (patched)
> > 
> >
> > The intrepretation of 'incrementByFactor' doesn't look right. Shouldn't 
> > it be something like:
> > 
> >   // incrementByFactor should be a float, with value > 0
> >   // incrCapacity can be initialized in constructor as well
> >   int incrCapacity = (int) (this.capacity * incrementByFactor) + 1.0F;
> > 
> >   int newCapacity = this.capacity; 
> > 
> >   for (; newCapacity < requestedSize; newCapacity += incrCapacity);
> >   
> >   this.buffer.ensureCapacity(newCapacity);
> >   
> >   return newCapacity;

incrementByFactor increments the current capacity.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221193
---


On July 13, 2020, 7:45 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 13, 2020, 7:45 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> nu

Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-13 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 13, 2020, 7:45 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
 a677b315c 


Diff: https://reviews.apache.org/r/72666/diff/4/

Changes: https://reviews.apache.org/r/72666/diff/3-4/


Testing
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-10 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221193
---




intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 30 (patched)


collectionName - doesn't seem to be used, except for the log statement at 
#79. Consider removing this, if its not needed.



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 31 (patched)


bufferElementClass => elementClass



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 35 (patched)


bufferMaxSize => capacity



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 36 (patched)


currentIndex => length
  looks like currentIndex is used to determine number of valid entries in 
the buffer. Consider renaming to 'length'



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 38 (patched)


'bufferElementClass' can be derived from the generic, as below. With this, 
bufferElementClass argument can be eliminated.

  ParameterizedType genericSuperclass = (ParameterizedType) 
getClass().getGenericSuperclass();
  Type  type  = 
genericSuperclass.getActualTypeArguments()[0];

  if (type instanceof ParameterizedType) {
this.bufferElementClass = (Class) ((ParameterizedType) 
type).getRawType();
  } else {
this.bufferElementClass = (Class) type;
  }



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 56 (patched)


- set() => add() - to be consistent with ArrayList
- better yet, consider collapsing both get() and set() with the following:
public T getForUpdate(int idx) {
  if (this.length <= idx) {
this.length = (idx + 1);

request(this.length);
  }

  return buffer.get(idx);
}



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 57 (patched)


Given incrementByFactor is an instance member, is it necessary to send as 
argument?



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 62 (patched)


Calling toList() effectively clears the buffer (since the 
currentIndex/length is reset). This is not intutive. Consider adding a separate 
method, reset(), for this.



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 67 (patched)


Given incrementByFactor is specified in the constructor, why increment by 
'1' here? Consider merging #66 and #70 into a single method.



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 77 (patched)


It might help to defer instantiation until an entry is accessed. Consider 
handling this in get(idx):
  public T get(int i) {
request(i + 1);

T ret = buffer.get(i);

if (ret == null) {
  ret = elementClass.newInstance();
  
  buffer.set(i, ret);
}

return ret;
  }

With this change, instantiateElements() will not be needed.



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 83 (patched)


The intrepretation of 'incrementByFactor' doesn't look right. Shouldn't it 
be something like:

  // incrementByFactor should be a float, with value > 0
  // incrCapacity can be initialized in constructor as well
  int incrCapacity = (int) (this.capacity * incrementByFactor) + 1.0F;

  int newCapacity = this.capacity; 

  for (; newCapacity < requestedSize; newCapacity += incrCapacity);
  
  this.buffer.ensureCapacity(newCapacity);
  
  return newCapacity;



intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java
Lines 25 (patched)


Instead of class FixedBufferListAccessor, consider using the follownig:

  ThreadLocal> buffer = ThreadLocal.withInitial(() -> 
new FixedBufferList(FIXED_BUFFER_INITIAL_SIZE_DEFAULT, 
FIXED_BUFFER_INCREMENT_DEFAULT));


- Madhan Neethiraj


On July 10, 2020, 4:43 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> -

Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-10 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221179
---




intg/src/main/java/org/apache/atlas/utils/NamedFixedBufferList.java
Lines 26 (patched)


Consider using thread-local ThreadLocal.


- Madhan Neethiraj


On July 10, 2020, 4:43 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 10, 2020, 4:43 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
>  a677b315c 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/3/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-10 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 10, 2020, 4:43 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Addressed review comments.
- Renamed *NamedFixedBufferList* to *FixedBufferListAccessor*.
- Added unit tests for *FixedBufferListAccessor*.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
 a677b315c 


Diff: https://reviews.apache.org/r/72666/diff/3/

Changes: https://reviews.apache.org/r/72666/diff/2-3/


Testing (updated)
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-10 Thread Nikhil Bonte

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221192
---




intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 85 (patched)


Refactor to arrange methods such that public methods appear earlier to 
private methods.



webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
Lines 77 (patched)


NamedFixedBufferList: Does not have a mechanism to remove objects


- Nikhil Bonte


On July 9, 2020, 6:17 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 9, 2020, 6:17 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/NamedFixedBufferList.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
>  a677b315c 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/2/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-09 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 9, 2020, 6:17 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: Improvement to initial buffer allocation.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/NamedFixedBufferList.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
 a677b315c 


Diff: https://reviews.apache.org/r/72666/diff/2/

Changes: https://reviews.apache.org/r/72666/diff/1-2/


Testing
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/


Thanks,

Ashutosh Mestry



Review Request 72666: Notification: Solution to Memory Build-up

2020-07-08 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs
-

  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/NamedFixedBufferList.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
 a677b315c 


Diff: https://reviews.apache.org/r/72666/diff/1/


Testing
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/


Thanks,

Ashutosh Mestry