[
https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000047#comment-14000047
]
Siddharth Seth commented on TEZ-776:
------------------------------------
Couple of options to fix the memory usage.
The first two options move event tracking into the Vertices and Edges
(specifically option1 to the vertexpluginmanager/edgepluginmanager and option2
to the vertexplugin/edgeplugin). The event fetching logic would need to change
to go to the individual components, instead of pulling the events from the task
itself.
1. EdgeManager/VertexManager keep track of events, as well as indices for
individual tasks. CompositeEvents are not exploded up front, and instead target
tasks are maintained via bitsets. This would work fairly well for
CompositeEvents and reduces memory by about 30X for the event storage. For non
composite events, it can end up being more expensive than storing the events in
tasks itself - depending on the number of consumer tasks for the events. Have
to work out some more details on this though.
2. Push event tracking into the actual user plugin itself. This would be the
most efficient in terms of memory usage, since the routing logic can be used to
minimize storage. This has the fairly big drawback of pushing this tracking
into user code.
3. A hybrid between option1 and option2, where a user plugin could choose to
implement the tracking logic, but if it doesn't - we fall back to option 1.
4. Go to disk. Don't particularly like this options, since it can slow down
tasks significantly.
I'm in favour of getting started with Option1, and maybe move to option 3 in
the future. [~bikassaha], [~hitesh] - thoughts ?
> Reduce AM mem usage caused by storing TezEvents
> -----------------------------------------------
>
> Key: TEZ-776
> URL: https://issues.apache.org/jira/browse/TEZ-776
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Siddharth Seth
> Assignee: Siddharth Seth
>
> This is open ended at the moment.
> A fair chunk of the AM heap is taken up by TezEvents (specifically
> DataMovementEvents - 64 bytes per event).
> Depending on the connection pattern - this puts limits on the number of tasks
> that can be processed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)