[jira] [Updated] (TEZ-2901) Flag to limit the number of inmemory segments in Tez Merge Manager
[ https://issues.apache.org/jira/browse/TEZ-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2901: Description: incorporate a new flag "tez.runtime.shuffle.in-memory.segments.max" to track and limit number of inmemory segments before merge-spill to disk. (was: incorporate a new flag "tez.runtime.shuffle.in-memory.segments.max" to track number of inmemory segments before merge-spill to disk.) > Flag to limit the number of inmemory segments in Tez Merge Manager > -- > > Key: TEZ-2901 > URL: https://issues.apache.org/jira/browse/TEZ-2901 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat > Attachments: TEZ-2850.1.sample.patch > > > incorporate a new flag "tez.runtime.shuffle.in-memory.segments.max" to track > and limit number of inmemory segments before merge-spill to disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2901) Flag to limit the number of inmemory segments in Tez Merge Manager
[ https://issues.apache.org/jira/browse/TEZ-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2901: Attachment: TEZ-2850.1.sample.patch [~jeagles] attached a sample patch. > Flag to limit the number of inmemory segments in Tez Merge Manager > -- > > Key: TEZ-2901 > URL: https://issues.apache.org/jira/browse/TEZ-2901 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat > Attachments: TEZ-2850.1.sample.patch > > > incorporate a new flag "tez.runtime.shuffle.in-memory.segments.max" to track > number of inmemory segments before merge-spill to disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2901) Flag to limit the number of inmemory segments in Tez Merge Manager
Saikat created TEZ-2901: --- Summary: Flag to limit the number of inmemory segments in Tez Merge Manager Key: TEZ-2901 URL: https://issues.apache.org/jira/browse/TEZ-2901 Project: Apache Tez Issue Type: Improvement Reporter: Saikat incorporate a new flag "tez.runtime.shuffle.in-memory.segments.max" to track number of inmemory segments before merge-spill to disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961167#comment-14961167 ] Saikat commented on TEZ-2850: - [~jeagles] unassigning myself to due time critical nature of this bug. Will take up the bug if still unresolved. > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, > TEZ-2850.2.patch, TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2850: Assignee: (was: Saikat) > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, > TEZ-2850.2.patch, TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955234#comment-14955234 ] Saikat commented on TEZ-2850: - [~sseth] if my understanding is correct when we call the InMemoryReader Constructor which in turn calls the IFile.Reader superclass constructor, we should pass an info saying that donot allocate the IFileInputStream object since checksumIn its not used, as the data is already in memory. > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat >Assignee: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, > TEZ-2850.2.patch, TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2850: Attachment: TEZ-2850.2.patch > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat >Assignee: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, > TEZ-2850.2.patch, TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2850: Attachment: TEZ-2850.1.patch [~gopalv] [~sseth] Submitted a patch 2850.1 with a new flag "tez.runtime.shuffle.in-memory.segments.max" to track number of inmemory segments before merge-spill to disk. Will look into the constructor for IFileInputStream. > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat >Assignee: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, > TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908260#comment-14908260 ] Saikat commented on TEZ-2850: - [~sseth] some question for the approach that you mention 1. "We should try capping the value based on a rough estimate of the size of segments." How to we estimate the size of the segments, since it may vary for each map output? and what percent should be set as default? 2. Whats should be the default number of segments (should it be 0, so that 0 means ignore this setting)? (commitmemory>mergethreshold || (inMemMergeSegmentsThreshold != 0 && inMemoryMapOutputs.size() > inMemMergeSegmentsThreshold)) 3. What should be the flag name? hadoop has something like "mapreduce.reduce.merge.inmem.threshold". > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat >Assignee: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat reassigned TEZ-2850: --- Assignee: Saikat > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat >Assignee: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904597#comment-14904597 ] Saikat commented on TEZ-2850: - Thanks [~gopalv] [~sseth] for the explanation. So how do we go about handling this scenario. Can we have a TEZ config flag to turn on/off this optimization feature? If so what name should be used for the flag. I can submit a patch for review. So this IFileInputStream optimzation flag and/or tweaking the shuffle.merge.percent flag can resolve this problem. (without this optmization turned off, we might need to put a very low value of around 0.01 for shuffle.merge.percent) > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903498#comment-14903498 ] Saikat commented on TEZ-2850: - adding [~jeagles] [~rohini] for watch > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2850: Attachment: TEZ-2850_test.patch > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903497#comment-14903497 ] Saikat commented on TEZ-2850: - [~hitesh] I was going through Hadoop's IFileInputStream implementation(hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/IFileInputStream.java ) and found that this implementation of buffer[4096] is not present in hadoop but in Tez. I submitted a tentative patch in which IFileInputStream of Tez behaves exactly as that of Hadoop. Can you please throw some light what does this added buffer do? > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903481#comment-14903481 ] Saikat edited comment on TEZ-2850 at 9/22/15 9:35 PM: -- This is a unique scenario, that we faced, while running a Tez Job. A reducer vertex task fetches around 20 map outputs, each of around ~100 odd bytes. So total mapoutput size is around 20 * 100 ~ 20Mb. The MergeManager has a merge threshold check, where if it crosses this threshold, InmemoryMerger will be triggered and it will merge & spill the inmemory fetched map outputs to disk to free up memory. In our scenario, mergethreshold(~500mb) >> commitMemory(~20mb), So inMemory merger never gets triggerd. Finally when the finalMerge() is called in close(), MergeManager calls createInMemorySegments() to do the final merge. In this, when Tez creates a IFileInputStream object for the InMemoryReader, the IFileInputStream allocates a buffer of size 4096(hard coded). Thus the total size of a single inmemory segment comes to around 5kb, even though data in this segment is only in order of 100 bytes. So, for 20 map outputs, the total size is 20 * 5000 ~ 1G, which causes OOM! Attached is a snapshot of the heap dump which shows this scenario. was (Author: saikatr): This is a unique scenario, that we faced, while running a Tez Job. A reducer vertex task fetches around 20 map outputs, each of around ~100 odd bytes. So total mapoutput size is around 20 * 100 ~ 20Mb. The MergeManager has a merge threshold check, where if it crosses this threshold, InmemoryMerger will be triggered and it will spill the inmemory fetched map outputs to disk to free up memory. In our scenario, mergethreshold(~500mb) >> commitMemory(~20mb), So inMemory merger never gets triggerd. Finally when the finalMerge() is called in close(), MergeManager calls createInMemorySegments() to do the final merge. In this, when Tez creates a IFileInputStream object for the InMemoryReader, the IFileInputStream allocates a buffer of size 4096(hard coded). Thus the total size of a single inmemory segment comes to around 5kb, even though data in this segment is only in order of 100 bytes. So, for 20 map outputs, the total size is 20 * 5000 ~ 1G, which causes OOM! Attached is a snapshot of the heap dump which shows this scenario. > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903481#comment-14903481 ] Saikat commented on TEZ-2850: - This is a unique scenario, that we faced, while running a Tez Job. A reducer vertex task fetches around 20 map outputs, each of around ~100 odd bytes. So total mapoutput size is around 20 * 100 ~ 20Mb. The MergeManager has a merge threshold check, where if it crosses this threshold, InmemoryMerger will be triggered and it will spill the inmemory fetched map outputs to disk to free up memory. In our scenario, mergethreshold(~500mb) >> commitMemory(~20mb), So inMemory merger never gets triggerd. Finally when the finalMerge() is called in close(), MergeManager calls createInMemorySegments() to do the final merge. In this, when Tez creates a IFileInputStream object for the InMemoryReader, the IFileInputStream allocates a buffer of size 4096(hard coded). Thus the total size of a single inmemory segment comes to around 5kb, even though data in this segment is only in order of 100 bytes. So, for 20 map outputs, the total size is 20 * 5000 ~ 1G, which causes OOM! Attached is a snapshot of the heap dump which shows this scenario. > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
[ https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2850: Attachment: OOM_3.png OOM_2.png OOM_1.png > Tez MergeManager OOM for small Map Outputs > -- > > Key: TEZ-2850 > URL: https://issues.apache.org/jira/browse/TEZ-2850 > Project: Apache Tez > Issue Type: Bug >Reporter: Saikat > Attachments: OOM_1.png, OOM_2.png, OOM_3.png > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2850) Tez MergeManager OOM for small Map Outputs
Saikat created TEZ-2850: --- Summary: Tez MergeManager OOM for small Map Outputs Key: TEZ-2850 URL: https://issues.apache.org/jira/browse/TEZ-2850 Project: Apache Tez Issue Type: Bug Reporter: Saikat -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2658: Attachment: TEZ-2658.8.patch > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, > TEZ-2658.4.patch, TEZ-2658.5.patch, TEZ-2658.6.patch, TEZ-2658.7.patch, > TEZ-2658.8.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2658: Attachment: TEZ-2658.7.patch > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, > TEZ-2658.4.patch, TEZ-2658.5.patch, TEZ-2658.6.patch, TEZ-2658.7.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745584#comment-14745584 ] Saikat edited comment on TEZ-2658 at 9/15/15 3:11 PM: -- 1. Fixed compilation issues. 2. created a readme file. (docs/src/site/markdown/tez-cli-tool.md) was (Author: saikatr): Fixed compilation issues. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, > TEZ-2658.4.patch, TEZ-2658.5.patch, TEZ-2658.6.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2658: Attachment: TEZ-2658.6.patch Fixed compilation issues. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, > TEZ-2658.4.patch, TEZ-2658.5.patch, TEZ-2658.6.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738054#comment-14738054 ] Saikat commented on TEZ-2658: - [~rohini] Will do that. I also need to resolve some compile issue that I am seeing with this patch on the latest tip of master branch. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, > TEZ-2658.4.patch, TEZ-2658.5.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738048#comment-14738048 ] Saikat commented on TEZ-2643: - [~jeagles] > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Fix For: 0.8.1 > > Attachments: TEZ-2643.1.patch, TEZ-2643.2.patch, TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2791) Reduce tez.runtime.shuffle.fetch.buffer.percent default value to avoid corner case OOMs
[ https://issues.apache.org/jira/browse/TEZ-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738026#comment-14738026 ] Saikat commented on TEZ-2791: - [~jeagles] I believe this is the same setitngs I was taking about which had bloated the old gen heap space. Reducing this will ensure that fetched map outputs are merged-spilled early to disk by the merge manager. > Reduce tez.runtime.shuffle.fetch.buffer.percent default value to avoid corner > case OOMs > --- > > Key: TEZ-2791 > URL: https://issues.apache.org/jira/browse/TEZ-2791 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan > > Default value for tez.runtime.shuffle.fetch.buffer.percent is set to 0.9. In > corner cases & based on scheduling & data sizes, it is possible that JVM > crosses old-gen threshold and ends up throwing OOM. It would be better to set > the default value to 0.6. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735571#comment-14735571 ] Saikat edited comment on TEZ-2643 at 9/8/15 9:18 PM: - Thanks [~rajesh.balamohan] for the review comments. Made the following changes in patchset 2643.2 comment 2: a. move the spillrecords init to after the check for ignoreSpillIfNeeded. b. renamed to ignoreEmptySpills Comment 1: I didnt want to put the sendPipelinedShuffleEvents() inside spill because of the following scenario: a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we need to change the if (!isFinalMergeEnabled) {} where only one event is sent out for last spill. b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we would need to pass lastEvent flag to sendPipelinedShuffleEvents(). There would be too many changes, so I return a boolean value from spill, to let the caller know if there was actually a spill or not, and then the caller can take a decision to send events and if its a last event etc. was (Author: saikatr): Thanks [~rajesh.balamohan] for the review comments. Made the following changes in patchset 2643.1 comment 2: move the spillrecords init to after the check for ignoreSpillIfNeeded. Comment 1: I didnt want to put the sendPipelinedShuffleEvents() inside spill because of the following scenario: a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we need to change the if (!isFinalMergeEnabled) {} where only one event is sent out for last spill. b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we wud need to pass lastEvent flag to sendPipelinedShuffleEvents(). There would be too many changes, so I return a boolean value from spill, to let the caller know it there was actually a spill, and then the caller can take a decision to send events and if its a last event etc. > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.1.patch, TEZ-2643.2.patch, TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2643: Attachment: TEZ-2643.2.patch > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.1.patch, TEZ-2643.2.patch, TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735571#comment-14735571 ] Saikat edited comment on TEZ-2643 at 9/8/15 8:41 PM: - Thanks [~rajesh.balamohan] for the review comments. Made the following changes in patchset 2643.1 comment 2: move the spillrecords init to after the check for ignoreSpillIfNeeded. Comment 1: I didnt want to put the sendPipelinedShuffleEvents() inside spill because of the following scenario: a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we need to change the if (!isFinalMergeEnabled) {} where only one event is sent out for last spill. b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we wud need to pass lastEvent flag to sendPipelinedShuffleEvents(). There would be too many chanes, so I return a boolean value from spill, to let the caller know it there was actually a spill, and then the caller can take a decision to send events and if its a last event etc. was (Author: saikatr): Thanks [~rajesh.balamohan] for the review comments. Made the following changes in patchset 2643.1 comment 2: move the spillrecords init to after the check for ignoreSpillIfNeeded. Comment 1: I didnt want to put the sendPipelinedShuffleEvents() inside spill because of the following scenario: a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we need to change the if (!isFinalMergeEnabled) {} where only event is sent out for last spill. b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we wud need to pass lastEvent flag to sendPipelinedShuffleEvents(). There would be too many chanes, so I return a boolean value from spill, to let the caller know it there was actually a spill, and then the caller can take a decision to send events and if its a last event etc. > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.1.patch, TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735571#comment-14735571 ] Saikat edited comment on TEZ-2643 at 9/8/15 8:42 PM: - Thanks [~rajesh.balamohan] for the review comments. Made the following changes in patchset 2643.1 comment 2: move the spillrecords init to after the check for ignoreSpillIfNeeded. Comment 1: I didnt want to put the sendPipelinedShuffleEvents() inside spill because of the following scenario: a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we need to change the if (!isFinalMergeEnabled) {} where only one event is sent out for last spill. b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we wud need to pass lastEvent flag to sendPipelinedShuffleEvents(). There would be too many changes, so I return a boolean value from spill, to let the caller know it there was actually a spill, and then the caller can take a decision to send events and if its a last event etc. was (Author: saikatr): Thanks [~rajesh.balamohan] for the review comments. Made the following changes in patchset 2643.1 comment 2: move the spillrecords init to after the check for ignoreSpillIfNeeded. Comment 1: I didnt want to put the sendPipelinedShuffleEvents() inside spill because of the following scenario: a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we need to change the if (!isFinalMergeEnabled) {} where only one event is sent out for last spill. b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we wud need to pass lastEvent flag to sendPipelinedShuffleEvents(). There would be too many chanes, so I return a boolean value from spill, to let the caller know it there was actually a spill, and then the caller can take a decision to send events and if its a last event etc. > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.1.patch, TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735571#comment-14735571 ] Saikat commented on TEZ-2643: - Thanks [~rajesh.balamohan] for the review comments. Made the following changes in patchset 2643.1 comment 2: move the spillrecords init to after the check for ignoreSpillIfNeeded. Comment 1: I didnt want to put the sendPipelinedShuffleEvents() inside spill because of the following scenario: a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we need to change the if (!isFinalMergeEnabled) {} where only event is sent out for last spill. b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we wud need to pass lastEvent flag to sendPipelinedShuffleEvents(). There would be too many chanes, so I return a boolean value from spill, to let the caller know it there was actually a spill, and then the caller can take a decision to send events and if its a last event etc. > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.1.patch, TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2643: Attachment: TEZ-2643.1.patch > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.1.patch, TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723960#comment-14723960 ] Saikat commented on TEZ-2643: - [~rajesh.balamohan] gentle reminder to review the patch. > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723474#comment-14723474 ] Saikat commented on TEZ-2726: - [~bikassaha] If you are ok with this approach I can submit an initial patch for review. > Handle invalid number of partitions for SCATTER-GATHER edge > --- > > Key: TEZ-2726 > URL: https://issues.apache.org/jira/browse/TEZ-2726 > Project: Apache Tez > Issue Type: Improvement >Affects Versions: 0.7.0 >Reporter: Saikat >Assignee: Saikat > > Encountered an issue where the source vertex has M task and sink vertex has N > tasks (N > M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. > This resulted in sink vertex receiving DMEs with non existent targetIds. > The fetchers for the sink vertex tasks then try to retrieve the map outputs > and retrieve invalid headers due to exception in the ShuffleHandler. > Possible fixes: > 1. raise proper Tez Exception to indicate this invalid scenario. > 2. or write appropriate empty partition bits, for the missing partitions > before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701705#comment-14701705 ] Saikat commented on TEZ-2726: - [~bikassaha] yes. So is this a proper place to raise an exception? an AMUserCodeException by checking this condition before sending out the CDMEs in Edge.java sendTezEventToDestinationTasks() for a scatter gather edge. > Handle invalid number of partitions for SCATTER-GATHER edge > --- > > Key: TEZ-2726 > URL: https://issues.apache.org/jira/browse/TEZ-2726 > Project: Apache Tez > Issue Type: Improvement >Affects Versions: 0.7.0 >Reporter: Saikat >Assignee: Saikat > > Encountered an issue where the source vertex has M task and sink vertex has N > tasks (N > M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. > This resulted in sink vertex receiving DMEs with non existent targetIds. > The fetchers for the sink vertex tasks then try to retrieve the map outputs > and retrieve invalid headers due to exception in the ShuffleHandler. > Possible fixes: > 1. raise proper Tez Exception to indicate this invalid scenario. > 2. or write appropriate empty partition bits, for the missing partitions > before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701505#comment-14701505 ] Saikat commented on TEZ-2726: - [~rajesh.balamohan] [~bikassaha] There are no empty partitions in the example I mentioned. The source vertex has 1 task (used a UnorderedKVOutput, so produced only 1 partition)and sink vertex has 3 tasks. The edge is of type SCATTER-GATHER. When http fetchers sent a request for fetching the map outputs, the code in shufflehandler catches IOException in IndexCache.java getIndexInformation() function for the condition [info.mapSpillRecord.size() <= reduce]. 2015-08-10 12:36:42,314 [New I/O worker #32] ERROR mapred.ShuffleHandler: Shuffle error in populating headers : java.io.IOException: Invalid request Map Id = attempt_1437478617943_17839_1_05_00_0_10003 Reducer = 1 Index Info Length = 1 at org.apache.hadoop.mapred.IndexCache.getIndexInformation(IndexCache.java:84) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.getMapOutputInfo(ShuffleHandler.java:855) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.populateHeaders(ShuffleHandler.java:875) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:793) I ll try to get an excerpt of the Fetcher logs for DMEs and post here. > Handle invalid number of partitions for SCATTER-GATHER edge > --- > > Key: TEZ-2726 > URL: https://issues.apache.org/jira/browse/TEZ-2726 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > > Encountered an issue where the source vertex has M task and sink vertex has N > tasks (N > M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. > This resulted in sink vertex receiving DMEs with non existent targetIds. > The fetchers for the sink vertex tasks then try to retrieve the map outputs > and retrieve invalid headers due to exception in the ShuffleHandler. > Possible fixes: > 1. raise proper Tez Exception to indicate this invalid scenario. > 2. or write appropriate empty partition bits, for the missing partitions > before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700237#comment-14700237 ] Saikat commented on TEZ-2726: - One possible place to raise a proper exception can be in sendTezEventToDestinationTasks() in Edge.java before sending out the DME. We can raise AMUserCodeException with source as edgemanager, and appropriate message. > Handle invalid number of partitions for SCATTER-GATHER edge > --- > > Key: TEZ-2726 > URL: https://issues.apache.org/jira/browse/TEZ-2726 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > > Encountered an issue where the source vertex has M task and sink vertex has N > tasks (N > M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. > This resulted in sink vertex receiving DMEs with non existent targetIds. > The fetchers for the sink vertex tasks then try to retrieve the map outputs > and retrieve invalid headers due to exception in the ShuffleHandler. > Possible fixes: > 1. raise proper Tez Exception to indicate this invalid scenario. > 2. or write appropriate empty partition bits, for the missing partitions > before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700237#comment-14700237 ] Saikat edited comment on TEZ-2726 at 8/17/15 9:10 PM: -- One possible place to raise a proper exception can be in sendTezEventToDestinationTasks() in Edge.java before sending out the DME(for a scattergather edgemanger). We can raise AMUserCodeException with source as edgemanager, and appropriate message. was (Author: saikatr): One possible place to raise a proper exception can be in sendTezEventToDestinationTasks() in Edge.java before sending out the DME. We can raise AMUserCodeException with source as edgemanager, and appropriate message. > Handle invalid number of partitions for SCATTER-GATHER edge > --- > > Key: TEZ-2726 > URL: https://issues.apache.org/jira/browse/TEZ-2726 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > > Encountered an issue where the source vertex has M task and sink vertex has N > tasks (N > M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. > This resulted in sink vertex receiving DMEs with non existent targetIds. > The fetchers for the sink vertex tasks then try to retrieve the map outputs > and retrieve invalid headers due to exception in the ShuffleHandler. > Possible fixes: > 1. raise proper Tez Exception to indicate this invalid scenario. > 2. or write appropriate empty partition bits, for the missing partitions > before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699942#comment-14699942 ] Saikat edited comment on TEZ-2726 at 8/17/15 6:00 PM: -- Adding [~jlowe] [~rohini] [~jeagles] [~rajesh.balamohan] for watch and comments. was (Author: saikatr): Adding [~jlowe] [~rohini] [~jeagles] for watch and comments. > Handle invalid number of partitions for SCATTER-GATHER edge > --- > > Key: TEZ-2726 > URL: https://issues.apache.org/jira/browse/TEZ-2726 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > > Encountered an issue where the source vertex has M task and sink vertex has N > tasks (N > M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. > This resulted in sink vertex receiving DMEs with non existent targetIds. > The fetchers for the sink vertex tasks then try to retrieve the map outputs > and retrieve invalid headers due to exception in the ShuffleHandler. > Possible fixes: > 1. raise proper Tez Exception to indicate this invalid scenario. > 2. or write appropriate empty partition bits, for the missing partitions > before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699942#comment-14699942 ] Saikat commented on TEZ-2726: - Adding [~jlowe] [~rohini] [~jeagles] for watch and comments. > Handle invalid number of partitions for SCATTER-GATHER edge > --- > > Key: TEZ-2726 > URL: https://issues.apache.org/jira/browse/TEZ-2726 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > > Encountered an issue where the source vertex has M task and sink vertex has N > tasks (N > M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. > This resulted in sink vertex receiving DMEs with non existent targetIds. > The fetchers for the sink vertex tasks then try to retrieve the map outputs > and retrieve invalid headers due to exception in the ShuffleHandler. > Possible fixes: > 1. raise proper Tez Exception to indicate this invalid scenario. > 2. or write appropriate empty partition bits, for the missing partitions > before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat reassigned TEZ-2726: --- Assignee: Saikat > Handle invalid number of partitions for SCATTER-GATHER edge > --- > > Key: TEZ-2726 > URL: https://issues.apache.org/jira/browse/TEZ-2726 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > > Encountered an issue where the source vertex has M task and sink vertex has N > tasks (N > M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. > This resulted in sink vertex receiving DMEs with non existent targetIds. > The fetchers for the sink vertex tasks then try to retrieve the map outputs > and retrieve invalid headers due to exception in the ShuffleHandler. > Possible fixes: > 1. raise proper Tez Exception to indicate this invalid scenario. > 2. or write appropriate empty partition bits, for the missing partitions > before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
Saikat created TEZ-2726: --- Summary: Handle invalid number of partitions for SCATTER-GATHER edge Key: TEZ-2726 URL: https://issues.apache.org/jira/browse/TEZ-2726 Project: Apache Tez Issue Type: Improvement Reporter: Saikat Encountered an issue where the source vertex has M task and sink vertex has N tasks (N > M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. This resulted in sink vertex receiving DMEs with non existent targetIds. The fetchers for the sink vertex tasks then try to retrieve the map outputs and retrieve invalid headers due to exception in the ShuffleHandler. Possible fixes: 1. raise proper Tez Exception to indicate this invalid scenario. 2. or write appropriate empty partition bits, for the missing partitions before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2658: Attachment: TEZ-2658.5.patch patch 5: 1. Added vertex counters 2. added check for $TEZ_JARS in tez.sh script > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, > TEZ-2658.4.patch, TEZ-2658.5.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681896#comment-14681896 ] Saikat commented on TEZ-2658: - [~jlowe] [~hitesh] request you to review the patch. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, > TEZ-2658.4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2658: Attachment: TEZ-2658.4.patch fixed findbug warning for VA_FORMAT_STRING_USES_NEWLINE. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch, > TEZ-2658.4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2658: Attachment: TEZ-2658.3.patch > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch, TEZ-2658.3.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2618) In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before returning a failure
[ https://issues.apache.org/jira/browse/TEZ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2618: Attachment: TEZ-2618.1.patch rebased patch on top of TEZ-2172 > In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before > returning a failure > > > Key: TEZ-2618 > URL: https://issues.apache.org/jira/browse/TEZ-2618 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2618.1.patch, TEZ-2618.patch > > > In setupLocalDiskFetch() method[this is invoked when the fetcher is in the > same host as the target map host], first try to check if we can open the > target spill file using the localDirAllocator.getLocalPathToRead(). The > localDirAllocator searches through the list of configured dirs for the file. > In disk full scenarios, if the path is not found, fetcher should to try an > http fetch. > proposed solution: > in local fetch mode, the fetcher should first try getLocalPathToRead() for > all the pending maps. and So local fetch gets divided into 2 stages: first > the maps for which path was found via LocalDirAllocator and second construct > a http fallback fetch list for the maps which couldnt be found via > LocalDirAllocator.getLocalPathToRead() and do an http fetch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2658: Attachment: TEZ-2658.2.patch fixed findbugs and release audit warnings and removed some logs > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch, TEZ-2658.2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14679460#comment-14679460 ] Saikat commented on TEZ-2658: - [~hitesh] I submitted patch on 6th Aug, I dont see TezQA build being fired. Why is it so? If you see the patch I added a new project tez-cli-tools. Has it got something to do with this? > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660847#comment-14660847 ] Saikat edited comment on TEZ-2658 at 8/7/15 5:29 PM: - Approach: create an instance of DagClientImpl to get the dag status/progress/counters etc. Limitation: For listing all dags for an appid, DagClient doesnot expose any api. Hence hooked directly into DAGClientRPCImpl to get all dags. Similar api needed for DAGClientTimelineImpl to fetch all dags from timeline server. Need a jira for this. Because of this limitation: tez.sh dag -status can only fetch all dags for a live AM.(since this command talks via RPC layer to fetch the list of dags from the live am) Added a README.txt to list current capabilities. was (Author: saikatr): Approach: create an instance of DagClientImpl to get the dag status/progress/counters etc. Limitation: For listing all dags for an appid, DagClient doesnot expose any api. Hence hooked directly into DAGClientRPCImpl to get all dags. Similar api needed for DAGClientTimelineImpl to fetch all dags from timeline server. Need a jira for this. Because of this limitation: tez.sh dag -status can only fetch all dags for a live AM. Added a README.txt to list current capabilities. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662116#comment-14662116 ] Saikat edited comment on TEZ-2658 at 8/7/15 5:27 PM: - [~rohini] request you to give some feedback on the tool and what other additional options will be useful (for future versions). I have tested the tool for running dag status/progress/counters on my local setup. There is a README.txt in tez-cli-tools project folder on instructions about how to use the tool was (Author: saikatr): [~rohini] request you to give some feedback on the tool and what other additional options will be useful (for future versions). I have tested the tool for running dag status/progress/counters on my local setup. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662116#comment-14662116 ] Saikat commented on TEZ-2658: - [~rohini] request you to give some feedback on the tool and what other additional options will be useful (for future versions). I have tested the tool for running dag status/progress/counters on my local setup. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660847#comment-14660847 ] Saikat edited comment on TEZ-2658 at 8/6/15 9:37 PM: - Approach: create an instance of DagClientImpl to get the dag status/progress/counters etc. For listing all dags for an appid, DagClient doesnot expose any api. Hence hooked directly into DAGClientRPCImpl to get all dags. Similar api needed for DAGClientTimelineImpl to fetch all dags from timeline server. Need a jira for this. Because of this limitation: tez.sh dag -status can only fetch all dags for a live AM. Added a README.txt to list current capabilities. was (Author: saikatr): Approach: create an instance of DagClientImpl to get the dag status. For listing all dags for an appid, DagClient doesnot expose any api. Hence hooked directly into DAGClientRPCImpl to get all dags. Similar api needed for DAGClientTimelineImpl to fetch all dags from timeline server. Need a jira for this. Because of this limitation: tez.sh dag -status can only fetch all dags for a live AM. Added a README.txt to list current capabilities. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660847#comment-14660847 ] Saikat edited comment on TEZ-2658 at 8/6/15 9:37 PM: - Approach: create an instance of DagClientImpl to get the dag status/progress/counters etc. Limitation: For listing all dags for an appid, DagClient doesnot expose any api. Hence hooked directly into DAGClientRPCImpl to get all dags. Similar api needed for DAGClientTimelineImpl to fetch all dags from timeline server. Need a jira for this. Because of this limitation: tez.sh dag -status can only fetch all dags for a live AM. Added a README.txt to list current capabilities. was (Author: saikatr): Approach: create an instance of DagClientImpl to get the dag status/progress/counters etc. For listing all dags for an appid, DagClient doesnot expose any api. Hence hooked directly into DAGClientRPCImpl to get all dags. Similar api needed for DAGClientTimelineImpl to fetch all dags from timeline server. Need a jira for this. Because of this limitation: tez.sh dag -status can only fetch all dags for a live AM. Added a README.txt to list current capabilities. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660847#comment-14660847 ] Saikat commented on TEZ-2658: - Approach: create an instance of DagClientImpl to get the dag status. For listing all dags for an appid, DagClient doesnot expose any api. Hence hooked directly into DAGClientRPCImpl to get all dags. Similar api needed for DAGClientTimelineImpl to fetch all dags from timeline server. Need a jira for this. Because of this limitation: tez.sh dag -status can only fetch all dags for a live AM. Added a README.txt to list current capabilities. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2658: Attachment: TEZ-2658.1.patch > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2658.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2618) In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before returning a failure
[ https://issues.apache.org/jira/browse/TEZ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651980#comment-14651980 ] Saikat edited comment on TEZ-2618 at 8/4/15 2:28 PM: - The patch for this jira needs to be rebased on top of TEZ-2172. I will submit a new patch for it. was (Author: saikatr): The patch for this jira needs to be rebased on top of TEZ-2613. I will submit a new patch for it. > In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before > returning a failure > > > Key: TEZ-2618 > URL: https://issues.apache.org/jira/browse/TEZ-2618 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2618.patch > > > In setupLocalDiskFetch() method[this is invoked when the fetcher is in the > same host as the target map host], first try to check if we can open the > target spill file using the localDirAllocator.getLocalPathToRead(). The > localDirAllocator searches through the list of configured dirs for the file. > In disk full scenarios, if the path is not found, fetcher should to try an > http fetch. > proposed solution: > in local fetch mode, the fetcher should first try getLocalPathToRead() for > all the pending maps. and So local fetch gets divided into 2 stages: first > the maps for which path was found via LocalDirAllocator and second construct > a http fallback fetch list for the maps which couldnt be found via > LocalDirAllocator.getLocalPathToRead() and do an http fetch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2172) FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651978#comment-14651978 ] Saikat commented on TEZ-2172: - Hi [~rajesh.balamohan] TEZ-2172.1.patch is already the rebased one on top of TEZ-2613 > FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > > > Key: TEZ-2172 > URL: https://issues.apache.org/jira/browse/TEZ-2172 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Saikat > Attachments: TEZ-2172.1.patch, TEZ-2172.patch > > > As part of fixing TEZ-2001, FetcherOrderedGrouped stores > InputAttemptIdentifier in List. This can lead to some inefficiency - since > the size of this list can be ~30, and remove() calls can be expensive. > Option 1: by using the spillId in the hashCode - or a wrapping structure for > just this. However, SpillId can not be added to the hashCode as it would > break ShuffleScheduler shuffleInfoEventsMap. > Option 2: consider using Map with an identifier. > Need to consider other options as well. Creating this jira as a placeholder > to fix this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2618) In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before returning a failure
[ https://issues.apache.org/jira/browse/TEZ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651980#comment-14651980 ] Saikat edited comment on TEZ-2618 at 8/3/15 3:23 PM: - The patch for this jira needs to be rebased on top of TEZ-2613. I will submit a new patch for it. was (Author: saikatr): The patch for this jeera needs to be rebased on top of TEZ-2613. I will submit a new patch for it. > In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before > returning a failure > > > Key: TEZ-2618 > URL: https://issues.apache.org/jira/browse/TEZ-2618 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2618.patch > > > In setupLocalDiskFetch() method[this is invoked when the fetcher is in the > same host as the target map host], first try to check if we can open the > target spill file using the localDirAllocator.getLocalPathToRead(). The > localDirAllocator searches through the list of configured dirs for the file. > In disk full scenarios, if the path is not found, fetcher should to try an > http fetch. > proposed solution: > in local fetch mode, the fetcher should first try getLocalPathToRead() for > all the pending maps. and So local fetch gets divided into 2 stages: first > the maps for which path was found via LocalDirAllocator and second construct > a http fallback fetch list for the maps which couldnt be found via > LocalDirAllocator.getLocalPathToRead() and do an http fetch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2618) In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before returning a failure
[ https://issues.apache.org/jira/browse/TEZ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651980#comment-14651980 ] Saikat commented on TEZ-2618: - The patch for this jeera needs to be rebased on top of TEZ-2613. I will submit a new patch for it. > In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before > returning a failure > > > Key: TEZ-2618 > URL: https://issues.apache.org/jira/browse/TEZ-2618 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2618.patch > > > In setupLocalDiskFetch() method[this is invoked when the fetcher is in the > same host as the target map host], first try to check if we can open the > target spill file using the localDirAllocator.getLocalPathToRead(). The > localDirAllocator searches through the list of configured dirs for the file. > In disk full scenarios, if the path is not found, fetcher should to try an > http fetch. > proposed solution: > in local fetch mode, the fetcher should first try getLocalPathToRead() for > all the pending maps. and So local fetch gets divided into 2 stages: first > the maps for which path was found via LocalDirAllocator and second construct > a http fallback fetch list for the maps which couldnt be found via > LocalDirAllocator.getLocalPathToRead() and do an http fetch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2618) In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before returning a failure
[ https://issues.apache.org/jira/browse/TEZ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2618: Description: In setupLocalDiskFetch() method[this is invoked when the fetcher is in the same host as the target map host], first try to check if we can open the target spill file using the localDirAllocator.getLocalPathToRead(). The localDirAllocator searches through the list of configured dirs for the file. In disk full scenarios, if the path is not found, fetcher should to try an http fetch. proposed solution: in local fetch mode, the fetcher should first try getLocalPathToRead() for all the pending maps. and So local fetch gets divided into 2 stages: first the maps for which path was found via LocalDirAllocator and second construct a http fallback fetch list for the maps which couldnt be found via LocalDirAllocator.getLocalPathToRead() and do an http fetch. > In Ordered Fetcher, if Local Fetch fails, fallback and try http Fetch before > returning a failure > > > Key: TEZ-2618 > URL: https://issues.apache.org/jira/browse/TEZ-2618 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2618.patch > > > In setupLocalDiskFetch() method[this is invoked when the fetcher is in the > same host as the target map host], first try to check if we can open the > target spill file using the localDirAllocator.getLocalPathToRead(). The > localDirAllocator searches through the list of configured dirs for the file. > In disk full scenarios, if the path is not found, fetcher should to try an > http fetch. > proposed solution: > in local fetch mode, the fetcher should first try getLocalPathToRead() for > all the pending maps. and So local fetch gets divided into 2 stages: first > the maps for which path was found via LocalDirAllocator and second construct > a http fallback fetch list for the maps which couldnt be found via > LocalDirAllocator.getLocalPathToRead() and do an http fetch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2172) FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2172: Attachment: TEZ-2172.1.patch > FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > > > Key: TEZ-2172 > URL: https://issues.apache.org/jira/browse/TEZ-2172 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Saikat > Attachments: TEZ-2172.1.patch, TEZ-2172.patch > > > As part of fixing TEZ-2001, FetcherOrderedGrouped stores > InputAttemptIdentifier in List. This can lead to some inefficiency - since > the size of this list can be ~30, and remove() calls can be expensive. > Option 1: by using the spillId in the hashCode - or a wrapping structure for > just this. However, SpillId can not be added to the hashCode as it would > break ShuffleScheduler shuffleInfoEventsMap. > Option 2: consider using Map with an identifier. > Need to consider other options as well. Creating this jira as a placeholder > to fix this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2172) FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648007#comment-14648007 ] Saikat commented on TEZ-2172: - rebased the patch after TEZ-2613 has been merged. > FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > > > Key: TEZ-2172 > URL: https://issues.apache.org/jira/browse/TEZ-2172 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Saikat > Attachments: TEZ-2172.1.patch, TEZ-2172.patch > > > As part of fixing TEZ-2001, FetcherOrderedGrouped stores > InputAttemptIdentifier in List. This can lead to some inefficiency - since > the size of this list can be ~30, and remove() calls can be expensive. > Option 1: by using the spillId in the hashCode - or a wrapping structure for > just this. However, SpillId can not be added to the hashCode as it would > break ShuffleScheduler shuffleInfoEventsMap. > Option 2: consider using Map with an identifier. > Need to consider other options as well. Creating this jira as a placeholder > to fix this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2574) Make a better Metadata Value split choice in Pipeline sort
[ https://issues.apache.org/jira/browse/TEZ-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647152#comment-14647152 ] Saikat commented on TEZ-2574: - need to rebase this patch on top of TEZ-2575, TEZ-2602, TEZ-2643 > Make a better Metadata Value split choice in Pipeline sort > -- > > Key: TEZ-2574 > URL: https://issues.apache.org/jira/browse/TEZ-2574 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2574.1.patch, TEZ-2574.2.patch, TEZ-2574.3.patch, > TEZ-2574.patch > > > In the current implementation of pipeline sort, when a new sort span object > is created with a hard coded value of 1M items and 16 bytes per item. > According to the present code logic, > int metasize = METASIZE*maxItems; > int dataSize = maxItems * perItem; > if(capacity < (metasize+dataSize)) { > // try to allocate less meta space, because we have sample data > metasize = METASIZE*(capacity/(perItem+METASIZE)); > } > if capacity is less than 32mb, the buffer will be halved into meta and value > buffers, which is not efficient. > We need a more generic split, based on the KV pair size written to the buffer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat reassigned TEZ-2658: --- Assignee: Saikat > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647148#comment-14647148 ] Saikat commented on TEZ-2658: - adding [~jlowe] [~jeagles] for watch > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
[ https://issues.apache.org/jira/browse/TEZ-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647147#comment-14647147 ] Saikat commented on TEZ-2658: - Create a tool similar to "mapred" to be able to track DAG counters and other important stats. > Create a CLI utility tool to track Tez DAG/Application Stats > > > Key: TEZ-2658 > URL: https://issues.apache.org/jira/browse/TEZ-2658 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2658) Create a CLI utility tool to track Tez DAG/Application Stats
Saikat created TEZ-2658: --- Summary: Create a CLI utility tool to track Tez DAG/Application Stats Key: TEZ-2658 URL: https://issues.apache.org/jira/browse/TEZ-2658 Project: Apache Tez Issue Type: Improvement Reporter: Saikat -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646236#comment-14646236 ] Saikat commented on TEZ-2613: - Done. Also removed some unused imports. > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.5.patch, TEZ-2613.6.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: TEZ-2613.6.patch > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.5.patch, TEZ-2613.6.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2172) FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2172: Attachment: TEZ-2172.patch > FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > > > Key: TEZ-2172 > URL: https://issues.apache.org/jira/browse/TEZ-2172 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Saikat > Attachments: TEZ-2172.patch > > > As part of fixing TEZ-2001, FetcherOrderedGrouped stores > InputAttemptIdentifier in List. This can lead to some inefficiency - since > the size of this list can be ~30, and remove() calls can be expensive. > Option 1: by using the spillId in the hashCode - or a wrapping structure for > just this. However, SpillId can not be added to the hashCode as it would > break ShuffleScheduler shuffleInfoEventsMap. > Option 2: consider using Map with an identifier. > Need to consider other options as well. Creating this jira as a placeholder > to fix this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2172) FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621362#comment-14621362 ] Saikat edited comment on TEZ-2172 at 7/28/15 2:21 PM: -- Using an approach similar to TEZ-2613. A linked hashmap was (Author: saikatr): LinkedHashSet seems to be a good option as it also retains the order in which the items are inserted into the set and provides constant time performance for add, contains and remove. But for pipelined shuffle can have multiple spill ids(which is not used in the equals.) So we could pass an indication to fetchers that the input attempts are all for a pipelined shuffle type fetch (which would then include spill id also for comparison in a custom comparator wrapper) else ignore the spill id and use default equals for inputAttemptIdentifier. This approach may not work if in future a task can switch from pipelined shuffle to final merger type or vice versa. (or decide to send out spills CDMEs if data is too skewed). In current implementation, the configuration of pipeline shuffle enable for a task is static. > FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > > > Key: TEZ-2172 > URL: https://issues.apache.org/jira/browse/TEZ-2172 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Saikat > > As part of fixing TEZ-2001, FetcherOrderedGrouped stores > InputAttemptIdentifier in List. This can lead to some inefficiency - since > the size of this list can be ~30, and remove() calls can be expensive. > Option 1: by using the spillId in the hashCode - or a wrapping structure for > just this. However, SpillId can not be added to the hashCode as it would > break ShuffleScheduler shuffleInfoEventsMap. > Option 2: consider using Map with an identifier. > Need to consider other options as well. Creating this jira as a placeholder > to fix this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2627) Support for Tez Job Priorities
[ https://issues.apache.org/jira/browse/TEZ-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643390#comment-14643390 ] Saikat commented on TEZ-2627: - Done. Thanks for the detailed review [~hitesh] > Support for Tez Job Priorities > -- > > Key: TEZ-2627 > URL: https://issues.apache.org/jira/browse/TEZ-2627 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2627.1.patch, TEZ-2627.2.patch, TEZ-2627.3.patch, > TEZ-2627.4.patch, TEZ-2627.5.patch, TEZ-2627.patch > > > When a Tez Job is submitted via TezClient, an ApplicationSubmissionContext is > created before submitting the job. ApplicationSubmissionContext has a > priority field which can be used to provide a priority for the job. > There is an ongoing effort in the Yarn Community to enable application > priorities(https://issues.apache.org/jira/browse/YARN-1963). > https://issues.apache.org/jira/browse/YARN-2003 implements the necessary > changes in RM and Capacity Scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2627) Support for Tez Job Priorities
[ https://issues.apache.org/jira/browse/TEZ-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2627: Attachment: TEZ-2627.5.patch > Support for Tez Job Priorities > -- > > Key: TEZ-2627 > URL: https://issues.apache.org/jira/browse/TEZ-2627 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2627.1.patch, TEZ-2627.2.patch, TEZ-2627.3.patch, > TEZ-2627.4.patch, TEZ-2627.5.patch, TEZ-2627.patch > > > When a Tez Job is submitted via TezClient, an ApplicationSubmissionContext is > created before submitting the job. ApplicationSubmissionContext has a > priority field which can be used to provide a priority for the job. > There is an ongoing effort in the Yarn Community to enable application > priorities(https://issues.apache.org/jira/browse/YARN-1963). > https://issues.apache.org/jira/browse/YARN-2003 implements the necessary > changes in RM and Capacity Scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641135#comment-14641135 ] Saikat edited comment on TEZ-2643 at 7/27/15 8:34 PM: -- [~rajesh.balamohan] the bug in TEZ-2602 occured in trying to optimize the number of empty spills.(missed that testcase scenrio!) I think the correct place to put the check is in merger.ready() and spill() method. The idea is if the merger heap is empty then we know that the spill will be empty and hence ignore that spill. For example: without this patch testKVExceedsBuffer() spills out 9 files. with this patch, pipelined sorter spills only 2, (which is a 4x improvement in worst case scenario where all KVs are larger than alloted buffer to sorter) The patch also passes all the newly added testcases(TEZ-2602) in pipelinedsorter. was (Author: saikatr): [~rajesh.balamohan] the bug in TEZ-2602 occured in trying to optimize the number of empty spills.(missed that testcase scenrio!) I think the correct place to put the check is in merger.ready() and spill() method. The idea is if the merger heap is empty then we know that the spill will be empty and hence ignore that spill. For example: without this patch testKVExceedsBuffer() spills out 9 files. with this patch, pipelined sorter spills only 2, (which is a 4x improvement in worst case scenario where all KVs are larger than alloted buffer to sorter) The patch also passes all the testcases in pipelinedsorter. > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641135#comment-14641135 ] Saikat edited comment on TEZ-2643 at 7/27/15 8:33 PM: -- [~rajesh.balamohan] the bug in TEZ-2602 occured in trying to optimize the number of empty spills.(missed that testcase scenrio!) I think the correct place to put the check is in merger.ready() and spill() method. The idea is if the merger heap is empty then we know that the spill will be empty and hence ignore that spill. For example: without this patch testKVExceedsBuffer() spills out 9 files. with this patch, pipelined sorter spills only 2, (which is a 4x improvement in worst case scenario where all KVs are larger than alloted buffer to sorter) The patch also passes all the testcases in pipelinedsorter. was (Author: saikatr): [~rajesh.balamohan] the bug in TEZ-2602 occured in trying to optimize the number of empty spills.(missed that testcase scenrio!) I think the correct place to put the check is in merger.ready() and spill() method. The idea is if the merger heap is empty then we know that the spill will be empty and hence ignore that spill. For example: without this patch testKVExceedsBuffer() spills out 9 files. with this patch, pipelined sorter spills only 2. The patch also passes all the testcases in pipelinedsorter. > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2643: Comment: was deleted (was: adding [~rohini] for watch) > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641138#comment-14641138 ] Saikat edited comment on TEZ-2643 at 7/27/15 7:57 PM: -- adding [~ozawa] [~jeagles] [~rohini] for watch and comments. was (Author: saikatr): adding [~ozawa] [~jeagles] for watch and comments. > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter
[ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643286#comment-14643286 ] Saikat commented on TEZ-2643: - adding [~rohini] for watch > Minimize number of empty spills in Pipelined Sorter > --- > > Key: TEZ-2643 > URL: https://issues.apache.org/jira/browse/TEZ-2643 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2643.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2627) Support for Tez Job Priorities
[ https://issues.apache.org/jira/browse/TEZ-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643285#comment-14643285 ] Saikat commented on TEZ-2627: - [~hitesh] addressed all your review comments. Thanks. > Support for Tez Job Priorities > -- > > Key: TEZ-2627 > URL: https://issues.apache.org/jira/browse/TEZ-2627 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2627.1.patch, TEZ-2627.2.patch, TEZ-2627.3.patch, > TEZ-2627.4.patch, TEZ-2627.patch > > > When a Tez Job is submitted via TezClient, an ApplicationSubmissionContext is > created before submitting the job. ApplicationSubmissionContext has a > priority field which can be used to provide a priority for the job. > There is an ongoing effort in the Yarn Community to enable application > priorities(https://issues.apache.org/jira/browse/YARN-1963). > https://issues.apache.org/jira/browse/YARN-2003 implements the necessary > changes in RM and Capacity Scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2627) Support for Tez Job Priorities
[ https://issues.apache.org/jira/browse/TEZ-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2627: Attachment: TEZ-2627.4.patch > Support for Tez Job Priorities > -- > > Key: TEZ-2627 > URL: https://issues.apache.org/jira/browse/TEZ-2627 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2627.1.patch, TEZ-2627.2.patch, TEZ-2627.3.patch, > TEZ-2627.4.patch, TEZ-2627.patch > > > When a Tez Job is submitted via TezClient, an ApplicationSubmissionContext is > created before submitting the job. ApplicationSubmissionContext has a > priority field which can be used to provide a priority for the job. > There is an ongoing effort in the Yarn Community to enable application > priorities(https://issues.apache.org/jira/browse/YARN-1963). > https://issues.apache.org/jira/browse/YARN-2003 implements the necessary > changes in RM and Capacity Scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642844#comment-14642844 ] Saikat edited comment on TEZ-2613 at 7/27/15 7:08 PM: -- Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). My suggestion would be to pass the remainingSrcAttemptsList as an Iterable to setupConnection.That we can avoid creating a new linkedlist. e.g. function prototype for setupConnection- private HostFetchResult setupConnection(Iterable attempts); //called as HostFetchResult connectionsWithRetryResult = setupConnection(srcAttemptsRemaining.values()); Patchset 5 shows the suggested change. was (Author: saikatr): Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). My suggestion would be to pass the remainingSrcAttemptsList as an Iterable to setupConnection.That we can avoid creating a new linkedlist. e.g. function prototype for setupConnection- private HostFetchResult setupConnection(Iterable attempts); //called as HostFetchResult connectionsWithRetryResult = setupConnection(srcAttemptsRemaining.values()); Patchset 4 shows the suggested change. > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.5.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: TEZ-2613.5.patch > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.5.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Comment: was deleted (was: resubmitted as patch4) > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.5.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: (was: TEZ-2613.4.patch) > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: (was: TEZ-2613.4.patch) > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: TEZ-2613.4.patch TEZ-2613.4.patch > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642970#comment-14642970 ] Saikat commented on TEZ-2613: - resubmitted as patch4 > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: TEZ-2613.4.patch > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: (was: TEZ-2613.5.patch) > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: TEZ-2613.5.patch > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: (was: TEZ-2613.4.patch) > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642844#comment-14642844 ] Saikat edited comment on TEZ-2613 at 7/27/15 3:39 PM: -- Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). Once I get clarification on this behaviour, will submit a change accordingly. My suggestion would be to pass the remainingSrcAttemptsList as an Iterable to setupConnection.That we can avoid creating a new linkedlist. e.g. function prototype for setupConnection- private HostFetchResult setupConnection(Iterable attempts); //called as HostFetchResult connectionsWithRetryResult = setupConnection(srcAttemptsRemaining.values()); Patchset 4 shows the suggested change. was (Author: saikatr): Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt in patchset 4. For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). Once I get clarification on this behaviour, will submit a change accordingly. My suggestion would be to pass the remainingSrcAttemptsList as an Iterable to setupConnection.That we can avoid creating a new linkedlist. e.g. function prototype for setupConnection- private HostFetchResult setupConnection(Iterable attempts); //called as HostFetchResult connectionsWithRetryResult = setupConnection(srcAttemptsRemaining.values()); > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642844#comment-14642844 ] Saikat edited comment on TEZ-2613 at 7/27/15 3:39 PM: -- Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). My suggestion would be to pass the remainingSrcAttemptsList as an Iterable to setupConnection.That we can avoid creating a new linkedlist. e.g. function prototype for setupConnection- private HostFetchResult setupConnection(Iterable attempts); //called as HostFetchResult connectionsWithRetryResult = setupConnection(srcAttemptsRemaining.values()); Patchset 4 shows the suggested change. was (Author: saikatr): Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). Once I get clarification on this behaviour, will submit a change accordingly. My suggestion would be to pass the remainingSrcAttemptsList as an Iterable to setupConnection.That we can avoid creating a new linkedlist. e.g. function prototype for setupConnection- private HostFetchResult setupConnection(Iterable attempts); //called as HostFetchResult connectionsWithRetryResult = setupConnection(srcAttemptsRemaining.values()); Patchset 4 shows the suggested change. > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: TEZ-2613.4.patch > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: (was: TEZ-2613.4.patch) > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642844#comment-14642844 ] Saikat edited comment on TEZ-2613 at 7/27/15 3:34 PM: -- Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt in patchset 4. For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). Once I get clarification on this behaviour, will submit a change accordingly. My suggestion would be to pass the remainingSrcAttemptsList as an Iterable to setupConnection.That we can avoid creating a new linkedlist. e.g. function prototype for setupConnection- private HostFetchResult setupConnection(Iterable attempts); //called as HostFetchResult connectionsWithRetryResult = setupConnection(srcAttemptsRemaining.values()); was (Author: saikatr): Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt in patchset 4. For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). Once I get clarification on this behaviour, will submit a change accordingly. > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642844#comment-14642844 ] Saikat edited comment on TEZ-2613 at 7/27/15 3:28 PM: -- Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt in patchset 4. For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). Once I get clarification on this behaviour, will submit a change accordingly. was (Author: saikatr): Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt in patchset 4. For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). Once I get clarification on this behaviour, will submit a change accordingly. > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642844#comment-14642844 ] Saikat edited comment on TEZ-2613 at 7/27/15 3:25 PM: -- Hi [~rajesh.balamohan] thanks for your review comments. Changed the iterator in getNextRemainingAttempt in patchset 4. For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). Once I get clarification on this behaviour, will submit a change accordingly. was (Author: saikatr): Hi [~rajesh.balamohan] thanks for your review comments. I will change the iterator in getNextRemainingAttempt. For the setupConnection, I had a doubt. Shouldn't we use the remaining identifier while setting up connection.(because setup connection may also be called again in the retry scenario line 528). I wanted some clarification on this behaviour. > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2613) Fetcher(unordered) using List to store InputAttemptIdentifier can lead to some inefficiency during remove() operation
[ https://issues.apache.org/jira/browse/TEZ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saikat updated TEZ-2613: Attachment: TEZ-2613.4.patch > Fetcher(unordered) using List to store InputAttemptIdentifier can lead to > some inefficiency during remove() operation > - > > Key: TEZ-2613 > URL: https://issues.apache.org/jira/browse/TEZ-2613 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2613.1.patch, TEZ-2613.2.patch, TEZ-2613.3.patch, > TEZ-2613.4.patch, TEZ-2613.patch > > > remove() operation on the remaining list can be inefficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2627) Support for Tez Job Priorities
[ https://issues.apache.org/jira/browse/TEZ-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642846#comment-14642846 ] Saikat edited comment on TEZ-2627 at 7/27/15 3:16 PM: -- [~hitesh] submitted patchset 3 according to your latest review comments. was (Author: saikatr): [~hitesh] submitted patch according to your latest review comments. > Support for Tez Job Priorities > -- > > Key: TEZ-2627 > URL: https://issues.apache.org/jira/browse/TEZ-2627 > Project: Apache Tez > Issue Type: Improvement >Reporter: Saikat >Assignee: Saikat > Attachments: TEZ-2627.1.patch, TEZ-2627.2.patch, TEZ-2627.3.patch, > TEZ-2627.patch > > > When a Tez Job is submitted via TezClient, an ApplicationSubmissionContext is > created before submitting the job. ApplicationSubmissionContext has a > priority field which can be used to provide a priority for the job. > There is an ongoing effort in the Yarn Community to enable application > priorities(https://issues.apache.org/jira/browse/YARN-1963). > https://issues.apache.org/jira/browse/YARN-2003 implements the necessary > changes in RM and Capacity Scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)