[jira] [Created] (TEZ-1432) TEZ_AM_CANCEL_DELEGATION_TOKEN is named inorrectly
Siddharth Seth created TEZ-1432: --- Summary: TEZ_AM_CANCEL_DELEGATION_TOKEN is named inorrectly Key: TEZ-1432 URL: https://issues.apache.org/jira/browse/TEZ-1432 Project: Apache Tez Issue Type: Bug Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Blocker TEZ_AM_CANCEL_DELEGATION_TOKEN is currently tez.am.am.complete.cancel.delegation.tokens -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1432) TEZ_AM_CANCEL_DELEGATION_TOKEN is named inorrectly
[ https://issues.apache.org/jira/browse/TEZ-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-1432: Attachment: TEZ-1432.1.txt Trivial patch renames the parameter. Committing. TEZ_AM_CANCEL_DELEGATION_TOKEN is named inorrectly -- Key: TEZ-1432 URL: https://issues.apache.org/jira/browse/TEZ-1432 Project: Apache Tez Issue Type: Bug Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Blocker Attachments: TEZ-1432.1.txt TEZ_AM_CANCEL_DELEGATION_TOKEN is currently tez.am.am.complete.cancel.delegation.tokens -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (TEZ-1432) TEZ_AM_CANCEL_DELEGATION_TOKEN is named inorrectly
[ https://issues.apache.org/jira/browse/TEZ-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved TEZ-1432. - Resolution: Fixed Fix Version/s: 0.5.0 Hadoop Flags: Incompatible change Committed to master. TEZ_AM_CANCEL_DELEGATION_TOKEN is named inorrectly -- Key: TEZ-1432 URL: https://issues.apache.org/jira/browse/TEZ-1432 Project: Apache Tez Issue Type: Bug Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1432.1.txt TEZ_AM_CANCEL_DELEGATION_TOKEN is currently tez.am.am.complete.cancel.delegation.tokens -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1065) DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order
[ https://issues.apache.org/jira/browse/TEZ-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098366#comment-14098366 ] Jeff Zhang commented on TEZ-1065: - [~bikassaha] I update the patch, please help review. DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order - Key: TEZ-1065 URL: https://issues.apache.org/jira/browse/TEZ-1065 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Bikas Saha Assignee: Jeff Zhang Labels: newbie Attachments: TEZ-1065.1.patch, Tez-1065-2.patch, Tez-1065-3.patch, Tez-1065-4.patch, Tez-1065.patch They should maintain the incoming vertex order. In VertexProgress e.g. lets use LinkedHashMap instead of HashMap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1117) Option to make YARN application failed on dag failure
[ https://issues.apache.org/jira/browse/TEZ-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098736#comment-14098736 ] Nick Dimiduk commented on TEZ-1117: --- I'm seeing a similar thing with Tez-based Hive jobs. Launch a job as {{hive -e ...}} that fails for some reason. Hive tells me it failed, but the job is reported on the cluster as SUCCESS. I'm struggling to imagine when I'd want a partial DAG failure to be anything but FAIL. Maybe if I coded the DAG directly, I'd want to mark some portions optional? When the DAG is generated by a planner (Pig, Hive, Cascading), the entries are there because the planner says they're necessary; they should all complete or it's failure. This equivalent to my query MR step 3 out of 5 failing and marking as SUCCESS. Option to make YARN application failed on dag failure - Key: TEZ-1117 URL: https://issues.apache.org/jira/browse/TEZ-1117 Project: Apache Tez Issue Type: Improvement Reporter: Rohini Palaniswamy Can we have an configuration to make the Application status FAILED on termination if one of the DAGs fail? It is very confusing for users to see the application SUCCEEDED. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1334) Annotate all non public classes in tez-runtime-library with @private
[ https://issues.apache.org/jira/browse/TEZ-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098773#comment-14098773 ] Bikas Saha commented on TEZ-1334: - bq. Looks like the previous comment was referring to InputAlreadyClosedException. Yes. That is what it was referring to. Unless its programmatically actionable by the user we should just replace this with a TezUncheckedException with a diagnostic message. bq.Maybe event LimitedPrivate for Hive Similar to other places in the code. If its necessary for users to write code then it should be public. If we are not sure about the API itself it should also be unstable. Annotate all non public classes in tez-runtime-library with @private Key: TEZ-1334 URL: https://issues.apache.org/jira/browse/TEZ-1334 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Hitesh Shah Priority: Blocker Attachments: TEZ-1334.1.patch, TEZ-1334.2.patch This prevents javadoc from being generated. Alternative would be to mark classes explicitly public using annotation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1065) DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order
[ https://issues.apache.org/jira/browse/TEZ-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098789#comment-14098789 ] Bikas Saha commented on TEZ-1065: - Looks good. Committing. Thanks for your contribution! DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order - Key: TEZ-1065 URL: https://issues.apache.org/jira/browse/TEZ-1065 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Bikas Saha Assignee: Jeff Zhang Labels: newbie Attachments: TEZ-1065.1.patch, Tez-1065-2.patch, Tez-1065-3.patch, Tez-1065-4.patch, Tez-1065.patch They should maintain the incoming vertex order. In VertexProgress e.g. lets use LinkedHashMap instead of HashMap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1338) Support submission of multiple applications with LocalRunner from within the same JVM
[ https://issues.apache.org/jira/browse/TEZ-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098802#comment-14098802 ] Bikas Saha commented on TEZ-1338: - FrameworkClient should probably have a refcounter to ensure that stop() takes effect only after all users have called stop(). Each new user can call start() to ensure that it acquires a reference. The first start() can start the real connection. We probably have this issue in the current patch but because localclient.stop() is empty it does not matter. I think if we make the above fix, then both local and non-local clients could be shared between TezClient and DAGClient. The local vs non-local check would not be necessary. Later we can make an extension that allows the last refcounted user to actually shutdown the session. This way we dont need to keep tezclient alive until all the processing is complete. Only the DAGClient needs to stay alive - which seems natural. Support submission of multiple applications with LocalRunner from within the same JVM - Key: TEZ-1338 URL: https://issues.apache.org/jira/browse/TEZ-1338 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Critical Attachments: TEZ-1338.1.txt A single DAGAM is currently setup, which is used for all clients. In non-session mode this AM would end up in a final state and will not accept another submission. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (TEZ-1065) DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order
[ https://issues.apache.org/jira/browse/TEZ-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha resolved TEZ-1065. - Resolution: Fixed Fix Version/s: 0.5.0 Hadoop Flags: Reviewed commit 8515a0fbab8eb9c08d695b4ff4f913dd5811543d Author: Bikas Saha bi...@apache.org Date: Fri Aug 15 10:41:57 2014 -0700 TEZ-1065. DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order (Jeff Zhang via bikas) DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order - Key: TEZ-1065 URL: https://issues.apache.org/jira/browse/TEZ-1065 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Bikas Saha Assignee: Jeff Zhang Labels: newbie Fix For: 0.5.0 Attachments: TEZ-1065.1.patch, Tez-1065-2.patch, Tez-1065-3.patch, Tez-1065-4.patch, Tez-1065.patch They should maintain the incoming vertex order. In VertexProgress e.g. lets use LinkedHashMap instead of HashMap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1320) Remove getApplicationId from DAGClient
[ https://issues.apache.org/jira/browse/TEZ-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-1320: Attachment: TEZ-1320.4.patch Attaching final commit patch. Remove getApplicationId from DAGClient -- Key: TEZ-1320 URL: https://issues.apache.org/jira/browse/TEZ-1320 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Jonathan Eagles Priority: Blocker Attachments: TEZ-1320-v1.patch, TEZ-1320-v2.patch, TEZ-1320-v3.patch, TEZ-1320.4.patch We should either get rid of this, or convert it to a String. Not sure why this API needs to be exposed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1429) Avoid sysexit in the DAGAM in case of local mode
[ https://issues.apache.org/jira/browse/TEZ-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098878#comment-14098878 ] Jonathan Eagles commented on TEZ-1429: -- Looks good in general. # simplify this logic {code} if (isLocal) { conf.setBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, false); } else { conf.setBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, true); } {code} to this {code} conf.setBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, !isLocal); {code} # Can we create a method for state to final state transition like convertDAGAppMasterState? # Consider moving to a centralized system exit system Obviously an opinion, but ...Ideally, I see Tez moving away from calling System.exit directly to a system like ExitUtils in hadoop. This simplifies testing and removes programmer need for handling all use cases. This JIRA isn't probably the time for this however. Avoid sysexit in the DAGAM in case of local mode Key: TEZ-1429 URL: https://issues.apache.org/jira/browse/TEZ-1429 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Critical Attachments: TEZ-1429.1.txt This jira is to investigate if there's a simple way where the sysexit in the DAGAM can be avoided in case of local mode. This is critical to actually making use of localmode. TEZ-1191 will be the proper fix for this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1334) Annotate all non public classes in tez-runtime-library with @private
[ https://issues.apache.org/jira/browse/TEZ-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-1334: - Attachment: TEZ-1334.3.patch Addressed most comments except for the handling of the exception. It is being used in places and the docs also call it out as a potential exception that could be thrown. Annotate all non public classes in tez-runtime-library with @private Key: TEZ-1334 URL: https://issues.apache.org/jira/browse/TEZ-1334 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Hitesh Shah Priority: Blocker Attachments: TEZ-1334.1.patch, TEZ-1334.2.patch, TEZ-1334.3.patch This prevents javadoc from being generated. Alternative would be to mark classes explicitly public using annotation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1431) Fix use of synchronized for certain functions in TezClient
[ https://issues.apache.org/jira/browse/TEZ-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098890#comment-14098890 ] Hitesh Shah commented on TEZ-1431: -- [~bikassaha] preWarm() calls submitDAG() which also throws Interrupted so cannot remove it for now unless we remove Interrupted from submit APIs. Fix use of synchronized for certain functions in TezClient -- Key: TEZ-1431 URL: https://issues.apache.org/jira/browse/TEZ-1431 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Priority: Blocker Attachments: TEZ-1431.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1338) Support submission of multiple applications with LocalRunner from within the same JVM
[ https://issues.apache.org/jira/browse/TEZ-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098896#comment-14098896 ] Siddharth Seth commented on TEZ-1338: - I haven't targeted sharing the FrameworkClient between TezClient and DAGClient in this patch. That's a slightly more involved change (and I believe I filed a separate jira for this earlier). This one is targeted only for LocalMode and doesn't affect non local-mode execution at all. Support submission of multiple applications with LocalRunner from within the same JVM - Key: TEZ-1338 URL: https://issues.apache.org/jira/browse/TEZ-1338 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Critical Attachments: TEZ-1338.1.txt A single DAGAM is currently setup, which is used for all clients. In non-session mode this AM would end up in a final state and will not accept another submission. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1320) Remove getApplicationId from DAGClient
[ https://issues.apache.org/jira/browse/TEZ-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098924#comment-14098924 ] Siddharth Seth commented on TEZ-1320: - Please commit. I'll need to rebase TEZ-1338 after this goes in. Remove getApplicationId from DAGClient -- Key: TEZ-1320 URL: https://issues.apache.org/jira/browse/TEZ-1320 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Jonathan Eagles Priority: Blocker Attachments: TEZ-1320-v1.patch, TEZ-1320-v2.patch, TEZ-1320-v3.patch, TEZ-1320.4.patch We should either get rid of this, or convert it to a String. Not sure why this API needs to be exposed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1065) DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order
[ https://issues.apache.org/jira/browse/TEZ-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098931#comment-14098931 ] Bikas Saha commented on TEZ-1065: - commit 2422ca7584353f69bbc3ed2c1ce22374124d68e7 Author: Bikas Saha bi...@apache.org Date: Fri Aug 15 11:51:31 2014 -0700 TEZ-1065 addendum to fix broken test (bikas) DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order - Key: TEZ-1065 URL: https://issues.apache.org/jira/browse/TEZ-1065 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Bikas Saha Assignee: Jeff Zhang Labels: newbie Fix For: 0.5.0 Attachments: TEZ-1065.1.patch, TEZ-1065.addendum.patch, Tez-1065-2.patch, Tez-1065-3.patch, Tez-1065-4.patch, Tez-1065.patch They should maintain the incoming vertex order. In VertexProgress e.g. lets use LinkedHashMap instead of HashMap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1411) Address initial feedback on swimlanes
[ https://issues.apache.org/jira/browse/TEZ-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098956#comment-14098956 ] Jonathan Eagles commented on TEZ-1411: -- [~gopalv], can we make the whole svn.rect a for a task attempt clickable instead of just the link. For long running jobs, this will make it much easier to get to the logs. Address initial feedback on swimlanes - Key: TEZ-1411 URL: https://issues.apache.org/jira/browse/TEZ-1411 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Gopal V Priority: Blocker Fix For: 0.5.0 Few other good to have things 1) A wrapper script that takes care of the command chaining with a single appId as input from the user. 2) Legend in the README or in the svg itself about what is what. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1411) Address initial feedback on swimlanes
[ https://issues.apache.org/jira/browse/TEZ-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098965#comment-14098965 ] Gopal V commented on TEZ-1411: -- [~jeagles]: sure, that shouldn't be a big issue. The reason to use SVG was to have the clickable links. Address initial feedback on swimlanes - Key: TEZ-1411 URL: https://issues.apache.org/jira/browse/TEZ-1411 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Gopal V Priority: Blocker Fix For: 0.5.0 Few other good to have things 1) A wrapper script that takes care of the command chaining with a single appId as input from the user. 2) Legend in the README or in the svg itself about what is what. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1117) Option to make YARN application failed on dag failure
[ https://issues.apache.org/jira/browse/TEZ-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14098983#comment-14098983 ] Nick Dimiduk commented on TEZ-1117: --- This is a simple select count(*) from foo where bar like 'baz%' kind of query. I don't think it involves multiple DAGs, but I could be mistaken. Option to make YARN application failed on dag failure - Key: TEZ-1117 URL: https://issues.apache.org/jira/browse/TEZ-1117 Project: Apache Tez Issue Type: Improvement Reporter: Rohini Palaniswamy Can we have an configuration to make the Application status FAILED on termination if one of the DAGs fail? It is very confusing for users to see the application SUCCEEDED. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1427) Change remaining classes that are using byte[] to UserPayload
[ https://issues.apache.org/jira/browse/TEZ-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-1427: Attachment: TEZ-1427.1.txt Trivial patch. [~bikassaha] - please review. Change remaining classes that are using byte[] to UserPayload - Key: TEZ-1427 URL: https://issues.apache.org/jira/browse/TEZ-1427 Project: Apache Tez Issue Type: Improvement Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Blocker Attachments: TEZ-1427.1.txt EdgeManagerPluginContext is the most important. SleepProcessor configuration All the Configurers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1429) Avoid sysexit in the DAGAM in case of local mode
[ https://issues.apache.org/jira/browse/TEZ-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-1429: Attachment: TEZ-1429.2.txt Updated patch. [~jeagles] - do you want to take another look ? Avoid sysexit in the DAGAM in case of local mode Key: TEZ-1429 URL: https://issues.apache.org/jira/browse/TEZ-1429 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Blocker Attachments: TEZ-1429.1.txt, TEZ-1429.2.txt This jira is to investigate if there's a simple way where the sysexit in the DAGAM can be avoided in case of local mode. This is critical to actually making use of localmode. TEZ-1191 will be the proper fix for this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1427) Change remaining classes that are using byte[] to UserPayload
[ https://issues.apache.org/jira/browse/TEZ-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099066#comment-14099066 ] Bikas Saha commented on TEZ-1427: - +1 Change remaining classes that are using byte[] to UserPayload - Key: TEZ-1427 URL: https://issues.apache.org/jira/browse/TEZ-1427 Project: Apache Tez Issue Type: Improvement Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Blocker Attachments: TEZ-1427.1.txt EdgeManagerPluginContext is the most important. SleepProcessor configuration All the Configurers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1423) Ability to pass custom properties to keySerializer for OnFileUnorderedPartitionedKVOutput
[ https://issues.apache.org/jira/browse/TEZ-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-1423: Attachment: TEZ-1423.1.txt Patch adds configuration for serializers and Compression. Keeping them separate has the advantage of providing exact confs set by users to individual components. From an API perspective, I think this is a better approach. The alternate is to use a single method. If we go down that route, the individual confs on methods should be removed though - since we wouldn't know which order the configs should be applied in - potentially overriding specific configs. Possibly easier to use, but in most cases users should not have to provide a conf - in which case I like the first approach better. [~bikassaha], [~oae] - please take a look. Ability to pass custom properties to keySerializer for OnFileUnorderedPartitionedKVOutput - Key: TEZ-1423 URL: https://issues.apache.org/jira/browse/TEZ-1423 Project: Apache Tez Issue Type: Improvement Reporter: Johannes Zillmann Attachments: TEZ-1423.1.txt Using OnFileUnorderedPartitionedKVOutput there is no way of passing custom properties to its keySerializer class given that this class implements Configurable. For OnFileSortedOutput this is possible because comparatorConf and partitionerConf touching both sides, the input and the output. Possible solutions could be either passing a map to keySerializer configuration as well or have custom properties for the input and the output of an edge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1338) Support submission of multiple applications with LocalRunner from within the same JVM
[ https://issues.apache.org/jira/browse/TEZ-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099124#comment-14099124 ] Siddharth Seth commented on TEZ-1338: - Thanks for taking a look [~bikassaha] and [~jeagles]. Will leave concurrent DAGs to a separate jira. I though [~airbots] had opened a jira for that, but couldn't find it. Support submission of multiple applications with LocalRunner from within the same JVM - Key: TEZ-1338 URL: https://issues.apache.org/jira/browse/TEZ-1338 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Blocker Attachments: TEZ-1338.1.txt, TEZ-1338.2.txt A single DAGAM is currently setup, which is used for all clients. In non-session mode this AM would end up in a final state and will not accept another submission. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1117) Option to make YARN application failed on dag failure
[ https://issues.apache.org/jira/browse/TEZ-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099135#comment-14099135 ] Hitesh Shah commented on TEZ-1117: -- [~ndimiduk] As [~bikassaha] mentioned, the issue is with how Hive interacts with the Tez APIs and the assumptions of a user expecting the application status to behave similarly to a MapReduce job. Option to make YARN application failed on dag failure - Key: TEZ-1117 URL: https://issues.apache.org/jira/browse/TEZ-1117 Project: Apache Tez Issue Type: Improvement Reporter: Rohini Palaniswamy Can we have an configuration to make the Application status FAILED on termination if one of the DAGs fail? It is very confusing for users to see the application SUCCEEDED. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1117) Option to make YARN application failed on dag failure
[ https://issues.apache.org/jira/browse/TEZ-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099149#comment-14099149 ] Nick Dimiduk commented on TEZ-1117: --- So then this ticket should be closed as invalid and new tickets be opened in Hive and Pig instead? Option to make YARN application failed on dag failure - Key: TEZ-1117 URL: https://issues.apache.org/jira/browse/TEZ-1117 Project: Apache Tez Issue Type: Improvement Reporter: Rohini Palaniswamy Can we have an configuration to make the Application status FAILED on termination if one of the DAGs fail? It is very confusing for users to see the application SUCCEEDED. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1331) Investigate : interrupts being swallowed by TezClient/DAGClient methods
[ https://issues.apache.org/jira/browse/TEZ-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099156#comment-14099156 ] Bikas Saha commented on TEZ-1331: - +1 for the patch after the waitTillReady() deletion is reverted. Investigate : interrupts being swallowed by TezClient/DAGClient methods --- Key: TEZ-1331 URL: https://issues.apache.org/jira/browse/TEZ-1331 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Priority: Blocker Attachments: TEZ-1331.1.patch TEZ-1278 fixes waitTillReady to not ignore interrupts. This jira is to look through other APIs to figure out whether interrupts handling needs to be fixed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1331) Investigate : interrupts being swallowed by TezClient/DAGClient methods
[ https://issues.apache.org/jira/browse/TEZ-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099155#comment-14099155 ] Bikas Saha commented on TEZ-1331: - Please keep the waitTillReady(). Thats needed until we transparently allow the prewarm to get superceded by the next real dag. Approach sounds good to me. Essentially, where the API is explicitly asking the user to wait then it expects the user to handled interruptions. For other cases it avoids potentially nagging handling of interrupted exceptions. This follows most of existing Java standard API code where all the wait methods throw interrupted but not every API that may be blocking. Later we can look at setting the Thread's interrupted status upon interruption so that users can use that to inspect for that condition. Thats another way to expose the interrupted status. It wont be backwards incompatible. Investigate : interrupts being swallowed by TezClient/DAGClient methods --- Key: TEZ-1331 URL: https://issues.apache.org/jira/browse/TEZ-1331 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Priority: Blocker Attachments: TEZ-1331.1.patch TEZ-1278 fixes waitTillReady to not ignore interrupts. This jira is to look through other APIs to figure out whether interrupts handling needs to be fixed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1117) Option to make YARN application failed on dag failure
[ https://issues.apache.org/jira/browse/TEZ-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099162#comment-14099162 ] Bikas Saha commented on TEZ-1117: - This ticket will allow users to set the final status of the session when they close the session. Or add a config that will make the session fail if any dag has failed (defaulting to current behavior of not doing so). Users who care can use that config. I would prefer the former approach because it allows the user to make the decision allowing for complex cases that the boolean config will not allow. Option to make YARN application failed on dag failure - Key: TEZ-1117 URL: https://issues.apache.org/jira/browse/TEZ-1117 Project: Apache Tez Issue Type: Improvement Reporter: Rohini Palaniswamy Can we have an configuration to make the Application status FAILED on termination if one of the DAGs fail? It is very confusing for users to see the application SUCCEEDED. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (TEZ-1435) Fix unused imports
[ https://issues.apache.org/jira/browse/TEZ-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah resolved TEZ-1435. -- Resolution: Fixed Fix Version/s: 0.5.0 Committed to master Fix unused imports --- Key: TEZ-1435 URL: https://issues.apache.org/jira/browse/TEZ-1435 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Priority: Trivial Fix For: 0.5.0 Attachments: TEZ-1435.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1338) Support submission of multiple applications with LocalRunner from within the same JVM
[ https://issues.apache.org/jira/browse/TEZ-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099220#comment-14099220 ] Chen He commented on TEZ-1338: -- This is really helpful. Thanks, [~seth.siddha...@gmail.com]. The multiDAG issue is: TEZ-1376. Support submission of multiple applications with LocalRunner from within the same JVM - Key: TEZ-1338 URL: https://issues.apache.org/jira/browse/TEZ-1338 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1338.1.txt, TEZ-1338.2.txt A single DAGAM is currently setup, which is used for all clients. In non-session mode this AM would end up in a final state and will not accept another submission. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1065) DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order
[ https://issues.apache.org/jira/browse/TEZ-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-1065: Attachment: TEZ-1065.addendum-1.patch Committing another test failure addendum. commit 5212b926c2bfd16d0f63d89814e9710a97bc4362 Author: Bikas Saha bi...@apache.org Date: Fri Aug 15 14:49:19 2014 -0700 TEZ-1065 addendum-1 to fix broken test (bikas) DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order - Key: TEZ-1065 URL: https://issues.apache.org/jira/browse/TEZ-1065 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Bikas Saha Assignee: Jeff Zhang Labels: newbie Fix For: 0.5.0 Attachments: TEZ-1065.1.patch, TEZ-1065.addendum-1.patch, TEZ-1065.addendum.patch, Tez-1065-2.patch, Tez-1065-3.patch, Tez-1065-4.patch, Tez-1065.patch They should maintain the incoming vertex order. In VertexProgress e.g. lets use LinkedHashMap instead of HashMap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-625) DAG/Vertex names should be restricted to using [A-Za-z0-9_] and limited to a defined character limit
[ https://issues.apache.org/jira/browse/TEZ-625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-625: Assignee: (was: Jonathan Eagles) DAG/Vertex names should be restricted to using [A-Za-z0-9_] and limited to a defined character limit Key: TEZ-625 URL: https://issues.apache.org/jira/browse/TEZ-625 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah Labels: newbie -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (TEZ-1331) Investigate : interrupts being swallowed by TezClient/DAGClient methods
[ https://issues.apache.org/jira/browse/TEZ-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah resolved TEZ-1331. -- Resolution: Implemented Investigate : interrupts being swallowed by TezClient/DAGClient methods --- Key: TEZ-1331 URL: https://issues.apache.org/jira/browse/TEZ-1331 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Priority: Blocker Attachments: TEZ-1331.1.patch, TEZ-1331.2.patch TEZ-1278 fixes waitTillReady to not ignore interrupts. This jira is to look through other APIs to figure out whether interrupts handling needs to be fixed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1331) Investigate : interrupts being swallowed by TezClient/DAGClient methods
[ https://issues.apache.org/jira/browse/TEZ-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099241#comment-14099241 ] Hitesh Shah commented on TEZ-1331: -- Looks like all aspects have been addressed for now. Closing this out. Investigate : interrupts being swallowed by TezClient/DAGClient methods --- Key: TEZ-1331 URL: https://issues.apache.org/jira/browse/TEZ-1331 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Priority: Blocker Attachments: TEZ-1331.1.patch, TEZ-1331.2.patch TEZ-1278 fixes waitTillReady to not ignore interrupts. This jira is to look through other APIs to figure out whether interrupts handling needs to be fixed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1436) Fix javadoc warnings
[ https://issues.apache.org/jira/browse/TEZ-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-1436: - Assignee: (was: Hitesh Shah) Fix javadoc warnings Key: TEZ-1436 URL: https://issues.apache.org/jira/browse/TEZ-1436 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (TEZ-1055) Rename tez-mapreduce-examples to tez-examples
[ https://issues.apache.org/jira/browse/TEZ-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah reassigned TEZ-1055: Assignee: Hitesh Shah (was: Rekha Joshi) Rename tez-mapreduce-examples to tez-examples - Key: TEZ-1055 URL: https://issues.apache.org/jira/browse/TEZ-1055 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Hitesh Shah Priority: Blocker And also the internal classes where applicable to remove MR references. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1411) Address initial feedback on swimlanes
[ https://issues.apache.org/jira/browse/TEZ-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099297#comment-14099297 ] Gopal V commented on TEZ-1411: -- [~jeagles]: You can produce a zoomed out view by modifing the -t variable. I intend to rewrite this tool, without needing regex based log-parsing and pull all the information from ATS/SimpleLoggingHistoryService directly. The latter is trivial to use, just add this to the tez-site.xml - to log ATS-like info into HDFS. {code} property nametez.simple.history.logging.dir/name value${fs.default.name}/tez-history//value /property {code} I will encourage you to use either of those, because I'll try to push out more tooling I have built for post-hoc analysis from that data. Address initial feedback on swimlanes - Key: TEZ-1411 URL: https://issues.apache.org/jira/browse/TEZ-1411 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Gopal V Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1411.1.patch, large.am.history.txt Few other good to have things 1) A wrapper script that takes care of the command chaining with a single appId as input from the user. 2) Legend in the README or in the svg itself about what is what. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1390) Replace byte[] with ByteBuffer as the type of user payload in the API
[ https://issues.apache.org/jira/browse/TEZ-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099313#comment-14099313 ] Siddharth Seth commented on TEZ-1390: - Configurers have already been changed to use userPayload, so that should not be a problem Replace byte[] with ByteBuffer as the type of user payload in the API - Key: TEZ-1390 URL: https://issues.apache.org/jira/browse/TEZ-1390 Project: Apache Tez Issue Type: Improvement Reporter: Bikas Saha Assignee: Tsuyoshi OZAWA Priority: Blocker Attachments: TEZ-1390.1.patch, pig.payload.txt This is just and API change. Internally we can continue to use byte[] since thats a much bigger change. The translation from ByteBuffer to byte[] in the API layer should not have perf impact. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1418) Provide Default value for TEZ_AM_LAUNCH_ENV and TEZ_TASK_LAUNCH
[ https://issues.apache.org/jira/browse/TEZ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-1418: Attachment: TEZ-1418.1.patch Attaching commit patch with edit suggested by [~hitesh]. Provide Default value for TEZ_AM_LAUNCH_ENV and TEZ_TASK_LAUNCH --- Key: TEZ-1418 URL: https://issues.apache.org/jira/browse/TEZ-1418 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Subroto Sanyal Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1418.1.patch, TEZ-1418.patch As part of the fix for the issue TEZ-1127 two new configurations have been introduced: # _TEZ_AM_LAUNCH_ENV_ # _TEZ_TASK_LAUNCH_ Ideally these properties should be configured with default value of: LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native as in the case for _mapreduce.admin.user.env_ The default value for these properties are set to (empty string). Now user has to explicitly set these values from the application code to use the native libs (like for compression). From Hitesh: {quote}As commented on TEZ-1127, it is a question as to what the default should be - whether HADOOP_COMMON_HOME or HADOOP_PREFIX and to some extent, it needs to handle Windows deployments too.{quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1418) Provide Default value for TEZ_AM_LAUNCH_ENV and TEZ_TASK_LAUNCH
[ https://issues.apache.org/jira/browse/TEZ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099326#comment-14099326 ] Bikas Saha commented on TEZ-1418: - commit 4897b25fe2ec0c9b0108f8a8f96b60f7b5f8ad06 Author: Bikas Saha bi...@apache.org Date: Fri Aug 15 16:03:00 2014 -0700 TEZ-1418. Provide Default value for TEZ_AM_LAUNCH_ENV and TEZ_TASK_LAUNCH (Subroto Sanyal via bikas) Provide Default value for TEZ_AM_LAUNCH_ENV and TEZ_TASK_LAUNCH --- Key: TEZ-1418 URL: https://issues.apache.org/jira/browse/TEZ-1418 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Subroto Sanyal Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1418.1.patch, TEZ-1418.patch As part of the fix for the issue TEZ-1127 two new configurations have been introduced: # _TEZ_AM_LAUNCH_ENV_ # _TEZ_TASK_LAUNCH_ Ideally these properties should be configured with default value of: LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native as in the case for _mapreduce.admin.user.env_ The default value for these properties are set to (empty string). Now user has to explicitly set these values from the application code to use the native libs (like for compression). From Hitesh: {quote}As commented on TEZ-1127, it is a question as to what the default should be - whether HADOOP_COMMON_HOME or HADOOP_PREFIX and to some extent, it needs to handle Windows deployments too.{quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1418) Provide Default value for TEZ_AM_LAUNCH_ENV and TEZ_TASK_LAUNCH
[ https://issues.apache.org/jira/browse/TEZ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-1418: Assignee: Subroto Sanyal Provide Default value for TEZ_AM_LAUNCH_ENV and TEZ_TASK_LAUNCH --- Key: TEZ-1418 URL: https://issues.apache.org/jira/browse/TEZ-1418 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Subroto Sanyal Assignee: Subroto Sanyal Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1418.1.patch, TEZ-1418.patch As part of the fix for the issue TEZ-1127 two new configurations have been introduced: # _TEZ_AM_LAUNCH_ENV_ # _TEZ_TASK_LAUNCH_ Ideally these properties should be configured with default value of: LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native as in the case for _mapreduce.admin.user.env_ The default value for these properties are set to (empty string). Now user has to explicitly set these values from the application code to use the native libs (like for compression). From Hitesh: {quote}As commented on TEZ-1127, it is a question as to what the default should be - whether HADOOP_COMMON_HOME or HADOOP_PREFIX and to some extent, it needs to handle Windows deployments too.{quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1390) Replace byte[] with ByteBuffer as the type of user payload in the API
[ https://issues.apache.org/jira/browse/TEZ-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099354#comment-14099354 ] Tsuyoshi OZAWA commented on TEZ-1390: - OK. I also plan to rename to/fromByteArray to to/fromByteBuffer. Replace byte[] with ByteBuffer as the type of user payload in the API - Key: TEZ-1390 URL: https://issues.apache.org/jira/browse/TEZ-1390 Project: Apache Tez Issue Type: Improvement Reporter: Bikas Saha Assignee: Tsuyoshi OZAWA Priority: Blocker Attachments: TEZ-1390.1.patch, pig.payload.txt This is just and API change. Internally we can continue to use byte[] since thats a much bigger change. The translation from ByteBuffer to byte[] in the API layer should not have perf impact. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1065) DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order
[ https://issues.apache.org/jira/browse/TEZ-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099372#comment-14099372 ] Jeff Zhang commented on TEZ-1065: - [~bikassaha] Thanks for fixing the test failure, will run tests locally before submitting patch next time :) DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order - Key: TEZ-1065 URL: https://issues.apache.org/jira/browse/TEZ-1065 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Bikas Saha Assignee: Jeff Zhang Labels: newbie Fix For: 0.5.0 Attachments: TEZ-1065.1.patch, TEZ-1065.addendum-1.patch, TEZ-1065.addendum.patch, Tez-1065-2.patch, Tez-1065-3.patch, Tez-1065-4.patch, Tez-1065.patch They should maintain the incoming vertex order. In VertexProgress e.g. lets use LinkedHashMap instead of HashMap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (TEZ-1438) Annotate add java doc for tez-runtime-library and tez-mapreduce
Bikas Saha created TEZ-1438: --- Summary: Annotate add java doc for tez-runtime-library and tez-mapreduce Key: TEZ-1438 URL: https://issues.apache.org/jira/browse/TEZ-1438 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Fix For: 0.5.0 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1438) Annotate add java doc for tez-runtime-library and tez-mapreduce
[ https://issues.apache.org/jira/browse/TEZ-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-1438: Attachment: TEZ-1438.1.patch [~hitesh] Please review. Annotate add java doc for tez-runtime-library and tez-mapreduce --- Key: TEZ-1438 URL: https://issues.apache.org/jira/browse/TEZ-1438 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1438.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1055) Rename tez-mapreduce-examples to tez-examples
[ https://issues.apache.org/jira/browse/TEZ-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-1055: - Attachment: mod-list.txt TEZ-1055.1.patch mod-list.txt has a git status which you can use to reference the relevant files to review. All other files which are renamed can be ignored. Main items to review are the changes in docs, the pom.xml, INSTALL/BUILDING guides and the assembly descriptors. Rename tez-mapreduce-examples to tez-examples - Key: TEZ-1055 URL: https://issues.apache.org/jira/browse/TEZ-1055 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Hitesh Shah Priority: Blocker Attachments: TEZ-1055.1.patch, mod-list.txt And also the internal classes where applicable to remove MR references. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1438) Annotate add java doc for tez-runtime-library and tez-mapreduce
[ https://issues.apache.org/jira/browse/TEZ-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099394#comment-14099394 ] Hitesh Shah commented on TEZ-1438: -- Comments: - TezGroupedSplitsInputFormat - this supports multiple input formats I believe - should call that out. - +@org.apache.hadoop.classification.InterfaceAudience.Private - in other places @ Private is used. - MRHelpers unstable? Rest looks good. Annotate add java doc for tez-runtime-library and tez-mapreduce --- Key: TEZ-1438 URL: https://issues.apache.org/jira/browse/TEZ-1438 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1438.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1055) Rename tez-mapreduce-examples to tez-examples
[ https://issues.apache.org/jira/browse/TEZ-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099406#comment-14099406 ] Bikas Saha commented on TEZ-1055: - Minor comment. Should this just be Tez job instead of MRR? Example of using an MRR job in the IntersectDataGen/Example/Validate should move back to tez-examples. +1. Looks good. Rename tez-mapreduce-examples to tez-examples - Key: TEZ-1055 URL: https://issues.apache.org/jira/browse/TEZ-1055 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Hitesh Shah Priority: Blocker Attachments: TEZ-1055.1.patch, mod-list.txt And also the internal classes where applicable to remove MR references. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1438) Annotate add java doc for tez-runtime-library and tez-mapreduce
[ https://issues.apache.org/jira/browse/TEZ-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-1438: - Attachment: TEZ-1438.2.patch Addressed minor nits. Annotate add java doc for tez-runtime-library and tez-mapreduce --- Key: TEZ-1438 URL: https://issues.apache.org/jira/browse/TEZ-1438 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1438.1.patch, TEZ-1438.2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1438) Annotate add java doc for tez-runtime-library and tez-mapreduce
[ https://issues.apache.org/jira/browse/TEZ-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-1438: - Attachment: TEZ-1438.3.patch Rebased patch Annotate add java doc for tez-runtime-library and tez-mapreduce --- Key: TEZ-1438 URL: https://issues.apache.org/jira/browse/TEZ-1438 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1438.1.patch, TEZ-1438.2.patch, TEZ-1438.3.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (TEZ-1439) IntersectDataGen/Example/Validate should move back to tez-examples.
Hitesh Shah created TEZ-1439: Summary: IntersectDataGen/Example/Validate should move back to tez-examples. Key: TEZ-1439 URL: https://issues.apache.org/jira/browse/TEZ-1439 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Priority: Blocker -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1400) Reducers stuck when enabling auto-reduce parallelism (MRR case)
[ https://issues.apache.org/jira/browse/TEZ-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated TEZ-1400: -- Attachment: TEZ-1400.3.patch Thanks Bikas. Will make the tez-site related changes in this JIRA and add separate patch for TEZ-744. Reducers stuck when enabling auto-reduce parallelism (MRR case) --- Key: TEZ-1400 URL: https://issues.apache.org/jira/browse/TEZ-1400 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Rajesh Balamohan Assignee: Rajesh Balamohan Labels: performance Attachments: TEZ-1400.1.patch, TEZ-1400.2.patch, TEZ-1400.3.patch, dag.dot In M - R1 - R2 case, if R1 is optimized by auto-parallelism R2 gets stuck waiting for events. e.g Map 1: 0/1 Map 2: -/- Map 5: 0/1 Map 6: 0/1 Map 7: 0/1 Reducer 3: 0/23 Reducer 4: 0/1 ... ... Map 1: 1/1 Map 2: 148(+13)/161 Map 5: 1/1 Map 6: 1/1 Map 7: 1/1 Reducer 3: 0(+3)/3 Reducer 4: 0(+1)/1 == Auto reduce parallelism kicks in .. Map 1: 1/1 Map 2: 161/161 Map 5: 1/1 Map 6: 1/1 Map 7: 1/1 Reducer 3: 3/3 Reducer 4: 0(+1)/1 Job is stuck waiting for events in Reducer 4. [fetcher [Reducer_3] #23] org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler: copy(3 of 23 at 0.02 MB/s) === *Waiting for 20 more partitions, even though Reducer3 has been optimized to use 3 reducers -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1132) Consistent naming of Input and Outputs
[ https://issues.apache.org/jira/browse/TEZ-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099444#comment-14099444 ] Bikas Saha commented on TEZ-1132: - LocalOnFileSorterOutput , LocalMergedInput moved to mapreduce tests OnFileSortedOutput/Configurer to OrderedPartitionedKVOutput/Configurer OnFileUnorderedPartitionedKVOutput/Configurer to UnorderedPartitionedKVOutput/Configurer UnorderedKVOutput/Configurer to UnorderedKVOutput/Configurer SortedGroupedMergedInput/Reader to OrderedGroupedMergedKVInput/Reader ShuffledMergedInput to OrderedGroupedKVInput ShuffledUnorderedKVInput to UnorderedKVOutput UnorderedUnpartitionedKVEdgeConfigurer to UnorderedKVEdgeConfigurer Consistent naming of Input and Outputs -- Key: TEZ-1132 URL: https://issues.apache.org/jira/browse/TEZ-1132 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Some places we should Sorted Partitioned. In others we should Shuffled. We should use a consistent naming scheme based on Sorted, Grouped, Partitioned sub-terms so that the function is clear from the name. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (TEZ-1132) Consistent naming of Input and Outputs
[ https://issues.apache.org/jira/browse/TEZ-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099444#comment-14099444 ] Bikas Saha edited comment on TEZ-1132 at 8/16/14 1:51 AM: -- Changes made LocalOnFileSorterOutput , LocalMergedInput moved to mapreduce tests OnFileSortedOutput/Configurer to OrderedPartitionedKVOutput/Configurer OnFileUnorderedPartitionedKVOutput/Configurer to UnorderedPartitionedKVOutput/Configurer UnorderedKVOutput/Configurer to UnorderedKVOutput/Configurer SortedGroupedMergedInput/Reader to OrderedGroupedMergedKVInput/Reader ShuffledMergedInput to OrderedGroupedKVInput ShuffledUnorderedKVInput to UnorderedKVOutput UnorderedUnpartitionedKVEdgeConfigurer to UnorderedKVEdgeConfigurer was (Author: bikassaha): LocalOnFileSorterOutput , LocalMergedInput moved to mapreduce tests OnFileSortedOutput/Configurer to OrderedPartitionedKVOutput/Configurer OnFileUnorderedPartitionedKVOutput/Configurer to UnorderedPartitionedKVOutput/Configurer UnorderedKVOutput/Configurer to UnorderedKVOutput/Configurer SortedGroupedMergedInput/Reader to OrderedGroupedMergedKVInput/Reader ShuffledMergedInput to OrderedGroupedKVInput ShuffledUnorderedKVInput to UnorderedKVOutput UnorderedUnpartitionedKVEdgeConfigurer to UnorderedKVEdgeConfigurer Consistent naming of Input and Outputs -- Key: TEZ-1132 URL: https://issues.apache.org/jira/browse/TEZ-1132 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Some places we should Sorted Partitioned. In others we should Shuffled. We should use a consistent naming scheme based on Sorted, Grouped, Partitioned sub-terms so that the function is clear from the name. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-1367) Review and clean packages in the API
[ https://issues.apache.org/jira/browse/TEZ-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099464#comment-14099464 ] Bikas Saha commented on TEZ-1367: - Took a pass over all packages. Looked fine. Recent refactorings seem to have put things in order. Closing. Review and clean packages in the API Key: TEZ-1367 URL: https://issues.apache.org/jira/browse/TEZ-1367 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (TEZ-1367) Review and clean packages in the API
[ https://issues.apache.org/jira/browse/TEZ-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha resolved TEZ-1367. - Resolution: Done Review and clean packages in the API Key: TEZ-1367 URL: https://issues.apache.org/jira/browse/TEZ-1367 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1132) Consistent naming of Input and Outputs
[ https://issues.apache.org/jira/browse/TEZ-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-1132: Attachment: TEZ-1132.1.patch git.txt Attaching patch. Since this is a refactor and there was prior agreement on the refactoring changes I am going to commit this patch. It is fragile to other changes. It also unblocks other blocker patches that will severely conflict with this. Consistent naming of Input and Outputs -- Key: TEZ-1132 URL: https://issues.apache.org/jira/browse/TEZ-1132 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Attachments: TEZ-1132.1.patch, git.txt Some places we should Sorted Partitioned. In others we should Shuffled. We should use a consistent naming scheme based on Sorted, Grouped, Partitioned sub-terms so that the function is clear from the name. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (TEZ-1132) Consistent naming of Input and Outputs
[ https://issues.apache.org/jira/browse/TEZ-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha resolved TEZ-1132. - Resolution: Fixed Fix Version/s: 0.5.0 Hadoop Flags: Incompatible change commit 469bf9052f7a51e67c82b4db300d71992d5fd874 Author: Bikas Saha bi...@apache.org Date: Fri Aug 15 20:05:26 2014 -0700 TEZ-1132. Consistent naming of Input and Outputs (bikas) Consistent naming of Input and Outputs -- Key: TEZ-1132 URL: https://issues.apache.org/jira/browse/TEZ-1132 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Fix For: 0.5.0 Attachments: TEZ-1132.1.patch, git.txt Some places we should Sorted Partitioned. In others we should Shuffled. We should use a consistent naming scheme based on Sorted, Grouped, Partitioned sub-terms so that the function is clear from the name. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1423) Ability to pass custom properties to keySerializer for OnFileUnorderedPartitionedKVOutput
[ https://issues.apache.org/jira/browse/TEZ-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-1423: Attachment: TEZ-1423.2.patch Attaching rebased patch for commit. Ability to pass custom properties to keySerializer for OnFileUnorderedPartitionedKVOutput - Key: TEZ-1423 URL: https://issues.apache.org/jira/browse/TEZ-1423 Project: Apache Tez Issue Type: Improvement Reporter: Johannes Zillmann Assignee: Siddharth Seth Attachments: TEZ-1423.1.txt, TEZ-1423.2.patch Using OnFileUnorderedPartitionedKVOutput there is no way of passing custom properties to its keySerializer class given that this class implements Configurable. For OnFileSortedOutput this is possible because comparatorConf and partitionerConf touching both sides, the input and the output. Possible solutions could be either passing a map to keySerializer configuration as well or have custom properties for the input and the output of an edge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (TEZ-1423) Ability to pass custom properties to keySerializer for OnFileUnorderedPartitionedKVOutput
[ https://issues.apache.org/jira/browse/TEZ-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha resolved TEZ-1423. - Resolution: Fixed Fix Version/s: 0.5.0 Hadoop Flags: Incompatible change,Reviewed commit 98875db09ece09acee71190489d40ce17ba239c3 Author: Bikas Saha bi...@apache.org Date: Fri Aug 15 20:25:12 2014 -0700 TEZ-1423. Ability to pass custom properties to keySerializer for OnFileUnorderedPartitionedKVOutput (Siddharth Seth via Ability to pass custom properties to keySerializer for OnFileUnorderedPartitionedKVOutput - Key: TEZ-1423 URL: https://issues.apache.org/jira/browse/TEZ-1423 Project: Apache Tez Issue Type: Improvement Reporter: Johannes Zillmann Assignee: Siddharth Seth Fix For: 0.5.0 Attachments: TEZ-1423.1.txt, TEZ-1423.2.patch Using OnFileUnorderedPartitionedKVOutput there is no way of passing custom properties to its keySerializer class given that this class implements Configurable. For OnFileSortedOutput this is possible because comparatorConf and partitionerConf touching both sides, the input and the output. Possible solutions could be either passing a map to keySerializer configuration as well or have custom properties for the input and the output of an edge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1395) TestSecureShuffle fails
[ https://issues.apache.org/jira/browse/TEZ-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-1395: - Attachment: TestSecureShuffle-logs.tgz [~rajesh.balamohan] attached mini cluster logs as well as test output logs. I can reproduce this consistently on a mac with java 1.7. TestSecureShuffle fails Key: TEZ-1395 URL: https://issues.apache.org/jira/browse/TEZ-1395 Project: Apache Tez Issue Type: Bug Reporter: Tsuyoshi OZAWA Assignee: Rajesh Balamohan Attachments: TestSecureShuffle-logs.tgz, org.apache.tez.test.TestSecureShuffle-output.txt {quote} Running org.apache.tez.test.TestSecureShuffle Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 262.405 sec FAILURE! testSecureShuffle[test[sslInCluster:true, sslInTez:true, expectedResult:0]](org.apache.tez.test.TestSecureShuffle) Time elapsed: 75.061 sec FAILURE! java.lang.AssertionError: expected:0 but was:1 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.tez.test.TestSecureShuffle.testSecureShuffle(TestSecureShuffle.java:148) {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (TEZ-1363) Make use of the regular scheduler when running in LocalMode
[ https://issues.apache.org/jira/browse/TEZ-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-1363: - Attachment: TEZ-1363-v1.patch I assume we can use this ticket to start using regular task scheduler for local mode and the hybrid local/remote/other containers will be another sub-task. Starter Delegate Abstraction for TezAMRMClientAsync patch based on assumption. Still need to create the TezLocalAMRMClientAsync. Make use of the regular scheduler when running in LocalMode --- Key: TEZ-1363 URL: https://issues.apache.org/jira/browse/TEZ-1363 Project: Apache Tez Issue Type: Sub-task Reporter: Siddharth Seth Attachments: TEZ-1363-v1.patch In TEZ-708, we decided to introduce a new scheduler for local mode - to keep things simple initially, and get local mode working. Eventually, however, scheduling should go through the regular task scheduler - which should be able to get containers from YARN / LocalAllocator / other sources - and treat them as a regular container for scheduling purposes. -- This message was sent by Atlassian JIRA (v6.2#6252)