[ https://issues.apache.org/jira/browse/TEZ-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180725#comment-15180725 ]
Hitesh Shah commented on TEZ-3155: ---------------------------------- Some general overal comments: - patch looks decent for the most part - please do not change code lines if they are not needed ( minor edits, new lines, etc). Small changes are sometimes ok but numerous modifications make it hard for a reviewer. Some comments on the patch: TezClient - why 0.8 for largeDAGPlanRatio ? Why does it need to be so high for a 64 MB max limit? Shoudnt it be more like a fixed number to account for the overhead? - please consider using DAGPlan.writeTo* to write the output stream instead of the objectoutputstream - output stream needs to be closed? {code} FileSystem fs = dagPlanPath.getFileSystem(dagClientConf); {code} - the above should use amCOnfig.getTezConf() and should be done once per TezClient and the fs object re-used. {code} SubmitDAGRequestProto request = requestBuilder.build(); SubmitDAGResponseProto response = proxy.submitDAG(null, request); {code} - any reason why this code change was needed? - requestBuilder.build() is getting called twice - should be fixed to be called once. - SERIALIZED_DAGPLAN_NAME_PREFIX can be replaced by TEZ_PB_PLAN_BINARY_NAME {code} requestBuilder.setSerializedDagPlanPath(dagPlanPath.toString()); {code} - this should be a fully resolved path ( check fs.resolve...) TestTezClient: - there should be a test for the negative case i.e. one where the dag plan is within the defined threshold DAGClientServer: {code} DAGClientAMProtocolBlockingPBServerImpl service = DAGClientAMProtocolBlockingPBServerImpl(realInstance, conf); {code} - why was the above needed? DAGClientAMProtocolBlockingPBServerImpl: - it might be better to move this aspect of the code into DAGAppMaster where the staging fs object can be reused. - also the stream needs to be closed in a finally block > Support a way to submit DAGs to a session where the DAG plan exceeds hadoop > ipc limits > --------------------------------------------------------------------------------------- > > Key: TEZ-3155 > URL: https://issues.apache.org/jira/browse/TEZ-3155 > Project: Apache Tez > Issue Type: Bug > Reporter: Hitesh Shah > Assignee: Zhiyuan Yang > Attachments: TEZ-3155.1.patch, TEZ-3155.2.patch, TEZ-3155.3.patch > > > Currently, dag submissions fail if the dag plan exceeds the hadoop ipc > limits. One option would be to fall back to local resources if the dag plan > is too large. -- This message was sent by Atlassian JIRA (v6.3.4#6332)