Hi Madhusudan, The UI is trying to access realtime data from Tez Application Master (Via RM), and is causing a CORS (Cross-origin resource sharing) issue. As in this link https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Enabling_CORS_support, please enable CORS support in RM, and the request must succeed. - Browsers doesn’t permits incoming data from a different domain unless it comes with some specific headers. In your case the UI is loaded from http://tez-dev.internal:8080, and it is trying to access data from http://rm:8088<http://rm:8088/proxy/application_1475091857089_0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228>. Setting those configurations would add the required headers in the response from the server.
Is proxy supposed to redirect http://rm:8088/proxy/application_1475091857089_0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228 to http://timelineserver:8188/ws/v1/tez/dagProgress?dagID=1&_=1475104772228<http://rm:8088/proxy/application_1475091857089_0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228> > The proxy is for redirecting to the respective Tez Application Master. I.e > http://rm:8088/proxy/application_1475091857089_0021/<http://rm:8088/proxy/application_1475091857089_0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228> > points the application master of app application_1475091857089_0021, > ws/v1/tez/dagProgress<http://rm:8088/proxy/application_1475091857089_0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228> > points to a REST end point in the AM, and the request must return realtime > data for > dagID=1<http://rm:8088/proxy/application_1475091857089_0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228>. Thanks, - Sreenath From: Madhusudan Ramanna <m.rama...@ymail.com<mailto:m.rama...@ymail.com>> Reply-To: "user@tez.apache.org<mailto:user@tez.apache.org>" <user@tez.apache.org<mailto:user@tez.apache.org>>, Madhusudan Ramanna <m.rama...@ymail.com<mailto:m.rama...@ymail.com>> Date: Thursday, September 29, 2016 at 4:58 AM To: Hitesh Shah <hit...@apache.org<mailto:hit...@apache.org>> Cc: "user@tez.apache.org<mailto:user@tez.apache.org>" <user@tez.apache.org<mailto:user@tez.apache.org>> Subject: Re: Zip Exception since commit da4098b9 Well, don't see much in yarn logs However, in the browser console of tez-ui we see: >>>>>>>> LHttpRequest cannot load http://rm:8088/proxy/application_1475091857089_0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://tez-dev.internal:8080' is therefore not allowed access. <<<<<<< Is proxy supposed to redirect http://rm:8088/proxy/application_1475091857089_0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228 to http://timelineserver:8188/ws/v1/tez/dagProgress?dagID=1&_=1475104772228<http://rm:8088/proxy/application_1475091857089_0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228> On Wednesday, September 28, 2016 3:02 PM, Hitesh Shah <hit...@apache.org<mailto:hit...@apache.org>> wrote: Ok thanks - so it does look it is publishing data correctly for the most part. You may wish to start digging through the yarn app logs for which data is not showing up as well as the yarn timeline logs to see if there are any exceptions being thrown. — Hitesh > On Sep 28, 2016, at 2:50 PM, Madhusudan Ramanna > <m.rama...@ymail.com<mailto:m.rama...@ymail.com>> wrote: > > Here is history.txt.appattempt_1475091857089_0015_000001 (clipped) > > > {"entity":"tez_application_1475091857089_0015","entitytype":"TEZ_APPLICATION","otherinfo":{"user":"apxqueue","config":""} > {"entity":"tez_appattempt_1475091857089_0015_000001","entitytype":"TEZ_APPLICATION_ATTEMPT","relatedEntities":[{"entity":"application_1475091857089_0015","entitytype":"applicationId"},{"entity":"appattempt_1475091857089_0015_000001","entitytype":"applicationAttemptId"}],"events":[{"ts":1475098751116,"eventtype":"AM_LAUNCHED"}],"otherinfo":{"appSubmitTime":1475098748512}} > {"entity":"tez_appattempt_1475091857089_0015_000001","entitytype":"TEZ_APPLICATION_ATTEMPT","relatedEntities":[{"entity":"application_1475091857089_0015","entitytype":"applicationId"},{"entity":"appattempt_1475091857089_0015_000001","entitytype":"applicationAttemptId"}],"events":[{"ts":1475098753324,"eventtype":"AM_STARTED"}]} > {"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID","relatedEntities":[{"entity":"tez_application_1475091857089_0015","entitytype":"TEZ_APPLICATION"},{"entity":"tez_appattempt_1475091857089_0015_000001","entitytype":"TEZ_APPLICATION_ATTEMPT"},{"entity":"application_1475091857089_0015","entitytype":"applicationId"},{"entity":"appattempt_1475091857089_0015_000001","entitytype":"applicationAttemptId"},{"entity":"apxqueue","entitytype":"user"}],"primaryfilters":{"dagName":"pager:0.1.3-SNAPSHOT","callerId":"application_1475091857089_0015","callerType":"Coordinator"},"events":[{"ts":1475098772855,"eventtype":"DAG_SUBMITTED"}],"otherinfo":{"dagPlan":{"dagName":"pager:0.1.3-SNAPSHOT","dagContext":{"callerId":"application_1475091857089_0015","callerType":"Coordinator","context":"Coordinator","description":"Tez > graph > 'pager:0.1.3-SNAPSHOT'"},"version":2,"vertices":[{"vertexName":"parser","processorClass":"com.xyz.cv2.mrv2.ShimMapper","outEdgeIds":["772105978"],"additionalInputs":[{"name":"_initial","class":"org.apache.tez.mapreduce.input.MRInput","initializer":"org.apache.tez.mapreduce.common.MRInputAMSplitGenerator"}]},{"vertexName":"pager","processorClass":"com.xyz.cv2.mrv2.ShimMapper","inEdgeIds":["772105978"]}],"edges":[{"edgeId":"772105978","inputVertexName":"parser","outputVertexName":"pager","dataMovementType":"ONE_TO_ONE","dataSourceType":"PERSISTED","schedulingType":"SEQUENTIAL","edgeSourceClass":"org.apache.tez.runtime.library.output.UnorderedKVOutput","edgeDestinationClass":"org.apache.tez.runtime.library.input.UnorderedKVInput"}]},"callerId":"application_1475091857089_0015","callerType":"Coordinator"}} > {"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID","events":[{"ts":1475098773464,"eventtype":"DAG_INITIALIZED"}],"otherinfo":{"vertexNameIdMapping":{"pager":"vertex_1475091857089_0015_1_01","parser":"vertex_1475091857089_0015_1_00"}}} > {"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID","events":[{"ts":1475098773467,"eventtype":"DAG_STARTED"}]} > {"entity":"vertex_1475091857089_0015_1_00","entitytype":"TEZ_VERTEX_ID","relatedEntities":[{"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID"}],"events":[{"ts":1475098773628,"eventtype":"VERTEX_INITIALIZED"}],"otherinfo":{"vertexName":"parser","initRequestedTime":1475098773472,"initTime":1475098773628,"numTasks":1,"processorClassName":"com.xyz.cv2.mrv2.ShimMapper","servicePlugin":{"taskSchedulerName":"TezYarn","taskSchedulerClassName":"org.apache.tez.dag.app.rm.YarnTaskSchedulerService","taskCommunicatorName":"TezYarn","taskCommunicatorClassName":"org.apache.tez.dag.app.TezTaskCommunicatorImpl","containerLauncherName":"TezYarn","containerLauncherClassName":"org.apache.tez.dag.app.launcher.TezContainerLauncherImpl"}}} > {"entity":"vertex_1475091857089_0015_1_00","entitytype":"TEZ_VERTEX_ID","relatedEntities":[{"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID"}],"events":[{"ts":1475098773630,"eventtype":"VERTEX_STARTED"}],"otherinfo":{"startRequestedTime":1475098773507,"startTime":1475098773630}} > {"entity":"vertex_1475091857089_0015_1_00","entitytype":"TEZ_VERTEX_ID","events":[{"ts":0,"eventtype":"VERTEX_CONFIGURE_DONE","eventinfo":{"numTasks":1}}],"otherinfo":{}} > {"entity":"vertex_1475091857089_0015_1_01","entitytype":"TEZ_VERTEX_ID","relatedEntities":[{"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID"}],"events":[{"ts":1475098773639,"eventtype":"VERTEX_INITIALIZED"}],"otherinfo":{"vertexName":"pager","initRequestedTime":1475098773507,"initTime":1475098773639,"numTasks":1,"processorClassName":"com.xyz.cv2.mrv2.ShimMapper","servicePlugin":{"taskSchedulerName":"TezYarn","taskSchedulerClassName":"org.apache.tez.dag.app.rm.YarnTaskSchedulerService","taskCommunicatorName":"TezYarn","taskCommunicatorClassName":"org.apache.tez.dag.app.TezTaskCommunicatorImpl","containerLauncherName":"TezYarn","containerLauncherClassName":"org.apache.tez.dag.app.launcher.TezContainerLauncherImpl"}}} > {"entity":"vertex_1475091857089_0015_1_01","entitytype":"TEZ_VERTEX_ID","events":[{"ts":0,"eventtype":"VERTEX_CONFIGURE_DONE","eventinfo":{"numTasks":1,"updatedEdgeManagers":{"parser":{"schedulingType":"SEQUENTIAL","edgeSourceClass":"org.apache.tez.runtime.library.output.UnorderedKVOutput","dataMovementType":"ONE_TO_ONE","edgeDestinationClass":"org.apache.tez.runtime.library.input.UnorderedKVInput","dataSourceType":"PERSISTED"}}}}],"otherinfo":{}} > {"entity":"vertex_1475091857089_0015_1_01","entitytype":"TEZ_VERTEX_ID","relatedEntities":[{"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID"}],"events":[{"ts":1475098773642,"eventtype":"VERTEX_STARTED"}],"otherinfo":{"startRequestedTime":1475098773636,"startTime":1475098773642}} > {"entity":"task_1475091857089_0015_1_00_000000","entitytype":"TEZ_TASK_ID","relatedEntities":[{"entity":"vertex_1475091857089_0015_1_00","entitytype":"TEZ_VERTEX_ID"}],"events":[{"ts":1475098773645,"eventtype":"TASK_STARTED"}],"otherinfo":{"startTime":1475098773645,"scheduledTime":1475098773645}} > {"entity":"tez_container_1475091857089_0015_01_000002","entitytype":"TEZ_CONTAINER_ID","relatedEntities":[{"entity":"appattempt_1475091857089_0015_000001","entitytype":"TEZ_APPLICATION_ATTEMPT"},{"entity":"container_1475091857089_0015_01_000002","entitytype":"containerId"}],"events":[{"ts":1475098775614,"eventtype":"CONTAINER_LAUNCHED"}]} > {"entity":"attempt_1475091857089_0015_1_00_000000_0","entitytype":"TEZ_TASK_ATTEMPT_ID","relatedEntities":[{"entity":"ip-10-1-2-173.us-west-2.compute.internal:8041","entitytype":"nodeId"},{"entity":"container_1475091857089_0015_01_000002","entitytype":"containerId"},{"entity":"task_1475091857089_0015_1_00_000000","entitytype":"TEZ_TASK_ID"}],"events":[{"ts":1475098782472,"eventtype":"TASK_ATTEMPT_STARTED"}],"otherinfo":{"inProgressLogsURL":"ip-10-1-2-173.us-west-2.compute.internal:8042\/node\/containerlogs\/container_1475091857089_0015_01_000002\/apxqueue","completedLogsURL":"http:\/\/ip-10-1-3-71.us-west-2.compute.internal:19888\/jobhistory\/logs\/\/ip-10-1-2-173.us-west-2.compute.internal:8041\/container_1475091857089_0015_01_000002\/v_parser_attempt_1475091857089_0015_1_00_000000_0\/apxqueue"}} > {"entity":"attempt_1475091857089_0015_1_00_000000_0","entitytype":"TEZ_TASK_ATTEMPT_ID","events":[{"ts":1475098785740,"eventtype":"TASK_ATTEMPT_FINISHED"}],"otherinfo":{"creationTime":1475098773667,"allocationTime":1475098775535,"startTime":1475098782472,"endTime":1475098785740,"timeTaken":3268,"status":"SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{"counterGroupName":"org.apache.tez.common.counters.DAGCounter","counters":[{"counterName":"DATA_LOCAL_TASKS","counterValue":1}]},{"counterGroupName":"org.apache.tez.common.counters.FileSystemCounter","counterGroupDisplayName":"File > System > Counters","counters":[{"counterName":"FILE_BYTES_WRITTEN","counterValue":42},{"counterName":"HDFS_BYTES_READ","counterValue":14297},{"counterName":"HDFS_READ_OPS","counterValue":2}]},{"counterGroupName":"org.apache.tez.common.counters.TaskCounter","counters":[{"counterName":"GC_TIME_MILLIS","counterValue":265},{"counterName":"CPU_MILLISECONDS","counterValue":7860},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":265814016},{"counterName":"VIRTUAL_MEMORY_BYTES","counterValue":5392384000},{"counterName":"COMMITTED_HEAP_BYTES","counterValue":265814016},{"counterName":"INPUT_RECORDS_PROCESSED","counterValue":1},{"counterName":"INPUT_SPLIT_LENGTH_BYTES","counterValue":14297},{"counterName":"OUTPUT_BYTES_WITH_OVERHEAD","counterValue":6},{"counterName":"OUTPUT_BYTES_PHYSICAL","counterValue":34}]}]},"lastDataEvents":{"lastDataEvents":[{"TEZ_TASK_ATTEMPT_ID":"","ts":1475098773609}]}}} > {"entity":"task_1475091857089_0015_1_00_000000","entitytype":"TEZ_TASK_ID","events":[{"ts":1475098785751,"eventtype":"TASK_FINISHED"}],"otherinfo":{"startTime":1475098782472,"endTime":1475098785751,"timeTaken":3279,"status":"SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{"counterGroupName":"org.apache.tez.common.counters.DAGCounter","counters":[{"counterName":"DATA_LOCAL_TASKS","counterValue":1}]},{"counterGroupName":"org.apache.tez.common.counters.FileSystemCounter","counterGroupDisplayName":"File > System > Counters","counters":[{"counterName":"FILE_BYTES_WRITTEN","counterValue":42},{"counterName":"HDFS_BYTES_READ","counterValue":14297},{"counterName":"HDFS_READ_OPS","counterValue":2}]},{"counterGroupName":"org.apache.tez.common.counters.TaskCounter","counters":[{"counterName":"GC_TIME_MILLIS","counterValue":265},{"counterName":"CPU_MILLISECONDS","counterValue":7860},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":265814016},{"counterName":"VIRTUAL_MEMORY_BYTES","counterValue":5392384000},{"counterName":"COMMITTED_HEAP_BYTES","counterValue":265814016},{"counterName":"INPUT_RECORDS_PROCESSED","counterValue":1},{"counterName":"INPUT_SPLIT_LENGTH_BYTES","counterValue":14297},{"counterName":"OUTPUT_BYTES_WITH_OVERHEAD","counterValue":6},{"counterName":"OUTPUT_BYTES_PHYSICAL","counterValue":34}]}]},"successfulAttemptId":"attempt_1475091857089_0015_1_00_000000_0"}} > {"entity":"vertex_1475091857089_0015_1_00","entitytype":"TEZ_VERTEX_ID","events":[{"ts":1475098785759,"eventtype":"VERTEX_FINISHED"}],"otherinfo":{"endTime":1475098785759,"timeTaken":12129,"status":"SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{"counterGroupName":"org.apache.tez.common.counters.DAGCounter","counters":[{"counterName":"DATA_LOCAL_TASKS","counterValue":1}]},{"counterGroupName":"org.apache.tez.common.counters.FileSystemCounter","counterGroupDisplayName":"File > System > Counters","counters":[{"counterName":"FILE_BYTES_WRITTEN","counterValue":42},{"counterName":"HDFS_BYTES_READ","counterValue":14297},{"counterName":"HDFS_READ_OPS","counterValue":2}]},{"counterGroupName":"org.apache.tez.common.counters.TaskCounter","counters":[{"counterName":"GC_TIME_MILLIS","counterValue":265},{"counterName":"CPU_MILLISECONDS","counterValue":7860},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":265814016},{"counterName":"VIRTUAL_MEMORY_BYTES","counterValue":5392384000},{"counterName":"COMMITTED_HEAP_BYTES","counterValue":265814016},{"counterName":"INPUT_RECORDS_PROCESSED","counterValue":1},{"counterName":"INPUT_SPLIT_LENGTH_BYTES","counterValue":14297},{"counterName":"OUTPUT_BYTES_WITH_OVERHEAD","counterValue":6},{"counterName":"OUTPUT_BYTES_PHYSICAL","counterValue":34}]}]},"stats":{"firstTaskStartTime":1475098782472,"firstTasksToStart":["task_1475091857089_0015_1_00_000000"],"lastTaskFinishTime":1475098785740,"lastTasksToFinish":["task_1475091857089_0015_1_00_000000"],"minTaskDuration":3268,"maxTaskDuration":3268,"avgTaskDuration":3268,"shortestDurationTasks":["task_1475091857089_0015_1_00_000000"],"longestDurationTasks":["task_1475091857089_0015_1_00_000000"]},"numFailedTaskAttempts":0,"numKilledTaskAttempts":0,"numCompletedTasks":1,"numSucceededTasks":1,"numKilledTasks":0,"numFailedTasks":0,"servicePlugin":{"taskSchedulerName":"TezYarn","taskSchedulerClassName":"org.apache.tez.dag.app.rm.YarnTaskSchedulerService","taskCommunicatorName":"TezYarn","taskCommunicatorClassName":"org.apache.tez.dag.app.TezTaskCommunicatorImpl","containerLauncherName":"TezYarn","containerLauncherClassName":"org.apache.tez.dag.app.launcher.TezContainerLauncherImpl"}}} > {"entity":"task_1475091857089_0015_1_01_000000","entitytype":"TEZ_TASK_ID","relatedEntities":[{"entity":"vertex_1475091857089_0015_1_01","entitytype":"TEZ_VERTEX_ID"}],"events":[{"ts":1475098785787,"eventtype":"TASK_STARTED"}],"otherinfo":{"startTime":1475098785787,"scheduledTime":1475098785787}} > {"entity":"attempt_1475091857089_0015_1_01_000000_0","entitytype":"TEZ_TASK_ATTEMPT_ID","relatedEntities":[{"entity":"ip-10-1-2-173.us-west-2.compute.internal:8041","entitytype":"nodeId"},{"entity":"container_1475091857089_0015_01_000002","entitytype":"containerId"},{"entity":"task_1475091857089_0015_1_01_000000","entitytype":"TEZ_TASK_ID"}],"events":[{"ts":1475098785847,"eventtype":"TASK_ATTEMPT_STARTED"}],"otherinfo":{"inProgressLogsURL":"ip-10-1-2-173.us-west-2.compute.internal:8042\/node\/containerlogs\/container_1475091857089_0015_01_000002\/apxqueue","completedLogsURL":"http:\/\/ip-10-1-3-71.us-west-2.compute.internal:19888\/jobhistory\/logs\/\/ip-10-1-2-173.us-west-2.compute.internal:8041\/container_1475091857089_0015_01_000002\/v_pager_attempt_1475091857089_0015_1_01_000000_0\/apxqueue"}} > {"entity":"attempt_1475091857089_0015_1_01_000000_0","entitytype":"TEZ_TASK_ATTEMPT_ID","events":[{"ts":1475098785988,"eventtype":"TASK_ATTEMPT_FINISHED"}],"otherinfo":{"creationTime":1475098785788,"allocationTime":1475098785793,"startTime":1475098785847,"endTime":1475098785988,"timeTaken":141,"status":"SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{"counterGroupName":"org.apache.tez.common.counters.DAGCounter","counters":[{"counterName":"OTHER_LOCAL_TASKS","counterValue":1}]},{"counterGroupName":"org.apache.tez.common.counters.TaskCounter","counters":[{"counterName":"CPU_MILLISECONDS","counterValue":330},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":265814016},{"counterName":"VIRTUAL_MEMORY_BYTES","counterValue":5404876800},{"counterName":"COMMITTED_HEAP_BYTES","counterValue":265814016},{"counterName":"SHUFFLE_PHASE_TIME","counterValue":27},{"counterName":"FIRST_EVENT_RECEIVED","counterValue":25},{"counterName":"LAST_EVENT_RECEIVED","counterValue":25}]}]},"lastDataEvents":{"lastDataEvents":[{"TEZ_TASK_ATTEMPT_ID":"attempt_1475091857089_0015_1_00_000000_0","ts":1475098785739}]}}} > {"entity":"task_1475091857089_0015_1_01_000000","entitytype":"TEZ_TASK_ID","events":[{"ts":1475098785989,"eventtype":"TASK_FINISHED"}],"otherinfo":{"startTime":1475098785847,"endTime":1475098785989,"timeTaken":142,"status":"SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{"counterGroupName":"org.apache.tez.common.counters.DAGCounter","counters":[{"counterName":"OTHER_LOCAL_TASKS","counterValue":1}]},{"counterGroupName":"org.apache.tez.common.counters.TaskCounter","counters":[{"counterName":"CPU_MILLISECONDS","counterValue":330},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":265814016},{"counterName":"VIRTUAL_MEMORY_BYTES","counterValue":5404876800},{"counterName":"COMMITTED_HEAP_BYTES","counterValue":265814016},{"counterName":"SHUFFLE_PHASE_TIME","counterValue":27},{"counterName":"FIRST_EVENT_RECEIVED","counterValue":25},{"counterName":"LAST_EVENT_RECEIVED","counterValue":25}]}]},"successfulAttemptId":"attempt_1475091857089_0015_1_01_000000_0"}} > {"entity":"vertex_1475091857089_0015_1_01","entitytype":"TEZ_VERTEX_ID","events":[{"ts":1475098785990,"eventtype":"VERTEX_FINISHED"}],"otherinfo":{"endTime":1475098785990,"timeTaken":12348,"status":"SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{"counterGroupName":"org.apache.tez.common.counters.DAGCounter","counters":[{"counterName":"OTHER_LOCAL_TASKS","counterValue":1}]},{"counterGroupName":"org.apache.tez.common.counters.TaskCounter","counters":[{"counterName":"CPU_MILLISECONDS","counterValue":330},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":265814016},{"counterName":"VIRTUAL_MEMORY_BYTES","counterValue":5404876800},{"counterName":"COMMITTED_HEAP_BYTES","counterValue":265814016},{"counterName":"SHUFFLE_PHASE_TIME","counterValue":27},{"counterName":"FIRST_EVENT_RECEIVED","counterValue":25},{"counterName":"LAST_EVENT_RECEIVED","counterValue":25}]}]},"stats":{"firstTaskStartTime":1475098785847,"firstTasksToStart":["task_1475091857089_0015_1_01_000000"],"lastTaskFinishTime":1475098785988,"lastTasksToFinish":["task_1475091857089_0015_1_01_000000"],"minTaskDuration":141,"maxTaskDuration":141,"avgTaskDuration":141,"shortestDurationTasks":["task_1475091857089_0015_1_01_000000"],"longestDurationTasks":["task_1475091857089_0015_1_01_000000"]},"numFailedTaskAttempts":0,"numKilledTaskAttempts":0,"numCompletedTasks":1,"numSucceededTasks":1,"numKilledTasks":0,"numFailedTasks":0,"servicePlugin":{"taskSchedulerName":"TezYarn","taskSchedulerClassName":"org.apache.tez.dag.app.rm.YarnTaskSchedulerService","taskCommunicatorName":"TezYarn","taskCommunicatorClassName":"org.apache.tez.dag.app.TezTaskCommunicatorImpl","containerLauncherName":"TezYarn","containerLauncherClassName":"org.apache.tez.dag.app.launcher.TezContainerLauncherImpl"}}} > {"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID","events":[{"ts":1475098785994,"eventtype":"DAG_FINISHED"}],"otherinfo":{"startTime":1475098773467,"endTime":1475098785994,"timeTaken":12527,"status":"SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{"counterGroupName":"org.apache.tez.common.counters.DAGCounter","counters":[{"counterName":"NUM_SUCCEEDED_TASKS","counterValue":2},{"counterName":"TOTAL_LAUNCHED_TASKS","counterValue":2},{"counterName":"OTHER_LOCAL_TASKS","counterValue":1},{"counterName":"DATA_LOCAL_TASKS","counterValue":1},{"counterName":"AM_CPU_MILLISECONDS","counterValue":1730},{"counterName":"AM_GC_TIME_MILLIS","counterValue":82}]},{"counterGroupName":"org.apache.tez.common.counters.FileSystemCounter","counterGroupDisplayName":"File > System > Counters","counters":[{"counterName":"FILE_BYTES_WRITTEN","counterValue":42},{"counterName":"HDFS_BYTES_READ","counterValue":14297},{"counterName":"HDFS_READ_OPS","counterValue":2}]},{"counterGroupName":"org.apache.tez.common.counters.TaskCounter","counters":[{"counterName":"GC_TIME_MILLIS","counterValue":265},{"counterName":"CPU_MILLISECONDS","counterValue":8190},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":531628032},{"counterName":"VIRTUAL_MEMORY_BYTES","counterValue":10797260800},{"counterName":"COMMITTED_HEAP_BYTES","counterValue":531628032},{"counterName":"INPUT_RECORDS_PROCESSED","counterValue":1},{"counterName":"INPUT_SPLIT_LENGTH_BYTES","counterValue":14297},{"counterName":"OUTPUT_BYTES_WITH_OVERHEAD","counterValue":6},{"counterName":"OUTPUT_BYTES_PHYSICAL","counterValue":34},{"counterName":"SHUFFLE_PHASE_TIME","counterValue":27},{"counterName":"FIRST_EVENT_RECEIVED","counterValue":25},{"counterName":"LAST_EVENT_RECEIVED","counterValue":25}]}]},"completionApplicationAttemptId":"appattempt_1475091857089_0015_000001","numFailedTaskAttempts":0,"numKilledTaskAttempts":0,"numCompletedTasks":2,"numSucceededTasks":2,"numKilledTasks":0,"numFailedTasks":0}} > {"entity":"tez_container_1475091857089_0015_01_000002","entitytype":"TEZ_CONTAINER_ID","relatedEntities":[{"entity":"appattempt_1475091857089_0015_000001","entitytype":"TEZ_APPLICATION_ATTEMPT"},{"entity":"container_1475091857089_0015_01_000002","entitytype":"containerId"}],"events":[{"ts":1475098794497,"eventtype":"CONTAINER_STOPPED"}],"otherinfo":{"exitStatus":0}} > > > > > On Wednesday, September 28, 2016 1:52 PM, Hitesh Shah > <hit...@apache.org<mailto:hit...@apache.org>> wrote: > > > To pinpoint the issue, one approach would be to change the history logger to > SimpleHistoryLogger . i.e comment out the property for > tez.history.logging.service.class in the configs so that it falls back to the > default value. This should generate a history log file as part of the > application logs which should help us understand whether tez itself is not > generating the data or YARN timeline is somehow losing it. Any exceptions in > the DAGAppMaster log and/or the yarn timeline logs when this job runs? > > — HItesh > > > > > On Sep 28, 2016, at 1:30 PM, Madhusudan Ramanna > > <m.rama...@ymail.com<mailto:m.rama...@ymail.com>> wrote: > > > > Hitesh, > > > > Some information like appId is getting through to timeline server, but not > > all. See attached. > > > > Here is the output of > > > > http://timelinehost:port/ws/v1/timeline/TEZ_DAG_ID/ > > {"entities":[{"events":[{"timestamp":1475094093409,"eventtype":"DAG_FINISHED","eventinfo":{}},{"timestamp":1475094062692,"eventtype":"DAG_STARTED","eventinfo":{}},{"timestamp":1475094062688,"eventtype":"DAG_INITIALIZED","eventinfo":{}},{"timestamp":1475094062055,"eventtype":"DAG_SUBMITTED","eventinfo":{}}],"entitytype":"TEZ_DAG_ID","entity":"dag_1475091857089_0007_1","starttime":1475094062055,"domain":"DEFAULT","relatedentities":{},"primaryfilters":{},"otherinfo":{}}]} > > > > http://host:8188/ws/v1/timeline/TEZ_DAG_ID/dag_1475091857089_0007_1 > > > > {"events":[{"timestamp":1475094093409,"eventtype":"DAG_FINISHED","eventinfo":{}},{"timestamp":1475094062692,"eventtype":"DAG_STARTED","eventinfo":{}},{"timestamp":1475094062688,"eventtype":"DAG_INITIALIZED","eventinfo":{}},{"timestamp":1475094062055,"eventtype":"DAG_SUBMITTED","eventinfo":{}}],"entitytype":"TEZ_DAG_ID","entity":"dag_1475091857089_0007_1","starttime":1475094062055,"domain":"DEFAULT","relatedentities":{},"primaryfilters":{},"otherinfo":{}} > > > > > > > > On Wednesday, September 28, 2016 8:44 AM, Hitesh Shah > > <hit...@apache.org<mailto:hit...@apache.org>> wrote: > > > > > > Hello Madhusudan, > > > > Thanks for the patience. Let us take this to a jira where once you attach > > more logs, we can root cause the issue. > > > > A few things to attach to the jira: > > - yarn-site.xml > > - tez-site.xml > > - hadoop version > > - timeline server log for the time period in question > > - application logs for any tez app which fails to display > > - output of http://timelinehost:port/ws/v1/timeline/TEZ_DAG_ID/<dag_id>/ ( > > e.g. dag_1475014682883_0027_1 ) > > > > thanks > > — Hitesh > > > > > On Sep 27, 2016, at 10:42 PM, Madhusudan Ramanna > > > <m.rama...@ymail.com<mailto:m.rama...@ymail.com>> wrote: > > > > > > So I downloaded Tez commit 91a397b0ba and built the dist package. We're > > > not seeing the zip exception anymore. > > > > > > However, now Tez UI is completely broken. Not at all sure what is > > > happening here. Please see attached screenshots. > > > > > > > > > 2016-09-28 05:11:40,903 [INFO] [main] |web.WebUIService|: Tez UI History > > > URL: > > > http://dev-cv2.aws:8080/tez-ui/#/tez-app/application_1475014682883_0027 > > > 2016-09-28 05:11:40,908 [INFO] [main] |history.HistoryEventHandler|: > > > Initializing HistoryEventHandler withrecoveryEnabled=true, > > > historyServiceClassName=org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService > > > 2016-09-28 05:11:41,474 [INFO] [main] |impl.TimelineClientImpl|: Timeline > > > service address: http://ts-ip.aws:8188/ws/v1/timeline/ > > > 2016-09-28 05:11:41,474 [INFO] [main] |ats.ATSHistoryLoggingService|: > > > Initializing ATSHistoryLoggingService with maxEventsPerBatch=5, > > > maxPollingTime(ms)=10, waitTimeForShutdown(ms)=-1, > > > TimelineACLManagerClass=org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager > > > 2016-09-28 05:11:41,644 [INFO] [main] |impl.TimelineClientImpl|: Timeline > > > service address: http://ts-ip.aws:8188/ws/v1/timeline/ > > > > > > > > > >>> DAG Execution > > > > > > 2016-09-28 05:11:52,779 [INFO] [IPC Server handler 0 on 44039] > > > |history.HistoryEventHandler|: > > > [HISTORY][DAG:dag_1475014682883_0027_1][Event:DAG_SUBMITTED]: > > > dagID=dag_1475014682883_0027_1, submitTime=1475039511185 > > > > > > > > > Timeline server is up and running. Tez UI is however not able to display > > > DAG and other details > > > > > > thanks, > > > Madhu > > > > > > > > > > > > On Saturday, September 24, 2016 12:01 PM, Hitesh Shah > > > <hit...@apache.org<mailto:hit...@apache.org>> wrote: > > > > > > > > > tez-dist tar balls are not published to maven today - only the module > > > specific jars are. But yes, you could just try a local build to see if > > > you can reproduce the issue with the commit in question. > > > > > > — Hitesh > > > > > > > > > > On Sep 23, 2016, at 6:23 PM, Madhusudan Ramanna > > > > <m.rama...@ymail.com<mailto:m.rama...@ymail.com>> wrote: > > > > > > > > Hitesh and Zhiyuan, > > > > > > > > Apache snapshots doesn't seem to have tez-dist > > > > > > > > http://repository.apache.org/content/groups/snapshots/org/apache/tez/tez-dist/ > > > > > > > > The last one seems to be 0.2.0-SNAPSHOT > > > > > > > > Should I just download based on the commit and recompile ? > > > > > > > > thanks, > > > > Madhu > > > > > > > > > > > > On Friday, September 23, 2016 5:19 PM, Hitesh Shah > > > > <hit...@apache.org<mailto:hit...@apache.org>> wrote: > > > > > > > > > > > > Hello Madhusudan, > > > > > > > > If you look at the MANIFEST.MF inside any of the tez jars, it will > > > > provide the commit hash via the SCM-Revision field. > > > > > > > > The tez client and the DAGAppMaster also log this info at runtime. > > > > > > > > — Hitesh > > > > > > > > > On Sep 23, 2016, at 4:08 PM, Madhusudan Ramanna > > > > > <m.rama...@ymail.com<mailto:m.rama...@ymail.com>> wrote: > > > > > > > > > > Zhiyuan, > > > > > > > > > > We just pulled down the latest snapshot from Apache repository. > > > > > Question, is how can I figure out branch and commit information from > > > > > the snapshot artifact ? > > > > > > > > > > thanks, > > > > > Madhu > > > > > > > > > > > > > > > On Friday, September 23, 2016 10:38 AM, zhiyuan yang > > > > > <sjtu....@gmail.com<mailto:sjtu....@gmail.com>> wrote: > > > > > > > > > > > > > > > Hi Madhu, > > > > > > > > > > It looks like a Inflater-Deflater mismatch to me. From stack traces I > > > > > see you cherry-picked this patch instead of using master branch. > > > > > Would you mind double check whether the patch is correctly > > > > > cherry-picked? > > > > > > > > > > Thanks! > > > > > Zhiyuan > > > > > > > > > >> On Sep 23, 2016, at 10:21 AM, Madhusudan Ramanna > > > > >> <m.rama...@ymail.com<mailto:m.rama...@ymail.com>> wrote: > > > > >> > > > > >> Hello, > > > > >> > > > > >> We're using the Apache snapshot repository to pull latest tez > > > > >> snapshots. > > > > >> > > > > >> We've started seeing this exception: > > > > >> > > > > >> org.apache.tez.dag.api.TezUncheckedException: > > > > >> java.util.zip.ZipException: incorrect header check > > > > >> at > > > > >> org.apache.tez.dag.library.vertexmanager.ShuffleVertexManager.handleVertexManagerEvent(ShuffleVertexManager.java:622) > > > > >> at > > > > >> org.apache.tez.dag.library.vertexmanager.ShuffleVertexManager.onVertexManagerEventReceived(ShuffleVertexManager.java:579) > > > > >> at > > > > >> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventReceived.invoke(VertexManager.java:606) > > > > >> at > > > > >> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:647) > > > > >> at > > > > >> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:642) > > > > >> at java.security.AccessController.doPrivileged(Native Method) > > > > >> at javax.security.auth.Subject.doAs(Subject.java:422) > > > > >> at > > > > >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > > > > >> at > > > > >> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:642) > > > > >> at > > > > >> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:631) > > > > >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) > > > > >> at > > > > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > > > >> at > > > > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > > > > >> at java.lang.Thread.run(Thread.java:745) > > > > >> Caused by: java.util.zip.ZipException: incorrect header check > > > > >> at > > > > >> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164) > > > > >> at java.io.FilterInputStream.read(FilterInputStream.java:107) > > > > >> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1792) > > > > >> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769) > > > > >> at org.apache.commons.io.IOUtils.copy(IOUtils.java:1744) > > > > >> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:462) > > > > >> > > > > >> > > > > >> since this commit > > > > >> > > > > >> https://github.com/apache/tez/commit/da4098b9d6f72e6d4aacc1623622a0875408d2ba > > > > >> > > > > >> > > > > >> Wanted to bring this to your attention. For now we've locked the > > > > >> snapshot version down. > > > > >> > > > > >> thanks, > > > > >> Madhu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > <Screen Shot 2016-09-27 at 10.27.02 PM.png><Screen Shot 2016-09-27 at > > > 10.27.13 PM.png><Screen Shot 2016-09-27 at 10.39.20 PM.png> > > > > > > > > > <Screen Shot 2016-09-28 at 1.26.35 PM.png> > > >