1. Is it possible to determine from the tez history logs, what the bottleneck for a task/vertex is? Whether it is compute, disk or network?
- Vertex counters and task counters for the vertex can be looked into for determine this. If you have enabled ATS, this would be available in TEZ-UI itself. Otherwise it should be available in the job logs. However, it is not always directly related to compute/disk/network. Sometimes, the vertex is delayed as it has to get the data from the source vertex (think of it more like data dependency), sometimes due to re-execution of task in the source vertex due to failures like disks, or due to cluster slot unavailability and so on. You can also look at using CriticalPathAnalyzer (early version available in 0.8.x) which can help in determining the critical path of the DAG (to determine whether the vertex was slow due to different conditions). E.g HADOOP_CLASSPATH=$TEZ_HOME/*:/$TEZ_HOME/lib/*:$ HADOOP_CLASSPATH yarn jar $TEZ_HOME/tez-job-analyzer-0.8.2-SNAPSHOT.jar CriticalPath --outputDir=/tmp/ --dagId=dag_1443665985063_58064_1 2. What are the common ways to get Tez work on data in memory, as opposed to reading from HDFS. This is to minimize the duration mappers spend in reading from HDFS or disk. - Not sure if you are trying to compare with Spark way of loading the data to memory and working on it. Tez does not have a direct equivalent for this; But Tez has ObjectRegistry (look for BroadcastAndOneToOneExample <https://github.com/apache/tez/blob/b153035b076d4603eb6bc771d675d64181eb02e9/tez-tests/src/main/java/org/apache/tez/mapreduce/examples/BroadcastAndOneToOneExample.java> in tez codebase) where data can be stored in memory to share between tasks. ~Rajesh.B On Tue, Dec 1, 2015 at 12:33 AM, Raajay <[email protected]> wrote: > Hello, > > Two questions > > 1. Is it possible to determine from the tez history logs, what the > bottleneck for a task/vertex is? Whether it is compute, disk or network? > > 2. What are the common ways to get Tez work on data in memory, as opposed > to reading from HDFS. This is to minimize the duration mappers spend in > reading from HDFS or disk. > > Thanks > Raajay >
