[jira] [Commented] (TEZ-3074) Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while working with Tez
[ https://issues.apache.org/jira/browse/TEZ-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230029#comment-15230029 ] Alina Abramova commented on TEZ-3074: - After my investigating work I see that cause of this issue is simultaneously writing to file which splits trying to get Tez. When files start being read before they complete writing and/or splits calculation finishes. Do splits are calculated separately from reading files in Tez? > Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while > working with Tez > > > Key: TEZ-3074 > URL: https://issues.apache.org/jira/browse/TEZ-3074 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.3 >Reporter: Oleksiy Sayankin > Fix For: 0.5.3 > > Attachments: tempsource.data > > > *STEP 1. Install and configure Tez on yarn* > *STEP 2. Configure hive for tez* > *STEP 3. Create test tables in Hive and fill it with data* > Enable dynamic partitioning in Hive. Add to {{hive-site.xml}} and restart > Hive. > {code:xml} > > > hive.exec.dynamic.partition > true > > > hive.exec.dynamic.partition.mode > nonstrict > > > hive.exec.max.dynamic.partitions.pernode > 2000 > > > hive.exec.max.dynamic.partitions > 2000 > > {code} > Execute in command line > {code} > hadoop fs -put tempsource.data / > {code} > Execute in command line. Use attached file {{tempsource.data}} > {code} > hive> CREATE TABLE test3 (x INT, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> CREATE TABLE ptest1 (x INT, y STRING) PARTITIONED BY (z STRING) ROW > FORMAT DELIMITED FIELDS TERMINATED BY ','; > hive> CREATE TABLE tempsource (x INT, y STRING, z STRING) ROW FORMAT > DELIMITED FIELDS TERMINATED BY ','; > hive> LOAD DATA INPATH '/tempsource.data' OVERWRITE INTO TABLE tempsource; > hive> INSERT OVERWRITE TABLE ptest1 PARTITION (z) SELECT x,y,z FROM > tempsource; > {code} > *STEP 4. Mount NFS on cluster* > *STEP 5. Run teragen test application* > Use separate console > {code} > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.5.1.jar > teragen -Dmapred.map.tasks=7 -Dmapreduce.map.disk=0 > -Dmapreduce.map.cpu.vcores=0 10 /user/hdfs/input > {code} > *STEP 6. Create many test files* > Use separate console > {code} > cd /hdfs/cluster/user/hive/warehouse/ptest1/z=66 > for i in `seq 1 1`; do dd if=/dev/urandom of=tempfile$i bs=1M count=1; > done > {code} > *STEP 7. Run the following query repeatedly in other console* > Use separate console > {code} > hive> insert overwrite table test3 select x,y from ( select x,y,z from > (select x,y,z from ptest1 where x > 5 and x < 1000 union all select x,y,z > from ptest1 where x > 5 and x < 1000) a)b; > {code} > After some time of working it gives an exception. > {noformat} > Status: Failed > Vertex failed, vertexName=Map 3, vertexId=vertex_1443452487059_0426_1_01, > diagnostics=[Vertex vertex_1443452487059_0426_1_01 [Map 3] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ptest1 initializer failed, > vertex=vertex_1443452487059_0426_1_01 [Map 3], > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.hadoop.mapred.FileInputFormat.getBlockIndex(FileInputFormat.java:395) > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:579) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:359) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:300) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:402) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:132) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > ] >
[jira] [Commented] (TEZ-3074) Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while working with Tez
[ https://issues.apache.org/jira/browse/TEZ-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141602#comment-15141602 ] Siddharth Seth commented on TEZ-3074: - Are these client nodes ? Do you mean configuring hive to run with MR when launched from Node3 and Node4, and running Hive with Tez when running from Node1 and Node2 ? It's simple enough to change the configuration on a single node - or within the Hive script to run with either MR on Tez. > Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while > working with Tez > > > Key: TEZ-3074 > URL: https://issues.apache.org/jira/browse/TEZ-3074 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.3 >Reporter: Oleksiy Sayankin > Fix For: 0.5.3 > > Attachments: tempsource.data > > > *STEP 1. Install and configure Tez on yarn* > *STEP 2. Configure hive for tez* > *STEP 3. Create test tables in Hive and fill it with data* > Enable dynamic partitioning in Hive. Add to {{hive-site.xml}} and restart > Hive. > {code:xml} > > > hive.exec.dynamic.partition > true > > > hive.exec.dynamic.partition.mode > nonstrict > > > hive.exec.max.dynamic.partitions.pernode > 2000 > > > hive.exec.max.dynamic.partitions > 2000 > > {code} > Execute in command line > {code} > hadoop fs -put tempsource.data / > {code} > Execute in command line. Use attached file {{tempsource.data}} > {code} > hive> CREATE TABLE test3 (x INT, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> CREATE TABLE ptest1 (x INT, y STRING) PARTITIONED BY (z STRING) ROW > FORMAT DELIMITED FIELDS TERMINATED BY ','; > hive> CREATE TABLE tempsource (x INT, y STRING, z STRING) ROW FORMAT > DELIMITED FIELDS TERMINATED BY ','; > hive> LOAD DATA INPATH '/tempsource.data' OVERWRITE INTO TABLE tempsource; > hive> INSERT OVERWRITE TABLE ptest1 PARTITION (z) SELECT x,y,z FROM > tempsource; > {code} > *STEP 4. Mount NFS on cluster* > *STEP 5. Run teragen test application* > Use separate console > {code} > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.5.1.jar > teragen -Dmapred.map.tasks=7 -Dmapreduce.map.disk=0 > -Dmapreduce.map.cpu.vcores=0 10 /user/hdfs/input > {code} > *STEP 6. Create many test files* > Use separate console > {code} > cd /hdfs/cluster/user/hive/warehouse/ptest1/z=66 > for i in `seq 1 1`; do dd if=/dev/urandom of=tempfile$i bs=1M count=1; > done > {code} > *STEP 7. Run the following query repeatedly in other console* > Use separate console > {code} > hive> insert overwrite table test3 select x,y from ( select x,y,z from > (select x,y,z from ptest1 where x > 5 and x < 1000 union all select x,y,z > from ptest1 where x > 5 and x < 1000) a)b; > {code} > After some time of working it gives an exception. > {noformat} > Status: Failed > Vertex failed, vertexName=Map 3, vertexId=vertex_1443452487059_0426_1_01, > diagnostics=[Vertex vertex_1443452487059_0426_1_01 [Map 3] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ptest1 initializer failed, > vertex=vertex_1443452487059_0426_1_01 [Map 3], > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.hadoop.mapred.FileInputFormat.getBlockIndex(FileInputFormat.java:395) > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:579) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:359) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:300) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:402) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:132) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) >
[jira] [Commented] (TEZ-3074) Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while working with Tez
[ https://issues.apache.org/jira/browse/TEZ-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140692#comment-15140692 ] Oleksiy Sayankin commented on TEZ-3074: --- Siddharth, is acceptable to configure cluster with Tez in following manner: * Node 1. Configured for Tez * Node 2. Configured for Tez * Node 3. Configured for MapReduce * Node 4. Configured for MapReduce and then run job in following configuration? PS: I remember your questions, our tests team preparing logs for you. > Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while > working with Tez > > > Key: TEZ-3074 > URL: https://issues.apache.org/jira/browse/TEZ-3074 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.3 >Reporter: Oleksiy Sayankin > Fix For: 0.5.3 > > Attachments: tempsource.data > > > *STEP 1. Install and configure Tez on yarn* > *STEP 2. Configure hive for tez* > *STEP 3. Create test tables in Hive and fill it with data* > Enable dynamic partitioning in Hive. Add to {{hive-site.xml}} and restart > Hive. > {code:xml} > > > hive.exec.dynamic.partition > true > > > hive.exec.dynamic.partition.mode > nonstrict > > > hive.exec.max.dynamic.partitions.pernode > 2000 > > > hive.exec.max.dynamic.partitions > 2000 > > {code} > Execute in command line > {code} > hadoop fs -put tempsource.data / > {code} > Execute in command line. Use attached file {{tempsource.data}} > {code} > hive> CREATE TABLE test3 (x INT, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> CREATE TABLE ptest1 (x INT, y STRING) PARTITIONED BY (z STRING) ROW > FORMAT DELIMITED FIELDS TERMINATED BY ','; > hive> CREATE TABLE tempsource (x INT, y STRING, z STRING) ROW FORMAT > DELIMITED FIELDS TERMINATED BY ','; > hive> LOAD DATA INPATH '/tempsource.data' OVERWRITE INTO TABLE tempsource; > hive> INSERT OVERWRITE TABLE ptest1 PARTITION (z) SELECT x,y,z FROM > tempsource; > {code} > *STEP 4. Mount NFS on cluster* > *STEP 5. Run teragen test application* > Use separate console > {code} > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.5.1.jar > teragen -Dmapred.map.tasks=7 -Dmapreduce.map.disk=0 > -Dmapreduce.map.cpu.vcores=0 10 /user/hdfs/input > {code} > *STEP 6. Create many test files* > Use separate console > {code} > cd /hdfs/cluster/user/hive/warehouse/ptest1/z=66 > for i in `seq 1 1`; do dd if=/dev/urandom of=tempfile$i bs=1M count=1; > done > {code} > *STEP 7. Run the following query repeatedly in other console* > Use separate console > {code} > hive> insert overwrite table test3 select x,y from ( select x,y,z from > (select x,y,z from ptest1 where x > 5 and x < 1000 union all select x,y,z > from ptest1 where x > 5 and x < 1000) a)b; > {code} > After some time of working it gives an exception. > {noformat} > Status: Failed > Vertex failed, vertexName=Map 3, vertexId=vertex_1443452487059_0426_1_01, > diagnostics=[Vertex vertex_1443452487059_0426_1_01 [Map 3] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ptest1 initializer failed, > vertex=vertex_1443452487059_0426_1_01 [Map 3], > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.hadoop.mapred.FileInputFormat.getBlockIndex(FileInputFormat.java:395) > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:579) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:359) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:300) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:402) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:132) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at
[jira] [Commented] (TEZ-3074) Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while working with Tez
[ https://issues.apache.org/jira/browse/TEZ-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122115#comment-15122115 ] Siddharth Seth commented on TEZ-3074: - These methods aren't used by Hive or Tez when AM split generation is enabled. They're primarily for client side split generation. The trace shows AM split generation being used. What's happening here is that Hive submits a payload which contains the paths to the Tez AM. It then runs the Hive Split Generator - which actually invokes getSplits. These splits are sent to tasks via RPC. localFiles are not used anywhere in the process. The ideal place to log this would be FileInputFormat itself. Tez includes a version of hadoop-mapreduce jars in it's assembly. So this would involve recompiling hadoop, and rebuilding Tez with the new hadoop bits. You could also try fetching the list of files for which splits are being generated by logging conf.get("mapred.input.dir") in HiveInputFormat before it invokes getSplits. Alternately invoke inputFormat.getInputPaths in HiveInputFormat. Another possible option to try is to set "mapreduce.input.fileinputformat.list-status.num-threads" - which will cause the splits to be generated in a single thread in FileInputFormat. This is the default behaviour though. > Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while > working with Tez > > > Key: TEZ-3074 > URL: https://issues.apache.org/jira/browse/TEZ-3074 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.3 >Reporter: Oleksiy Sayankin > Fix For: 0.5.3 > > Attachments: tempsource.data > > > *STEP 1. Install and configure Tez on yarn* > *STEP 2. Configure hive for tez* > *STEP 3. Create test tables in Hive and fill it with data* > Enable dynamic partitioning in Hive. Add to {{hive-site.xml}} and restart > Hive. > {code:xml} > > > hive.exec.dynamic.partition > true > > > hive.exec.dynamic.partition.mode > nonstrict > > > hive.exec.max.dynamic.partitions.pernode > 2000 > > > hive.exec.max.dynamic.partitions > 2000 > > {code} > Execute in command line > {code} > hadoop fs -put tempsource.data / > {code} > Execute in command line. Use attached file {{tempsource.data}} > {code} > hive> CREATE TABLE test3 (x INT, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> CREATE TABLE ptest1 (x INT, y STRING) PARTITIONED BY (z STRING) ROW > FORMAT DELIMITED FIELDS TERMINATED BY ','; > hive> CREATE TABLE tempsource (x INT, y STRING, z STRING) ROW FORMAT > DELIMITED FIELDS TERMINATED BY ','; > hive> LOAD DATA INPATH '/tempsource.data' OVERWRITE INTO TABLE tempsource; > hive> INSERT OVERWRITE TABLE ptest1 PARTITION (z) SELECT x,y,z FROM > tempsource; > {code} > *STEP 4. Mount NFS on cluster* > *STEP 5. Run teragen test application* > Use separate console > {code} > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.5.1.jar > teragen -Dmapred.map.tasks=7 -Dmapreduce.map.disk=0 > -Dmapreduce.map.cpu.vcores=0 10 /user/hdfs/input > {code} > *STEP 6. Create many test files* > Use separate console > {code} > cd /hdfs/cluster/user/hive/warehouse/ptest1/z=66 > for i in `seq 1 1`; do dd if=/dev/urandom of=tempfile$i bs=1M count=1; > done > {code} > *STEP 7. Run the following query repeatedly in other console* > Use separate console > {code} > hive> insert overwrite table test3 select x,y from ( select x,y,z from > (select x,y,z from ptest1 where x > 5 and x < 1000 union all select x,y,z > from ptest1 where x > 5 and x < 1000) a)b; > {code} > After some time of working it gives an exception. > {noformat} > Status: Failed > Vertex failed, vertexName=Map 3, vertexId=vertex_1443452487059_0426_1_01, > diagnostics=[Vertex vertex_1443452487059_0426_1_01 [Map 3] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ptest1 initializer failed, > vertex=vertex_1443452487059_0426_1_01 [Map 3], > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.hadoop.mapred.FileInputFormat.getBlockIndex(FileInputFormat.java:395) > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:579) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:359) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:300) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:402) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:132) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) > at >
[jira] [Commented] (TEZ-3074) Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while working with Tez
[ https://issues.apache.org/jira/browse/TEZ-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15121261#comment-15121261 ] Oleksiy Sayankin commented on TEZ-3074: --- Yes, turning off Tez and using just MapReduce fixes the issue. But our customer wants to use Tez to speed up Hive queries. Actually, these steps only simulate production cluster behavior, but not the exactly the same. They were found by our support team. To figure out what is going on with block location and why blkLocations.length = 0, we have added logging statements into Tes sources. Here they are: {code:title=org.apache.tez.dag.api.DAG.java|borderStyle=solid} public synchronized DAG addTaskLocalFiles(MaplocalFiles) { Preconditions.checkNotNull(localFiles); logLocalFiles(localFiles); logCommonTaskLocalFiles(commonTaskLocalFiles); TezCommonUtils.addAdditionalLocalResources(localFiles, commonTaskLocalFiles, "DAG " + getName()); return this; } private static void logLocalFiles(Map localFiles){ LOG.info("###@@@ localFiles:"); for(Map.Entry entry : localFiles.entrySet()){ String key = entry.getKey(); LocalResource localRecourse = entry.getValue(); LOG.info("###@@@001 key = " + key + ", localRecourse.getSize() = " + localRecourse.getSize() + ", localRecourse.getType() = " + localRecourse.getType() + ", localRecourse.getVisibility() = " + localRecourse.getVisibility()); } } private static void logCommonTaskLocalFiles(Map commonTaskLocalFiles){ LOG.info("###@@@ commonTaskLocalFiles:"); for(Map.Entry entry : commonTaskLocalFiles.entrySet()){ String key = entry.getKey(); LocalResource localRecourse = entry.getValue(); LOG.info("###@@@002 key = " + key + ", localRecourse.getSize() = " + localRecourse.getSize() + ", localRecourse.getType() = " + localRecourse.getType() + ", localRecourse.getVisibility() = " + localRecourse.getVisibility()); } } {code} and {code:title=org.apache.tez.mapreduce.hadoop.MRInputHelpers.java|borderStyle=solid} private static void updateLocalResourcesForInputSplits( FileSystem fs, InputSplitInfo inputSplitInfo, Map localResources) throws IOException { if (localResources.containsKey(JOB_SPLIT_RESOURCE_NAME)) { throw new RuntimeException("LocalResources already contains a" + " resource named " + JOB_SPLIT_RESOURCE_NAME); } if (localResources.containsKey(JOB_SPLIT_METAINFO_RESOURCE_NAME)) { throw new RuntimeException("LocalResources already contains a" + " resource named " + JOB_SPLIT_METAINFO_RESOURCE_NAME); } LOG.info("###@@@003 inputSplitInfo.getSplitsFile() = " + inputSplitInfo.getSplitsFile()); {code} But it gave nothing. Exception happened before any tag {noformat}###@@@{noformat} was printed out. > Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while > working with Tez > > > Key: TEZ-3074 > URL: https://issues.apache.org/jira/browse/TEZ-3074 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.3 >Reporter: Oleksiy Sayankin > Fix For: 0.5.3 > > Attachments: tempsource.data > > > *STEP 1. Install and configure Tez on yarn* > *STEP 2. Configure hive for tez* > *STEP 3. Create test tables in Hive and fill it with data* > Enable dynamic partitioning in Hive. Add to {{hive-site.xml}} and restart > Hive. > {code:xml} > > > hive.exec.dynamic.partition > true > > > hive.exec.dynamic.partition.mode > nonstrict > > > hive.exec.max.dynamic.partitions.pernode > 2000 > > > hive.exec.max.dynamic.partitions > 2000 > > {code} > Execute in command line > {code} > hadoop fs -put tempsource.data / > {code} > Execute in command line. Use attached file {{tempsource.data}} > {code} > hive> CREATE TABLE test3 (x INT, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> CREATE TABLE ptest1 (x INT, y STRING) PARTITIONED BY (z STRING) ROW > FORMAT DELIMITED FIELDS TERMINATED BY ','; > hive> CREATE TABLE tempsource (x INT, y STRING, z STRING) ROW FORMAT > DELIMITED FIELDS TERMINATED BY ','; > hive> LOAD DATA INPATH '/tempsource.data' OVERWRITE INTO TABLE tempsource; > hive> INSERT OVERWRITE TABLE ptest1 PARTITION (z) SELECT x,y,z FROM > tempsource; > {code} > *STEP 4. Mount NFS on cluster* > *STEP 5. Run teragen test application* > Use separate console > {code} > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.5.1.jar > teragen -Dmapred.map.tasks=7 -Dmapreduce.map.disk=0 > -Dmapreduce.map.cpu.vcores=0 10 /user/hdfs/input > {code} > *STEP 6. Create many test files* > Use separate
[jira] [Commented] (TEZ-3074) Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while working with Tez
[ https://issues.apache.org/jira/browse/TEZ-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117590#comment-15117590 ] Hitesh Shah commented on TEZ-3074: -- \cc [~hagleitn] [~sseth] > Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while > working with Tez > > > Key: TEZ-3074 > URL: https://issues.apache.org/jira/browse/TEZ-3074 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.3 >Reporter: Oleksiy Sayankin > Fix For: 0.5.3 > > Attachments: tempsource.data > > > *STEP 1. Install and configure Tez on yarn* > *STEP 2. Configure hive for tez* > *STEP 3. Create test tables in Hive and fill it with data* > Enable dynamic partitioning in Hive. Add to {{hive-site.xml}} and restart > Hive. > {code:xml} > > > hive.exec.dynamic.partition > true > > > hive.exec.dynamic.partition.mode > nonstrict > > > hive.exec.max.dynamic.partitions.pernode > 2000 > > > hive.exec.max.dynamic.partitions > 2000 > > {code} > Execute in command line > {code} > hadoop fs -put tempsource.data / > {code} > Execute in command line. Use attached file {{tempsource.data}} > {code} > hive> CREATE TABLE test3 (x INT, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> CREATE TABLE ptest1 (x INT, y STRING) PARTITIONED BY (z STRING) ROW > FORMAT DELIMITED FIELDS TERMINATED BY ','; > hive> CREATE TABLE tempsource (x INT, y STRING, z STRING) ROW FORMAT > DELIMITED FIELDS TERMINATED BY ','; > hive> LOAD DATA INPATH '/tempsource.data' OVERWRITE INTO TABLE tempsource; > hive> INSERT OVERWRITE TABLE ptest1 PARTITION (z) SELECT x,y,z FROM > tempsource; > {code} > *STEP 4. Mount NFS on cluster* > *STEP 5. Run teragen test application* > Use separate console > {code} > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.5.1.jar > teragen -Dmapred.map.tasks=7 -Dmapreduce.map.disk=0 > -Dmapreduce.map.cpu.vcores=0 10 /user/hdfs/input > {code} > *STEP 6. Create many test files* > Use separate console > {code} > cd /hdfs/cluster/user/hive/warehouse/ptest1/z=66 > for i in `seq 1 1`; do dd if=/dev/urandom of=tempfile$i bs=1M count=1; > done > {code} > *STEP 7. Run the following query repeatedly in other console* > Use separate console > {code} > hive> insert overwrite table test3 select x,y from ( select x,y,z from > (select x,y,z from ptest1 where x > 5 and x < 1000 union all select x,y,z > from ptest1 where x > 5 and x < 1000) a)b; > {code} > After some time of working it gives an exception. > {noformat} > Status: Failed > Vertex failed, vertexName=Map 3, vertexId=vertex_1443452487059_0426_1_01, > diagnostics=[Vertex vertex_1443452487059_0426_1_01 [Map 3] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ptest1 initializer failed, > vertex=vertex_1443452487059_0426_1_01 [Map 3], > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.hadoop.mapred.FileInputFormat.getBlockIndex(FileInputFormat.java:395) > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:579) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:359) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:300) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:402) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:132) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > ] > Vertex killed, vertexName=Map 1, vertexId=vertex_1443452487059_0426_1_00, > diagnostics=[Vertex received Kill in INITED state., Vertex > vertex_1443452487059_0426_1_00 [Map 1] killed/failed due to:null] > DAG failed due to vertex failure. failedVertices:1
[jira] [Commented] (TEZ-3074) Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while working with Tez
[ https://issues.apache.org/jira/browse/TEZ-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117588#comment-15117588 ] Hitesh Shah commented on TEZ-3074: -- What version of Hive was this run against? > Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while > working with Tez > > > Key: TEZ-3074 > URL: https://issues.apache.org/jira/browse/TEZ-3074 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.3 >Reporter: Oleksiy Sayankin > Fix For: 0.5.3 > > Attachments: tempsource.data > > > *STEP 1. Install and configure Tez on yarn* > *STEP 2. Configure hive for tez* > *STEP 3. Create test tables in Hive and fill it with data* > Enable dynamic partitioning in Hive. Add to {{hive-site.xml}} and restart > Hive. > {code:xml} > > > hive.exec.dynamic.partition > true > > > hive.exec.dynamic.partition.mode > nonstrict > > > hive.exec.max.dynamic.partitions.pernode > 2000 > > > hive.exec.max.dynamic.partitions > 2000 > > {code} > Execute in command line > {code} > hadoop fs -put tempsource.data / > {code} > Execute in command line. Use attached file {{tempsource.data}} > {code} > hive> CREATE TABLE test3 (x INT, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> CREATE TABLE ptest1 (x INT, y STRING) PARTITIONED BY (z STRING) ROW > FORMAT DELIMITED FIELDS TERMINATED BY ','; > hive> CREATE TABLE tempsource (x INT, y STRING, z STRING) ROW FORMAT > DELIMITED FIELDS TERMINATED BY ','; > hive> LOAD DATA INPATH '/tempsource.data' OVERWRITE INTO TABLE tempsource; > hive> INSERT OVERWRITE TABLE ptest1 PARTITION (z) SELECT x,y,z FROM > tempsource; > {code} > *STEP 4. Mount NFS on cluster* > *STEP 5. Run teragen test application* > Use separate console > {code} > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.5.1.jar > teragen -Dmapred.map.tasks=7 -Dmapreduce.map.disk=0 > -Dmapreduce.map.cpu.vcores=0 10 /user/hdfs/input > {code} > *STEP 6. Create many test files* > Use separate console > {code} > cd /hdfs/cluster/user/hive/warehouse/ptest1/z=66 > for i in `seq 1 1`; do dd if=/dev/urandom of=tempfile$i bs=1M count=1; > done > {code} > *STEP 7. Run the following query repeatedly in other console* > Use separate console > {code} > hive> insert overwrite table test3 select x,y from ( select x,y,z from > (select x,y,z from ptest1 where x > 5 and x < 1000 union all select x,y,z > from ptest1 where x > 5 and x < 1000) a)b; > {code} > After some time of working it gives an exception. > {noformat} > Status: Failed > Vertex failed, vertexName=Map 3, vertexId=vertex_1443452487059_0426_1_01, > diagnostics=[Vertex vertex_1443452487059_0426_1_01 [Map 3] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ptest1 initializer failed, > vertex=vertex_1443452487059_0426_1_01 [Map 3], > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.hadoop.mapred.FileInputFormat.getBlockIndex(FileInputFormat.java:395) > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:579) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:359) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:300) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:402) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:132) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > ] > Vertex killed, vertexName=Map 1, vertexId=vertex_1443452487059_0426_1_00, > diagnostics=[Vertex received Kill in INITED state., Vertex > vertex_1443452487059_0426_1_00 [Map 1] killed/failed due to:null] > DAG failed due to vertex failure.
[jira] [Commented] (TEZ-3074) Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while working with Tez
[ https://issues.apache.org/jira/browse/TEZ-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117653#comment-15117653 ] Oleksiy Sayankin commented on TEZ-3074: --- Hive-1.0 > Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while > working with Tez > > > Key: TEZ-3074 > URL: https://issues.apache.org/jira/browse/TEZ-3074 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.3 >Reporter: Oleksiy Sayankin > Fix For: 0.5.3 > > Attachments: tempsource.data > > > *STEP 1. Install and configure Tez on yarn* > *STEP 2. Configure hive for tez* > *STEP 3. Create test tables in Hive and fill it with data* > Enable dynamic partitioning in Hive. Add to {{hive-site.xml}} and restart > Hive. > {code:xml} > > > hive.exec.dynamic.partition > true > > > hive.exec.dynamic.partition.mode > nonstrict > > > hive.exec.max.dynamic.partitions.pernode > 2000 > > > hive.exec.max.dynamic.partitions > 2000 > > {code} > Execute in command line > {code} > hadoop fs -put tempsource.data / > {code} > Execute in command line. Use attached file {{tempsource.data}} > {code} > hive> CREATE TABLE test3 (x INT, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> CREATE TABLE ptest1 (x INT, y STRING) PARTITIONED BY (z STRING) ROW > FORMAT DELIMITED FIELDS TERMINATED BY ','; > hive> CREATE TABLE tempsource (x INT, y STRING, z STRING) ROW FORMAT > DELIMITED FIELDS TERMINATED BY ','; > hive> LOAD DATA INPATH '/tempsource.data' OVERWRITE INTO TABLE tempsource; > hive> INSERT OVERWRITE TABLE ptest1 PARTITION (z) SELECT x,y,z FROM > tempsource; > {code} > *STEP 4. Mount NFS on cluster* > *STEP 5. Run teragen test application* > Use separate console > {code} > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.5.1.jar > teragen -Dmapred.map.tasks=7 -Dmapreduce.map.disk=0 > -Dmapreduce.map.cpu.vcores=0 10 /user/hdfs/input > {code} > *STEP 6. Create many test files* > Use separate console > {code} > cd /hdfs/cluster/user/hive/warehouse/ptest1/z=66 > for i in `seq 1 1`; do dd if=/dev/urandom of=tempfile$i bs=1M count=1; > done > {code} > *STEP 7. Run the following query repeatedly in other console* > Use separate console > {code} > hive> insert overwrite table test3 select x,y from ( select x,y,z from > (select x,y,z from ptest1 where x > 5 and x < 1000 union all select x,y,z > from ptest1 where x > 5 and x < 1000) a)b; > {code} > After some time of working it gives an exception. > {noformat} > Status: Failed > Vertex failed, vertexName=Map 3, vertexId=vertex_1443452487059_0426_1_01, > diagnostics=[Vertex vertex_1443452487059_0426_1_01 [Map 3] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ptest1 initializer failed, > vertex=vertex_1443452487059_0426_1_01 [Map 3], > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.hadoop.mapred.FileInputFormat.getBlockIndex(FileInputFormat.java:395) > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:579) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:359) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:300) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:402) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:132) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > ] > Vertex killed, vertexName=Map 1, vertexId=vertex_1443452487059_0426_1_00, > diagnostics=[Vertex received Kill in INITED state., Vertex > vertex_1443452487059_0426_1_00 [Map 1] killed/failed due to:null] > DAG failed due to vertex failure. failedVertices:1 killedVertices:1 >
[jira] [Commented] (TEZ-3074) Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while working with Tez
[ https://issues.apache.org/jira/browse/TEZ-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118393#comment-15118393 ] Siddharth Seth commented on TEZ-3074: - [~osayankin] - this looks to be an issue in FileInputFormat in MapReduce itself. BlockLocations being an empty array would trigger this at {code}BlockLocation last = blkLocations[blkLocations.length -1];{code} Have you tried running this with MR ? > Multithreading issue java.lang.ArrayIndexOutOfBoundsException: -1 while > working with Tez > > > Key: TEZ-3074 > URL: https://issues.apache.org/jira/browse/TEZ-3074 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.3 >Reporter: Oleksiy Sayankin > Fix For: 0.5.3 > > Attachments: tempsource.data > > > *STEP 1. Install and configure Tez on yarn* > *STEP 2. Configure hive for tez* > *STEP 3. Create test tables in Hive and fill it with data* > Enable dynamic partitioning in Hive. Add to {{hive-site.xml}} and restart > Hive. > {code:xml} > > > hive.exec.dynamic.partition > true > > > hive.exec.dynamic.partition.mode > nonstrict > > > hive.exec.max.dynamic.partitions.pernode > 2000 > > > hive.exec.max.dynamic.partitions > 2000 > > {code} > Execute in command line > {code} > hadoop fs -put tempsource.data / > {code} > Execute in command line. Use attached file {{tempsource.data}} > {code} > hive> CREATE TABLE test3 (x INT, y STRING) ROW FORMAT DELIMITED FIELDS > TERMINATED BY ','; > hive> CREATE TABLE ptest1 (x INT, y STRING) PARTITIONED BY (z STRING) ROW > FORMAT DELIMITED FIELDS TERMINATED BY ','; > hive> CREATE TABLE tempsource (x INT, y STRING, z STRING) ROW FORMAT > DELIMITED FIELDS TERMINATED BY ','; > hive> LOAD DATA INPATH '/tempsource.data' OVERWRITE INTO TABLE tempsource; > hive> INSERT OVERWRITE TABLE ptest1 PARTITION (z) SELECT x,y,z FROM > tempsource; > {code} > *STEP 4. Mount NFS on cluster* > *STEP 5. Run teragen test application* > Use separate console > {code} > /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.5.1.jar > teragen -Dmapred.map.tasks=7 -Dmapreduce.map.disk=0 > -Dmapreduce.map.cpu.vcores=0 10 /user/hdfs/input > {code} > *STEP 6. Create many test files* > Use separate console > {code} > cd /hdfs/cluster/user/hive/warehouse/ptest1/z=66 > for i in `seq 1 1`; do dd if=/dev/urandom of=tempfile$i bs=1M count=1; > done > {code} > *STEP 7. Run the following query repeatedly in other console* > Use separate console > {code} > hive> insert overwrite table test3 select x,y from ( select x,y,z from > (select x,y,z from ptest1 where x > 5 and x < 1000 union all select x,y,z > from ptest1 where x > 5 and x < 1000) a)b; > {code} > After some time of working it gives an exception. > {noformat} > Status: Failed > Vertex failed, vertexName=Map 3, vertexId=vertex_1443452487059_0426_1_01, > diagnostics=[Vertex vertex_1443452487059_0426_1_01 [Map 3] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ptest1 initializer failed, > vertex=vertex_1443452487059_0426_1_01 [Map 3], > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.hadoop.mapred.FileInputFormat.getBlockIndex(FileInputFormat.java:395) > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:579) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:359) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:300) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:402) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:132) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > ] > Vertex killed, vertexName=Map 1,