[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-10-02 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941729#comment-14941729
 ] 

Hitesh Shah commented on TEZ-2602:
--

bq. This is specific to 0.8 codebase and does not need a backport to 0.7.

[~rajesh.balamohan] the earlier comment seem to indicate that this is not 
needed in branch 0.7 but there is a later comment which indicates that it was 
also committed to branch 0.7. Could you please clarify the issue? 

> Throwing EOFException when launching MR job
> ---
>
> Key: TEZ-2602
> URL: https://issues.apache.org/jira/browse/TEZ-2602
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Tsuyoshi Ozawa
>Assignee: Rajesh Balamohan
> Fix For: 0.8.0-alpha, 0.7.1
>
> Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch
>
>
> {quote}
> $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
> wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
> ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
> 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
> /127.0.0.1:8081   
>   
>   
> 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at /0.0.0.0:10200
> 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
> http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
>   
>  
> 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
> 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
> uber mode : false 
>   
>   
> 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
> 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
> state FAILED due to: Vertex failed, vertexName=initialmap, 
> vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
> taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running task:java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:197)
>   
>   
> 
> at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
> at org.apache.hadoop.io.Text.readFields(Text.java:291)
>   
>   
> 
> at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
> at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
>   
> 
> at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
> at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
>   
>   
>  
> at 
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
>   
>   
> 
> at 
> org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
> at 
> org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
>   
>   
>  
> at 
> org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
> at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
>

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-10-02 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942099#comment-14942099
 ] 

Rajesh Balamohan commented on TEZ-2602:
---

[~hitesh] This is due to a corner case of fixing TEZ-2575. Since TEZ-2575 got 
backported to 0.7, this had to be backported as well.

> Throwing EOFException when launching MR job
> ---
>
> Key: TEZ-2602
> URL: https://issues.apache.org/jira/browse/TEZ-2602
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Tsuyoshi Ozawa
>Assignee: Rajesh Balamohan
> Fix For: 0.8.0-alpha, 0.7.1
>
> Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch
>
>
> {quote}
> $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
> wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
> ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
> 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
> /127.0.0.1:8081   
>   
>   
> 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at /0.0.0.0:10200
> 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
> http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
>   
>  
> 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
> 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
> uber mode : false 
>   
>   
> 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
> 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
> state FAILED due to: Vertex failed, vertexName=initialmap, 
> vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
> taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running task:java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:197)
>   
>   
> 
> at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
> at org.apache.hadoop.io.Text.readFields(Text.java:291)
>   
>   
> 
> at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
> at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
>   
> 
> at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
> at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
>   
>   
>  
> at 
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
>   
>   
> 
> at 
> org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
> at 
> org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
>   
>   
>  
> at 
> org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
> at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
>   
>   
> at 
> 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-09-02 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727410#comment-14727410
 ] 

Rajesh Balamohan commented on TEZ-2602:
---

committed to branch-0.7
>>>
commit 52f8cd94081fbb34405bdda4d7cf4919c221208e
>>>

> Throwing EOFException when launching MR job
> ---
>
> Key: TEZ-2602
> URL: https://issues.apache.org/jira/browse/TEZ-2602
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Tsuyoshi Ozawa
>Assignee: Rajesh Balamohan
> Fix For: 0.8.0-alpha, 0.7.1
>
> Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch
>
>
> {quote}
> $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
> wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
> ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
> 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
> /127.0.0.1:8081   
>   
>   
> 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at /0.0.0.0:10200
> 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
> http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
>   
>  
> 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
> 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
> uber mode : false 
>   
>   
> 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
> 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
> state FAILED due to: Vertex failed, vertexName=initialmap, 
> vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
> taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running task:java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:197)
>   
>   
> 
> at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
> at org.apache.hadoop.io.Text.readFields(Text.java:291)
>   
>   
> 
> at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
> at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
>   
> 
> at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
> at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
>   
>   
>  
> at 
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
>   
>   
> 
> at 
> org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
> at 
> org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
>   
>   
>  
> at 
> org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
> at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
>   
>   
> at 
> 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-30 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648437#comment-14648437
 ] 

Siddharth Seth commented on TEZ-2602:
-

[~rajesh.balamohan] - I've updated the affects version based on your comment. 
Please revert the change if the update is not correct.

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa
Assignee: Rajesh Balamohan
 Fix For: 0.8.0

 Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch


 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-26 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642132#comment-14642132
 ] 

Rajesh Balamohan commented on TEZ-2602:
---

This is specific to 0.8 codebase and does not need a backport to 0.7.

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0, 0.8.0
Reporter: Tsuyoshi Ozawa
Assignee: Rajesh Balamohan
 Fix For: 0.8.0

 Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch


 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-23 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638342#comment-14638342
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

Thanks Gopal for review and thanks Rajesh for the fix.

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0, 0.8.0
Reporter: Tsuyoshi Ozawa
Assignee: Rajesh Balamohan
 Fix For: 0.8.0

 Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch


 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-22 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637895#comment-14637895
 ] 

Gopal V commented on TEZ-2602:
--

The patch LGTM - +1.

In a later pass, we can optimize this further to work better for 2Gb + large 
value spills.

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0, 0.8.0
Reporter: Tsuyoshi Ozawa
Assignee: Rajesh Balamohan
 Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch


 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-13 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625727#comment-14625727
 ] 

Rajesh Balamohan commented on TEZ-2602:
---

Thanks [~hitesh]

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0, 0.8.0
Reporter: Tsuyoshi Ozawa
Assignee: Rajesh Balamohan
 Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch


 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
  

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-13 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625768#comment-14625768
 ] 

TezQA commented on TEZ-2602:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12744709/TEZ-2602.1.patch
  against master revision 0804088.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/897//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/897//console

This message is automatically generated.

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0, 0.8.0
Reporter: Tsuyoshi Ozawa
Assignee: Rajesh Balamohan
 Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch


 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-13 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625723#comment-14625723
 ] 

Hitesh Shah commented on TEZ-2602:
--

[~rajesh.balamohan] Re-triggerring precommit. Build failure due to an issue in 
patch committed for TEZ-2616. Not sure why its precommit did not catch it. 

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0, 0.8.0
Reporter: Tsuyoshi Ozawa
Assignee: Rajesh Balamohan
 Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch


 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-13 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625599#comment-14625599
 ] 

TezQA commented on TEZ-2602:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12744709/TEZ-2602.1.patch
  against master revision fd522ce.

{color:red}-1 patch{color}.  master compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/896//console

This message is automatically generated.

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0, 0.8.0
Reporter: Tsuyoshi Ozawa
Assignee: Rajesh Balamohan
 Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch


 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-13 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624310#comment-14624310
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

[~rajesh.balamohan] +1(non-binding), verified new patch works well on my local 
and the new test passes only with the fix.

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0, 0.8.0
Reporter: Tsuyoshi Ozawa
 Attachments: TEZ-2602.1.patch, TEZ-2602.WIP.1.patch


 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-10 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622040#comment-14622040
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

[~rajesh.balamohan] it works well with the patch! 

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0, 0.8.0
Reporter: Tsuyoshi Ozawa
 Attachments: TEZ-2602.WIP.1.patch


 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-09 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620098#comment-14620098
 ] 

Rajesh Balamohan commented on TEZ-2602:
---

Not yet [~ozawa], tried with different settings on local vm. Mind sharing the 
sample input?

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa

 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-09 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619961#comment-14619961
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

[~rajesh.balamohan] could you produce the problem?

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa

 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
   
   

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-09 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620106#comment-14620106
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

The data was generated by randomtextwriter included in hadoop-examples.jar.

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa

 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
   
  

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-09 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620104#comment-14620104
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

[~rajesh.balamohan] Oh, sorry, my input size was still large - 10GB. 

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa

 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
   

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-09 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620165#comment-14620165
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

Trying to reproduce the problem with smaller data. Please wait a moment...

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa

 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
   
   

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-09 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620164#comment-14620164
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

Trying to reproduce the problem with smaller data. Please wait a moment...

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa

 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
   
   

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-09 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620174#comment-14620174
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

[~rajesh.balamohan] I could reproduce the problem with 500mb data. How to 
generate data is:
{code}
$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
randomtextwriter   -Dmapreduce.framework.name=yarn-tez 
-Dmapreduce.randomtextwriter.totalbytes=5 wc500mb  
{code}

How to launch the job is:
{code}
$ time hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapred.reduce.tasks=15 
-Dtez.runtime.sort.threads=1 -Dmapreduce.map.sort.spill.percent=0.1 
-Dio.sort.mb=1 wc500mb3 tezdebug/10  
{code}

{quote}
15/07/09 09:20:11 INFO mapreduce.Job: Running job: job_1435943097882_0035   





  
15/07/09 09:20:12 INFO mapreduce.Job: Job job_1435943097882_0035 running in 
uber mode : false   




  
15/07/09 09:20:12 INFO mapreduce.Job:  map 0% reduce 0% 





  
15/07/09 09:20:18 INFO mapreduce.Job: Job job_1435943097882_0035 failed with 
state FAILED due to: Vertex failed, vertexName=initialmap, 
vertexId=vertex_1435943097882_0035_1_00, diagnostics=[Task failed, 
taskId=task_1435943097882_0035_1_00_03, diagnostics=[TaskAttempt 0 failed, 
info=[Error: Failure while running task:java.io.EOFException


at java.io.DataInputStream.readFully(DataInputStream.java:197)  





  
at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319) 





  
at org.apache.hadoop.io.Text.readFields(Text.java:291)  





  
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)




  

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-07 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617299#comment-14617299
 ] 

Hitesh Shah commented on TEZ-2602:
--

\cc [~rajesh.balamohan]

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa

 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
   

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-07 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617594#comment-14617594
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

[~hitesh] no, no fail with LEGACY sorter.

{quote}
$ time hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapred.reduce.tasks=15 
-Dtez.runtime.sort.threads=1 -Dtez.runtime.sorter.class=LEGACY wc10g tezwc10g9
{quote}
succeeds.

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa

 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-07 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616764#comment-14616764
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

If making -Dtez.runtime.ifile.readahead false, IndexOutOfBoundsException is 
thrown:

{quote}
15/07/07 14:16:35 INFO mapreduce.Job: Job job_1435943097882_0022 failed with 
state FAILED due to: Vertex failed, vertexName=initialmap, 
vertexId=vertex_1435943097882_0022_1_00, diagnostics=[Task failed, 
taskId=task_1435943097882_0022_1_00_05, diagnostics=[TaskAttempt 0 failed, 
info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException
at java.io.DataInputStream.readFully(DataInputStream.java:192)
at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
at org.apache.hadoop.io.Text.readFields(Text.java:291)
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at 
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
at 
org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
at 
org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
at 
org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
at 
org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)
at 
org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
at 
org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
at 
org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:219)
at 
org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
at 
org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:267)
at 
org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:164)
at 
org.apache.tez.mapreduce.processor.map.MapProcessor$NewOutputCollector.write(MapProcessor.java:363)
at 
org.apache.tez.mapreduce.hadoop.mapreduce.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:90)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at 
org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:47)
at 
org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:36)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at 
org.apache.tez.mapreduce.processor.map.MapProcessor.runNewMapper(MapProcessor.java:237)
at 
org.apache.tez.mapreduce.processor.map.MapProcessor.run(MapProcessor.java:124)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:345)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
killedTasks:89, Vertex vertex_1435943097882_0022_1_00 [initialmap] 
killed/failed due to:null]. Vertex killed, vertexName=finalreduce, 
vertexId=vertex_1435943097882_0022_1_01, diagnostics=[Vertex received Kill 
while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, 
failedTasks:0 killedTasks:15, Vertex vertex_1435943097882_0022_1_01 
[finalreduce] killed/failed due to:null]. DAG did not succeed due to 
VERTEX_FAILURE. failedVertices:1 killedVertices:1
15/07/07 14:16:35 INFO mapreduce.Job: Counters: 0
{quote}

 Throwing EOFException when launching MR job
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-07 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617917#comment-14617917
 ] 

Rajesh Balamohan commented on TEZ-2602:
---

[~Tsuyoshi Ozawa] - I am trying to reproduce the issue on my local vm. I tried 
with 700 MB text file with 100 mb sort buffer for pipeliendsorter to ensure 
multiple spills. Job completed without error;  Are there any other setting you 
enable/disable to get this issue?
I am yet to run this at scale (~10GB).

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa

 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at 
 org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.spill(PipelinedSorter.java:463)
   
   
 at 
 

[jira] [Commented] (TEZ-2602) Throwing EOFException when launching MR job

2015-07-07 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617955#comment-14617955
 ] 

Tsuyoshi Ozawa commented on TEZ-2602:
-

[~rajesh.balamohan]  I could reproduce the error with smaller input, 
mapreduce.map.sort.spill.percent and io.sort.mb because the error is 
combiner-related issue.

{quote}
$ time hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapred.reduce.tasks=15 
-Dtez.runtime.sort.threads=1 -Dmapreduce.map.sort.spill.percent=0.1 
-Dio.sort.mb=10 wc500mb tezdebug/7
{quote}

As a reference, this is a complete configuration file I'm using about Tez:
https://gist.github.com/oza/3ab356c25ec64a2298e0

 Throwing EOFException when launching MR job
 ---

 Key: TEZ-2602
 URL: https://issues.apache.org/jira/browse/TEZ-2602
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Tsuyoshi Ozawa

 {quote}
 $hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar 
 wordcount   -Dmapreduce.framework.name=yarn-tez -Dmapr
 ed.reduce.tasks=15 -Dtez.runtime.sort.threads=1 wc10g tezwc10g5 
 15/07/07 13:24:30 INFO client.RMProxy: Connecting to ResourceManager at 
 /127.0.0.1:8081   
   
   
 15/07/07 13:24:30 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 15/07/07 13:24:30 INFO mapreduce.Job: The url to track the job: 
 http://ip-172-31-4-8.ap-northeast-1.compute.internal:8088/proxy/application_1435943097882_0019/
   
  
 15/07/07 13:24:30 INFO mapreduce.Job: Running job: job_1435943097882_0019
 15/07/07 13:24:31 INFO mapreduce.Job: Job job_1435943097882_0019 running in 
 uber mode : false 
   
   
 15/07/07 13:24:31 INFO mapreduce.Job:  map 0% reduce 0%
 15/07/07 13:24:59 INFO mapreduce.Job: Job job_1435943097882_0019 failed with 
 state FAILED due to: Vertex failed, vertexName=initialmap, 
 vertexId=vertex_1435943097882_0019_1_00, diagnostics=[Task failed, 
 taskId=task_1435943097882_0019_1_00_05, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:197)
   
   
 
 at org.apache.hadoop.io.Text.readWithKnownLength(Text.java:319)
 at org.apache.hadoop.io.Text.readFields(Text.java:291)
   
   
 
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
 at 
 org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   
 
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
 at 
 org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
   
   
  
 at 
 org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
 at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)  
   
   
 
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runNewCombiner(MRCombiner.java:191)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:115)  
   
   
  
 at 
 org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.runCombineProcessor(ExternalSorter.java:285)
 at