[jira] [Commented] (MAPREDUCE-4164) Hadoop 22 Exception thrown after task completion causes its reexecution

2012-04-17 Thread Mayank Bansal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256089#comment-13256089
 ] 

Mayank Bansal commented on MAPREDUCE-4164:
--

1. TaskReporter thread sends status updates/pings periodically to TaskTracker. 
If it needs to send the task progress, it sends STATUS_UPDATE message
to TaskTracker. Otherwise, it sends a PING signal to check if the TaskTracker 
is alive.

2. When the map/reduce phase is over, it calls stopCommunicationThread() which 
interrupts ping/statusupdate thread.

3. If the system was trying to communicate with the server at the time of 
interrupts, it breaks the connection to the
server.Since the interrupt was issued, the stream throws 
ClosedByInterruptException.

5. However in Client.java, Client keeps waiting for the response and it 
basically times out and re-throws this exception.


> Hadoop 22 Exception thrown after task completion causes its reexecution
> ---
>
> Key: MAPREDUCE-4164
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4164
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-4164.patch
>
>
> 2012-02-28 19:17:08,504 INFO org.apache.hadoop.mapred.Merger: Down to the 
> last merge-pass, with 3 segments left of total size: 1969310 bytes
> 2012-02-28 19:17:08,694 INFO org.apache.hadoop.mapred.Task: 
> Task:attempt_201202272306_0794_m_94_0 is done. And is in the process of 
> commiting
> 2012-02-28 19:18:08,774 INFO org.apache.hadoop.mapred.Task: Communication 
> exception: java.io.IOException: Call to /127.0.0.1:35400 failed on local 
> exception: java.nio.channels.ClosedByInterruptException
> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1094)
> at org.apache.hadoop.ipc.Client.call(Client.java:1062)
> at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
> at $Proxy0.statusUpdate(Unknown Source)
> at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:650)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.nio.channels.ClosedByInterruptException
> at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
> at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:60)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
> at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:151)
> at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:112)
> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at java.io.DataOutputStream.flush(DataOutputStream.java:106)
> at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:769)
> at org.apache.hadoop.ipc.Client.call(Client.java:1040)
> ... 4 more
> 2012-02-28 19:18:08,825 INFO org.apache.hadoop.mapred.Task: Task 
> 'attempt_201202272306_0794_m_94_0' done.
> >> SHOULD be <++
> 2012-02-28 19:17:02,214 INFO org.apache.hadoop.mapred.Merger: Down to the 
> last merge-pass, with 3 segments left of total size: 1974104 bytes
> 2012-02-28 19:17:02,408 INFO org.apache.hadoop.mapred.Task: 
> Task:attempt_201202272306_0794_m_00_0 is done. And is in the process of 
> commiting
> 2012-02-28 19:17:02,519 INFO org.apache.hadoop.mapred.Task: Task 
> 'attempt_201202272306_0794_m_00_0' done. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-03-13 Thread Mayank Bansal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228571#comment-13228571
 ] 

Mayank Bansal commented on MAPREDUCE-3837:
--

Thanks Arun for your reply.

a) It reads the user id from the job token stored into the system directory and 
submits the job as that user, so the actual job runs as that user.
b) Yeah you are right, I will add the documentation and append it to the patch.

Thanks,
Mayank

> Hadoop 22 Job tracker is not able to recover job in case of crash and after 
> that no user can submit job.
> 
>
> Key: MAPREDUCE-3837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Fix For: 0.24.0, 0.22.1, 0.23.2
>
> Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, 
> PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, PATCH-HADOOP-1-MAPREDUCE-3837.patch, 
> PATCH-MAPREDUCE-3837.patch, PATCH-TRUNK-MAPREDUCE-3837.patch
>
>
> If job tracker is crashed while running , and there were some jobs are 
> running , so if job tracker's property mapreduce.jobtracker.restart.recover 
> is true then it should recover the job.
> However the current behavior is as follows
> jobtracker try to restore the jobs but it can not . And after that jobtracker 
> closes its handle to hdfs and nobody else can submit job. 
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-03-02 Thread Mayank Bansal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221253#comment-13221253
 ] 

Mayank Bansal commented on MAPREDUCE-3837:
--

Hi Alejandro

Thanks for your help testing this patch, I am really sorry about confusion as I 
missed one function in the patch.  I have attached the new patch , tested it 
and it is working fine in my local environment. I am not sure how I missed that 
before.

Please let me know if you find any more issues with that.

Arun,

I believe the issues were in terms of recovering the jobs from the point they 
crashed. Here what I am doing is very simplistic approach. I am reading the job 
token file and resubmitting the jobs in case of crash and recover. I am not 
trying to recover from the point it left from the last run.

In this scenario it is a new run of the job and works well. The downside is the 
whole job will re run however the upside is Users don't need to resubmit the 
jobs.

Please let me know your thoughts.

Thanks,
Mayank 

> Hadoop 22 Job tracker is not able to recover job in case of crash and after 
> that no user can submit job.
> 
>
> Key: MAPREDUCE-3837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Fix For: 0.24.0, 0.22.1, 0.23.2
>
> Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, 
> PATCH-HADOOP-1-MAPREDUCE-3837.patch, PATCH-MAPREDUCE-3837.patch, 
> PATCH-TRUNK-MAPREDUCE-3837.patch
>
>
> If job tracker is crashed while running , and there were some jobs are 
> running , so if job tracker's property mapreduce.jobtracker.restart.recover 
> is true then it should recover the job.
> However the current behavior is as follows
> jobtracker try to restore the jobs but it can not . And after that jobtracker 
> closes its handle to hdfs and nobody else can submit job. 
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-02-27 Thread Mayank Bansal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217460#comment-13217460
 ] 

Mayank Bansal commented on MAPREDUCE-3837:
--

Attached the patch for Hadoop -1, please review that.

Thanks,
Mayank

> Hadoop 22 Job tracker is not able to recover job in case of crash and after 
> that no user can submit job.
> 
>
> Key: MAPREDUCE-3837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Fix For: 0.24.0, 0.22.1, 0.23.2
>
> Attachments: PATCH-HADOOP-1-MAPREDUCE-3837.patch, 
> PATCH-MAPREDUCE-3837.patch, PATCH-TRUNK-MAPREDUCE-3837.patch
>
>
> If job tracker is crashed while running , and there were some jobs are 
> running , so if job tracker's property mapreduce.jobtracker.restart.recover 
> is true then it should recover the job.
> However the current behavior is as follows
> jobtracker try to restore the jobs but it can not . And after that jobtracker 
> closes its handle to hdfs and nobody else can submit job. 
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-02-09 Thread Mayank Bansal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204944#comment-13204944
 ] 

Mayank Bansal commented on MAPREDUCE-3837:
--

PATCH-MAPREDUCE-3837.patch

this one is for 22 branch. Please review that. Shortly I will be putting the 
same for trunk as well.

Thanks,
Mayank


> Hadoop 22 Job tracker is not able to recover job in case of crash and after 
> that no user can submit job.
> 
>
> Key: MAPREDUCE-3837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: PATCH-MAPREDUCE-3837.patch
>
>
> If job tracker is crashed while running , and there were some jobs are 
> running , so if job tracker's property mapreduce.jobtracker.restart.recover 
> is true then it should recover the job.
> However the current behavior is as follows
> jobtracker try to restore the jobs but it can not . And after that jobtracker 
> closes its handle to hdfs and nobody else can submit job. 
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3725) Hadoop 22 hadoop job -list returns user name as NULL

2012-02-01 Thread Mayank Bansal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198321#comment-13198321
 ] 

Mayank Bansal commented on MAPREDUCE-3725:
--

Tests ran fine on 22

> Hadoop 22 hadoop job -list returns user name as NULL
> 
>
> Key: MAPREDUCE-3725
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3725
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.22.1
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: patch-MAPREDUCE-3725.patch
>
>
> Hadoop 22 hadoop job -list returns user name as NULL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3593) MAPREDUCE Impersonation is not working in 22

2012-01-09 Thread Mayank Bansal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183085#comment-13183085
 ] 

Mayank Bansal commented on MAPREDUCE-3593:
--

All tests are passed for 22 branch.

> MAPREDUCE Impersonation is not working in 22
> 
>
> Key: MAPREDUCE-3593
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3593
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Fix For: 0.22.1
>
> Attachments: MAPREDUCE-3593.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3593) MAPREDUCE Impersonation is not working in 22

2011-12-28 Thread Mayank Bansal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176790#comment-13176790
 ] 

Mayank Bansal commented on MAPREDUCE-3593:
--

Please ignore the above comment by Hadoop QA as this patch is only for 22

> MAPREDUCE Impersonation is not working in 22
> 
>
> Key: MAPREDUCE-3593
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3593
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Fix For: 0.22.1
>
> Attachments: MAPREDUCE-3593.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3593) MAPREDUCE Impersonation is not working in 22

2011-12-21 Thread Mayank Bansal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174443#comment-13174443
 ] 

Mayank Bansal commented on MAPREDUCE-3593:
--

In Hadoop 22 when we try to submit job using proxy users, while proxy user 
tries to impersonate other users based on core-site.xml configuration, It is 
failing due to permission issues for the staging directory. The problem is job 
is submitted using proxy user, it is not able to impersonate as the actual user.

> MAPREDUCE Impersonation is not working in 22
> 
>
> Key: MAPREDUCE-3593
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3593
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira