from:"Robert Joseph Evans \(JIRA\)"

[jira] [Commented] (HDFS-12984) BlockPoolSlice can leak in a mini dfs cluster

2018-01-12 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324092#comment-16324092
 ] 

Robert Joseph Evans commented on HDFS-12984:


+1 for committing it.

> BlockPoolSlice can leak in a mini dfs cluster
> -
>
> Key: HDFS-12984
> URL: https://issues.apache.org/jira/browse/HDFS-12984
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.5
>Reporter: Robert Joseph Evans
>Assignee: Ajay Kumar
> Attachments: HDFS-12984.001.patch, Screen Shot 2018-01-05 at 4.38.06 
> PM.png, Screen Shot 2018-01-05 at 5.26.54 PM.png, Screen Shot 2018-01-05 at 
> 5.31.52 PM.png
>
>
> When running some unit tests for storm we found that we would occasionally 
> get out of memory errors on the HDFS integration tests.
> When I got a heap dump I found that the ShutdownHookManager was full of 
> BlockPoolSlice$1 instances.  Which hold a reference to the BlockPoolSlice 
> which then in turn holds a reference to the DataNode etc
> It looks like when shutdown is called on the BlockPoolSlice there is no way 
> to remove the shut down hook in because no reference to it is saved.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12984) BlockPoolSlice can leak in a mini dfs cluster

2018-01-08 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317259#comment-16317259
 ] 

Robert Joseph Evans commented on HDFS-12984:


Thanks [~ajayydv],

Looks good to me I am +1

[~kihwal],

It has been a long time since I checked anything into Hadoop.  Would you be 
willing to merge this in, and preferably take a look at it too?

> BlockPoolSlice can leak in a mini dfs cluster
> -
>
> Key: HDFS-12984
> URL: https://issues.apache.org/jira/browse/HDFS-12984
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.5
>Reporter: Robert Joseph Evans
>Assignee: Ajay Kumar
> Attachments: HDFS-12984.001.patch, Screen Shot 2018-01-05 at 4.38.06 
> PM.png, Screen Shot 2018-01-05 at 5.26.54 PM.png, Screen Shot 2018-01-05 at 
> 5.31.52 PM.png
>
>
> When running some unit tests for storm we found that we would occasionally 
> get out of memory errors on the HDFS integration tests.
> When I got a heap dump I found that the ShutdownHookManager was full of 
> BlockPoolSlice$1 instances.  Which hold a reference to the BlockPoolSlice 
> which then in turn holds a reference to the DataNode etc
> It looks like when shutdown is called on the BlockPoolSlice there is no way 
> to remove the shut down hook in because no reference to it is saved.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12984) BlockPoolSlice can leak in a mini dfs cluster

2018-01-08 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16316626#comment-16316626
 ] 

Robert Joseph Evans commented on HDFS-12984:


[~ajayydv],

I also ran into issues trying to reproduce this in some environments.  
Specifically I could never make it happen on my MBP and I don't know why.  But 
if you look at the code inside the BlockPoolSlice 

https://github.com/apache/hadoop/blob/01f3f2167ec20b52a18bc2cf250fb4229cfd2c14/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java#L165-L173

If an instance of this is ever created it can never be collected.  I am not 
sure why BlockPoolSlice instances are created some times by a MiniDFSCluster 
and not others.  I am not that familiar with the internals of the DataNode to 
say off the top of my head.  Glad to see you going in the right direction, and 
I agree that removing everything from the ShutdownHooksManager is far from 
ideal, but I didn't see this happening, at least not with 2.7.5 and 2.6.2.

> BlockPoolSlice can leak in a mini dfs cluster
> -
>
> Key: HDFS-12984
> URL: https://issues.apache.org/jira/browse/HDFS-12984
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.5
>Reporter: Robert Joseph Evans
>Assignee: Ajay Kumar
> Attachments: Screen Shot 2018-01-05 at 4.38.06 PM.png, Screen Shot 
> 2018-01-05 at 5.26.54 PM.png, Screen Shot 2018-01-05 at 5.31.52 PM.png
>
>
> When running some unit tests for storm we found that we would occasionally 
> get out of memory errors on the HDFS integration tests.
> When I got a heap dump I found that the ShutdownHookManager was full of 
> BlockPoolSlice$1 instances.  Which hold a reference to the BlockPoolSlice 
> which then in turn holds a reference to the DataNode etc
> It looks like when shutdown is called on the BlockPoolSlice there is no way 
> to remove the shut down hook in because no reference to it is saved.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-12984) BlockPoolSlice can leak in a mini dfs cluster

2018-01-03 Thread Robert Joseph Evans (JIRA)

Robert Joseph Evans created HDFS-12984:
--

 Summary: BlockPoolSlice can leak in a mini dfs cluster
 Key: HDFS-12984
 URL: https://issues.apache.org/jira/browse/HDFS-12984
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.5
Reporter: Robert Joseph Evans


When running some unit tests for storm we found that we would occasionally get 
out of memory errors on the HDFS integration tests.

When I got a heap dump I found that the ShutdownHookManager was full of 
BlockPoolSlice$1 instances.  Which hold a reference to the BlockPoolSlice which 
then in turn holds a reference to the DataNode etc

It looks like when shutdown is called on the BlockPoolSlice there is no way to 
remove the shut down hook in because no reference to it is saved.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-3594) ListPathsServlet should not log a warning for paths that do not exist

2014-04-15 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3594:
--

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

 ListPathsServlet should not log a warning for paths that do not exist
 -

 Key: HDFS-3594
 URL: https://issues.apache.org/jira/browse/HDFS-3594
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
 Attachments: HDFS-3594.patch, HDFS-3594.patch


 ListPathsServlet logs a warning message every time someone request a listing 
 for a directory that does not exist.  This should be a debug or at most an 
 info message, because the is expected behavior.  People will ask for things 
 that do not exist.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-3594) ListPathsServlet should not log a warning for paths that do not exist

2014-04-15 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969564#comment-13969564
 ] 

Robert Joseph Evans commented on HDFS-3594:
---

Yup HADOOP-10015 makes the log statement a debug, so it is no longer an issue.

 ListPathsServlet should not log a warning for paths that do not exist
 -

 Key: HDFS-3594
 URL: https://issues.apache.org/jira/browse/HDFS-3594
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
 Attachments: HDFS-3594.patch, HDFS-3594.patch


 ListPathsServlet logs a warning message every time someone request a listing 
 for a directory that does not exist.  This should be a debug or at most an 
 info message, because the is expected behavior.  People will ask for things 
 that do not exist.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5852) Change the colors on the hdfs UI

2014-01-30 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13886702#comment-13886702
 ] 

Robert Joseph Evans commented on HDFS-5852:
---

+1 for HDFS-5852.best.txt.  I love purple (Y!) :) 


 Change the colors on the hdfs UI
 

 Key: HDFS-5852
 URL: https://issues.apache.org/jira/browse/HDFS-5852
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Blocker
  Labels: webui
 Fix For: 2.3.0

 Attachments: HDFS-5852.best.txt, HDFS-5852v2.txt, 
 HDFS-5852v3-dkgreen.txt, color-rationale.png, compromise_gray.png, 
 dkgreen.png, hdfs-5852.txt, new_hdfsui_colors.png


 The HDFS UI colors are too close to HWX green.
 Here is a patch that steers clear of vendor colors.
 I made it a blocker thinking this something we'd want to fix before we 
 release apache hadoop 2.3.0.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HDFS-4948) mvn site for hadoop-hdfs-nfs fails

2013-07-02 Thread Robert Joseph Evans (JIRA)

Robert Joseph Evans created HDFS-4948:
-

 Summary: mvn site for hadoop-hdfs-nfs fails
 Key: HDFS-4948
 URL: https://issues.apache.org/jira/browse/HDFS-4948
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Robert Joseph Evans


Running mvn site on trunk results in the following error.
{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-antrun-plugin:1.6:run (default) on project 
hadoop-hdfs-nfs: An Ant BuildException has occured: Warning: Could not find 
file 
/home/evans/src/hadoop-git/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/resources/hdfs-nfs-default.xml
 to copy. - [Help 1]
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-4731) Loading data from HDFS to tape

2013-04-23 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans resolved HDFS-4731.
---

Resolution: Invalid

Is this a question of how you would go about doing this?  If so please use 
u...@hadoop.apache.org instead.  JIRA is not the place for this.  If you are 
proposing a new feature to be added to Hadoop, please provide better details 
about the new feature.  As this JIRA sounds a lot more like the later I am 
closing this as.  If I am wrong and this is a new feature request please feel 
free to reopen the JIRA.

 Loading data from HDFS to tape
 --

 Key: HDFS-4731
 URL: https://issues.apache.org/jira/browse/HDFS-4731
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: prashanthi

 I want to load my HDFS data directly to a tape or external storage device.
 Please let me know if there is any way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4632) globStatus using backslash for escaping does not work on Windows

2013-03-25 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13612831#comment-13612831
 ] 

Robert Joseph Evans commented on HDFS-4632:
---

HADOOP-8139 is another JIRA that had a lot of discussion on it about this 
subject.  I believe that it was never resolved specifically because of the 
windows issue.  You probably want to read through the discussion there as well. 

 globStatus using backslash for escaping does not work on Windows
 

 Key: HDFS-4632
 URL: https://issues.apache.org/jira/browse/HDFS-4632
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth

 {{Path}} normalizes backslashes to forward slashes on Windows.  Later, when 
 passed to {{FileSystem#globStatus}}, the path is no longer treated as an 
 escape sequence.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4199) Provide test for HdfsVolumeId

2012-11-27 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504832#comment-13504832
 ] 

Robert Joseph Evans commented on HDFS-4199:
---

The changes look good to me. I am a +1 on them, but I would like feedback from 
Andrew on this before I check it in.

 Provide test for HdfsVolumeId
 -

 Key: HDFS-4199
 URL: https://issues.apache.org/jira/browse/HDFS-4199
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Ivan A. Veselovsky
Assignee: Ivan A. Veselovsky
Priority: Minor
 Attachments: HADOOP-9053.patch, HDFS-4199--b.patch, 
 HDFS-4199--c.patch, HDFS-4199--d.patch, HDFS-4199.patch


 Provide test for HdfsVolumeId to improve the code coverage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4199) Provide test for HdfsVolumeId

2012-11-19 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13500615#comment-13500615
 ] 

Robert Joseph Evans commented on HDFS-4199:
---

The changes look fairly simple and straight forward.  They also match the code. 
 However, I am just a bit concerned that we are testing/locking in 
functionality that is arguably wrong.

We are testing that new HdfsVolumeId(A, false).equals(new HdfsVolumeId(A, 
true)).  If you look at how the code actually works it starts out by creating a 
bunch of invalid ids with null for the id.  Then it goes off and replaces them 
with valid IDs once it finds them.  I personally don't think that a valid 
volume ID should ever be equal to an invalid one.  I added Andrew who 
originally wrote this code to see if he can take a look at it and tell us if 
this is expected behavior on not.

 Provide test for HdfsVolumeId
 -

 Key: HDFS-4199
 URL: https://issues.apache.org/jira/browse/HDFS-4199
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 2.0.2-alpha
Reporter: Ivan A. Veselovsky
Assignee: Ivan A. Veselovsky
Priority: Minor
 Attachments: HADOOP-9053.patch, HDFS-4199--b.patch, HDFS-4199.patch


 Provide test for HdfsVolumeId to improve the code coverage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-14 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-4182:
--

Attachment: HDFS-4182.txt

This patch fixes an NPE that was found by the existing unit tests and adds in 
some more tests to validate that the changes are working.  I also manually 
brought up a cluster and saw that the NameCache moved out of initializing.

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
Priority: Critical
 Attachments: HDFS-4182.txt, HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-14 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-4182:
--

Attachment: HDFS-4182.txt

Adds in requested annotation.

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
Priority: Critical
 Attachments: HDFS-4182.txt, HDFS-4182.txt, HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-14 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-4182:
--

Status: Patch Available  (was: Open)

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
Priority: Critical
 Attachments: HDFS-4182.txt, HDFS-4182.txt, HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-14 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-4182:
--

Attachment: HDFS-4182.txt

Updated to address Suresh's comments.

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
Priority: Critical
 Attachments: HDFS-4182.txt, HDFS-4182.txt, HDFS-4182.txt, 
 HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-14 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497363#comment-13497363
 ] 

Robert Joseph Evans commented on HDFS-4182:
---

Jenkins came back with a +1 and with a +1 for Surash and a +1 for Kihwal, I 
will check this in.

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
Priority: Critical
 Attachments: HDFS-4182.txt, HDFS-4182.txt, HDFS-4182.txt, 
 HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-14 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497363#comment-13497363
 ] 

Robert Joseph Evans edited comment on HDFS-4182 at 11/14/12 7:17 PM:
-

Jenkins came back with a +1 and with a +1 for Suresh and a +1 for Kihwal, I 
will check this in.

  was (Author: revans2):
Jenkins came back with a +1 and with a +1 for Surash and a +1 for Kihwal, I 
will check this in.
  
 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
Priority: Critical
 Attachments: HDFS-4182.txt, HDFS-4182.txt, HDFS-4182.txt, 
 HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-14 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497438#comment-13497438
 ] 

Robert Joseph Evans commented on HDFS-4182:
---

The patch is in for trunk, and branch-2. I am working on a patch for 
branch-0.23 because there were merge conflicts. It looks like the leak exists 
there too.

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
Priority: Critical
 Attachments: HDFS-4182.txt, HDFS-4182.txt, HDFS-4182.txt, 
 HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-14 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-4182:
--

Attachment: HDFS-4182-branch-0.23.txt

I am attaching the upmerged patch for review.  I am still running the unit 
tests and manual tests to be sure that the leak is plugged.

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
Priority: Critical
 Attachments: HDFS-4182-branch-0.23.txt, HDFS-4182.txt, HDFS-4182.txt, 
 HDFS-4182.txt, HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-14 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497529#comment-13497529
 ] 

Robert Joseph Evans commented on HDFS-4182:
---

All of the unit tests for branch-0.23 passed, so with Daryn's +1 I'll check 
this into branch-0.23 too.

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
Priority: Critical
 Attachments: HDFS-4182-branch-0.23.txt, HDFS-4182.txt, HDFS-4182.txt, 
 HDFS-4182.txt, HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-14 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-4182:
--

   Resolution: Fixed
Fix Version/s: 0.23.5
   2.0.3-alpha
   3.0.0
   Status: Resolved  (was: Patch Available)

I put this into trunk, branch-2, and branch-0.23

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Robert Joseph Evans
Priority: Critical
 Fix For: 3.0.0, 2.0.3-alpha, 0.23.5

 Attachments: HDFS-4182-branch-0.23.txt, HDFS-4182.txt, HDFS-4182.txt, 
 HDFS-4182.txt, HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-13 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13496262#comment-13496262
 ] 

Robert Joseph Evans commented on HDFS-4182:
---

Todd,

Are you working on a patch for this?  It seems critical enough that I really 
would like to get a patch in for 0.23.5.  But I also don't want to start 
working on something if you are already doing it.

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Priority: Critical

 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-13 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-4182:
--

Attachment: HDFS-4182.txt

The patch I am attaching does not include any tests yet.  I wanted to see if 
the direction I was going in seemed OK.  I changed FSDirectory.reset to also 
reset the NameCache and mark the directory as not ready.

Then in the SecondaryNameNode after loading the new image it informs the 
FSDirectory that the image was loaded.

I am going to run some manual tests and then see if I can write some unit tests 
for it. 

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Priority: Critical
 Attachments: HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4182) SecondaryNameNode leaks NameCache entries

2012-11-13 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13496586#comment-13496586
 ] 

Robert Joseph Evans commented on HDFS-4182:
---

Ya I thought about disabling the NameCache, because it is not really needed.  
If you think that would be less of an impact I am happy to switch over to that 
instead. 

 SecondaryNameNode leaks NameCache entries
 -

 Key: HDFS-4182
 URL: https://issues.apache.org/jira/browse/HDFS-4182
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4, 3.0.0, 2.0.2-alpha
Reporter: Todd Lipcon
Priority: Critical
 Attachments: HDFS-4182.txt


 We recently saw an issue where a 2NN ran out of memory, even though it had a 
 relatively small fsimage. When we looked at the heap dump, we saw that all of 
 the memory had gone to entries in the NameCache.
 It appears that the NameCache is staying in initializing mode forever, and 
 therefore a long running 2NN leaks entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4172) namenode does not URI-encode parameters when building URI for datanode request

2012-11-12 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13495441#comment-13495441
 ] 

Robert Joseph Evans commented on HDFS-4172:
---

I only have two very minor comments.

# There are tabs in the code in a few places. (mostly in toValueString())
# In StringParam.toValueString() it is probably not necessary to call 
value.toString().  Very minor.

 namenode does not URI-encode parameters when building URI for datanode request
 --

 Key: HDFS-4172
 URL: https://issues.apache.org/jira/browse/HDFS-4172
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4
Reporter: Derek Dagit
Assignee: Derek Dagit
Priority: Minor
 Attachments: HDFS-4172.patch


 Param values such as
 foobar
 or 
 foo=bar
 Are not escaped in Param.toSortedString()
 When these are given as, say, token parameter values, a string like
 token=foobartoken=foo=bar
 is returned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4172) namenode does not URI-encode parameters when building URI for datanode request

2012-11-12 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13495447#comment-13495447
 ] 

Robert Joseph Evans commented on HDFS-4172:
---

I am not sure what happened with the test failure.  It timed out in Jenkins, 
but when I run it manually with the patch it passes.  I ran it 4 times to be 
sure.

 namenode does not URI-encode parameters when building URI for datanode request
 --

 Key: HDFS-4172
 URL: https://issues.apache.org/jira/browse/HDFS-4172
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4
Reporter: Derek Dagit
Assignee: Derek Dagit
Priority: Minor
 Attachments: HDFS-4172.patch


 Param values such as
 foobar
 or 
 foo=bar
 Are not escaped in Param.toSortedString()
 When these are given as, say, token parameter values, a string like
 token=foobartoken=foo=bar
 is returned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4172) namenode does not URI-encode parameters when building URI for datanode request

2012-11-12 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13495501#comment-13495501
 ] 

Robert Joseph Evans commented on HDFS-4172:
---

The new patch looks good to me.  I am +1.  I'll check it in.

 namenode does not URI-encode parameters when building URI for datanode request
 --

 Key: HDFS-4172
 URL: https://issues.apache.org/jira/browse/HDFS-4172
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4
Reporter: Derek Dagit
Assignee: Derek Dagit
Priority: Minor
 Attachments: HDFS-4172.patch


 Param values such as
 foobar
 or 
 foo=bar
 Are not escaped in Param.toSortedString()
 When these are given as, say, token parameter values, a string like
 token=foobartoken=foo=bar
 is returned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4172) namenode does not URI-encode parameters when building URI for datanode request

2012-11-12 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-4172:
--

   Resolution: Fixed
Fix Version/s: 0.23.5
   2.0.3-alpha
   3.0.0
   Status: Resolved  (was: Patch Available)

Thanks Derek,

I put this into trunk, branch-2, and branch-0.23

 namenode does not URI-encode parameters when building URI for datanode request
 --

 Key: HDFS-4172
 URL: https://issues.apache.org/jira/browse/HDFS-4172
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4
Reporter: Derek Dagit
Assignee: Derek Dagit
Priority: Minor
 Fix For: 3.0.0, 2.0.3-alpha, 0.23.5

 Attachments: HDFS-4172.patch


 Param values such as
 foobar
 or 
 foo=bar
 Are not escaped in Param.toSortedString()
 When these are given as, say, token parameter values, a string like
 token=foobartoken=foo=bar
 is returned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4172) namenode does not URI-encode parameters when building URI for datanode request

2012-11-09 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494416#comment-13494416
 ] 

Robert Joseph Evans commented on HDFS-4172:
---

The patch looks good to me. +1 pending Jenkins.

 namenode does not URI-encode parameters when building URI for datanode request
 --

 Key: HDFS-4172
 URL: https://issues.apache.org/jira/browse/HDFS-4172
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.4
Reporter: Derek Dagit
Assignee: Derek Dagit
Priority: Minor
 Attachments: HDFS-4172.patch


 Param values such as
 foobar
 or 
 foo=bar
 Are not escaped in Param.toSortedString()
 When these are given as, say, token parameter values, a string like
 token=foobartoken=foo=bar
 is returned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans reopened HDFS-3809:
---


Branch-2 is failing with 
{noformat}
main:
 [exec] bkjournal.proto:30:12: NamespaceInfoProto is not defined.
{noformat}

after this was merged in.  Please either fix it or revert the change.

 Make BKJM use protobufs for all serialization with ZK
 -

 Key: HDFS-3809
 URL: https://issues.apache.org/jira/browse/HDFS-3809
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: 3.0.0, 2.0.3-alpha

 Attachments: 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, 
 HDFS-3809.diff, HDFS-3809.diff


 HDFS uses protobufs for serialization in many places. Protobufs allow fields 
 to be added without breaking bc or requiring new parsing code to be written. 
 For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487094#comment-13487094
 ] 

Robert Joseph Evans commented on HDFS-3809:
---

Thanks for doing that Uma. It looks like there is something about the build 
scripts that is causing it, because hdfs.proto, where NameSpaceInfoProto is 
defined, is more or less identical between trunk and branch-2.

 Make BKJM use protobufs for all serialization with ZK
 -

 Key: HDFS-3809
 URL: https://issues.apache.org/jira/browse/HDFS-3809
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: 3.0.0, 2.0.3-alpha

 Attachments: 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, 
 HDFS-3809.diff, HDFS-3809.diff


 HDFS uses protobufs for serialization in many places. Protobufs allow fields 
 to be added without breaking bc or requiring new parsing code to be written. 
 For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3996) Add debug log removed in HDFS-3873 back

2012-10-19 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3996:
--

Fix Version/s: 0.23.5

I pulled this into branch-0.23 too.

 Add debug log removed in HDFS-3873 back
 ---

 Key: HDFS-3996
 URL: https://issues.apache.org/jira/browse/HDFS-3996
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.2-alpha
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: hdfs-3996.txt


 Per HDFS-3873 let's add the debug log back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3483) Better error message when hdfs fsck is run against a ViewFS config

2012-10-19 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3483:
--

Fix Version/s: 0.23.5

I pulled this into branch-0.23 too.

 Better error message when hdfs fsck is run against a ViewFS config
 --

 Key: HDFS-3483
 URL: https://issues.apache.org/jira/browse/HDFS-3483
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Fritz
  Labels: newbie
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: core-site.xml, HDFS-3483.patch, hdfs-site.xml


 I'm running a HA + secure + federated cluster.
 When I run hdfs fsck /nameservices/ha-nn-uri/, I see the following:
 bash-3.2$ hdfs fsck /nameservices/ha-nn-uri/
 FileSystem is viewfs://oracle/
 DFSck exiting.
 Any path I enter will return the same message.
 Attached are my core-site.xml and hdfs-site.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4016) back-port HDFS-3582 to branch-0.23

2012-10-16 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477051#comment-13477051
 ] 

Robert Joseph Evans commented on HDFS-4016:
---

The patch looks good to me.  I am running the unit tests and I will try to 
bring up a small cluster.  If everything goes OK I'll check it in.  +1  Thanks 
for the work Ivan.

 back-port HDFS-3582 to branch-0.23
 --

 Key: HDFS-4016
 URL: https://issues.apache.org/jira/browse/HDFS-4016
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ivan A. Veselovsky
Assignee: Ivan A. Veselovsky
Priority: Minor
 Attachments: HDFS-4016-branch-0.23.patch


 We suggest a patch that back-ports the change 
 https://issues.apache.org/jira/browse/HDFS-3582 to branch 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4016) back-port HDFS-3582 to branch-0.23

2012-10-16 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-4016:
--

   Resolution: Fixed
Fix Version/s: 0.23.5
   Status: Resolved  (was: Patch Available)

Thanks again Ivan.  I put this into branch-0.23

 back-port HDFS-3582 to branch-0.23
 --

 Key: HDFS-4016
 URL: https://issues.apache.org/jira/browse/HDFS-4016
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ivan A. Veselovsky
Assignee: Ivan A. Veselovsky
Priority: Minor
 Fix For: 0.23.5

 Attachments: HDFS-4016-branch-0.23.patch


 We suggest a patch that back-ports the change 
 https://issues.apache.org/jira/browse/HDFS-3582 to branch 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3224) Bug in check for DN re-registration with different storage ID

2012-10-11 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13474215#comment-13474215
 ] 

Robert Joseph Evans commented on HDFS-3224:
---

It looks like a clean port to 0.23.  +1 feel free to check it in.

 Bug in check for DN re-registration with different storage ID
 -

 Key: HDFS-3224
 URL: https://issues.apache.org/jira/browse/HDFS-3224
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Eli Collins
Assignee: Jason Lowe
Priority: Minor
 Fix For: 2.0.3-alpha

 Attachments: HDFS-3224-branch0.23.patch, HDFS-3224.patch, 
 HDFS-3224.patch, HDFS-3224.patch, HDFS-3224.patch


 DatanodeManager#registerDatanode checks the host to node map using an IP:port 
 key, however the map is keyed on IP, so this check will always fail. It's 
 performing the check to determine if a DN with the same IP and storage ID has 
 already registered, and if so to remove this DN from the map and indicate 
 that eg it's no longer hosting these blocks. This bug has been here forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4016) back-port HDFS-3582 to branch-0.23

2012-10-08 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-4016:
--

Target Version/s: 0.23.5

 back-port HDFS-3582 to branch-0.23
 --

 Key: HDFS-4016
 URL: https://issues.apache.org/jira/browse/HDFS-4016
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ivan A. Veselovsky
Assignee: Ivan A. Veselovsky
Priority: Minor
 Attachments: HDFS-4016-branch-0.23.patch


 We suggest a patch that back-ports the change 
 https://issues.apache.org/jira/browse/HDFS-3582 to branch 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4016) back-port HDFS-3582 to branch-0.23

2012-10-08 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471652#comment-13471652
 ] 

Robert Joseph Evans commented on HDFS-4016:
---

There seems to be a lot in this patch that is not in the original.

It looks like you pulled in Time.java so that you could also update 
GenericTestUtils.waitFor, which is something that is not related to HDFS-3582.  
In fact it looks like GenericTestUtils does not need to be updated at all.  
Please revert it and Time.java so if we ever do decide to port HDFS-3641 and 
others it will not be so confusing or difficult.

The same goes for DFSConfigKeys.java.  None of the changes in there are used at 
all.

Also the isActive method inside FSEditLog.java looks like it can still be 
marked as private.

Other then that the port looks good.


 back-port HDFS-3582 to branch-0.23
 --

 Key: HDFS-4016
 URL: https://issues.apache.org/jira/browse/HDFS-4016
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ivan A. Veselovsky
Assignee: Ivan A. Veselovsky
Priority: Minor
 Attachments: HDFS-4016-branch-0.23.patch


 We suggest a patch that back-ports the change 
 https://issues.apache.org/jira/browse/HDFS-3582 to branch 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3919) MiniDFSCluster:waitClusterUp can hang forever

2012-10-01 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3919:
--

Fix Version/s: 0.23.5

I just pulled this into branch-0.23

 MiniDFSCluster:waitClusterUp can hang forever
 -

 Key: HDFS-3919
 URL: https://issues.apache.org/jira/browse/HDFS-3919
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.1-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
Priority: Minor
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: hdfs3919.txt


 A test run hung due to a known system config issue, but the hang was 
 interesting:
 {noformat}
 2012-09-11 13:22:41,888 WARN  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:waitClusterUp(925)) - Waiting for the Mini HDFS Cluster 
 to start...
 2012-09-11 13:22:42,889 WARN  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:waitClusterUp(925)) - Waiting for the Mini HDFS Cluster 
 to start...
 2012-09-11 13:22:43,889 WARN  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:waitClusterUp(925)) - Waiting for the Mini HDFS Cluster 
 to start...
 2012-09-11 13:22:44,890 WARN  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:waitClusterUp(925)) - Waiting for the Mini HDFS Cluster 
 to start...
 {noformat}
 The MiniDFSCluster should give up after a few seconds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3373) FileContext HDFS implementation can leak socket caches

2012-09-28 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465670#comment-13465670
 ] 

Robert Joseph Evans commented on HDFS-3373:
---

The 0.23 patch looks like a fairly straight forward port of the trunk version, 
but what happened to 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSocketCache.java?

 FileContext HDFS implementation can leak socket caches
 --

 Key: HDFS-3373
 URL: https://issues.apache.org/jira/browse/HDFS-3373
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0
Reporter: Todd Lipcon
Assignee: John George
 Fix For: 2.0.3-alpha

 Attachments: HDFS-3373.branch-23.patch, HDFS-3373.branch23.patch, 
 HDFS-3373.trunk.patch, HDFS-3373.trunk.patch.1, HDFS-3373.trunk.patch.2, 
 HDFS-3373.trunk.patch.3, HDFS-3373.trunk.patch.3, HDFS-3373.trunk.patch.4


 As noted by Nicholas in HDFS-3359, FileContext doesn't have a close() method, 
 and thus never calls DFSClient.close(). This means that, until finalizers 
 run, DFSClient will hold on to its SocketCache object and potentially have a 
 lot of outstanding sockets/fds held on to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3373) FileContext HDFS implementation can leak socket caches

2012-09-28 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465721#comment-13465721
 ] 

Robert Joseph Evans commented on HDFS-3373:
---

Makes since.  Because it is such a straight forward patch I feel OK checking 
the code in.  Thanks for the work John.

 FileContext HDFS implementation can leak socket caches
 --

 Key: HDFS-3373
 URL: https://issues.apache.org/jira/browse/HDFS-3373
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0
Reporter: Todd Lipcon
Assignee: John George
 Fix For: 2.0.3-alpha

 Attachments: HDFS-3373.branch-23.patch, HDFS-3373.branch23.patch, 
 HDFS-3373.trunk.patch, HDFS-3373.trunk.patch.1, HDFS-3373.trunk.patch.2, 
 HDFS-3373.trunk.patch.3, HDFS-3373.trunk.patch.3, HDFS-3373.trunk.patch.4


 As noted by Nicholas in HDFS-3359, FileContext doesn't have a close() method, 
 and thus never calls DFSClient.close(). This means that, until finalizers 
 run, DFSClient will hold on to its SocketCache object and potentially have a 
 lot of outstanding sockets/fds held on to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3373) FileContext HDFS implementation can leak socket caches

2012-09-28 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3373:
--

  Resolution: Fixed
   Fix Version/s: 0.23.4
Target Version/s: 2.0.0-alpha, 0.23.3  (was: 0.23.3, 2.0.0-alpha)
  Status: Resolved  (was: Patch Available)

I pulled this into branch-0.23 too

 FileContext HDFS implementation can leak socket caches
 --

 Key: HDFS-3373
 URL: https://issues.apache.org/jira/browse/HDFS-3373
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0
Reporter: Todd Lipcon
Assignee: John George
 Fix For: 0.23.4, 2.0.3-alpha

 Attachments: HDFS-3373.branch-23.patch, HDFS-3373.branch23.patch, 
 HDFS-3373.trunk.patch, HDFS-3373.trunk.patch.1, HDFS-3373.trunk.patch.2, 
 HDFS-3373.trunk.patch.3, HDFS-3373.trunk.patch.3, HDFS-3373.trunk.patch.4


 As noted by Nicholas in HDFS-3359, FileContext doesn't have a close() method, 
 and thus never calls DFSClient.close(). This means that, until finalizers 
 run, DFSClient will hold on to its SocketCache object and potentially have a 
 lot of outstanding sockets/fds held on to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3831) Failure to renew tokens due to test-sources left in classpath

2012-09-27 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13464895#comment-13464895
 ] 

Robert Joseph Evans commented on HDFS-3831:
---

The change looks fine to me.  It is simple and removes the need for 
Mockito/Junit from the FakeRenewer.  +1 I would like to ultimately see the 
tests removed from the classpath.  But that can happen later as we try to clean 
up the classpath in general.  I'll check this in.

 Failure to renew tokens due to test-sources left in classpath
 -

 Key: HDFS-3831
 URL: https://issues.apache.org/jira/browse/HDFS-3831
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: HDFS-3831.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3831) Failure to renew tokens due to test-sources left in classpath

2012-09-27 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3831:
--

   Resolution: Fixed
Fix Version/s: 2.0.3-alpha
   3.0.0
   0.23.4
   Status: Resolved  (was: Patch Available)

Thanks Jason,

I put this into trunk, branch-2, and branch-0.23

 Failure to renew tokens due to test-sources left in classpath
 -

 Key: HDFS-3831
 URL: https://issues.apache.org/jira/browse/HDFS-3831
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 0.23.4, 3.0.0, 2.0.3-alpha

 Attachments: HDFS-3831.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-09-26 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3860:
--

Fix Version/s: 0.23.4

I pulled this into branch-0.23 too

 HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
 -

 Key: HDFS-3860
 URL: https://issues.apache.org/jira/browse/HDFS-3860
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: 0.23.4, 2.0.2-alpha

 Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch


 In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
 monitor thread will acquire the write lock of namesystem, and recheck the 
 safemode. If it is in safemode, the monitor thread will return from the 
 heartbeatCheck function without release the write lock. This may cause the 
 monitor thread wrongly holding the write lock forever.
 The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3626) Creating file with invalid path can corrupt edit log

2012-09-26 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3626:
--

Fix Version/s: (was: 0.23.3)
   0.23.4

 Creating file with invalid path can corrupt edit log
 

 Key: HDFS-3626
 URL: https://issues.apache.org/jira/browse/HDFS-3626
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.23.4, 3.0.0, 2.0.2-alpha

 Attachments: hdfs-3626.txt, hdfs-3626.txt, hdfs-3626.txt, 
 hdfs-3626.txt


 Joris Bontje reports the following:
 The following command results in a corrupt NN editlog (note the double slash 
 and reading from stdin):
 $ cat /usr/share/dict/words | hadoop fs -put - 
 hdfs://localhost:8020//path/file
 After this, restarting the namenode will result into the following fatal 
 exception:
 {code}
 2012-07-10 06:29:19,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Reading 
 /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_173-188
  expecting start txid #173
 2012-07-10 06:29:19,912 ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
 on operation MkdirOp [length=0, path=/, timestamp=1341915658216, 
 permissions=cloudera:supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=182]
 java.lang.ArrayIndexOutOfBoundsException: -1
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3553) Hftp proxy tokens are broken

2012-09-26 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3553:
--

Fix Version/s: (was: 0.23.3)
   0.23.4

 Hftp proxy tokens are broken
 

 Key: HDFS-3553
 URL: https://issues.apache.org/jira/browse/HDFS-3553
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 0.23.4, 3.0.0, 2.0.2-alpha

 Attachments: HDFS-3553-1.branch-1.0.patch, 
 HDFS-3553-2.branch-1.0.patch, HDFS-3553-3.branch-1.0.patch, 
 HDFS-3553.branch-1.0.patch, HDFS-3553.branch-23.patch, HDFS-3553.trunk.patch


 Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
 such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3108) [UI] Few Namenode links are not working

2012-09-26 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3108:
--

Fix Version/s: (was: 0.23.3)
   0.23.4

 [UI] Few Namenode links are not working
 ---

 Key: HDFS-3108
 URL: https://issues.apache.org/jira/browse/HDFS-3108
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0, 0.23.1
Reporter: Brahma Reddy Battula
Priority: Minor
 Fix For: 0.23.4

 Attachments: Scenario2_Trace.txt


 Scenario 1
 ==
 Once tail a file from UI and click on Go Back to File View,I am getting 
 HTTP ERROR 404
 Scenario 2
 ===
 Frequently I am getting following execption If a click on (BrowseFileSystem 
 or anyfile)java.lang.IllegalArgumentException: java.net.UnknownHostException: 
 HOST-10-18-40-24

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3087) Decomissioning on NN restart can complete without blocks being replicated

2012-09-26 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3087:
--

Fix Version/s: (was: 0.23.3)
   0.23.4

 Decomissioning on NN restart can complete without blocks being replicated
 -

 Key: HDFS-3087
 URL: https://issues.apache.org/jira/browse/HDFS-3087
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Fix For: 0.23.0, 0.24.0, 0.23.2, 0.23.4


 If a data node is added to the exclude list and the name node is restarted, 
 the decomissioning happens right away on the data node registration. At this 
 point the initial block report has not been sent, so the name node thinks the 
 node has zero blocks and the decomissioning completes very quick, without 
 replicating the blocks on that node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3973) Old trash directories are never deleted on upgrade from 1.x

2012-09-25 Thread Robert Joseph Evans (JIRA)

Robert Joseph Evans created HDFS-3973:
-

 Summary: Old trash directories are never deleted on upgrade from 
1.x
 Key: HDFS-3973
 URL: https://issues.apache.org/jira/browse/HDFS-3973
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.3, 2.0.2-alpha
Reporter: Robert Joseph Evans


The older format of the trash checkpoint for 1.x is yyMMddHHmm the new format 
is yyMMddHHmmss(-\d+)? so if you upgrade from an old cluster to a new one, all 
of the entires in .trash will never be deleted because they currently are 
always ignored on deletion.

We should support deleting the older format as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3971) Add a resume feature to the copyFromLocal and put commands

2012-09-25 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463192#comment-13463192
 ] 

Robert Joseph Evans commented on HDFS-3971:
---

It almost sounds like you want to turn this into something like rsync.  I think 
it would be much more useful to just add in an rsync command with a simmilar 
set of features and flags then trying to reinvent it piecemeal.  Then it can 
look at time stamps on the files, and possibly checksums as well, to pick up 
where it left off on a failure.

 Add a resume feature to the copyFromLocal and put commands
 --

 Key: HDFS-3971
 URL: https://issues.apache.org/jira/browse/HDFS-3971
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Affects Versions: 2.0.1-alpha
Reporter: Adam Muise
Priority: Minor
 Fix For: 2.0.1-alpha


 Add a resume feature to the copyFromLocal command. Failures in large 
 transfers result in a great deal of wasted time. For large files, it would be 
 good to be able to continue from the last good block onwards. The file would 
 have to be unavailable to other clients for reads or regular writes until the 
 resume process was completed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-09-05 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448781#comment-13448781
 ] 

Robert Joseph Evans commented on HDFS-3731:
---

Thanks for reassigning this to me.  I have been distracted by a number of other 
things, but I should get back to is shortly.

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Robert Joseph Evans
Priority: Blocker
 Fix For: 2.2.0-alpha

 Attachments: hadoop1-bbw.tgz, HDFS-3731.002.patch, HDFS-3731.003.patch


 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.
 The {{DataNode}} will only have one block pool after upgrading from a 1.x 
 release.  (This is because in the 1.x releases, there were no block pools-- 
 or equivalently, everything was in the same block pool).  During the upgrade, 
 we should hardlink the block files from the {{blocksBeingWritten}} directory 
 into the {{rbw}} directory of this block pool.  Similarly, on {{-finalize}}, 
 we should delete the {{blocksBeingWritten}} directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3873) Hftp assumes security is disabled if token fetch fails

2012-08-31 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13446386#comment-13446386
 ] 

Robert Joseph Evans commented on HDFS-3873:
---

It looks good to me.  I am not really an expert on HFTP, but this is a simple 
enough change that I feel OK to give it a +1, but please use your discretion 
before checking it in.

I am not sure why Jenkins ran the tests again and they failed, but when I run 
them with your patch they pass. except for TestHftpDelegationToken which is a 
known issue.

 Hftp assumes security is disabled if token fetch fails
 --

 Key: HDFS-3873
 URL: https://issues.apache.org/jira/browse/HDFS-3873
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-3873.branch-23.patch, HDFS-3873.patch


 Hftp ignores all exceptions generated while trying to get a token, based on 
 the assumption that it means security is disabled.  Debugging problems is 
 excruciatingly difficult when security is enabled but something goes wrong.  
 Job submissions succeed, but tasks fail because the NN rejects the user as 
 unauthenticated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-08-28 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13443221#comment-13443221
 ] 

Robert Joseph Evans commented on HDFS-3731:
---

Any update on branch-0.23?  Do you want me to look into it?

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Colin Patrick McCabe
Priority: Blocker
 Fix For: 2.2.0-alpha

 Attachments: hadoop1-bbw.tgz, HDFS-3731.002.patch, HDFS-3731.003.patch


 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.
 The {{DataNode}} will only have one block pool after upgrading from a 1.x 
 release.  (This is because in the 1.x releases, there were no block pools-- 
 or equivalently, everything was in the same block pool).  During the upgrade, 
 we should hardlink the block files from the {{blocksBeingWritten}} directory 
 into the {{rbw}} directory of this block pool.  Similarly, on {{-finalize}}, 
 we should delete the {{blocksBeingWritten}} directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-08-28 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13443551#comment-13443551
 ] 

Robert Joseph Evans commented on HDFS-3731:
---

Do you have a list of ones you know about?  If not I can start pulling on that 
thread tomorrow.

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Colin Patrick McCabe
Priority: Blocker
 Fix For: 2.2.0-alpha

 Attachments: hadoop1-bbw.tgz, HDFS-3731.002.patch, HDFS-3731.003.patch


 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.
 The {{DataNode}} will only have one block pool after upgrading from a 1.x 
 release.  (This is because in the 1.x releases, there were no block pools-- 
 or equivalently, everything was in the same block pool).  During the upgrade, 
 we should hardlink the block files from the {{blocksBeingWritten}} directory 
 into the {{rbw}} directory of this block pool.  Similarly, on {{-finalize}}, 
 we should delete the {{blocksBeingWritten}} directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3841) Port HDFS-3835 to branch-0.23

2012-08-23 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans resolved HDFS-3841.
---

   Resolution: Fixed
Fix Version/s: 0.23.3

Daryn just checked this in to branch-0.23

 Port HDFS-3835 to branch-0.23
 -

 Key: HDFS-3841
 URL: https://issues.apache.org/jira/browse/HDFS-3841
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.3

 Attachments: HDFS-3841.txt, HDFS-3841.txt, HDFS-3841.txt


 HDFS-3835 does not cleanly merge into branch-0.23.  This is to port it over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3841) Port HDFS-3835 to branch-0.23

2012-08-22 Thread Robert Joseph Evans (JIRA)

Robert Joseph Evans created HDFS-3841:
-

 Summary: Port HDFS-3835 to branch-0.23
 Key: HDFS-3841
 URL: https://issues.apache.org/jira/browse/HDFS-3841
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans


HDFS-3835 does not cleanly merge into branch-0.23.  This is to port it over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3841) Port HDFS-3835 to branch-0.23

2012-08-22 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3841:
--

Attachment: HDFS-3841.txt

This patch only applies to branch-0.23.  The main difference between this patch 
and HDFS-3835 is that the DelegationTokenSecretManager is in a different 
location so FSImage was modified to use the new location.  Also The tests do 
not compile because HDFS-2579 is not part of 0.23 so 
DFS_NAMENODE_DELEGATION_TOKEN_ALWAYS_USE_KEY does not work.  In response I 
removed the test.  It seemed more risky to try and pull out 
DFS_NAMENODE_DELEGATION_TOKEN_ALWAYS_USE_KEY support then to simply remove the 
test.

 Port HDFS-3835 to branch-0.23
 -

 Key: HDFS-3841
 URL: https://issues.apache.org/jira/browse/HDFS-3841
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HDFS-3841.txt


 HDFS-3835 does not cleanly merge into branch-0.23.  This is to port it over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3841) Port HDFS-3835 to branch-0.23

2012-08-22 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3841:
--

Status: Patch Available  (was: Open)

test-patch is not going to work.  I have run several of the HDFS tests manually 
without any failures. I will update the JIRA once my test run completes.

 Port HDFS-3835 to branch-0.23
 -

 Key: HDFS-3841
 URL: https://issues.apache.org/jira/browse/HDFS-3841
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HDFS-3841.txt


 HDFS-3835 does not cleanly merge into branch-0.23.  This is to port it over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3841) Port HDFS-3835 to branch-0.23

2012-08-22 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439823#comment-13439823
 ] 

Robert Joseph Evans commented on HDFS-3841:
---

You are correct, by bad.  I commented it out to validate that it runs, and I 
forgot to remove it.  I'll upload a new patch.  Thanks for the catch.

 Port HDFS-3835 to branch-0.23
 -

 Key: HDFS-3841
 URL: https://issues.apache.org/jira/browse/HDFS-3841
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HDFS-3841.txt


 HDFS-3835 does not cleanly merge into branch-0.23.  This is to port it over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3841) Port HDFS-3835 to branch-0.23

2012-08-22 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3841:
--

Attachment: HDFS-3841.txt

Patch updated without test.

 Port HDFS-3835 to branch-0.23
 -

 Key: HDFS-3841
 URL: https://issues.apache.org/jira/browse/HDFS-3841
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HDFS-3841.txt, HDFS-3841.txt


 HDFS-3835 does not cleanly merge into branch-0.23.  This is to port it over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3841) Port HDFS-3835 to branch-0.23

2012-08-22 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3841:
--

Attachment: HDFS-3841.txt

Patch with space between if and (



 Port HDFS-3835 to branch-0.23
 -

 Key: HDFS-3841
 URL: https://issues.apache.org/jira/browse/HDFS-3841
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HDFS-3841.txt, HDFS-3841.txt, HDFS-3841.txt


 HDFS-3835 does not cleanly merge into branch-0.23.  This is to port it over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3843) Large dist cache can block tasktracker heartbeat

2012-08-22 Thread Robert Joseph Evans (JIRA)

Robert Joseph Evans created HDFS-3843:
-

 Summary: Large dist cache can block tasktracker heartbeat
 Key: HDFS-3843
 URL: https://issues.apache.org/jira/browse/HDFS-3843
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 1.0.0, 0.20.205.0
Reporter: Robert Joseph Evans




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3843) Large dist cache can block tasktracker heartbeat

2012-08-22 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439874#comment-13439874
 ] 

Robert Joseph Evans commented on HDFS-3843:
---

MAPREDUCE-2494 introduced a new lock when releasing a dist cache entry that 
introduced this problem.  Thanks to Koji for finding and debugging this.

Essentially the heartbeat thread holds a lock on the TaskTracker object.
So does the job cleanup thread.  Which also holds a lock on the 
TrackerDistributedCacheMenager's big list lock (this is the lock that 
MAPREDUCE-2494 added in).
The thread that deletes things from the dist cache also grabs that big lock, 
and at the same time grabs locks in turn for every entry in the dist cache.
While an entry in the dist cache is being downloaded it also holds the lock for 
the dist cache entry.

So this can result in

Downloading thread holds dist cache lock which blocks the dist cache delete 
thread which holds the full dist cache map lock that blocks the job cleanup 
thread that holds that Task Tracker lock which blocks the heartbeat thread.  
This can be seen below.

I think it is probably best to change the DistCache entries' locks so that when 
we go to delete them if the lock is held we skip that entry instead of having 
it block. 

{noformat}
Here, tracing from the heartbeat thread.
1=
main prio=10 tid=0x0875c400 nid=0x3fca waiting for monitor entry [0xf73e6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
  at
org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1790)
  - waiting to lock 0xb4299248 (a org.apache.hadoop.mapred.TaskTracker)
  at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
  at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
  at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)

Looking for lock 0xb4299248 
2=
taskCleanup daemon prio=10 tid=0x0949ac00 nid=0x405c waiting for monitor
entry [0xadead000]
   java.lang.Thread.State: BLOCKED (on object monitor)
  at
org.apache.hadoop.filecache.TrackerDistributedCacheManager$CacheStatus.decRefCount(TrackerDistributedCacheManager.java:597)
  - waiting to lock 0xb4214308 (a java.util.LinkedHashMap)
  at
org.apache.hadoop.filecache.TrackerDistributedCacheManager.releaseCache(TrackerDistributedCacheManager.java:233)
  at
org.apache.hadoop.filecache.TaskDistributedCacheManager.release(TaskDistributedCacheManager.java:254)
  at org.apache.hadoop.mapred.TaskTracker.purgeJob(TaskTracker.java:2066)
  - locked 0xb51e5d78 (a org.apache.hadoop.mapred.TaskTracker$RunningJob)
  - locked 0xb4299248 (a org.apache.hadoop.mapred.TaskTracker)
  at org.apache.hadoop.mapred.TaskTracker$1.run(TaskTracker.java:439)
  at java.lang.Thread.run(Thread.java:619)


Looking for the lock 0xb4214308


3=
Thread-27 prio=10 tid=0xae501400 nid=0x4021 waiting for monitor entry
[0xae4ad000]
   java.lang.Thread.State: BLOCKED (on object monitor)
  at
org.apache.hadoop.filecache.TrackerDistributedCacheManager$BaseDirManager.checkAndCleanup(TrackerDistributedCacheManager.java:1019)
  - waiting to lock 0xb52776c0 (a
org.apache.hadoop.filecache.TrackerDistributedCacheManager$CacheStatus)
  - locked 0xb4214308 (a java.util.LinkedHashMap)
  at
org.apache.hadoop.filecache.TrackerDistributedCacheManager$CleanupThread.run(TrackerDistributedCacheManager.java:948)

Looking for the lock 0xb52776c0

4=
Thread-187419 daemon prio=10 tid=0xaa103400 nid=0x3758 runnable [0xad75c000]
   java.lang.Thread.State: RUNNABLE
  at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
  at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
  at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
  at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
  - locked 0xb52998d0 (a sun.nio.ch.Util$1)
  - locked 0xb52998e0 (a java.util.Collections$UnmodifiableSet)
  - locked 0xb5299880 (a sun.nio.ch.EPollSelectorImpl)
  at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
  at
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:332)
  at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
  at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
  at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
  at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
  at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
  at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
  - locked 0xb5505ec8 (a java.io.BufferedInputStream)
  at java.io.DataInputStream.read(DataInputStream.java:132)
  at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:153)
  at

[jira] [Commented] (HDFS-3843) Large dist cache can block tasktracker heartbeat

2012-08-22 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439877#comment-13439877
 ] 

Robert Joseph Evans commented on HDFS-3843:
---

I forgot to add in that I tested this on 0.23, and mrv2 does not have this 
issue at all.  I added in a dist cache entry that takes 30 min to download and 
the job succeeded.

 Large dist cache can block tasktracker heartbeat
 

 Key: HDFS-3843
 URL: https://issues.apache.org/jira/browse/HDFS-3843
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.205.0, 1.0.0
Reporter: Robert Joseph Evans



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3841) Port HDFS-3835 to branch-0.23

2012-08-22 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439895#comment-13439895
 ] 

Robert Joseph Evans commented on HDFS-3841:
---

Thanks for the reviews.  The HDFS unit tests all pass.  I asked Daryn to check 
it in when he gets a chance.

 Port HDFS-3835 to branch-0.23
 -

 Key: HDFS-3841
 URL: https://issues.apache.org/jira/browse/HDFS-3841
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HDFS-3841.txt, HDFS-3841.txt, HDFS-3841.txt


 HDFS-3835 does not cleanly merge into branch-0.23.  This is to port it over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2745) unclear to users which command to use to access the filesystem

2012-08-20 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438027#comment-13438027
 ] 

Robert Joseph Evans commented on HDFS-2745:
---

The changes look good to me +1, non-binding.

 unclear to users which command to use to access the filesystem
 --

 Key: HDFS-2745
 URL: https://issues.apache.org/jira/browse/HDFS-2745
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.0, 1.2.0, 2.2.0-alpha
Reporter: Thomas Graves
Assignee: Andrew Wang
Priority: Critical
 Attachments: hdfs-2745-1.patch


 Its unclear to users which command to use to access the filesystem. Need some 
 background and then we can fix accordingly. We have 3 choices:
 hadoop dfs - says its deprecated and to use hdfs.  If I run hdfs usage it 
 doesn't list any options like -ls in the usage, although there is an hdfs dfs 
 command
 hdfs dfs - not in the usage of hdfs. If we recommend it when running hadoop 
 dfs it should atleast be in the usage.
 hadoop fs - seems like one to use it appears generic for any filesystem.
 Any input on this what is the recommended way to do this?  Based on that we 
 can fix up the other issues. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-08-14 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434190#comment-13434190
 ] 

Robert Joseph Evans commented on HDFS-3731:
---

I am not an HDFS expert but the patch looks good to me. +1 non-binding. 

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch


 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-08-08 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431254#comment-13431254
 ] 

Robert Joseph Evans commented on HDFS-3731:
---

Is there any update on this?  There has been no activity for about a week, and 
this seems fairly critical to fix.

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Colin Patrick McCabe
Priority: Blocker

 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3751) DN should log warnings for lengthy disk IOs

2012-08-02 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427320#comment-13427320
 ] 

Robert Joseph Evans commented on HDFS-3751:
---

If we are collecting this data to be able to output a warning it would be good 
to also keep metrics for each disk.  This would potentially give us the ability 
in the future to have an admin look at the disk metrics and look for outliers.  
They could then investigate further and possible remove the failing disk.

 DN should log warnings for lengthy disk IOs
 ---

 Key: HDFS-3751
 URL: https://issues.apache.org/jira/browse/HDFS-3751
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: 1.2.0, 2.1.0-alpha
Reporter: Todd Lipcon
Assignee: Colin Patrick McCabe

 Occasionally failing disks or other OS-and-below issues cause a single IO to 
 take tens of seconds, or even minutes in the case of failures. This often 
 results in timeout exceptions at the client side which are hard to diagnose. 
 It would be easier to root-cause these issues if the DN logged a WARN like 
 IO of 64kb to volume /data/1/dfs/dn for block 12345234 client 1.2.3.4 took 
 61.3 seconds or somesuch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-07-31 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425798#comment-13425798
 ] 

Robert Joseph Evans commented on HDFS-3731:
---

I thought that hardlinks to directories are not typically supported.  HSF+ on 
the mac is the only one I know of that allows it.  I am nervous about 
implementing an upgrade path that will only work on a Mac.  Did you actually 
mean a symbolic link, or did you intend to hardlink all of the files in the 
directories? 

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Todd Lipcon
Priority: Blocker

 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3696) Create files with WebHdfsFileSystem goes OOM when file size is big

2012-07-26 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423124#comment-13423124
 ] 

Robert Joseph Evans commented on HDFS-3696:
---

Comment Thanks for the patch for branch-0.23.  +1 (non-binding) for it. I 
reviewed the change and ran the tests.

 Create files with WebHdfsFileSystem goes OOM when file size is big
 --

 Key: HDFS-3696
 URL: https://issues.apache.org/jira/browse/HDFS-3696
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Kihwal Lee
Assignee: Tsz Wo (Nicholas), SZE
Priority: Critical
 Fix For: 0.23.3

 Attachments: h3696_20120724.patch, h3696_20120724_0.23.patch, 
 h3696_20120724_b-1.patch


 When doing fs -put to a WebHdfsFileSystem (webhdfs://), the FsShell goes 
 OOM if the file size is large. When I tested, 20MB files were fine, but 200MB 
 didn't work.  
 I also tried reading a large file by issuing -cat and piping to a slow sink 
 in order to force buffering. The read path didn't have this problem. The 
 memory consumption stayed the same regardless of progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-07-26 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423182#comment-13423182
 ] 

Robert Joseph Evans commented on HDFS-3731:
---

I am a bit confused by this, as I am not an expert on HDFS.  I am mainly 
concerned if this does impact 0.23, I assume it does, and if so what that 
impact is.  Does it mean that the datanode could drop the last block from a 
file because that block is in a bbw file as the datanode is upgraded?  You 
mention HBase here, does this only impact a block that is being written to with 
hsync?

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Priority: Blocker

 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-07-26 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423303#comment-13423303
 ] 

Robert Joseph Evans commented on HDFS-3731:
---

Thanks for the clarification Suresh.  If it is simple to put this into 0.23 I 
really would appreciate it.  If not I can do the porting myself when the time 
comes.

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Priority: Blocker

 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-24 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13421678#comment-13421678
 ] 

Robert Joseph Evans commented on HDFS-3667:
---

Nicholas,

If it is too much of a pain to separate them, that is OK. I want the OOM fix in 
0.23, but I realize that is not a priority for a lot of others and I can port 
it over myself once this goes into branch-2. 

 Add retry support to WebHdfsFileSystem
 --

 Key: HDFS-3667
 URL: https://issues.apache.org/jira/browse/HDFS-3667
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3667_20120718.patch, h3667_20120721.patch, 
 h3667_20120722.patch


 DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
 retries on exceptions such as connection failure, safemode.  
 WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3626) Creating file with invalid path can corrupt edit log

2012-07-18 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417343#comment-13417343
 ] 

Robert Joseph Evans commented on HDFS-3626:
---

This is especially true with ViewFs.  A symbolic link for one client could 
point to a totally different file/directory for another client.

 Creating file with invalid path can corrupt edit log
 

 Key: HDFS-3626
 URL: https://issues.apache.org/jira/browse/HDFS-3626
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Attachments: hdfs-3626.txt, hdfs-3626.txt


 Joris Bontje reports the following:
 The following command results in a corrupt NN editlog (note the double slash 
 and reading from stdin):
 $ cat /usr/share/dict/words | hadoop fs -put - 
 hdfs://localhost:8020//path/file
 After this, restarting the namenode will result into the following fatal 
 exception:
 {code}
 2012-07-10 06:29:19,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
 Reading 
 /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_173-188
  expecting start txid #173
 2012-07-10 06:29:19,912 ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
 on operation MkdirOp [length=0, path=/, timestamp=1341915658216, 
 permissions=cloudera:supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=182]
 java.lang.ArrayIndexOutOfBoundsException: -1
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3318) Hftp hangs on transfers 2GB

2012-07-16 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13415140#comment-13415140
 ] 

Robert Joseph Evans commented on HDFS-3318:
---

Yes this probably also impacts branch-1.

 Hftp hangs on transfers 2GB
 

 Key: HDFS-3318
 URL: https://issues.apache.org/jira/browse/HDFS-3318
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0, 0.23.3, 2.0.0-alpha
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 0.23.3

 Attachments: HDFS-3318-1.patch, HDFS-3318.patch


 Hftp transfers 2GB hang after the transfer is complete.  The problem appears 
 to be caused by java internally using an int for the content length.  When it 
 overflows 2GB, it won't check the bounds of the reads on the input stream.  
 The client continues reading after all data is received, and the client 
 blocks until the server times out the connection -- _many_ minutes later.  In 
 conjunction with hftp timeouts, all transfers 2G fail with a read timeout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3622) Backport HDFS-3541 to branch-0.23

2012-07-16 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13415197#comment-13415197
 ] 

Robert Joseph Evans commented on HDFS-3622:
---

HDFS-3541 also depends on HDFS-2878, so I am going to include that here too.  
It is just a fix to some tests.

 Backport HDFS-3541 to branch-0.23
 -

 Key: HDFS-3622
 URL: https://issues.apache.org/jira/browse/HDFS-3622
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans

 HDFS-3541 Deadlock between recovery, xceiver and packet responder 
 does not apply directly to branch-0.23, but the bug exists there too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3577) WebHdfsFileSystem can not read files larger than 24KB

2012-07-16 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3577:
--

 Target Version/s: 1.1.0, 0.23.3, 2.1.0-alpha  (was: 1.1.0, 2.1.0-alpha)
Affects Version/s: 0.23.3

This impacts branch-0.23 as well.  I really would like to see whatever fix 
happens go into branch-0.23 as well.  I applied the latest patch and it looks 
to apply fairly cleanly.  If it does not apply cleanly when checking in the 
file fix I will be happy to port it.

 WebHdfsFileSystem can not read files larger than 24KB
 -

 Key: HDFS-3577
 URL: https://issues.apache.org/jira/browse/HDFS-3577
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Tsz Wo (Nicholas), SZE
Priority: Blocker
 Attachments: h3577_20120705.patch, h3577_20120708.patch, 
 h3577_20120714.patch


 If reading a file large enough for which the httpserver running 
 webhdfs/httpfs uses chunked transfer encoding (more than 24K in the case of 
 webhdfs), then the WebHdfsFileSystem client fails with an IOException with 
 message *Content-Length header is missing*.
 It looks like WebHdfsFileSystem is delegating opening of the inputstream to 
 *ByteRangeInputStream.URLOpener* class, which checks for the *Content-Length* 
 header, but when using chunked transfer encoding the *Content-Length* header 
 is not present and  the *URLOpener.openInputStream()* method thrown an 
 exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3622) Backport HDFS-3541 to branch-0.23

2012-07-16 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3622:
--

Attachment: HDFS-3622.txt

 Backport HDFS-3541 to branch-0.23
 -

 Key: HDFS-3622
 URL: https://issues.apache.org/jira/browse/HDFS-3622
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HDFS-3622.txt


 HDFS-3541 Deadlock between recovery, xceiver and packet responder 
 does not apply directly to branch-0.23, but the bug exists there too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3622) Backport HDFS-3541 to branch-0.23

2012-07-16 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3622:
--

Status: Patch Available  (was: Open)

 Backport HDFS-3541 to branch-0.23
 -

 Key: HDFS-3622
 URL: https://issues.apache.org/jira/browse/HDFS-3622
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HDFS-3622.txt


 HDFS-3541 Deadlock between recovery, xceiver and packet responder 
 does not apply directly to branch-0.23, but the bug exists there too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3622) Backport HDFS-3541 to branch-0.23

2012-07-16 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13415355#comment-13415355
 ] 

Robert Joseph Evans commented on HDFS-3622:
---

I ran all of the HDFS tests on branch-0.23 and they all pass.

 Backport HDFS-3541 to branch-0.23
 -

 Key: HDFS-3622
 URL: https://issues.apache.org/jira/browse/HDFS-3622
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HDFS-3622.txt


 HDFS-3541 Deadlock between recovery, xceiver and packet responder 
 does not apply directly to branch-0.23, but the bug exists there too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3486) offlineimageviewer can't read fsimage files that contain persistent delegation tokens

2012-07-10 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3486:
--

Fix Version/s: 0.23.3

 offlineimageviewer can't read fsimage files that contain persistent 
 delegation tokens
 -

 Key: HDFS-3486
 URL: https://issues.apache.org/jira/browse/HDFS-3486
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security, tools
Affects Versions: 2.0.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 0.23.3, 2.0.1-alpha

 Attachments: HDFS-3486.001.patch, HDFS-3486.002.patch


 OfflineImageViewer (oiv) crashes when trying to read fsimage files that 
 contain persistent delegation tokens.
 Example stack trace:
 {code}
 Caused by: java.lang.IndexOutOfBoundsException
 at java.io.DataInputStream.readFully(DataInputStream.java:175)
 at org.apache.hadoop.io.Text.readFields(Text.java:284)
 at 
 org.apache.hadoop.security.token.delegation.AbstractDelegationTokenIdentifier.readFields(AbstractDelegationTokenIdentifier.java:178)
 at 
 org.apache.hadoop.hdfs.tools.offlineImageViewer.ImageLoaderCurrent.processDelegationTokens(ImageLoaderCurrent.java:222)
 at 
 org.apache.hadoop.hdfs.tools.offlineImageViewer.ImageLoaderCurrent.loadImage(ImageLoaderCurrent.java:186)
 at 
 org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.go(OfflineImageViewer.java:129)
 {code}
 The oiv and loadFSImage code paths are separate.  The issue here seems to be 
 that the loadFSImage code path has diverged from the oiv code path.
 On the loadFSImage code path (from FSImageFormat#loadCurrentTokens):
 {code}
   /**
* Private helper methods to load Delegation tokens from fsimage
*/
   private synchronized void loadCurrentTokens(DataInputStream in)
   throws IOException {
 int numberOfTokens = in.readInt();
 for (int i = 0; i  numberOfTokens; i++) {
   DelegationTokenIdentifier id = new DelegationTokenIdentifier();
   id.readFields(in);
   long expiryTime = in.readLong();
   addPersistedDelegationToken(id, expiryTime);
 }
   }
 {code}
 Notice how it loads a 4-byte int after every DelegationTokenIdentifier.
 On the oiv code path (from ImageLoaderCurrent#processDelegationTokens):
 {code}
 int numDTokens = in.readInt();
 v.visitEnclosingElement(ImageElement.DELEGATION_TOKENS,
 ImageElement.NUM_DELEGATION_TOKENS, numDTokens);
 for(int i=0; inumDTokens; i++){
   DelegationTokenIdentifier id = new  DelegationTokenIdentifier();
   id.readFields(in);
   v.visit(ImageElement.DELEGATION_TOKEN_IDENTIFIER, id.toString());
 }
 {code}
 Notice how it does *not* load a 4-byte int after every 
 DelegationTokenIdentifier.
 This bug seems to have been introduced by change 916534, the same change 
 which introduced persistent delegation tokens.  So I don't think oiv was ever 
 able to decode them in the past.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2978) The NameNode should expose name dir statuses via JMX

2012-07-09 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-2978:
--

Fix Version/s: 0.23.3

 The NameNode should expose name dir statuses via JMX
 

 Key: HDFS-2978
 URL: https://issues.apache.org/jira/browse/HDFS-2978
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: name-node
Affects Versions: 0.23.0, 1.0.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 1.0.2, 0.23.3, 2.0.0-alpha

 Attachments: HDFS-2978-branch-1.patch, HDFS-2978.patch, 
 HDFS-2978.patch


 We currently display this info on the NN web UI, so users who wish to monitor 
 this must either do it manually or parse HTML. We should publish this 
 information via JMX.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3581) FSPermissionChecker#checkPermission sticky bit check missing range check

2012-07-09 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3581:
--

Fix Version/s: 0.23.3

 FSPermissionChecker#checkPermission sticky bit check missing range check 
 -

 Key: HDFS-3581
 URL: https://issues.apache.org/jira/browse/HDFS-3581
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 0.23.3, 2.0.1-alpha

 Attachments: hdfs-3581.txt


 The checkStickyBit call in FSPermissionChecker#checkPermission is missing a 
 range check which results in an index out of bounds when accessing root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3622) Backport HDFS-3541 to branch-0.23

2012-07-09 Thread Robert Joseph Evans (JIRA)

Robert Joseph Evans created HDFS-3622:
-

 Summary: Backport HDFS-3541 to branch-0.23
 Key: HDFS-3622
 URL: https://issues.apache.org/jira/browse/HDFS-3622
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans


HDFS-3541 Deadlock between recovery, xceiver and packet responder 

does not apply directly to branch-0.23, but the bug exists there too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3541) Deadlock between recovery, xceiver and packet responder

2012-07-09 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409849#comment-13409849
 ] 

Robert Joseph Evans commented on HDFS-3541:
---

@Uma,

Sorry it took me so long to respond.  Yes, I would be happy to look into do the 
porting, as the patch does not just apply. I filed HDFS-3622 to do this work on.

 Deadlock between recovery, xceiver and packet responder
 ---

 Key: HDFS-3541
 URL: https://issues.apache.org/jira/browse/HDFS-3541
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: suja s
Assignee: Vinay
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: DN_dump.rar, HDFS-3541-2.patch, HDFS-3541.patch


 Block Recovery initiated while write in progress at Datanode side. Found a 
 lock between recovery, xceiver and packet responder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3594) ListPathsServlet should not log a warning for paths that do not exist

2012-07-03 Thread Robert Joseph Evans (JIRA)

Robert Joseph Evans created HDFS-3594:
-

 Summary: ListPathsServlet should not log a warning for paths that 
do not exist
 Key: HDFS-3594
 URL: https://issues.apache.org/jira/browse/HDFS-3594
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans


ListPathsServlet logs a warning message every time someone request a listing 
for a directory that does not exist.  This should be a debug or at most an info 
message, because the is expected behavior.  People will ask for things that do 
not exist.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3442) Incorrect count for Missing Replicas in FSCK report

2012-07-03 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3442:
--

Fix Version/s: (was: 2.0.1-alpha)
   0.23.3

 Incorrect count for Missing Replicas in FSCK report
 ---

 Key: HDFS-3442
 URL: https://issues.apache.org/jira/browse/HDFS-3442
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: suja s
Assignee: Andrew Wang
Priority: Minor
 Fix For: 0.23.3

 Attachments: HDFS-3442-2.patch, HDFS-3442-3.patch, HDFS-3442.patch


 Scenario:
 Cluster running in HA mode with 2 DNs. Files are written with replication 
 factor as 3.
 There are 7 blocks in cluster.
 FSCK report is including all blocks in UnderReplicated Blocks as well as 
 Missing Replicas.
 HOST-XX-XX-XX-102:/home/Apr4/hadoop-2.0.0-SNAPSHOT/bin # ./hdfs fsck /
 Connecting to namenode via http://XX.XX.XX.55:50070
 FSCK started by root (auth:SIMPLE) from /XX.XX.XX.102 for path / at Wed Apr 
 04 17:28:37 IST 2012
 .
 /1:  Under replicated 
 BP-534619337-XX.XX.XX.55-1333526344705:blk_2551710840802340037_1002. Target 
 Replicas is 3 but found 2 replica(s).
 .
 /2:  Under replicated 
 BP-534619337-XX.XX.XX.55-1333526344705:blk_-3851276776144500288_1004. Target 
 Replicas is 3 but found 2 replica(s).
 .
 /3:  Under replicated 
 BP-534619337-XX.XX.XX.55-1333526344705:blk_-3210606555285049524_1006. Target 
 Replicas is 3 but found 2 replica(s).
 .
 /4:  Under replicated 
 BP-534619337-XX.XX.XX.55-1333526344705:blk_4028835120510075310_1008. Target 
 Replicas is 3 but found 2 replica(s).
 .
 /5:  Under replicated 
 BP-534619337-XX.XX.XX.55-1333526344705:blk_-5238093749956876969_1010. Target 
 Replicas is 3 but found 2 replica(s).
 .
 /testrenamed/file1renamed:  Under replicated 
 BP-534619337-XX.XX.XX.55-1333526344705:blk_-5669194716756513504_1012. Target 
 Replicas is 3 but found 2 replica(s).
 .
 /testrenamed/file2:  Under replicated 
 BP-534619337-XX.XX.XX.55-1333526344705:blk_8510284478280941311_1014. Target 
 Replicas is 3 but found 2 replica(s).
 Status: HEALTHY
  Total size:33215 B
  Total dirs:3
  Total files:   7 (Files currently being written: 1)
  Total blocks (validated):  7 (avg. block size 4745 B)
  Minimally replicated blocks:   7 (100.0 %)
  Over-replicated blocks:0 (0.0 %)
  Under-replicated blocks:   7 (100.0 %)
  Mis-replicated blocks: 0 (0.0 %)
  Default replication factor:3
  Average block replication: 2.0
  Corrupt blocks:0
  Missing replicas:  7 (50.0 %)
  Number of data-nodes:  2
  Number of racks:   1
 FSCK ended at Wed Apr 04 17:28:37 IST 2012 in 2 milliseconds
 The filesystem under path '/' is HEALTHY
 Also it indicates a measure as 50% in brackets (There are only 7 blocks in 
 cluster and so if all 7 are included as Missing replicas it should be 100%)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3591) Backport HDFS-3357 to branch-0.23

2012-07-02 Thread Robert Joseph Evans (JIRA)

Robert Joseph Evans created HDFS-3591:
-

 Summary: Backport HDFS-3357 to branch-0.23
 Key: HDFS-3591
 URL: https://issues.apache.org/jira/browse/HDFS-3591
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans


I would like to have HDFS-3357 in branch-0.23, but it is not a trivial upmerge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3591) Backport HDFS-3357 to branch-0.23

2012-07-02 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3591:
--

Status: Patch Available  (was: Open)

This patch does not apply to trunk, or branch-0.23.  This only applies to 
branch-0.23 as it is backporting code already in trunk, and branch-2.  I ran 
all of the HDFS, and common tests, and they all passed for me.  I also brought 
up a small 3 node cluster and ran a few tests on it, and they all passed.

 Backport HDFS-3357 to branch-0.23
 -

 Key: HDFS-3591
 URL: https://issues.apache.org/jira/browse/HDFS-3591
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HDFS-3357-branch-0.23.txt


 I would like to have HDFS-3357 in branch-0.23, but it is not a trivial 
 upmerge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3554) TestRaidNode is failing

2012-06-25 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400500#comment-13400500
 ] 

Robert Joseph Evans commented on HDFS-3554:
---

It looks like there is no history server up and running.  In Yarn there is a 
race in the client.  If the client asks for status if the AM is still up and 
running then it will talk to the AM.  If it has exited, which it tends to do 
when the MR job has completed then the client will fall over to the history 
server.  It looks like while you are running using the minicluster there is no 
corresponding history server to fulfill the request.

 TestRaidNode is failing
 ---

 Key: HDFS-3554
 URL: https://issues.apache.org/jira/browse/HDFS-3554
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/raid, test
Affects Versions: 3.0.0
Reporter: Jason Lowe
Assignee: Weiyan Wang

 After MAPREDUCE-3868 re-enabled raid, TestRaidNode has been failing in 
 Jenkins builds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project

2012-06-20 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13397800#comment-13397800
 ] 

Robert Joseph Evans commented on HDFS-3549:
---

+1 the change looks good to me, but I am not an HDFS committer so you are going 
to need someone else to +1 and commit it.

 dist tar build fails in hadoop-hdfs-raid project
 

 Key: HDFS-3549
 URL: https://issues.apache.org/jira/browse/HDFS-3549
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: HDFS-3549.patch


 Trying to build the distribution tarball in a clean tree via {{mvn install 
 -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error:
 {noformat}
 main:
  [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file 
 or directory
  [exec] tar: Exiting with failure status due to previous errors
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3541) Deadlock between recovery, xceiver and packet responder

2012-06-18 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HDFS-3541:
--

 Target Version/s: 0.23.3
Affects Version/s: 0.23.3

I really would like to see this fixed in 0.23 as well.

 Deadlock between recovery, xceiver and packet responder
 ---

 Key: HDFS-3541
 URL: https://issues.apache.org/jira/browse/HDFS-3541
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: suja s
Assignee: Vinay
 Attachments: DN_dump.rar


 Block Recovery initiated while write in progress at Datanode side. Found a 
 lock between recovery, xceiver and packet responder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3527) Distributed cache object changed Error

2012-06-18 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395962#comment-13395962
 ] 

Robert Joseph Evans commented on HDFS-3527:
---

Just guessing here from the name of the JIRA, but a Distributed cache object 
changed error typically happens when the file on HDFS changes in-between the 
submission of a job and a container being launched that is going to download 
the file.  It sounds to me like it may be a test error where you a submitting 
lots of jobs and changing an object in HDFS in between them that is shared by 
all of the jobs.  I cannot be sure without more information though.

 Distributed cache object changed Error
 --

 Key: HDFS-3527
 URL: https://issues.apache.org/jira/browse/HDFS-3527
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.0.0-alpha
Reporter: Sujay Rau

 I'm writing some automation test code that basically runs the teragen, 
 terasort, teravalidate sequence while repeatedly doing a manual failover 
 throughout. About every fourth time I run the test script, the terasort phase 
 crashes and returns the following error: 
 http://c0405.hal.cloudera.com:50030/jobtasks.jsp?jobid=job_201206041130_1482type=setuppagenum=1state=killed.
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3483) hdfs fsck doesn't run with ViewFS path

2012-06-04 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288652#comment-13288652
 ] 

Robert Joseph Evans commented on HDFS-3483:
---

Daryn,

I tend to disagree that we don't want to expose the mapping.  I think it is 
incredibly useful to be able to know what is happening here, and expose it to 
the end user so they can then reason about what they want to have happen.  For 
example doing a mv from one federated namespace to another will either be very 
slow, or it will fail, I don't remember which it is right now. In either case 
it would be good to expose the mounting to both the end user, and also 
programatically so that appropriate steps can be taken in those situations.  
Even if the step is just to call up ops and complain that they have the 
namespaces all wrong for what they want to do.

All OSes expose it, type mount on Linux and it will list where each file system 
is mounted.

 hdfs fsck doesn't run with ViewFS path
 --

 Key: HDFS-3483
 URL: https://issues.apache.org/jira/browse/HDFS-3483
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
  Labels: newbie
 Attachments: core-site.xml, hdfs-site.xml


 I'm running a HA + secure + federated cluster.
 When I run hdfs fsck /nameservices/ha-nn-uri/, I see the following:
 bash-3.2$ hdfs fsck /nameservices/ha-nn-uri/
 FileSystem is viewfs://oracle/
 DFSck exiting.
 Any path I enter will return the same message.
 Attached are my core-site.xml and hdfs-site.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3376) DFSClient fails to make connection to DN if there are many unusable cached sockets

2012-05-07 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269864#comment-13269864
 ] 

Robert Joseph Evans commented on HDFS-3376:
---

Hey Todd,

I have been trying to follow some of the fixes you have been putting into the 
HDFS socket caching.  I was wondering if you would be willing to pull HDFS-3357 
and this one, HDFS-3376, into branch-0.23.  They both seem to apply cleanly, 
but I am not an HDFS committer to do this myself.

 DFSClient fails to make connection to DN if there are many unusable cached 
 sockets
 --

 Key: HDFS-3376
 URL: https://issues.apache.org/jira/browse/HDFS-3376
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 2.0.0

 Attachments: hdfs-3376.txt


 After fixing the datanode side of keepalive to properly disconnect stale 
 clients, (HDFS-3357), the client side has the following issue: when it 
 connects to a DN, it first tries to use cached sockets, and will try a 
 configurable number of sockets from the cache. If there are more cached 
 sockets than the configured number of retries, and all of them have been 
 closed by the datanode side, then the client will throw an exception and mark 
 the replica node as dead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3376) DFSClient fails to make connection to DN if there are many unusable cached sockets

2012-05-07 Thread Robert Joseph Evans (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270022#comment-13270022
]

Robert Joseph Evans commented on HDFS-3376:
---

Todd,

You are much more of an expert on this then I am. I think HADOOP-8280 and
HADOOP-8350 look fine to pull in too. Thanks for the help with this.

Aaron,

I spoke with Suresh off-line about it when I took over release manager for
branch-0.23, as I was curious about it. He thought that I could not. I don't
really see it being too much of a problem just yet, because there have not been
very many HDFS issues that are applicable to branch-0.23. Although I am in the
process of going through the full HDFS list to see if I have missed anything.

DFSClient fails to make connection to DN if there are many unusable cached
sockets
--

Key: HDFS-3376
URL: https://issues.apache.org/jira/browse/HDFS-3376
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs client
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
Fix For: 2.0.0

Attachments: hdfs-3376.txt

After fixing the datanode side of keepalive to properly disconnect stale
clients, (HDFS-3357), the client side has the following issue: when it
connects to a DN, it first tries to use cached sockets, and will try a
configurable number of sockets from the cache. If there are more cached
sockets than the configured number of retries, and all of them have been
closed by the datanode side, then the client will throw an exception and mark
the replica node as dead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3359) DFSClient.close should close cached sockets

2012-05-04 Thread Robert Joseph Evans (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Joseph Evans updated HDFS-3359:
--

Attachment: hdfs-3359-branch-0.23.txt

I really would like to see this fix go into 0.23 as well. The patch did not
apply cleanly so I have created my own. If someone could please review and
commit this to 0.23 I really would appreciate it.

DFSClient.close should close cached sockets
---

Key: HDFS-3359
URL: https://issues.apache.org/jira/browse/HDFS-3359
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs client
Affects Versions: 0.22.0, 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
Fix For: 2.0.0

Attachments: hdfs-3359-branch-0.23.txt, hdfs-3359.txt, hdfs-3359.txt

Some applications like the TT/JT (pre-2.0) and probably the RM/NM cycle
through DistributedFileSystem objects reasonably frequently. So long as they
call close() it isn't a big problem, except that currently DFSClient.close()
doesn't explicitly close the SocketCache. So unless a full GC runs (causing
the references to get finalized), many SocketCaches can get orphaned, each
with many open sockets inside. We should fix the close() function to close
all cached sockets.

1 2 >

1 - 100 of 108 matches

Mail list logo