[jira] Commented: (HDFS-1035) Generate Eclipse's .classpath file from Ivy config

2010-10-22 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12923790#action_12923790
 ] 

Nigel Daley commented on HDFS-1035:
---

Good call on needing to update the document.  The reason you didn't notice any 
change is that it produces the exact same .classpath file from ivy deps vs the 
one you previously had from the template file.  If you blow away your 
.classpath file and run this, it should create the exact same file.

I agree that the rest of the comments are nits.  Feel free to fix them and 
upload a new patch.  Otherwise I'll commit this and MAPREDUCE-1592 this weekend.

 Generate Eclipse's .classpath file from Ivy config
 --

 Key: HDFS-1035
 URL: https://issues.apache.org/jira/browse/HDFS-1035
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: build
Reporter: Tom White
Assignee: Nigel Daley
 Attachments: HDFS-1035.patch


 HDFS companion issue for HADOOP-6407.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1035) Generate Eclipse's .classpath file from Ivy config

2010-10-22 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12923894#action_12923894
 ] 

Konstantin Boudnik commented on HDFS-1035:
--

# Applied the patch
# removed .classpath file
# run {{ant eclipse}}
# {{-rw-r--r--  1 xxx users 0 2010-10-22 09:11 .classpath}}

Have I done something wrong?

bq. Feel free to fix them and upload a new patch

Awesome! As soon as this one will perform ;)

 Generate Eclipse's .classpath file from Ivy config
 --

 Key: HDFS-1035
 URL: https://issues.apache.org/jira/browse/HDFS-1035
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: build
Reporter: Tom White
Assignee: Nigel Daley
 Attachments: HDFS-1035.patch


 HDFS companion issue for HADOOP-6407.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs

2010-10-22 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12923923#action_12923923
 ] 

Suresh Srinivas commented on HDFS-1073:
---

+1 for option A.

 Simpler model for Namenode's fs Image and edit Logs 
 

 Key: HDFS-1073
 URL: https://issues.apache.org/jira/browse/HDFS-1073
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Todd Lipcon
 Attachments: hdfs-1073.txt, hdfs1073.pdf


 The naming and handling of  NN's fsImage and edit logs can be significantly 
 improved resulting simpler and more robust code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-903) NN should verify images and edit logs on startup

2010-10-22 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12923930#action_12923930
 ] 

Suresh Srinivas commented on HDFS-903:
--

 It would have been nice if the contents of the VERSION file was stored as a 
 header record in the beginning of the fsimage file itself 
Currently VERSION creation signals end of snapshot creation, independent of 
fsimage and edits creation. Moving VERSION to fsimage will complicate the 
current design.

 NN should verify images and edit logs on startup
 

 Key: HDFS-903
 URL: https://issues.apache.org/jira/browse/HDFS-903
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: Eli Collins
Assignee: Hairong Kuang
Priority: Critical
 Fix For: 0.22.0


 I was playing around with corrupting fsimage and edits logs when there are 
 multiple dfs.name.dirs specified. I noticed that:
  * As long as your corruption does not make the image invalid, eg changes an 
 opcode so it's an invalid opcode HDFS doesn't notice and happily uses a 
 corrupt image or applies the corrupt edit.
 * If the first image in dfs.name.dir is valid it replaces the other copies 
 in the other name.dirs, even if they are different, with this first image, ie 
 if the first image is actually invalid/old/corrupt metadata than you've lost 
 your valid metadata, which can result in data loss if the namenode garbage 
 collects blocks that it thinks are no longer used.
 How about we maintain a checksum as part of the image and edit log and check 
 those on startup and refuse to startup if they are different. Or at least 
 provide a configuration option to do so if people are worried about the 
 overhead of maintaining checksums of these files. Even if we assume 
 dfs.name.dir is reliable storage this guards against operator errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1435) Provide an option to store fsimage compressed

2010-10-22 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-1435:


Attachment: trunkImageCompress4.patch

This patch cleaned up some unnecessary indention or blank line changes and 
fixed a failed test caused by the previous patch.

I ran ant test-patch and succeeded.

I ran ant test and saw the following tests fail:
TestFileStatus, TestHdfsTrash (timeout), TestHDFSFileContextMainOperations, 
TestPipelines, TestBlockTokenWithDFS, and TestLargeBlock (timeout).

They seem not related to my patch. If nobody is against it, I will commit this 
patch later today.

 

 Provide an option to store fsimage compressed
 -

 Key: HDFS-1435
 URL: https://issues.apache.org/jira/browse/HDFS-1435
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: 0.22.0

 Attachments: checkpoint-limitandcompress.patch, 
 trunkImageCompress.patch, trunkImageCompress1.patch, 
 trunkImageCompress2.patch, trunkImageCompress3.patch, 
 trunkImageCompress4.patch


 Our HDFS has fsimage as big as 20G bytes. It consumes a lot of network 
 bandwidth when secondary NN uploads a new fsimage to primary NN.
 If we could store fsimage compressed, the problem could be greatly alleviated.
 I plan to provide a new configuration hdfs.image.compressed with a default 
 value of false. If it is set to be true, fsimage is stored as compressed.
 The fsimage will have a new layout with a new field compressed in its 
 header, indicating if the namespace is stored compressed or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1475) Want a -d flag in hadoop dfs -ls : Do not expand directories

2010-10-22 Thread Greg Connor (JIRA)
Want a -d flag in hadoop dfs -ls : Do not expand directories


 Key: HDFS-1475
 URL: https://issues.apache.org/jira/browse/HDFS-1475
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.20.1
 Environment: any
Reporter: Greg Connor
Priority: Minor


I would really love it if dfs -ls had a -d flag, like unix ls -d, which would 
list the directories matching the name or pattern but *not* their contents.

Current behavior is to expand every matching dir and list its contents, which 
is awkward if I just want to see the matching dirs themselves (and their 
permissions).  Worse, if a directory exists but is empty, -ls simply returns no 
output at all, which is unhelpful.  

So far we have used some ugly workarounds to this in various scripts, such as
  -ls /path/to |grep dir   # wasteful, and problematic if dir is a substring 
of the path
  -stat /path/to/dir Exists  # stat has no way to get back the full path, 
sadly
  -count /path/to/dir  # works but is probably overkill.

Really there is no reliable replacement for ls -d -- the above hacks will work 
but only for certain isolated contexts.  (I'm not a java programmer, or else I 
would probably submit a patch for this, or make my own jar file to do this 
since I need it a lot.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1475) Want a -d flag in hadoop dfs -ls : Do not expand directories

2010-10-22 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12924062#action_12924062
 ] 

Aaron T. Myers commented on HDFS-1475:
--

+1

This would be extremely useful. For that matter, this would allow one to see 
the permissions, etc of the root directory, which presently I know of no way to 
view, aside from the OIV. :)

 Want a -d flag in hadoop dfs -ls : Do not expand directories
 

 Key: HDFS-1475
 URL: https://issues.apache.org/jira/browse/HDFS-1475
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.20.1
 Environment: any
Reporter: Greg Connor
Priority: Minor

 I would really love it if dfs -ls had a -d flag, like unix ls -d, which would 
 list the directories matching the name or pattern but *not* their contents.
 Current behavior is to expand every matching dir and list its contents, which 
 is awkward if I just want to see the matching dirs themselves (and their 
 permissions).  Worse, if a directory exists but is empty, -ls simply returns 
 no output at all, which is unhelpful.  
 So far we have used some ugly workarounds to this in various scripts, such as
   -ls /path/to |grep dir   # wasteful, and problematic if dir is a 
 substring of the path
   -stat /path/to/dir Exists  # stat has no way to get back the full path, 
 sadly
   -count /path/to/dir  # works but is probably overkill.
 Really there is no reliable replacement for ls -d -- the above hacks will 
 work but only for certain isolated contexts.  (I'm not a java programmer, or 
 else I would probably submit a patch for this, or make my own jar file to do 
 this since I need it a lot.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1472) Refactor DFSck to allow programmatic access to output

2010-10-22 Thread Ramkumar Vadali (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12924067#action_12924067
 ] 

Ramkumar Vadali commented on HDFS-1472:
---

Test results:

ant test-patch:


 [exec]
 [exec] +1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to i
 [exec] nclude 2 new or modified tests.
 [exec] [exec] +1 javadoc.  The javadoc tool did not generate any 
warning messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec]
 [exec] +1 system tests framework.  The patch passed system tests 
framework compile.
 [exec] [exec]
 [exec]
 [exec]
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==

ant test:
Some tests failed, but I verified that these fail in a clean checkout as well.
[junit] Test org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery FAILED
[junit] Test org.apache.hadoop.hdfs.TestFileStatus FAILED
[junit] Test org.apache.hadoop.hdfs.TestHDFSTrash FAILED (timeout)
[junit] Test org.apache.hadoop.fs.TestHDFSFileContextMainOperations FAILED
[junit] Test org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery FAILED


 Refactor DFSck to allow programmatic access to output
 -

 Key: HDFS-1472
 URL: https://issues.apache.org/jira/browse/HDFS-1472
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Reporter: Ramkumar Vadali
 Attachments: HDFS-1472.patch


 DFSck prints the list of corrupt files to stdout. This jira proposes that it 
 write to a PrintStream object that is passed to the constructor. This will 
 allow components like RAID to programmatically get a list of corrupt files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1462) Refactor edit log loading to a separate class from edit log writing

2010-10-22 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12924094#action_12924094
 ] 

Jitendra Nath Pandey commented on HDFS-1462:


This patch removes the references to namesystem object from FSEditlog, which is 
a good thing. 
FSEditlog still refers to FSNamesystem.LOG, if LOG object could be passed as a 
parameter, we could remove all references to FSNamesystem from FSEditlog.

 Refactor edit log loading to a separate class from edit log writing
 ---

 Key: HDFS-1462
 URL: https://issues.apache.org/jira/browse/HDFS-1462
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-1462.txt, hdfs-1462.txt, hdfs-1462.txt


 Before the major work in HDFS-1073, I'd like to do this refactor to clean up 
 the monster FSEditLog class. We can separate all the functions that take of 
 loading edits into an FSN from the functions that take care of writing edits, 
 rolling, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1462) Refactor edit log loading to a separate class from edit log writing

2010-10-22 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12924109#action_12924109
 ] 

Todd Lipcon commented on HDFS-1462:
---

Sure, we could pass in LOG, or just create our own logger for FSEditLog. I left 
it as is to make it a straight refactor, but so long as people don't mind the 
slightly breaking change of changing the log category, I'm fine changing it. 
Thoughts?

 Refactor edit log loading to a separate class from edit log writing
 ---

 Key: HDFS-1462
 URL: https://issues.apache.org/jira/browse/HDFS-1462
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-1462.txt, hdfs-1462.txt, hdfs-1462.txt


 Before the major work in HDFS-1073, I'd like to do this refactor to clean up 
 the monster FSEditLog class. We can separate all the functions that take of 
 loading edits into an FSN from the functions that take care of writing edits, 
 rolling, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1472) Refactor DFSck to allow programmatic access to output

2010-10-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12924117#action_12924117
 ] 

Hadoop QA commented on HDFS-1472:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12457711/HDFS-1472.patch
  against trunk revision 1026178.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

+1 system tests framework.  The patch passed system tests framework compile.

Test results: 
http://hudson.zones.apache.org/hudson/job/PreCommit-HDFS-Build/2/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/PreCommit-HDFS-Build/2/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/PreCommit-HDFS-Build/2/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/PreCommit-HDFS-Build/2/console

This message is automatically generated.

 Refactor DFSck to allow programmatic access to output
 -

 Key: HDFS-1472
 URL: https://issues.apache.org/jira/browse/HDFS-1472
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Reporter: Ramkumar Vadali
 Attachments: HDFS-1472.patch


 DFSck prints the list of corrupt files to stdout. This jira proposes that it 
 write to a PrintStream object that is passed to the constructor. This will 
 allow components like RAID to programmatically get a list of corrupt files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.