[jira] [Commented] (HDFS-2513) Bump jetty to 6.1.26

2011-11-15 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150999#comment-13150999
 ] 

Hudson commented on HDFS-2513:
--

Integrated in Hadoop-Hdfs-22-branch #109 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-22-branch/109/])
HDFS-2513. contrib/hdfsproxy is missing redundant dependencies. Fixed

cos : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1202505
Files : 
* /hadoop/common/branches/branch-0.22/hdfs/src/contrib/hdfsproxy/ivy.xml


> Bump jetty to 6.1.26
> 
>
> Key: HDFS-2513
> URL: https://issues.apache.org/jira/browse/HDFS-2513
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Konstantin Boudnik
>Assignee: Konstantin Boudnik
> Fix For: 0.22.0
>
> Attachments: HADOOP-2513.patch, HDFS-2513.patch
>
>
> HDFS part of Hadoop-7450

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2552) Add WebHdfs Forrest doc

2011-11-15 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-2552:
-

Attachment: h2552_2015b.pdf
h2552_2015b.patch

h2552_2015b.patch
h2552_2015b.pdf

About 40% done.  Also attached a pdf for easing the review.  For generating the 
doc, you may
- run {{forrest}} on {{hadoop-hdfs-project/hadoop-hdfs/src/main/docs}}; or
- run {{mvn package -DskipTests -Pdocs}} on {{hadoop-hdfs-project/hadoop-hdfs}}.

> Add WebHdfs Forrest doc
> ---
>
> Key: HDFS-2552
> URL: https://issues.apache.org/jira/browse/HDFS-2552
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: documentation
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h2552_2015.patch, h2552_2015b.patch, 
> h2552_2015b.pdf
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2513) Bump jetty to 6.1.26

2011-11-15 Thread Konstantin Boudnik (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-2513:
-

Attachment: HDFS-2513.patch

Corrected patch addressing hdfsproxy dependencies (which is sorta useless 
because hdfsproxy seems to be abandoned any way)

> Bump jetty to 6.1.26
> 
>
> Key: HDFS-2513
> URL: https://issues.apache.org/jira/browse/HDFS-2513
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Konstantin Boudnik
>Assignee: Konstantin Boudnik
> Fix For: 0.22.0
>
> Attachments: HADOOP-2513.patch, HDFS-2513.patch
>
>
> HDFS part of Hadoop-7450

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2513) Bump jetty to 6.1.26

2011-11-15 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150957#comment-13150957
 ] 

Hudson commented on HDFS-2513:
--

Integrated in Hadoop-Hdfs-22-branch #107 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-22-branch/107/])
HDFS-2513. Bump jetty to 6.1.26. Contributed by Konstantin Boudnik

cos : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1202503
Files : 
* /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt
* /hadoop/common/branches/branch-0.22/hdfs/ivy/libraries.properties
* 
/hadoop/common/branches/branch-0.22/hdfs/src/contrib/hdfsproxy/ivy/libraries.properties


> Bump jetty to 6.1.26
> 
>
> Key: HDFS-2513
> URL: https://issues.apache.org/jira/browse/HDFS-2513
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Konstantin Boudnik
>Assignee: Konstantin Boudnik
> Fix For: 0.22.0
>
> Attachments: HADOOP-2513.patch
>
>
> HDFS part of Hadoop-7450

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-2513) Bump jetty to 6.1.26

2011-11-15 Thread Konstantin Boudnik (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik resolved HDFS-2513.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

I have just committed it.

> Bump jetty to 6.1.26
> 
>
> Key: HDFS-2513
> URL: https://issues.apache.org/jira/browse/HDFS-2513
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Konstantin Boudnik
>Assignee: Konstantin Boudnik
> Fix For: 0.22.0
>
> Attachments: HADOOP-2513.patch
>
>
> HDFS part of Hadoop-7450

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2513) Bump jetty to 6.1.26

2011-11-15 Thread Konstantin Shvachko (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150946#comment-13150946
 ] 

Konstantin Shvachko commented on HDFS-2513:
---

+1

> Bump jetty to 6.1.26
> 
>
> Key: HDFS-2513
> URL: https://issues.apache.org/jira/browse/HDFS-2513
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Konstantin Boudnik
>Assignee: Konstantin Boudnik
> Fix For: 0.22.0
>
> Attachments: HADOOP-2513.patch
>
>
> HDFS part of Hadoop-7450

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2552) Add WebHdfs Forrest doc

2011-11-15 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-2552:
-

Attachment: h2552_2015.patch

h2552_2015.patch: for preview; not done yet.

> Add WebHdfs Forrest doc
> ---
>
> Key: HDFS-2552
> URL: https://issues.apache.org/jira/browse/HDFS-2552
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: documentation
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h2552_2015.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2554) Add separate metrics for missing blocks with desired replication level 1

2011-11-15 Thread Todd Lipcon (Created) (JIRA)
Add separate metrics for missing blocks with desired replication level 1


 Key: HDFS-2554
 URL: https://issues.apache.org/jira/browse/HDFS-2554
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Todd Lipcon
Priority: Minor


Some users use replication level set to 1 for datasets which are unimportant 
and can be lost with no worry (eg the output of terasort tests). But other data 
on the cluster is important and should not be lost. It would be useful to 
separate the metric for missing blocks by the desired replication level of 
those blocks, so that one could ignore missing blocks at repl 1 while still 
alerting on missing blocks with higher desired replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-1445) Batch the calls in DataStorage to FileUtil.createHardLink(), so we call it once per directory instead of once per file

2011-11-15 Thread Harsh J (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-1445:
--

Fix Version/s: 0.20.204.0

I was wrong. This one does seem to be present. Not present in CHANGES.txt 
though.

> Batch the calls in DataStorage to FileUtil.createHardLink(), so we call it 
> once per directory instead of once per file
> --
>
> Key: HDFS-1445
> URL: https://issues.apache.org/jira/browse/HDFS-1445
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node
>Affects Versions: 0.20.2
>Reporter: Matt Foley
>Assignee: Matt Foley
> Fix For: 0.20.204.0, 0.23.0
>
> Attachments: HDFS-1445-trunk.v22_hdfs_2-of-2.patch
>
>
> It was a bit of a puzzle why we can do a full scan of a disk in about 30 
> seconds during FSDir() or getVolumeMap(), but the same disk took 11 minutes 
> to do Upgrade replication via hardlinks.  It turns out that the 
> org.apache.hadoop.fs.FileUtil.createHardLink() method does an outcall to 
> Runtime.getRuntime().exec(), to utilize native filesystem hardlink 
> capability.  So it is forking a full-weight external process, and we call it 
> on each individual file to be replicated.
> As a simple check on the possible cost of this approach, I built a Perl test 
> script (under Linux on a production-class datanode).  Perl also uses a 
> compiled and optimized p-code engine, and it has both native support for 
> hardlinks and the ability to do "exec".  
> -  A simple script to create 256,000 files in a directory tree organized like 
> the Datanode, took 10 seconds to run.
> -  Replicating that directory tree using hardlinks, the same way as the 
> Datanode, took 12 seconds using native hardlink support.
> -  The same replication using outcalls to exec, one per file, took 256 
> seconds!
> -  Batching the calls, and doing 'exec' once per directory instead of once 
> per file, took 16 seconds.
> Obviously, your mileage will vary based on the number of blocks per volume.  
> A volume with less than about 4000 blocks will have only 65 directories.  A 
> volume with more than 4K and less than about 250K blocks will have 4200 
> directories (more or less).  And there are two files per block (the data file 
> and the .meta file).  So the average number of files per directory may vary 
> from 2:1 to 500:1.  A node with 50K blocks and four volumes will have 25K 
> files per volume, or an average of about 6:1.  So this change may be expected 
> to take it down from, say, 12 minutes per volume to 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-1445) Batch the calls in DataStorage to FileUtil.createHardLink(), so we call it once per directory instead of once per file

2011-11-15 Thread Harsh J (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-1445:
--

Fix Version/s: (was: 0.20.204.0)

This isn't in 0.20.204. Not sure how that got into the Fix Version. Neither is 
HADOOP-7133, which is a co-patch.

Removing from Fix Version.

CHANGES.txt does not carry these names, so we're good release-wise. Must've 
been a JIRA field error.

> Batch the calls in DataStorage to FileUtil.createHardLink(), so we call it 
> once per directory instead of once per file
> --
>
> Key: HDFS-1445
> URL: https://issues.apache.org/jira/browse/HDFS-1445
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node
>Affects Versions: 0.20.2
>Reporter: Matt Foley
>Assignee: Matt Foley
> Fix For: 0.23.0
>
> Attachments: HDFS-1445-trunk.v22_hdfs_2-of-2.patch
>
>
> It was a bit of a puzzle why we can do a full scan of a disk in about 30 
> seconds during FSDir() or getVolumeMap(), but the same disk took 11 minutes 
> to do Upgrade replication via hardlinks.  It turns out that the 
> org.apache.hadoop.fs.FileUtil.createHardLink() method does an outcall to 
> Runtime.getRuntime().exec(), to utilize native filesystem hardlink 
> capability.  So it is forking a full-weight external process, and we call it 
> on each individual file to be replicated.
> As a simple check on the possible cost of this approach, I built a Perl test 
> script (under Linux on a production-class datanode).  Perl also uses a 
> compiled and optimized p-code engine, and it has both native support for 
> hardlinks and the ability to do "exec".  
> -  A simple script to create 256,000 files in a directory tree organized like 
> the Datanode, took 10 seconds to run.
> -  Replicating that directory tree using hardlinks, the same way as the 
> Datanode, took 12 seconds using native hardlink support.
> -  The same replication using outcalls to exec, one per file, took 256 
> seconds!
> -  Batching the calls, and doing 'exec' once per directory instead of once 
> per file, took 16 seconds.
> Obviously, your mileage will vary based on the number of blocks per volume.  
> A volume with less than about 4000 blocks will have only 65 directories.  A 
> volume with more than 4K and less than about 250K blocks will have 4200 
> directories (more or less).  And there are two files per block (the data file 
> and the .meta file).  So the average number of files per directory may vary 
> from 2:1 to 500:1.  A node with 50K blocks and four volumes will have 25K 
> files per volume, or an average of about 6:1.  So this change may be expected 
> to take it down from, say, 12 minutes per volume to 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system

2011-11-15 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150598#comment-13150598
 ] 

Hadoop QA commented on HDFS-234:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12503763/hdfs_tpt_lat.pdf
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1560//console

This message is automatically generated.

> Integration with BookKeeper logging system
> --
>
> Key: HDFS-234
> URL: https://issues.apache.org/jira/browse/HDFS-234
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Luca Telloli
>Assignee: Ivan Kelly
> Fix For: HA branch (HDFS-1623), 0.24.0
>
> Attachments: HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234.diff, HDFS-234.patch, 
> create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, zookeeper-dev.jar
>
>
> BookKeeper is a system to reliably log streams of records 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
> natural target for such a system for being the metadata repository of the 
> entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-234) Integration with BookKeeper logging system

2011-11-15 Thread Ivan Kelly (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-234:


Attachment: hdfs_tpt_lat.pdf

> Integration with BookKeeper logging system
> --
>
> Key: HDFS-234
> URL: https://issues.apache.org/jira/browse/HDFS-234
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Luca Telloli
>Assignee: Ivan Kelly
> Fix For: HA branch (HDFS-1623), 0.24.0
>
> Attachments: HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234.diff, HDFS-234.patch, 
> create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, zookeeper-dev.jar
>
>
> BookKeeper is a system to reliably log streams of records 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
> natural target for such a system for being the metadata repository of the 
> entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system

2011-11-15 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150589#comment-13150589
 ] 

Hadoop QA commented on HDFS-234:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12503761/HDFS-234.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

-1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed the unit tests build

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1559//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1559//console

This message is automatically generated.

> Integration with BookKeeper logging system
> --
>
> Key: HDFS-234
> URL: https://issues.apache.org/jira/browse/HDFS-234
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Luca Telloli
>Assignee: Ivan Kelly
> Fix For: HA branch (HDFS-1623), 0.24.0
>
> Attachments: HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234.diff, HDFS-234.patch, 
> create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, zookeeper-dev.jar
>
>
> BookKeeper is a system to reliably log streams of records 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
> natural target for such a system for being the metadata repository of the 
> entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-234) Integration with BookKeeper logging system

2011-11-15 Thread Ivan Kelly (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-234:


Attachment: (was: hdfs_tpt_lat.pdf)

> Integration with BookKeeper logging system
> --
>
> Key: HDFS-234
> URL: https://issues.apache.org/jira/browse/HDFS-234
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Luca Telloli
>Assignee: Ivan Kelly
> Fix For: HA branch (HDFS-1623), 0.24.0
>
> Attachments: HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234.diff, HDFS-234.patch, 
> create.png, zookeeper-dev-bookkeeper.jar, zookeeper-dev.jar
>
>
> BookKeeper is a system to reliably log streams of records 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
> natural target for such a system for being the metadata repository of the 
> entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-234) Integration with BookKeeper logging system

2011-11-15 Thread Ivan Kelly (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-234:


Attachment: hdfs_tpt_lat.pdf

Added latency/tpt diagram.

> Integration with BookKeeper logging system
> --
>
> Key: HDFS-234
> URL: https://issues.apache.org/jira/browse/HDFS-234
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Luca Telloli
>Assignee: Ivan Kelly
> Fix For: HA branch (HDFS-1623), 0.24.0
>
> Attachments: HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234.diff, HDFS-234.patch, 
> create.png, zookeeper-dev-bookkeeper.jar, zookeeper-dev.jar
>
>
> BookKeeper is a system to reliably log streams of records 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
> natural target for such a system for being the metadata repository of the 
> entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system

2011-11-15 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150586#comment-13150586
 ] 

jirapos...@reviews.apache.org commented on HDFS-234:



---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2835/
---

Review request for hadoop-hdfs.


Summary
---

BookKeeper is a system to reliably log streams of records 
(https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
natural target for such a system for being the metadata repository of the 
entire file system for HDFS. 


This addresses bug HDFS-234.
http://issues.apache.org/jira/browse/HDFS-234


Diffs
-

  hadoop-hdfs-project/hadoop-hdfs/pom.xml 1bcc372 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperEditLogInputStream.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/TestBookKeeperJournalManager.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogTestUtil.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/EditLogLedgerMetadata.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/MaxTxId.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/WriteLock.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperEditLogOutputStream.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/bkjournal/BookKeeperJournalManager.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2835/diff


Testing
---


Thanks,

Ivan



> Integration with BookKeeper logging system
> --
>
> Key: HDFS-234
> URL: https://issues.apache.org/jira/browse/HDFS-234
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Luca Telloli
>Assignee: Ivan Kelly
> Fix For: HA branch (HDFS-1623), 0.24.0
>
> Attachments: HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234.diff, HDFS-234.patch, 
> create.png, zookeeper-dev-bookkeeper.jar, zookeeper-dev.jar
>
>
> BookKeeper is a system to reliably log streams of records 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
> natural target for such a system for being the metadata repository of the 
> entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-234) Integration with BookKeeper logging system

2011-11-15 Thread Ivan Kelly (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-234:


Fix Version/s: 0.24.0
   HA branch (HDFS-1623)
   Status: Patch Available  (was: Open)

> Integration with BookKeeper logging system
> --
>
> Key: HDFS-234
> URL: https://issues.apache.org/jira/browse/HDFS-234
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Luca Telloli
>Assignee: Ivan Kelly
> Fix For: HA branch (HDFS-1623), 0.24.0
>
> Attachments: HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234.diff, HDFS-234.patch, 
> create.png, zookeeper-dev-bookkeeper.jar, zookeeper-dev.jar
>
>
> BookKeeper is a system to reliably log streams of records 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
> natural target for such a system for being the metadata repository of the 
> entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-234) Integration with BookKeeper logging system

2011-11-15 Thread Ivan Kelly (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-234:


Attachment: HDFS-234.diff

This patch depends on HDFS-1580

> Integration with BookKeeper logging system
> --
>
> Key: HDFS-234
> URL: https://issues.apache.org/jira/browse/HDFS-234
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Luca Telloli
>Assignee: Ivan Kelly
> Attachments: HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
> HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234.diff, HDFS-234.patch, 
> create.png, zookeeper-dev-bookkeeper.jar, zookeeper-dev.jar
>
>
> BookKeeper is a system to reliably log streams of records 
> (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
> natural target for such a system for being the metadata repository of the 
> entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks

2011-11-15 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150480#comment-13150480
 ] 

Hudson commented on HDFS-2476:
--

Integrated in Hadoop-Mapreduce-trunk #898 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/898/])
HDFS-2476. More CPU efficient data structure for under-replicated, 
over-replicated, and invalidated blocks. Contributed by Tomasz Nykiel.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1201991
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightHashSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightLinkedSet.java


> More CPU efficient data structure for 
> under-replicated/over-replicated/invalidate blocks
> 
>
> Key: HDFS-2476
> URL: https://issues.apache.org/jira/browse/HDFS-2476
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Fix For: 0.24.0
>
> Attachments: hashStructures.patch, hashStructures.patch-2, 
> hashStructures.patch-3, hashStructures.patch-4, hashStructures.patch-5, 
> hashStructures.patch-6, hashStructures.patch-7, hashStructures.patch-8, 
> hashStructures.patch-9
>
>
> This patch introduces two hash data structures for storing under-replicated, 
> over-replicated and invalidated blocks. 
> 1. LightWeightHashSet
> 2. LightWeightLinkedSet
> Currently in all these cases we are using java.util.TreeSet which adds 
> unnecessary overhead.
> The main bottlenecks addressed by this patch are:
> -cluster instability times, when these queues (especially under-replicated) 
> tend to grow quite drastically,
> -initial cluster startup, when the queues are initialized, after leaving 
> safemode,
> -block reports,
> -explicit acks for block addition and deletion
> 1. The introduced structures are CPU-optimized.
> 2. They shrink and expand according to current capacity.
> 3. Add/contains/delete ops are performed in O(1) time (unlike current log n 
> for TreeSet).
> 4. The sets are equipped with fast access methods for polling a number of 
> elements (get+remove), which are used for handling the queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2532) TestDfsOverAvroRpc timing out in trunk

2011-11-15 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150476#comment-13150476
 ] 

Hudson commented on HDFS-2532:
--

Integrated in Hadoop-Mapreduce-trunk #898 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/898/])
HDFS-2532. Add timeout to TestDfsOverAvroRpc

This test is timing out on trunk, causing tests below it to
fail spuriously. This patch doesn't fix the issue -- just adds
a JUnit timeout so that the failure is properly attributed
to this test.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1201963
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDfsOverAvroRpc.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLocalDFS.java


> TestDfsOverAvroRpc timing out in trunk
> --
>
> Key: HDFS-2532
> URL: https://issues.apache.org/jira/browse/HDFS-2532
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-2532-make-timeout.txt
>
>
> "java.io.IOException: java.io.IOException: Unknown protocol: 
> org.apache.hadoop.ipc.AvroRpcEngine$TunnelProtocol" occurs while starting up 
> the DN, and then it hangs waiting for the MiniCluster to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks

2011-11-15 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150439#comment-13150439
 ] 

Hudson commented on HDFS-2476:
--

Integrated in Hadoop-Hdfs-trunk #864 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/864/])
HDFS-2476. More CPU efficient data structure for under-replicated, 
over-replicated, and invalidated blocks. Contributed by Tomasz Nykiel.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1201991
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightHashSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightLinkedSet.java


> More CPU efficient data structure for 
> under-replicated/over-replicated/invalidate blocks
> 
>
> Key: HDFS-2476
> URL: https://issues.apache.org/jira/browse/HDFS-2476
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Fix For: 0.24.0
>
> Attachments: hashStructures.patch, hashStructures.patch-2, 
> hashStructures.patch-3, hashStructures.patch-4, hashStructures.patch-5, 
> hashStructures.patch-6, hashStructures.patch-7, hashStructures.patch-8, 
> hashStructures.patch-9
>
>
> This patch introduces two hash data structures for storing under-replicated, 
> over-replicated and invalidated blocks. 
> 1. LightWeightHashSet
> 2. LightWeightLinkedSet
> Currently in all these cases we are using java.util.TreeSet which adds 
> unnecessary overhead.
> The main bottlenecks addressed by this patch are:
> -cluster instability times, when these queues (especially under-replicated) 
> tend to grow quite drastically,
> -initial cluster startup, when the queues are initialized, after leaving 
> safemode,
> -block reports,
> -explicit acks for block addition and deletion
> 1. The introduced structures are CPU-optimized.
> 2. They shrink and expand according to current capacity.
> 3. Add/contains/delete ops are performed in O(1) time (unlike current log n 
> for TreeSet).
> 4. The sets are equipped with fast access methods for polling a number of 
> elements (get+remove), which are used for handling the queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2532) TestDfsOverAvroRpc timing out in trunk

2011-11-15 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150435#comment-13150435
 ] 

Hudson commented on HDFS-2532:
--

Integrated in Hadoop-Hdfs-trunk #864 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/864/])
HDFS-2532. Add timeout to TestDfsOverAvroRpc

This test is timing out on trunk, causing tests below it to
fail spuriously. This patch doesn't fix the issue -- just adds
a JUnit timeout so that the failure is properly attributed
to this test.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1201963
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDfsOverAvroRpc.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLocalDFS.java


> TestDfsOverAvroRpc timing out in trunk
> --
>
> Key: HDFS-2532
> URL: https://issues.apache.org/jira/browse/HDFS-2532
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-2532-make-timeout.txt
>
>
> "java.io.IOException: java.io.IOException: Unknown protocol: 
> org.apache.hadoop.ipc.AvroRpcEngine$TunnelProtocol" occurs while starting up 
> the DN, and then it hangs waiting for the MiniCluster to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira