date:20090915

[jira] Commented: (HDFS-472) Document hdfsproxy design and set-up guide

2009-09-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755387#action_12755387
 ] 

Hudson commented on HDFS-472:
-

Integrated in Hadoop-Hdfs-trunk-Commit #34 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/34/])


 Document hdfsproxy design and set-up guide
 --

 Key: HDFS-472
 URL: https://issues.apache.org/jira/browse/HDFS-472
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/hdfsproxy
Reporter: zhiyong zhang
Assignee: zhiyong zhang
 Fix For: 0.21.0

 Attachments: HDFS-472.patch, HDFS-472.patch, HDFS-472.patch, 
 hdfsproxy.pdf, hdfsproxy.pdf


 currently hdfsproxy only have a README file that does not follow closely to 
 the code. Need more documentation on the design, build and set-up guide. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-385) Design a pluggable interface to place replicas of blocks in HDFS

2009-09-15 Thread Hudson (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755388#action_12755388
]

Hudson commented on HDFS-385:
-

Integrated in Hadoop-Hdfs-trunk-Commit #34 (See
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/34/])
. Add support for an API that allows a module external
to HDFS to specify how HDFS blocks should be placed. (dhruba)

Design a pluggable interface to place replicas of blocks in HDFS

Key: HDFS-385
URL: https://issues.apache.org/jira/browse/HDFS-385
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Fix For: 0.21.0

Attachments: BlockPlacementPluggable.txt,
BlockPlacementPluggable2.txt, BlockPlacementPluggable3.txt,
BlockPlacementPluggable4.txt, BlockPlacementPluggable4.txt,
BlockPlacementPluggable5.txt, BlockPlacementPluggable6.txt,
BlockPlacementPluggable7.txt

The current HDFS code typically places one replica on local rack, the second
replica on remote random rack and the third replica on a random node of that
remote rack. This algorithm is baked in the NameNode's code. It would be nice
to make the block placement algorithm a pluggable interface. This will allow
experimentation of different placement algorithms based on workloads,
availability guarantees and failure models.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-612) FSDataset should not use org.mortbay.log.Log

2009-09-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755496#action_12755496
 ] 

Hudson commented on HDFS-612:
-

Integrated in Hadoop-Hdfs-trunk #84 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/84/])


 FSDataset should not use org.mortbay.log.Log
 

 Key: HDFS-612
 URL: https://issues.apache.org/jira/browse/HDFS-612
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.21.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.21.0

 Attachments: h612_20090911.patch, h612_20090911b.patch


 There are some codes in FSDataset using org.mortbay.log.Log.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-202) Add a bulk FIleSystem.getFileBlockLocations

2009-09-15 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755567#action_12755567
 ] 

Sanjay Radia commented on HDFS-202:
---

 Maybe we should punt that until someone develops an append-savvy distcp?
+1

Why is DetailedFileStatus[] better than MapFileStatus,BlockLocation[]? The 
latter seems more transparent.
I was holding out on a file system interface return a map. But that is old 
school.
Fine I am convinced.

I suspect you also want the rpc signature to return a map (that makes me more 
nervous because most rpcs do not support that - but ours does I guess.).


-

Wrt to the new FileContext api, my  proposal is that its provides a single  
getBlockLocation method:

MapFileStatus,BlockLocation[] getBlockLocations(Path[] path)

and abandon the BlockLocation[] getBlockLocations(path, start, end).


(of course FileSystem will continue to support the old getBlockLocations.)



 Add a bulk FIleSystem.getFileBlockLocations
 ---

 Key: HDFS-202
 URL: https://issues.apache.org/jira/browse/HDFS-202
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Jakob Homan

 Currently map-reduce applications (specifically file-based input-formats) use 
 FileSystem.getFileBlockLocations to compute splits. However they are forced 
 to call it once per file.
 The downsides are multiple:
# Even with a few thousand files to process the number of RPCs quickly 
 starts getting noticeable
# The current implementation of getFileBlockLocations is too slow since 
 each call results in 'search' in the namesystem. Assuming a few thousand 
 input files it results in that many RPCs and 'searches'.
 It would be nice to have a FileSystem.getFileBlockLocations which can take in 
 a directory, and return the block-locations for all files in that directory. 
 We could eliminate both the per-file RPC and also the 'search' by a 'scan'.
 When I tested this for terasort, a moderate job with 8000 input files the 
 runtime halved from the current 8s to 4s. Clearly this is much more important 
 for latency-sensitive applications...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-598) Eclipse launch task for HDFS

2009-09-15 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated HDFS-598:
-

Attachment: HDFS-598-v2.patch

Yep, Tom, didn't check that one after the backport.  The move into hdfs and 
hdfs-with-mr confused it, too.  I've updated the patch to work.

 Eclipse launch task for HDFS
 

 Key: HDFS-598
 URL: https://issues.apache.org/jira/browse/HDFS-598
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: build
 Environment: Eclipse 3.5
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Trivial
 Attachments: HDFS-598-v2.patch, hdfs-598.patch


 Porting HDFS launch task from HADOOP-5911. See MAPREDUCE-905.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-617) Support for non-recursive create() in HDFS

2009-09-15 Thread Kan Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kan Zhang updated HDFS-617:
---

Status: Open  (was: Patch Available)

 Support for non-recursive create() in HDFS
 --

 Key: HDFS-617
 URL: https://issues.apache.org/jira/browse/HDFS-617
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kan Zhang
Assignee: Kan Zhang
 Attachments: h617-01.patch, h617-02.patch, h617-03.patch, 
 h617-04.patch


 HADOOP-4952 calls for a create call that doesn't automatically create missing 
 parent directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-222) Support for concatenating of files into a single file

2009-09-15 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755575#action_12755575
 ] 

Sanjay Radia commented on HDFS-222:
---

Clearly this is a hack to support parallel copies of large files in distcp. (It 
is an embarrassment that hadoop does not support this).
The proper way to do this is to create a first class abstraction for a file as 
a container for blocks. But that is long project.
So the new concat method would be marked as limited-private.

Breaking the FileSystem abstraction issue - I don't get it: All  file systems 
impls can support a concat of files, though most cannot do this atomically.
Owen are you proposing that we add this to distributedFileSystem and not 
FileSystem and that distcp does as class narrow to use it if it is available?
I am fine with that.



 Support for concatenating of files into a single file
 -

 Key: HDFS-222
 URL: https://issues.apache.org/jira/browse/HDFS-222
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Venkatesh S
Assignee: Boris Shkolnik

 An API to concatenate files of same size and replication factor on HDFS into 
 a single larger file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-598) Eclipse launch task for HDFS

2009-09-15 Thread Eli Collins (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755595#action_12755595
 ] 

Eli Collins commented on HDFS-598:
--

Thanks Phil!


 Eclipse launch task for HDFS
 

 Key: HDFS-598
 URL: https://issues.apache.org/jira/browse/HDFS-598
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: build
 Environment: Eclipse 3.5
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Trivial
 Attachments: HDFS-598-v2.patch, hdfs-598.patch


 Porting HDFS launch task from HADOOP-5911. See MAPREDUCE-905.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-573) Porting libhdfs to Windows

2009-09-15 Thread Faisal Khan (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755610#action_12755610
]

Faisal Khan commented on HDFS-573:
--

I ran unit tests on Ziliang's patch for libhdfs on Linux and here is the output
http://pages.cs.wisc.edu/~faisal/libhdfs_testresult.txt . Tests look ok.

Porting libhdfs to Windows
--

Key: HDFS-573
URL: https://issues.apache.org/jira/browse/HDFS-573
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs client
Environment: Windows, Visual Studio 2008
Reporter: Ziliang Guo
Original Estimate: 336h
Remaining Estimate: 336h

The current C code in libhdfs is written using C99 conventions and also uses
a few POSIX specific functions such as hcreate, hsearch, and pthread mutex
locks. To compile it using Visual Studio would require a conversion of the
code in hdfsJniHelper.c and hdfs.c to C89 and replacement/reimplementation of
the POSIX functions. The code also uses the stdint.h header, which is not
part of the original C89, but there exists what appears to be a BSD licensed
reimplementation written to be compatible with MSVC floating around. I have
already done the other necessary conversions, as well as created a simplistic
hash bucket for use with hcreate and hsearch and successfully built a DLL of
libhdfs. Further testing is needed to see if it is usable by other programs
to actually access hdfs, which will likely happen in the next few weeks as
the Condor Project continues with its file transfer work.
In the process, I've removed a few what I believe are extraneous consts and
also fixed an incorrect array initialization where someone was attempting to
initialize with something like this: JavaVMOption options[noArgs]; where
noArgs was being incremented in the code above. This was in the
hdfsJniHelper.c file, in the getJNIEnv function.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-222) Support for concatenating of files into a single file

2009-09-15 Thread Doug Cutting (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755619#action_12755619
 ] 

Doug Cutting commented on HDFS-222:
---

 add this to distributedFileSystem and not FileSystem and that distcp does as 
 class narrow to use it if it is available

+1 This sounds like a reasonable plan.


 Support for concatenating of files into a single file
 -

 Key: HDFS-222
 URL: https://issues.apache.org/jira/browse/HDFS-222
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Venkatesh S
Assignee: Boris Shkolnik

 An API to concatenate files of same size and replication factor on HDFS into 
 a single larger file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-202) Add a bulk FIleSystem.getFileBlockLocations

2009-09-15 Thread dhruba borthakur (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755621#action_12755621
]

dhruba borthakur commented on HDFS-202:
---

Add a bulk FIleSystem.getFileBlockLocations
---

Key: HDFS-202
URL: https://issues.apache.org/jira/browse/HDFS-202
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Jakob Homan

Currently map-reduce applications (specifically file-based input-formats) use
FileSystem.getFileBlockLocations to compute splits. However they are forced
to call it once per file.
The downsides are multiple:
# Even with a few thousand files to process the number of RPCs quickly
starts getting noticeable
# The current implementation of getFileBlockLocations is too slow since
each call results in 'search' in the namesystem. Assuming a few thousand
input files it results in that many RPCs and 'searches'.
It would be nice to have a FileSystem.getFileBlockLocations which can take in
a directory, and return the block-locations for all files in that directory.
We could eliminate both the per-file RPC and also the 'search' by a 'scan'.
When I tested this for terasort, a moderate job with 8000 input files the
runtime halved from the current 8s to 4s. Clearly this is much more important
for latency-sensitive applications...

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-516) Low Latency distributed reads

2009-09-15 Thread Raghu Angadi (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755627#action_12755627
]

Raghu Angadi commented on HDFS-516:
---

bq. somehow, from 213 seconds to 112 seconds to stream 1GB from a remote HDFS
file.

This is 5MBps for HDFS and 9MBps for RadFS. Assuming 9MBps is probably 100Mbps
network limit (is it?), 5MBps is too low for any FS. Since both reads are from
the same physical files, this may not be hardware related. Could you check what
is causing this delay? This might be affecting other benchmarks as well.
Checking netstat on the client while this read is going on might help.

Regd reads in RAD fs, does client fetch 32KB each time (single RPC) or does it
pipeline (multiple requests for single client's stream)?

@Todd, I essentially see this as POC of what could/should be improved in HDFS
for addressing latency issues. Contrib makes sense, but I would not expect this
to go to production in this form and should be marked 'Experimental'. The
benchmarks also help greatly in setting priorities for features. I don't think
this needs a branch since it does not touch core at all.

Low Latency distributed reads
-

Key: HDFS-516
URL: https://issues.apache.org/jira/browse/HDFS-516
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Jay Booth
Priority: Minor
Attachments: hdfs-516-20090912.patch

Original Estimate: 168h
Remaining Estimate: 168h

I created a method for low latency random reads using NIO on the server side
and simulated OS paging with LRU caching and lookahead on the client side.
Some applications could include lucene searching (term-doc and doc-offset
mappings are likely to be in local cache, thus much faster than nutch's
current FsDirectory impl and binary search through record files (bytes at
1/2, 1/4, 1/8 marks are likely to be cached)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-617) Support for non-recursive create() in HDFS

2009-09-15 Thread Kan Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kan Zhang updated HDFS-617:
---

Status: Patch Available  (was: Open)

 Support for non-recursive create() in HDFS
 --

 Key: HDFS-617
 URL: https://issues.apache.org/jira/browse/HDFS-617
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kan Zhang
Assignee: Kan Zhang
 Attachments: h617-01.patch, h617-02.patch, h617-03.patch, 
 h617-04.patch, h617-06.patch


 HADOOP-4952 calls for a create call that doesn't automatically create missing 
 parent directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-516) Low Latency distributed reads

2009-09-15 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755640#action_12755640
 ] 

Todd Lipcon commented on HDFS-516:
--

bq. I essentially see this as POC of what could/should be improved in HDFS for 
addressing latency issues. Contrib makes sense, but I would not expect this to 
go to production in this form and should be marked 'Experimental'.

+1

 Low Latency distributed reads
 -

 Key: HDFS-516
 URL: https://issues.apache.org/jira/browse/HDFS-516
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Jay Booth
Priority: Minor
 Attachments: hdfs-516-20090912.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 I created a method for low latency random reads using NIO on the server side 
 and simulated OS paging with LRU caching and lookahead on the client side.  
 Some applications could include lucene searching (term-doc and doc-offset 
 mappings are likely to be in local cache, thus much faster than nutch's 
 current FsDirectory impl and binary search through record files (bytes at 
 1/2, 1/4, 1/8 marks are likely to be cached)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-616) Create functional tests for new design of the block report

2009-09-15 Thread Konstantin Boudnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-616:


Attachment: HDFS-616.patch

This patch adds BlockReport03 through BlockReport08 cases according to 
HDFS-551's test plan.
The modifications are done against Append branch

 Create functional tests for new design of the block report
 --

 Key: HDFS-616
 URL: https://issues.apache.org/jira/browse/HDFS-616
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
 Attachments: HDFS-616.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-617) Support for non-recursive create() in HDFS

2009-09-15 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-617:


  Component/s: name-node
   hdfs client
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]

+1 the patch is perfect!

 Support for non-recursive create() in HDFS
 --

 Key: HDFS-617
 URL: https://issues.apache.org/jira/browse/HDFS-617
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client, name-node
Reporter: Kan Zhang
Assignee: Kan Zhang
 Fix For: 0.21.0

 Attachments: h617-01.patch, h617-02.patch, h617-03.patch, 
 h617-04.patch, h617-06.patch


 HADOOP-4952 calls for a create call that doesn't automatically create missing 
 parent directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-617) Support for non-recursive create() in HDFS

2009-09-15 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-617:


  Resolution: Fixed
Hadoop Flags: [Incompatible change, Reviewed]  (was: [Reviewed])
  Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, Kan!

Please add release note.

 Support for non-recursive create() in HDFS
 --

 Key: HDFS-617
 URL: https://issues.apache.org/jira/browse/HDFS-617
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client, name-node
Reporter: Kan Zhang
Assignee: Kan Zhang
 Fix For: 0.21.0

 Attachments: h617-01.patch, h617-02.patch, h617-03.patch, 
 h617-04.patch, h617-06.patch


 HADOOP-4952 calls for a create call that doesn't automatically create missing 
 parent directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-621) Exposing MiniDFS and MiniMR clusters as a single process command-line

2009-09-15 Thread Tsz Wo (Nicholas), SZE (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755716#action_12755716
]

Tsz Wo (Nicholas), SZE commented on HDFS-621:
-

I guess you need a separated MAPREDUCE patch for MiniMR.

Exposing MiniDFS and MiniMR clusters as a single process command-line
-

Key: HDFS-621
URL: https://issues.apache.org/jira/browse/HDFS-621
Project: Hadoop HDFS
Issue Type: New Feature
Components: test, tools
Reporter: Philip Zeyliger
Priority: Minor

It's hard to test non-Java programs that rely on significant mapreduce
functionality. The patch I'm proposing shortly will let you just type
bin/hadoop jar hadoop-hdfs-hdfswithmr-test.jar minicluster to start a
cluster (internally, it's using Mini{MR,HDFS}Cluster) with a specified number
of daemons, etc. A test that checks how some external process interacts with
Hadoop might start minicluster as a subprocess, run through its thing, and
then simply kill the java subprocess.
I've been using just such a system for a couple of weeks, and I like it.
It's significantly easier than developing a lot of scripts to start a
pseudo-distributed cluster, and then clean up after it. I figure others
might find it useful as well.
I'm at a bit of a loss as to where to put it in 0.21. hdfs-with-mr tests
have all the required libraries, so I've put it there. I could conceivably
split this into minimr and minihdfs, but it's specifically the fact that
they're configured to talk to each other that I like about having them
together. And one JVM is better than two for my test programs.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-621) Exposing MiniDFS and MiniMR clusters as a single process command-line

2009-09-15 Thread Philip Zeyliger (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Philip Zeyliger updated HDFS-621:
-

Assignee: Philip Zeyliger
Status: Patch Available (was: Open)

Exposing MiniDFS and MiniMR clusters as a single process command-line
-

Key: HDFS-621
URL: https://issues.apache.org/jira/browse/HDFS-621
Project: Hadoop HDFS
Issue Type: New Feature
Components: test, tools
Reporter: Philip Zeyliger
Assignee: Philip Zeyliger
Priority: Minor
Attachments: HDFS-621.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-621) Exposing MiniDFS and MiniMR clusters as a single process command-line

2009-09-15 Thread Tsz Wo (Nicholas), SZE (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755743#action_12755743
]

Tsz Wo (Nicholas), SZE commented on HDFS-621:
-

If you would, take a quick look to see how I use MiniMRCluster. Do you feel
I'm abusing the fact that hdfs-hdfswithmr-test exists?

No more mapreduce codes in hdfs, please. Having hdfs-with-mr in hdfs is a
mistake. It leads to a circular dependence. Indeed, we should move
hdfs-with-mr to mapreduce.

Exposing MiniDFS and MiniMR clusters as a single process command-line
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-621) Exposing MiniDFS and MiniMR clusters as a single process command-line

2009-09-15 Thread Philip Zeyliger (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Philip Zeyliger updated HDFS-621:
-

Attachment: HDFS-621-0.20-patch

Attaching the 0.20 version. Conveniently, where to place it is not a problem
there :)

Exposing MiniDFS and MiniMR clusters as a single process command-line
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-618) Support for non-recursive mkdir in HDFS

2009-09-15 Thread Kan Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kan Zhang updated HDFS-618:
---

Attachment: h618-06.patch

 Support for non-recursive mkdir in HDFS
 ---

 Key: HDFS-618
 URL: https://issues.apache.org/jira/browse/HDFS-618
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client, name-node
Affects Versions: 0.21.0
Reporter: Kan Zhang
Assignee: Kan Zhang
 Attachments: h618-03.patch, h618-04.patch, h618-06.patch


 Existing mkdirs call automatically creates missing parent directories. 
 HADOOP-4952 call for a mkdir call that doesn't.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-618) Support for non-recursive mkdir in HDFS

2009-09-15 Thread Kan Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755757#action_12755757
 ] 

Kan Zhang commented on HDFS-618:


attached a new patch for the latest trunk. also updated the test to check for 
FileAlreadyExistsException.

 Support for non-recursive mkdir in HDFS
 ---

 Key: HDFS-618
 URL: https://issues.apache.org/jira/browse/HDFS-618
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client, name-node
Affects Versions: 0.21.0
Reporter: Kan Zhang
Assignee: Kan Zhang
 Attachments: h618-03.patch, h618-04.patch, h618-06.patch


 Existing mkdirs call automatically creates missing parent directories. 
 HADOOP-4952 call for a mkdir call that doesn't.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-574) Hadoop Doc Split: HDFS Docs

2009-09-15 Thread Corinne Chandel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corinne Chandel updated HDFS-574:
-

Attachment: hdfs-logo.jpg


New logo file ...

hdfs-logo.jpg

 Hadoop Doc Split: HDFS Docs
 ---

 Key: HDFS-574
 URL: https://issues.apache.org/jira/browse/HDFS-574
 Project: Hadoop HDFS
  Issue Type: Task
  Components: documentation
Affects Versions: 0.21.0
Reporter: Corinne Chandel
Assignee: Owen O'Malley
Priority: Blocker
 Attachments: Hadoop-Doc-Split-2.doc, Hadoop-Doc-Split.doc, 
 HDFS-574-hdfs.patch, hdfs-logo.jpg


 Hadoop Doc Split: HDFS Docs 
 Please note that I am unable to directly check all of the new links. Some 
 links may break and will need to be updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-617) Support for non-recursive create() in HDFS

2009-09-15 Thread Kan Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kan Zhang updated HDFS-617:
---

Release Note: Support for non-recursive create()

 Support for non-recursive create() in HDFS
 --

 Key: HDFS-617
 URL: https://issues.apache.org/jira/browse/HDFS-617
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client, name-node
Reporter: Kan Zhang
Assignee: Kan Zhang
 Fix For: 0.21.0

 Attachments: h617-01.patch, h617-02.patch, h617-03.patch, 
 h617-04.patch, h617-06.patch


 HADOOP-4952 calls for a create call that doesn't automatically create missing 
 parent directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-616) Create functional tests for new design of the block report

2009-09-15 Thread Konstantin Boudnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-616:


Attachment: HDFS-616.patch

All testcases are now in place.
3 of them keep failing for some of the functionality isn't yet implemented.

 Create functional tests for new design of the block report
 --

 Key: HDFS-616
 URL: https://issues.apache.org/jira/browse/HDFS-616
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Affects Versions: Append Branch
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
 Attachments: HDFS-616.patch, HDFS-616.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-551) Create new functional test for a block report.

2009-09-15 Thread Konstantin Boudnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-551:


Attachment: BlockReportTestPlan.html

Last three test cases were removed for they are essentially re-enforce the 
behavior of BlockReport_09

 Create new functional test for a block report.
 --

 Key: HDFS-551
 URL: https://issues.apache.org/jira/browse/HDFS-551
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Affects Versions: 0.21.0
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
 Fix For: 0.21.0

 Attachments: BlockReportTestPlan.html, BlockReportTestPlan.html, 
 BlockReportTestPlan.html, BlockReportTestPlan.html, HDFS-551.patch, 
 HDFS-551.patch, HDFS-551.patch, HDFS-551.patch, HDFS-551.patch, 
 HDFS-551.patch, HDFS-551.patch, HDFS-551.patch


 It turned out that there's no test for block report functionality. The one 
 would be extremely valuable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-592) Allow client to get a new generation stamp from NameNode

2009-09-15 Thread Hairong Kuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-592:
---

Attachment: newGS2.patch

This patch incorporates Kan's reveiw comments and adds a unit test for the new 
API in ClientProtocol.

I understand your concern about the naming of the API and what you said makes 
good sense. But still I do not like the name pipelineRecovery or 
recoverPipeline. This patch uses the name getNewStampForPipeline. If you 
still do not like the name, could we resolve the naming issue later? I will 
keep this in my mind.

 Allow client to get a new generation stamp from NameNode
 

 Key: HDFS-592
 URL: https://issues.apache.org/jira/browse/HDFS-592
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Append Branch
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: Append Branch

 Attachments: newGS.patch, newGS1.patch, newGS2.patch


 This issue aims to  add an API to ClientProtocol that fetches a new 
 generation stamp and an access token from NameNode to support append or 
 pipeline recovery.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-592) Allow client to get a new generation stamp from NameNode

2009-09-15 Thread Hairong Kuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755800#action_12755800
 ] 

Hairong Kuang commented on HDFS-592:


 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

 Allow client to get a new generation stamp from NameNode
 

 Key: HDFS-592
 URL: https://issues.apache.org/jira/browse/HDFS-592
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Append Branch
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: Append Branch

 Attachments: newGS.patch, newGS1.patch, newGS2.patch


 This issue aims to  add an API to ClientProtocol that fetches a new 
 generation stamp and an access token from NameNode to support append or 
 pipeline recovery.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-592) Allow client to get a new generation stamp from NameNode

2009-09-15 Thread Kan Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755816#action_12755816
 ] 

Kan Zhang commented on HDFS-592:


 If you still do not like the name, could we resolve the naming issue later?
Well, I have said enough on this. It's your call. 

I have a further comment on the following lease checking. It seems if the 
client sends NULL for clientName, the checking is bypassed, which could become 
a security loophole.
{code}
+if (clientName != null  !pendingFile.getClientName().equals(clientName)) 
{
+  throw new LeaseExpiredException(Lease mismatch:  + block +  owned by 
+  + pendingFile.getClientName() +  but is accessed by  + clientName);
+}
{code}

 Allow client to get a new generation stamp from NameNode
 

 Key: HDFS-592
 URL: https://issues.apache.org/jira/browse/HDFS-592
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Append Branch
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: Append Branch

 Attachments: newGS.patch, newGS1.patch, newGS2.patch


 This issue aims to  add an API to ClientProtocol that fetches a new 
 generation stamp and an access token from NameNode to support append or 
 pipeline recovery.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-592) Allow client to get a new generation stamp from NameNode

2009-09-15 Thread Hairong Kuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-592:
---

Attachment: newGS3.patch

Kan, thanks for catching this. ClientName should never be null in the real 
system. But anyway the new patch checks the null case and adds a new unit test 
for this.

 Allow client to get a new generation stamp from NameNode
 

 Key: HDFS-592
 URL: https://issues.apache.org/jira/browse/HDFS-592
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Append Branch
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: Append Branch

 Attachments: newGS.patch, newGS1.patch, newGS2.patch, newGS3.patch


 This issue aims to  add an API to ClientProtocol that fetches a new 
 generation stamp and an access token from NameNode to support append or 
 pipeline recovery.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-516) Low Latency distributed reads

2009-09-15 Thread Jay Booth (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755823#action_12755823
]

Jay Booth commented on HDFS-516:

Yeah, I was puzzled by the performance too. I dug through the DFS code and I'm
saving a bit on new socket and object creation, maybe a couple instructions
here and there, but that shouldn't add up to 100 seconds for a gigabyte (approx
20 blocks). I'm calling read() a bajillion times in a row so it's conceivable
(although unlikely) that I'm pegging the CPU and that's the limiting factor.

I'm busy for a couple days but will get back to you with some figures from
netstat, top and whatever else I can think of, along with another streaming
case that works with read(b, off, len) to see if that changes things. I'll do
a little more digging into DFS as well to see if I can isolate the cause. I
definitely did run them several times on the same machine and another time on a
different cluster with similar results, so it wasn't simply bad luck on the
rack placement on EC2 (well maybe but unlikely).

Will report back when I have more numbers. After I get those, my roadmap for
this is to add checksum support and better DatanodeInfo caching. User groups
would come after that.

Low Latency distributed reads
-

Key: HDFS-516
URL: https://issues.apache.org/jira/browse/HDFS-516
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Jay Booth
Priority: Minor
Attachments: hdfs-516-20090912.patch

Original Estimate: 168h
Remaining Estimate: 168h

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-618) Support for non-recursive mkdir in HDFS

2009-09-15 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755825#action_12755825
]

Hadoop QA commented on HDFS-618:

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12419697/h618-06.patch
against trunk revision 815496.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 6 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated 1 warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/8/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/8/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/8/artifact/trunk/build/test/checkstyle-errors.html
Console output:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/8/console

This message is automatically generated.

Support for non-recursive mkdir in HDFS
---

Key: HDFS-618
URL: https://issues.apache.org/jira/browse/HDFS-618
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs client, name-node
Affects Versions: 0.21.0
Reporter: Kan Zhang
Assignee: Kan Zhang
Attachments: h618-03.patch, h618-04.patch, h618-06.patch

Existing mkdirs call automatically creates missing parent directories.
HADOOP-4952 call for a mkdir call that doesn't.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HDFS-622) checkMinReplication should count only live node.

2009-09-15 Thread Konstantin Shvachko (JIRA)

checkMinReplication should count only live node.


 Key: HDFS-622
 URL: https://issues.apache.org/jira/browse/HDFS-622
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: 0.21.0


{{BlockManager.checkMinReplication(Block)}} currently counts all replicas of 
the block even if they are corrupt. Corrupt replicas should be excluded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-622) checkMinReplication should count only live node.

2009-09-15 Thread Konstantin Shvachko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-622:
-

Attachment: liveReplicas.patch

Here is a simple patch which replaces 
{{blocksMap.numNodes(block) = minReplication}}
with
{{countNodes(block).liveReplicas() = minReplication}}

 checkMinReplication should count only live node.
 

 Key: HDFS-622
 URL: https://issues.apache.org/jira/browse/HDFS-622
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: 0.21.0

 Attachments: liveReplicas.patch


 {{BlockManager.checkMinReplication(Block)}} currently counts all replicas of 
 the block even if they are corrupt. Corrupt replicas should be excluded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-622) checkMinReplication should count only live node.

2009-09-15 Thread Konstantin Shvachko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-622:
-

Status: Patch Available  (was: Open)

 checkMinReplication should count only live node.
 

 Key: HDFS-622
 URL: https://issues.apache.org/jira/browse/HDFS-622
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: 0.21.0

 Attachments: liveReplicas.patch


 {{BlockManager.checkMinReplication(Block)}} currently counts all replicas of 
 the block even if they are corrupt. Corrupt replicas should be excluded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HDFS-623) hdfs jar-test ant target fails with the latest commons jar's from the common trunk

2009-09-15 Thread Giridharan Kesavan (JIRA)

hdfs jar-test ant target fails with the latest commons jar's from the common 
trunk
--

 Key: HDFS-623
 URL: https://issues.apache.org/jira/browse/HDFS-623
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Giridharan Kesavan


[javac]
somelocation/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestReplicationPolicy.java:67:
 incompatible types
[javac] found   : 
org.apache.hadoop.hdfs.server.namenode.ReplicationTargetChooser
[javac] required: 
org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicy
[javac] replicator = fsNamesystem.blockManager.replicator;
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 5 errors


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

37 matches

Mail list logo