[jira] [Resolved] (HDFS-4717) Change the parameter type of the snapshot methods in HdfsAdmin to Path

2013-04-18 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE resolved HDFS-4717.
--

   Resolution: Fixed
Fix Version/s: Snapshot (HDFS-2802)
 Hadoop Flags: Reviewed

Thanks Jing for reviewing the patch.

I have committed this.

> Change the parameter type of the snapshot methods in HdfsAdmin to Path
> --
>
> Key: HDFS-4717
> URL: https://issues.apache.org/jira/browse/HDFS-4717
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: Snapshot (HDFS-2802)
>
> Attachments: h4717_20130418.patch
>
>
> In HdfsAdmin, the path parameter type in allowSnapshot(String path) and 
> disallowSnapshot(String path) should be Path but not String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs

2013-04-18 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li resolved HDFS-4489.
--

Resolution: Fixed

Close this JIRA since all its sub-issues have been resolved. 

> Use InodeID as as an identifier of a file in HDFS protocols and APIs
> 
>
> Key: HDFS-4489
> URL: https://issues.apache.org/jira/browse/HDFS-4489
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brandon Li
>Assignee: Brandon Li
>
> The benefit of using InodeID to uniquely identify a file can be multiple 
> folds. Here are a few of them:
> 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, 
> HDFS-4437.
> 2. modification checks in tools like distcp. Since a file could have been 
> replaced or renamed to, the file name and size combination is no t reliable, 
> but the combination of file id and size is unique.
> 3. id based protocol support (e.g., NFS)
> 4. to make the pluggable block placement policy use fileid instead of 
> filename (HDFS-385).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4716) TestAllowFormat#testFormatShouldBeIgnoredForNonFileBasedDirs fails on Windows

2013-04-18 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth resolved HDFS-4716.
-

Resolution: Duplicate

> TestAllowFormat#testFormatShouldBeIgnoredForNonFileBasedDirs fails on Windows
> -
>
> Key: HDFS-4716
> URL: https://issues.apache.org/jira/browse/HDFS-4716
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, test
>Affects Versions: 3.0.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-4716.1.patch
>
>
> {{TestAllowFormat#testFormatShouldBeIgnoredForNonFileBasedDirs}} fails on 
> Windows due to an incorrectly initialized name dir in the {{Configuration}} 
> used by the test.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Heads up - Snapshots feature merge into trunk

2013-04-18 Thread Aaron T. Myers
On Fri, Apr 19, 2013 at 6:53 AM, Tsz Wo Sze  wrote:

> HdfsAdmin is also for admin operations.  However, createSnapshot etc
> methods aren't.
>

I agree that they're not administrative operations in the sense that they
don't strictly require super user privilege, but they are "administrative"
in the sense that they will most-often be used by those administering HDFS.
The HdfsAdmin class should not be construed to contain only operations
which require super user privilege, even though that happens to be the case
right now. It's intended as just a public API for HDFS-specific operations.

Regardless, my point is not necessarily that these operations should go
into the HdfsAdmin class, but rather that they shouldn't go into the
FileSystem class, since the snapshots API doesn't seem to me like it will
generalize to other FileSystem implementations.

--
Aaron T. Myers
Software Engineer, Cloudera


[jira] [Created] (HDFS-4717) Change the parameter type of the snapshot methods in HdfsAdmin to Path

2013-04-18 Thread Tsz Wo (Nicholas), SZE (JIRA)
Tsz Wo (Nicholas), SZE created HDFS-4717:


 Summary: Change the parameter type of the snapshot methods in 
HdfsAdmin to Path
 Key: HDFS-4717
 URL: https://issues.apache.org/jira/browse/HDFS-4717
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE


In HdfsAdmin, the path parameter type in allowSnapshot(String path) and 
allowSnapshot(String path) should be Path but not String.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4716) TestAllowFormat#testFormatShouldBeIgnoredForNonFileBasedDirs fails on Windows

2013-04-18 Thread Chris Nauroth (JIRA)
Chris Nauroth created HDFS-4716:
---

 Summary: 
TestAllowFormat#testFormatShouldBeIgnoredForNonFileBasedDirs fails on Windows
 Key: HDFS-4716
 URL: https://issues.apache.org/jira/browse/HDFS-4716
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, test
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth


{{TestAllowFormat#testFormatShouldBeIgnoredForNonFileBasedDirs}} fails on 
Windows due to an incorrectly initialized name dir in the {{Configuration}} 
used by the test.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Heads up - Snapshots feature merge into trunk

2013-04-18 Thread Tsz Wo Sze
Hi Aaron,

You are right the arguments should be Path but not String.  Will fix it.

HdfsAdmin is also for admin operations.  However, createSnapshot etc methods 
aren't.

Tsz-Wo





 From: Aaron T. Myers 
To: "hdfs-dev@hadoop.apache.org" ; Tsz Wo Sze 
 
Sent: Thursday, April 18, 2013 1:49 PM
Subject: Re: Heads up - Snapshots feature merge into trunk
 


On Fri, Apr 19, 2013 at 4:48 AM, Tsz Wo Sze  wrote:

Currently, allowSnapshot(..) and disallowSnapshot(..) are already in HdfsAdmin.

Ah, my bad. Not sure how I missed those. Good to see. Though, now that I look 
at them, those methods should really be taking Paths as arguments, not Strings. 
This is obviously quite minor, though.
 

  The other operations createSnapshot(..), renameSnapshot(..) and 
deleteSnapshot(..) are actually user operations and they are declared in 
FileSystem.  Users can take snapshots for their own directories once admin has 
allowed snapshots for those directories.  Snapshot is not a HDFS-specific 
operation.  Many other file systems do support it.  No?
>
Certainly other "file systems" support it, e.g. WAFL, ZFS, etc, but do other 
"FileSystem" (the Hadoop class) implementations, e.g. LocalFileSystem, 
S3FileSystem, etc? Will they ever? If they do, will they support sub-tree 
snapshots like HDFS does? Snapshots in general seem like something whose 
implementation, interface, etc. are highly file system-specific, and thus I 
don't think it makes a ton of sense to put that API in what is intended to be a 
broad, stable interface. If we were to move these operations into the HdfsAdmin 
interface, there's nothing to stop users from using that interface instead of 
FileSystem. After all, that was the point of adding the HdfsAdmin class in the 
first place - to have a public API for performing HDFS-specific operations.


--Aaron T. Myers
Software Engineer, Cloudera

[jira] [Created] (HDFS-4715) Backport HDFS-3577 and other related WebHDFS to branch-1

2013-04-18 Thread Tsz Wo (Nicholas), SZE (JIRA)
Tsz Wo (Nicholas), SZE created HDFS-4715:


 Summary: Backport HDFS-3577 and other related WebHDFS to branch-1
 Key: HDFS-4715
 URL: https://issues.apache.org/jira/browse/HDFS-4715
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Mark Wagner


The related JIRAs are HDFS-3577, HDFS-3318, and HDFS-3788.  Backporting them 
can fix some WebHDFS performance issues in branch-1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: collision in the naming of '.snapshot' directory between hdfs snapshot and hbase snapshot

2013-04-18 Thread Tsz Wo Sze
@HBase-dev, thanks for yielding the reserved word ".snapshot" to HDFS and the 
fast fix for addressing the problem.  You guys have done a great job!


@Harsh, it seems that more people think that .snapshot is better.


Tsz-Wo




 From: Andrew Purtell 
To: "hdfs-dev@hadoop.apache.org"  
Cc: "d...@hbase.apache.org"  
Sent: Wednesday, April 17, 2013 9:14 AM
Subject: Re: collision in the naming of '.snapshot' directory between hdfs 
snapshot and hbase snapshot
 

Thanks for the consideration but we've just committed a change to address
this as HBASE-8352


On Wednesday, April 17, 2013, Harsh J wrote:

> Pardon my late inquisition here but since HBase already shipped out
> with a name .snapshots/, why do we force them to change it, and not
> rename HDFS' snapshots to use .hdfs-snapshots, given that HDFS
> Snapshots has not been released for any users yet. The way I see it,
> that'd be much more easier to do than making a workaround for a done
> deal on HBase, which already has its snapshot's users.
>
> @Tsz-Wo - If the snapshots in HDFS aren't a 'generic' feature
> applicable to other FileSystem interface implementations as well, then
> .hdfs-snapshots should be fine for it - no?
>
> On Wed, Apr 17, 2013 at 2:32 AM, Ted Yu  wrote:
> > Hi,
> > Please take a look at patch v5 attached to HBASE-8352.
> >
> > It would be nice to resolve this blocker today so that 0.94.7 RC can be
> cut.
> >
> > Thanks
> >
> > On Tue, Apr 16, 2013 at 10:12 AM, lars hofhansl 
> wrote:
> >
> >> Please see my last comment on the jira. We can make this work without
> >> breaking users who are using HDFS snapshots.
> >>
> >>   --
> >>  *From:* Ted Yu 
> >> *To:* d...@hbase.apache.org
> >> *Cc:* hdfs-dev@hadoop.apache.org; lars hofhansl 
> >> *Sent:* Tuesday, April 16, 2013 10:00 AM
> >> *Subject:* Re: collision in the naming of '.snapshot' directory between
> >> hdfs snapshot and hbase snapshot
> >>
> >> Let's get proper release notes for HBASE-8352 .
> >>
> >> Either Lars or I can send out notification to user mailing list so that
> >> there is enough preparation for this change.
> >>
> >> Cheers
> >>
> >> On Tue, Apr 16, 2013 at 8:46 AM, Jonathan Hsieh 
> wrote:
> >>
> >> I was away from keyboard when I asserted that hdfs snapshot was a hadoop
> >> 2.1 or 3.0 feature.  Apparently it is targeted as a hadoop 2.0.5
> feature.
> >>  (I'm a little surprised -- expected this to be a hadoop2 compat
> breaking
> >> feature) -- so I agree that this is a bit more urgent.
> >>
> >> Anyway, I agree that the fs .snapshot naming convention is long standing
> >> and should win.
> >>
> >> My concern is with breaking compatibility in 0.94 again -- if we don't
> go
> >> down the conf variable route,  I consider having docs to properly
> document
> >> how to do the upgrade and caveats of doing the upgrade in the
> docs/release
> >> notes blocker to hbase 0.94.7.  (specifically mentioning from 0.94.6 to
> >> 0.94.7, and to possibly to 0.95).
> >>
> >> Jon.
> >>
> >> On Mon, Apr 15, 2013 at 9:00 PM, Ted Yu  wrote:
> >>
> >> > bq. Alternatively, we can detect the underlying Hadoop version, and
> use
> >> > either .snapshot or .hbase_snapshot in 0.94 depending on h1 & h2.
> >> >
> >> > I think this would introduce more confusion, especially for
> operations.
> >> >
> >> > Cheers
> >> >
> >> > On Mon, Apr 15, 2013 at 8:52 PM, Enis Söztutar 
> >> wrote:
> >> >
> >> > > Because HDFS exposes the snapshots so that the normal file system
> >> > > operations are mapped inside snapshot dirs, I think HDFS reserving
> the
> >> > > .snapshot name makes sense. OTOH, nothing is specific about the dir
> >> name
> >> > > that is chosen by HBase.
> >> > >
> >> > > I would prefer to change the dir name in 0.94 as well, since 0.94 is
> >> also
> >> > > being run on top of hadoop 2. Alternatively, we can detect the
> >> underlying
> >> > > Hadoop version, and use either .snapshot or .hbase_snapshot in 0.94
> >> > > depending on h1 & h2.
> >> > >
> >> > > Enis
> >> > >
> >> > >
> >> > > On Mon, Apr 15, 2013 at 8:31 PM, Ted Yu 
> wrote:
> >> > >
> >> > > > bq. let's make the hbase snapshot for a conf variable.
> >> > > >
> >> > > > Once we decide on the new name of snapshot directory, we should
> still
> >> > use
> >> > > > hardcoded value. This aligns with current code base:
> >> > > > See this snippet from HConstants:
> >--
> Harsh J
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: Heads up - Snapshots feature merge into trunk

2013-04-18 Thread Aaron T. Myers
On Fri, Apr 19, 2013 at 4:48 AM, Tsz Wo Sze  wrote:

> Currently, allowSnapshot(..) and disallowSnapshot(..) are already in
> HdfsAdmin.
>

Ah, my bad. Not sure how I missed those. Good to see. Though, now that I
look at them, those methods should really be taking Paths as arguments, not
Strings. This is obviously quite minor, though.


>   The other operations createSnapshot(..), renameSnapshot(..) and
> deleteSnapshot(..) are actually user operations and they are declared in
> FileSystem.  Users can take snapshots for their own directories once admin
> has allowed snapshots for those directories.  Snapshot is not a
> HDFS-specific operation.  Many other file systems do support it.  No?
>

Certainly other "file systems" support it, e.g. WAFL, ZFS, etc, but do
other "FileSystem" (the Hadoop class) implementations, e.g.
LocalFileSystem, S3FileSystem, etc? Will they ever? If they do, will they
support sub-tree snapshots like HDFS does? Snapshots in general seem like
something whose implementation, interface, etc. are highly file
system-specific, and thus I don't think it makes a ton of sense to put that
API in what is intended to be a broad, stable interface. If we were to move
these operations into the HdfsAdmin interface, there's nothing to stop
users from using that interface instead of FileSystem. After all, that was
the point of adding the HdfsAdmin class in the first place - to have a
public API for performing HDFS-specific operations.

--
Aaron T. Myers
Software Engineer, Cloudera


[jira] [Resolved] (HDFS-4711) Can not change replication factor of file during moving or deleting.

2013-04-18 Thread Vladimir Barinov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Barinov resolved HDFS-4711.


Resolution: Won't Fix

> Can not change replication factor of file during moving or deleting.
> 
>
> Key: HDFS-4711
> URL: https://issues.apache.org/jira/browse/HDFS-4711
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Vladimir Barinov
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> I don't know is it a feature or a bug. 
> According to hdfs dfs -help we can use key -D to set specific options for 
> action;
> When we copying or uploading file to hdfs we can override replication factor 
> with -D dfs.replication=N. That's works well.
> But it doesn't work for moving or removing(to trash) file.
> Steps to reproduce:
> Uploading file
> hdfs dfs -put somefile /tmp/somefile
> Copying with changing replication:
> hdfs dfs -D dfs.replication=1 -mv /tmp/somefile /tmp/somefile2
> hadoop version:
> Hadoop 2.0.0-cdh4.1.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] Release Apache Hadoop 0.23.7

2013-04-18 Thread Thomas Graves
Thanks everyone for trying 0.23.7 out and voting.

The vote passes with 13 +1s (8 binding and 5 non-binding) and no -1s.

I'll push the release.

Tom


On 4/11/13 2:55 PM, "Thomas Graves"  wrote:

>I've created a release candidate (RC0) for hadoop-0.23.7 that I would like
>to release.
>
>This release is a sustaining release with several important bug fixes in
>it.
>
>The RC is available at:
>http://people.apache.org/~tgraves/hadoop-0.23.7-candidate-0/
>The RC tag in svn is here:
>http://svn.apache.org/viewvc/hadoop/common/tags/release-0.23.7-rc0/
>
>The maven artifacts are available via repository.apache.org.
>
>Please try the release and vote; the vote will run for the usual 7 days.
>
>thanks,
>Tom Graves
>



Re: Heads up - Snapshots feature merge into trunk

2013-04-18 Thread Tsz Wo Sze
Hi Aaron,

Thanks for supporting the snapshot feature.

Currently, allowSnapshot(..) and disallowSnapshot(..) are already in HdfsAdmin. 
 The other operations createSnapshot(..), renameSnapshot(..) and 
deleteSnapshot(..) are actually user operations and they are declared in 
FileSystem.  Users can take snapshots for their own directories once admin has 
allowed snapshots for those directories.  Snapshot is not a HDFS-specific 
operation.  Many other file systems do support it.  No?

Tsz-Wo





 From: Aaron T. Myers 
To: "hdfs-dev@hadoop.apache.org"  
Sent: Wednesday, April 17, 2013 6:45 PM
Subject: Re: Heads up - Snapshots feature merge into trunk
 

I'm very excited to see that this project is nearing completion. I've been
following the development pretty closely and am very much looking forward
to getting this merged to trunk.

One thing that I do think we should address before the merge is moving the
programmatic APIs for working with snapshots. I've brought this up before,
and was told that it would be done in a separate JIRA, but I don't think
that JIRA was ever filed.

As it stands right now, the API for using snapshots is the following:

1. The API to create/delete/rename snapshots are in FileSystem.
2. The API to mark directories as snapshottable or not only exists in
DistributedFileSystem and DFSAdmin, neither of which are intended to be
public APIs.

In my opinion (and I think this was shared by others at the last snapshots
design meetup?) we should move #1 out of the FileSystem class since these
are primarily administrative APIs, and it is unlikely that any other
FileSystem implementation besides HDFS will ever implement these commands.
Also, #2 should really be in some public (not necessarily stable, but
public) class for use by tools which are used to administer HDFS. In my
opinion the most natural place for both of these APIs is in the HdfsAdmin
class, which is a public/evolving interface explicitly for these sorts of
operations.

What are others thoughts on this subject?

Best,
Aaron

--
Aaron T. Myers
Software Engineer, Cloudera


On Sat, Apr 13, 2013 at 10:05 AM, Suresh Srinivas wrote:

> Support for snapshots feature is being worked on in the jira
> https://issues.apache.org/jira/browse/HDFS-2802. This is an important and
> a
> large feature in HDFS. Please see a brief presentation that describes the
> feature at a highlevel from the Snapshot discussion meetup we had a while
> back -
> https://issues.apache.org/jira/secure/attachment/12552861/Snapshots.pdf.
>
> I am exicted to announce that the feature development will soon be
> completed. Please see the jira for the design and the details of the
> subtasks. This is a heads up about the merge vote mail that will soon be
> sent.
>
> Details of development and testing:
> Development has been done in a separate branch -
> https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-2802. The
> design is posted at -
>
> https://issues.apache.org/jira/secure/attachment/12551474/Snapshots20121030.pdf
> .
> The feature development has involved close to 100 subtasks and close to 20K
> lines of code.
>
> A lot of unit tests have been added as a part of the feature. We also have
> been testing this in a cluster of 5 nodes with a long running test that
> mimics a real cluster usage with emphasis on use cases related to
> snapshots.  Please see the test plan
>
> https://issues.apache.org/jira/secure/attachment/12575442/snapshot-testplan.pdffor
> the details.
>
> Next steps, before calling for merge vote, we need to get the following
> done:
> - Add user documentation that describes the feature, and how to use it
> - Complete some of the pending tasks
> - Continue testing the feature and fix any bugs that might come up
> - Update the design document
>
> Thanks to everyone who has participated in design and development of this
> feature. Please review the work and help in testing the feature.
>
> Regards,
> Suresh
>

[jira] [Created] (HDFS-4714) Support logging short messages in Namenode IPC server for configurable list of exception classes.

2013-04-18 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-4714:


 Summary: Support logging short messages in Namenode IPC server for 
configurable list of exception classes.
 Key: HDFS-4714
 URL: https://issues.apache.org/jira/browse/HDFS-4714
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Kihwal Lee


Namenode can slow down significantly if a rogue client/job issues massive 
number of requests that will fail. E.g. permission denied, quota overage, etc.  
The major contributing factor in slow down is the long namenode log message, 
which includes full stack trace.  

Previously similar issues involving safe mode and standby node have been 
addressed and we can extend it to suppress logging stack traces for configured 
list of exception classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] Release Apache Hadoop 2.0.4-alpha

2013-04-18 Thread Konstantin Boudnik
-0

the release is missing HADOOP-9704 that has critical effect on downstream
projects e.g. build are affected. The issue has been raised for the first time
back in 4/10/13 http://is.gd/OGb3GG and never been even sneezed upon.

Cos

On Sat, Apr 13, 2013 at 03:26AM, Arun C Murthy wrote:
> Folks,
> 
> I've created a release candidate (RC2) for hadoop-2.0.4-alpha that I would 
> like to release.
> 
> The RC is available at: 
> http://people.apache.org/~acmurthy/hadoop-2.0.4-alpha-rc2/
> The RC tag in svn is here: 
> http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.4-alpha-rc2
> 
> The maven artifacts are available via repository.apache.org.
> 
> Please try the release and vote; the vote will run for the usual 7 days.
> 
> thanks,
> Arun
> 
> 
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
> 
> 


signature.asc
Description: Digital signature


[jira] [Created] (HDFS-4713) Wrong server principal is used for rpc calls to namenode if HA is enabled

2013-04-18 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-4713:


 Summary: Wrong server principal is used for rpc calls to namenode 
if HA is enabled
 Key: HDFS-4713
 URL: https://issues.apache.org/jira/browse/HDFS-4713
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, namenode
Affects Versions: 2.0.4-alpha
Reporter: Kihwal Lee
Priority: Blocker


When various components are connecting to a namenode in a HA-enabled 
environment, a wrong server principal may be picked up.  This result in SASL 
failure, since the client-side used a wrong service ticket for the connection.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] Release Apache Hadoop 0.23.7

2013-04-18 Thread Derek Dagit
+1 (non-binding)
checked sigs and checksums
built and ran some simple jobs on single-node
-- 
Derek

On Apr 17, 2013, at 18:27, Sandy Ryza wrote:

> +1 (non-binding)
> Built from source and ran a couple of MR examples on a single node cluster.
> 
> -Sandy
> 
> 
> On Wed, Apr 17, 2013 at 12:03 PM, Siddharth Seth
> wrote:
> 
>> +1 (binding).
>> Verified checksums and signature.
>> Built from source tar, deployed a single node cluster (CapacityScheduler)
>> and tried a couple of simple MR jobs.
>> 
>> - Sid
>> 
>> 
>> On Thu, Apr 11, 2013 at 12:55 PM, Thomas Graves >> wrote:
>> 
>>> I've created a release candidate (RC0) for hadoop-0.23.7 that I would
>> like
>>> to release.
>>> 
>>> This release is a sustaining release with several important bug fixes in
>>> it.
>>> 
>>> The RC is available at:
>>> http://people.apache.org/~tgraves/hadoop-0.23.7-candidate-0/
>>> The RC tag in svn is here:
>>> http://svn.apache.org/viewvc/hadoop/common/tags/release-0.23.7-rc0/
>>> 
>>> The maven artifacts are available via repository.apache.org.
>>> 
>>> Please try the release and vote; the vote will run for the usual 7 days.
>>> 
>>> thanks,
>>> Tom Graves
>>> 
>>> 
>> 



[jira] [Created] (HDFS-4712) New libhdfs method hdfsGetDataNodes

2013-04-18 Thread andrea manzi (JIRA)
andrea manzi created HDFS-4712:
--

 Summary: New libhdfs method hdfsGetDataNodes
 Key: HDFS-4712
 URL: https://issues.apache.org/jira/browse/HDFS-4712
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: libhdfs
Reporter: andrea manzi


we have implemented a possible extension to libhdfs to retrieve information 
about the available datanodes ( there was a mail on the hadoop-hdsf-dev mailing 
list initially abut this :
http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201204.mbox/%3CCANhO-
s0mvororrxpjnjbql6brkj4c7l+u816xkdc+2r0whj...@mail.gmail.com%3E)

i would like to know how to proceed to create a patch, cause on the wiki 
http://wiki.apache.org/hadoop/HowToContribute i can see info about JAVA patches 
but nothing related to extensions in C.





--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4711) Can not change replication factor of file during moving or deleting.

2013-04-18 Thread Vladimir Barinov (JIRA)
Vladimir Barinov created HDFS-4711:
--

 Summary: Can not change replication factor of file during moving 
or deleting.
 Key: HDFS-4711
 URL: https://issues.apache.org/jira/browse/HDFS-4711
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vladimir Barinov
Priority: Minor


I don't know is it a feature or a bug. 
According to hdfs dfs -help we can use key -D to set specific options for 
action;
When we copying or uploading file to hdfs we can override replication factor 
with -D dfs.replication=N. That's works well.
But it doesn't work for moving or removing(to trash) file.
Step to reproduce:
Uploading file
hdfs dfs -put somefile /tmp/somefile
Copying with changing replication:
hdfs dfs -D dfs.replication=1 -mv /tmp/somefile /tmp/somefile2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hadoop-Hdfs-trunk #1376

2013-04-18 Thread Apache Jenkins Server
See 

Changes:

[harsh] HADOOP-9450. HADOOP_USER_CLASSPATH_FIRST is not honored; CLASSPATH is 
PREpended instead of APpended. Contributed by Chris Nauroth and Harsh J. (harsh)

[tucu] MAPREDUCE-5128. mapred-default.xml is missing a bunch of history server 
configs. (sandyr via tucu)

[tucu] YARN-476. ProcfsBasedProcessTree info message confuses users. (sandyr 
via tucu)

[tucu] YARN-518. Fair Scheduler's document link could be added to the hadoop 
2.x main doc page. (sandyr via tucu)

[bikas] MAPREDUCE-5140. MR part of YARN-514 (Zhijie Shen via bikas)

[bikas] YARN-514.Delayed store operations should not result in RM 
unavailability for app submission (Zhijie Shen via bikas)

[suresh] HDFS-4695. TestEditLog leaks open file handles between tests. 
Contributed by Ivan Mitic.

--
[...truncated 14117 lines...]
Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 

Generating 


Hadoop-Hdfs-trunk - Build # 1376 - Still Failing

2013-04-18 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1376/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 14310 lines...]
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)

Running org.apache.hadoop.contrib.bkjournal.TestCurrentInprogress
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.675 sec
Running org.apache.hadoop.contrib.bkjournal.TestBookKeeperConfiguration
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.066 sec
Running org.apache.hadoop.contrib.bkjournal.TestBookKeeperJournalManager
Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.489 sec

Results :

Failed tests:   
testStandbyExceptionThrownDuringCheckpoint(org.apache.hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints):
 SBN should have still been checkpointing.

Tests run: 32, Failures: 1, Errors: 0, Skipped: 0

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  SUCCESS 
[1:29:37.786s]
[INFO] Apache Hadoop HttpFS .. SUCCESS [2:23.040s]
[INFO] Apache Hadoop HDFS BookKeeper Journal . FAILURE [57.900s]
[INFO] Apache Hadoop HDFS Project  SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 1:32:59.664s
[INFO] Finished at: Thu Apr 18 13:07:57 UTC 2013
[INFO] Final Memory: 51M/711M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on 
project hadoop-hdfs-bkjournal: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/target/surefire-reports
 for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hadoop-hdfs-bkjournal
Build step 'Execute shell' marked build as failure
Archiving artifacts
Updating HADOOP-9450
Updating MAPREDUCE-5128
Updating YARN-518
Updating YARN-476
Updating YARN-514
Updating HDFS-4695
Updating MAPREDUCE-5140
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

Hadoop-Hdfs-0.23-Build - Build # 585 - Still Unstable

2013-04-18 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/585/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 11955 lines...]
[INFO] --- maven-dependency-plugin:2.1:build-classpath (build-classpath) @ 
hadoop-hdfs-project ---
[INFO] No dependencies found.
[INFO] Wrote classpath file 
'/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/target/classes/mrapp-generated-classpath'.
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.0:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-install-plugin:2.3.1:install (default-install) @ 
hadoop-hdfs-project ---
[INFO] Installing 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/pom.xml
 to 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-hdfs-project/0.23.8-SNAPSHOT/hadoop-hdfs-project-0.23.8-SNAPSHOT.pom
[INFO] 
[INFO] --- maven-antrun-plugin:1.6:run (create-testdirs) @ hadoop-hdfs-project 
---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-dependency-plugin:2.1:build-classpath (build-classpath) @ 
hadoop-hdfs-project ---
[INFO] No dependencies found.
[INFO] Skipped writing classpath file 
'/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-Build/trunk/hadoop-hdfs-project/target/classes/mrapp-generated-classpath'.
  No changes found.
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.0:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.6:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:2.3.2:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] ** FindBugsMojo execute ***
[INFO] canGenerate is false
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  SUCCESS [4:53.289s]
[INFO] Apache Hadoop HttpFS .. SUCCESS [46.215s]
[INFO] Apache Hadoop HDFS Project  SUCCESS [0.057s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 5:40.190s
[INFO] Finished at: Thu Apr 18 11:39:29 UTC 2013
[INFO] Final Memory: 53M/768M
[INFO] 
+ /home/jenkins/tools/maven/latest/bin/mvn test 
-Dmaven.test.failure.ignore=true -Pclover 
-DcloverLicenseLocation=/home/jenkins/tools/clover/latest/lib/clover.license
Archiving artifacts
Recording test results
Build step 'Publish JUnit test result report' changed build result to UNSTABLE
Publishing Javadoc
Recording fingerprints
Updating YARN-72
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Unstable
Sending email for trigger: Unstable



###
## FAILED TESTS (if any) 
##
1 tests failed.
REGRESSION:  
org.apache.hadoop.fs.http.client.TestHttpFSFileSystem.testOperation[5]

Error Message:
null

Stack Trace:
java.lang.AssertionError: 
at org.junit.Assert.fail(Assert.java:91)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertTrue(Assert.java:54)
at 
org.apache.hadoop.fs.http.client.TestHttpFSFileSystem.testDelete(TestHttpFSFileSystem.java:202)
at 
org.apache.hadoop.fs.http.client.TestHttpFSFileSystem.operation(TestHttpFSFileSystem.java:420)
at 
org.apache.hadoop.fs.http.client.TestHttpFSFileSystem.__CLR3_0_2jycbqc1wh(TestHttpFSFileSystem.java:473)
at 
org.apache.hadoop.fs.http.client.TestHttpFSFileSystem.testOperation(TestHttpFSFileSystem.java:471)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.Refle

Jenkins build is still unstable: Hadoop-Hdfs-0.23-Build #585

2013-04-18 Thread Apache Jenkins Server
See 



[jira] [Created] (HDFS-4710) Turning off HDFS short-circuit checksums unexpectedly slows down Hive

2013-04-18 Thread Gopal V (JIRA)
Gopal V created HDFS-4710:
-

 Summary: Turning off HDFS short-circuit checksums unexpectedly 
slows down Hive
 Key: HDFS-4710
 URL: https://issues.apache.org/jira/browse/HDFS-4710
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.0.4-alpha
 Environment: Centos (EC2) + short-circuit reads on
Reporter: Gopal V
Priority: Minor


When short-circuit reads are on, HDFS client slows down when checksums are 
turned off.

With checksums on, the query takes 45.341 seconds and with it turned off, it 
takes 56.345 seconds. This is slower than the speeds observed when 
short-circuiting is turned off.

The issue seems to be that FSDataInputStream.readByte() calls are directly 
transferred to the disk fd when the checksums are turned off.

Even though all the columns are integers, the data being read will be read via 
DataInputStream which does

{code}
public final int readInt() throws IOException {
int ch1 = in.read();
int ch2 = in.read();
int ch3 = in.read();
int ch4 = in.read();
{code}

To confirm, an strace of the Yarn container shows

{code}
26690 read(154, "B", 1) = 1
26690 read(154, "\250", 1)  = 1
26690 read(154, ".", 1) = 1
26690 read(154, "\24", 1)   = 1
{code}

To emulate this without the entirety of Hive code, I have written a simpler 
test app 

https://github.com/t3rmin4t0r/shortcircuit-reader

The jar will read a file in -bs  sized buffers. Running it with 1 byte 
blocks gives similar results to the Hive test run.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira



Created a new Fix Version for 1.3.0

2013-04-18 Thread Harsh J
Just an FYI. Since 1.2 got branched, and I couldn't find one for 1.3,
I went ahead and created this under HADOOP and HDFS JIRAs (already
used under MAPREDUCE). I also updated bugs HDFS-4622 and HDFS-4581 to
reference their right fix versions.

Thanks,
--
Harsh J