Re: Heads up: branch-2.1-beta

2013-06-18 Thread Arun C Murthy
Ping. Any luck?

On Jun 17, 2013, at 4:06 PM, Roman Shaposhnik  wrote:

> On Sun, Jun 16, 2013 at 5:14 PM, Arun C Murthy  wrote:
>> Roman,
>> 
>> Is there a chance you can run the tests with the full stack built against 
>> branch-2.1-beta and help us know where we are?
> 
> I will try to kick off the full build today. And deploy/test tomorrow.
> It is all pretty automated, but takes a long time. Hope the results
> will still be useful for you guys wrt. 2.1 release.
> 
> Thanks,
> Roman.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/




[jira] [Created] (HADOOP-9652) FileContext#getFileLinkStatus does not fill in the link owner and mode

2013-06-18 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HADOOP-9652:


 Summary: FileContext#getFileLinkStatus does not fill in the link 
owner and mode
 Key: HADOOP-9652
 URL: https://issues.apache.org/jira/browse/HADOOP-9652
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Colin Patrick McCabe


{{FileContext#getFileLinkStatus}} does not actually get the owner and mode of 
the symlink, but instead uses the owner and mode of the symlink target.  If the 
target can't be found, it fills in bogus values (the empty string and 
FsPermission.getDefault) for these.

Symlinks have an owner distinct from who created them, and getFileLinkStatus 
ought to expose this.

In some operating systems, symlinks can have a permission other than 0777.  We 
ought to expose this in RawLocalFilesystem and other places, although we don't 
necessarily have to support this behavior in HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Hadoop 2.0 - org.apache.hadoop.fs.Hdfs vs. DistributedFileSystem?

2013-06-18 Thread Eli Collins
Hey Steve,

That's correct, see HADOOP-6223 for the history.  However, per Andrew
I don't think it's realistic to expect people to migrate off
FileSystem for a while (I filed HADOOP-6446 well over three years
ago).

The unfortunate consequence of the earlier decision to have parallel
interfaces rather than transition one over time means people
effectively need to end up implementing multiple backends - one that
gets used by clients of FileSystem, and one for clients of
FileContext.  Implementing in only one place significantly limits
adoption of the feature or file system because they can't be
effectively adopted in practice unless they're available to old and
new clients  (for example, this is why symlinks are getting backported
to FileSystem from FileContext).

Thanks,
Eli

On Tue, Jun 18, 2013 at 11:15 AM, Stephen Watt  wrote:
> Hi Folks
>
> My understanding is that from Hadoop 2.0 onwards the AbstractFileSystem is 
> now the strategic class to extend for writing Hadoop FileSystem plugins. This 
> is a departure from previous versions where one would extend the FileSystem 
> class. This seems to be reinforced by the hadoop-default.xml for Hadoop 2.0 
> in the Apache Wiki 
> (http://hadoop.apache.org/docs/r2.0.2-alpha/hadoop-project-dist/hadoop-common/core-default.xml)
>  which shows fs.AbstractFileSystem.hdfs.impl being set to 
> org.apache.hadoop.fs.Hdfs
>
> Is my assertion correct? Do we have community consensus around this? i.e. 
> Beyond the apache distro, are the commercial distros (Intel, Hortonworks, 
> Cloudera, WanDisco, EMC Pivotal, etc.) using org.apache.hadoop.fs.Hdfs as 
> their filesystem plugin for HDFS? What does one lose by using the 
> DistributedFileSystem class instead of the Hdfs class?
>
> Regards
> Steve Watt
>
> - Original Message -
> From: "Andrew Wang" 
> To: common-dev@hadoop.apache.org
> Cc: "Milind Bhandarkar" , "shv hadoop" 
> , "Steve Loughran" , "Kun Ling" 
> , "Roman Shaposhnik" , "Andrew 
> Purtell" , cdoug...@apache.org, jayh...@cs.ucsc.edu, 
> "Sanjay Radia" 
> Sent: Friday, June 14, 2013 1:32:38 PM
> Subject: Re: [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop 
> FileSystems + Workshop
>
> Hey Steve,
>
> I agree that it's confusing. FileSystem and FileContext are essentially two
> parallel sets of interfaces for accessing filesystems in Hadoop.
> FileContext splits the interface and shared code with AbstractFileSystem,
> while FileSystem is all-in-one. If you're looking for the AFS equivalents
> to DistributedFileSystem and LocalFileSystem, see Hdfs and LocalFs.
>
> Realistically, FileSystem isn't going to be deprecated and removed any time
> soon. There are lots of 3rd-party FileSystem implementations, and most apps
> today use FileSystem (including many HDFS internals, like trash and the
> shell).
>
> When I read the wiki page, I figured that the mention of AFS was
> essentially a typo, since everyone's been steaming ahead with FileSystem.
> Standardizing FileSystem makes total sense to me, I just wanted to confirm
> that plan.
>
> Best,
> Andrew
>
>
> On Fri, Jun 14, 2013 at 9:38 AM, Stephen Watt  wrote:
>
>> This is a good point Andrew. The hangout was actually the first time I'd
>> heard about the AbstractFileSystem class. I've been doing some further
>> analysis on the source in Hadoop 2.0 and when I look at the Hadoop 2.0
>> implementation of DistributedFileSystem and LocalFileSystem class they
>> extend the FileSystem class and not AbstractFileSystem. I would imagine if
>> the plan for Hadoop 2.0 is to build FileSystem implementations using the
>> AbstractFileSystem, then those two would use it, so I'm a bit confused.
>>
>> Perhaps I'm looking in the wrong place? Sanjay (or anyone else), could you
>> clarify this for us?
>>
>> Regards
>> Steve Watt
>>
>> - Original Message -
>> From: "Andrew Wang" 
>> To: common-dev@hadoop.apache.org
>> Cc: mbhandar...@gopivotal.com, "shv hadoop" ,
>> ste...@hortonworks.com, erlv5...@gmail.com, shaposh...@gmail.com,
>> apurt...@apache.org, cdoug...@apache.org, jayh...@cs.ucsc.edu,
>> san...@hortonworks.com
>> Sent: Monday, June 10, 2013 5:14:16 PM
>> Subject: Re: [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop
>> FileSystems + Workshop
>>
>> Thanks for the summary Steve, very useful.
>>
>> I'm wondering a bit about the point on testing AbstractFileSystem rather
>> than FileSystem. While these are both wrappers for DFSClient, they're
>> pretty different in terms of the APIs they expose. Furthermore, AFS is not
>> actually a client-facing API; clients interact with an AFS through
>> FileContext.
>>
>> I ask because I did some work trying to unify the symlink tests for both
>> FileContext and FileSystem (HADOOP-9370 and HADOOP-9355). Subtle things
>> like the default mkdir semantics are different; you can see some of the
>> contortions in HADOOP-9370. I ultimately ended up just adhering to the
>> FileContext-style behavior, but as a result I'm not really testing some
>> parts of FileSy

Hadoop 2.0 - org.apache.hadoop.fs.Hdfs vs. DistributedFileSystem?

2013-06-18 Thread Stephen Watt
Hi Folks

My understanding is that from Hadoop 2.0 onwards the AbstractFileSystem is now 
the strategic class to extend for writing Hadoop FileSystem plugins. This is a 
departure from previous versions where one would extend the FileSystem class. 
This seems to be reinforced by the hadoop-default.xml for Hadoop 2.0 in the 
Apache Wiki 
(http://hadoop.apache.org/docs/r2.0.2-alpha/hadoop-project-dist/hadoop-common/core-default.xml)
 which shows fs.AbstractFileSystem.hdfs.impl being set to 
org.apache.hadoop.fs.Hdfs 

Is my assertion correct? Do we have community consensus around this? i.e. 
Beyond the apache distro, are the commercial distros (Intel, Hortonworks, 
Cloudera, WanDisco, EMC Pivotal, etc.) using org.apache.hadoop.fs.Hdfs as their 
filesystem plugin for HDFS? What does one lose by using the 
DistributedFileSystem class instead of the Hdfs class?

Regards
Steve Watt

- Original Message -
From: "Andrew Wang" 
To: common-dev@hadoop.apache.org
Cc: "Milind Bhandarkar" , "shv hadoop" 
, "Steve Loughran" , "Kun Ling" 
, "Roman Shaposhnik" , "Andrew 
Purtell" , cdoug...@apache.org, jayh...@cs.ucsc.edu, 
"Sanjay Radia" 
Sent: Friday, June 14, 2013 1:32:38 PM
Subject: Re: [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop 
FileSystems + Workshop

Hey Steve,

I agree that it's confusing. FileSystem and FileContext are essentially two
parallel sets of interfaces for accessing filesystems in Hadoop.
FileContext splits the interface and shared code with AbstractFileSystem,
while FileSystem is all-in-one. If you're looking for the AFS equivalents
to DistributedFileSystem and LocalFileSystem, see Hdfs and LocalFs.

Realistically, FileSystem isn't going to be deprecated and removed any time
soon. There are lots of 3rd-party FileSystem implementations, and most apps
today use FileSystem (including many HDFS internals, like trash and the
shell).

When I read the wiki page, I figured that the mention of AFS was
essentially a typo, since everyone's been steaming ahead with FileSystem.
Standardizing FileSystem makes total sense to me, I just wanted to confirm
that plan.

Best,
Andrew


On Fri, Jun 14, 2013 at 9:38 AM, Stephen Watt  wrote:

> This is a good point Andrew. The hangout was actually the first time I'd
> heard about the AbstractFileSystem class. I've been doing some further
> analysis on the source in Hadoop 2.0 and when I look at the Hadoop 2.0
> implementation of DistributedFileSystem and LocalFileSystem class they
> extend the FileSystem class and not AbstractFileSystem. I would imagine if
> the plan for Hadoop 2.0 is to build FileSystem implementations using the
> AbstractFileSystem, then those two would use it, so I'm a bit confused.
>
> Perhaps I'm looking in the wrong place? Sanjay (or anyone else), could you
> clarify this for us?
>
> Regards
> Steve Watt
>
> - Original Message -
> From: "Andrew Wang" 
> To: common-dev@hadoop.apache.org
> Cc: mbhandar...@gopivotal.com, "shv hadoop" ,
> ste...@hortonworks.com, erlv5...@gmail.com, shaposh...@gmail.com,
> apurt...@apache.org, cdoug...@apache.org, jayh...@cs.ucsc.edu,
> san...@hortonworks.com
> Sent: Monday, June 10, 2013 5:14:16 PM
> Subject: Re: [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop
> FileSystems + Workshop
>
> Thanks for the summary Steve, very useful.
>
> I'm wondering a bit about the point on testing AbstractFileSystem rather
> than FileSystem. While these are both wrappers for DFSClient, they're
> pretty different in terms of the APIs they expose. Furthermore, AFS is not
> actually a client-facing API; clients interact with an AFS through
> FileContext.
>
> I ask because I did some work trying to unify the symlink tests for both
> FileContext and FileSystem (HADOOP-9370 and HADOOP-9355). Subtle things
> like the default mkdir semantics are different; you can see some of the
> contortions in HADOOP-9370. I ultimately ended up just adhering to the
> FileContext-style behavior, but as a result I'm not really testing some
> parts of FileSystem.
>
> Are we going to end up with two different sets of validation tests? Or just
> choose one API over the other? FileSystem is supposed to eventually be
> deprecated in favor of FileContext (HADOOP-6446, filed in 2009), but actual
> uptake in practice has been slow.
>
> Best,
> Andrew
>
>
> On Mon, Jun 10, 2013 at 1:49 PM, Stephen Watt  wrote:
>
> > For those interested - I posted a recap of this mornings Google Hangout
> on
> > the Wiki Page at https://wiki.apache.org/hadoop/HCFS/Progress
> >
> > On Jun 5, 2013, at 8:14 PM, Stephen Watt wrote:
> >
> > > Hi Folks
> > >
> > > Per Roman's recommendation I've created a Wiki Page for organizing the
> > work and managing the logistics -
> > https://wiki.apache.org/hadoop/HCFS/Progress
> > >
> > > I'd like to propose a Google Hangout at 9am PST on Monday June 10th to
> > get together and discuss the initiative. Please respond back to me if
> > you're interested or would like to propose a different time. I'll update

[jira] [Created] (HADOOP-9651) Filesystems to throw FileAlreadyExistsException in createFile(path, overwrite=false) when the file exists

2013-06-18 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-9651:
--

 Summary: Filesystems to throw FileAlreadyExistsException in 
createFile(path, overwrite=false) when the file exists
 Key: HADOOP-9651
 URL: https://issues.apache.org/jira/browse/HADOOP-9651
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs
Affects Versions: 2.1.0-beta
Reporter: Steve Loughran
Priority: Minor


While HDFS and other filesystems throw a {{FileAlreadyExistsException}} if you 
try to create a file that exists and you have set {{overwrite=false}}, 
{{RawLocalFileSystem}} throws a plain {{IOException}}. This makes it impossible 
to distinguish a create operation failing from a fixable problem (the file is 
there) and something more fundamental.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hadoop-Common-trunk #803

2013-06-18 Thread Apache Jenkins Server
See 

Changes:

[vinodkv] YARN-846.  Move pb Impl classes from yarn-api to yarn-common. 
Contributed by Jian He.

[acmurthy] YARN-799. Fix CgroupsLCEResourcesHandler to use /tasks instead of 
/cgroup.procs. Contributed by Chris Riccomini.

[vinodkv] YARN-841. Move Auxiliary service to yarn-api, annotate and document 
it. Contributed by Vinod Kumar Vavilapalli.

[acmurthy] YARN-840. Moved ProtoUtils to yarn.api.records.pb.impl. Contributed 
by Jian He.

[cnauroth] HDFS-4818. Several HDFS tests that attempt to make directories 
unusable do not work correctly on Windows. Contributed by Chris Nauroth.

[vinodkv] YARN-834. Fixed annotations for yarn-client module, reorganized 
packages and clearly differentiated *Async apis. Contributed by Arun C Murthy 
and Zhijie Shen.

[sseth] YARN-805. Fix javadoc and annotations on classes in the yarn-api 
package. Contributed by Jian He.

[cmccabe] HDFS-4626. ClientProtocol#getLinkTarget should throw an exception for 
non-symlink and non-existent paths.  (Andrew Wang via cmccabe)

[vinodkv] YARN-610. ClientToken is no longer set in the environment of the 
Containers. Contributed by Omkar Vinit Joshi.

[cnauroth] YARN-839. TestContainerLaunch.testContainerEnvVariables fails on 
Windows. Contributed by Chuan Liu.

[brandonli] CHANGES.txt change for HADOOP-9515

[brandonli] HADOOP-9515. Add general interface for NFS and Mount. Contributed 
by Brandon Li

[vinodkv] YARN-822. Renamed ApplicationToken to be AMRMToken, and similarly the 
corresponding TokenSelector and SecretManager. Contributed by Omkar Vinit Joshi.

[jing9] HDFS-4875. Add a test for testing snapshot file length. Contributed by 
Arpit Agarwal.

[bikas] Fix hadoop-yarn-project/CHANGES.txt for YARN-759

[acmurthy] HADOOP-9517. Documented various aspects of compatibility for Apache 
Hadoop. Contributed by Karthik Kambatla.

--
[...truncated 51878 lines...]
Adding reference: maven.compile.classpath
Adding reference: maven.runtime.classpath
Adding reference: maven.test.classpath
Adding reference: maven.plugin.classpath
Adding reference: maven.project
Adding reference: maven.project.helper
Adding reference: maven.local.repository
[DEBUG] Initialize Maven Ant Tasks
parsing buildfile 
jar:file:/home/jenkins/.m2/repository/org/apache/maven/plugins/maven-antrun-plugin/1.6/maven-antrun-plugin-1.6.jar!/org/apache/maven/ant/tasks/antlib.xml
 with URI = 
jar:file:/home/jenkins/.m2/repository/org/apache/maven/plugins/maven-antrun-plugin/1.6/maven-antrun-plugin-1.6.jar!/org/apache/maven/ant/tasks/antlib.xml
 from a zip file
parsing buildfile 
jar:file:/home/jenkins/.m2/repository/org/apache/ant/ant/1.8.1/ant-1.8.1.jar!/org/apache/tools/ant/antlib.xml
 with URI = 
jar:file:/home/jenkins/.m2/repository/org/apache/ant/ant/1.8.1/ant-1.8.1.jar!/org/apache/tools/ant/antlib.xml
 from a zip file
Class org.apache.maven.ant.tasks.AttachArtifactTask loaded from parent loader 
(parentFirst)
 +Datatype attachartifact org.apache.maven.ant.tasks.AttachArtifactTask
Class org.apache.maven.ant.tasks.DependencyFilesetsTask loaded from parent 
loader (parentFirst)
 +Datatype dependencyfilesets org.apache.maven.ant.tasks.DependencyFilesetsTask
Setting project property: test.build.dir -> 

Setting project property: test.exclude.pattern -> _
Setting project property: hadoop.assemblies.version -> 3.0.0-SNAPSHOT
Setting project property: test.exclude -> _
Setting project property: distMgmtSnapshotsId -> apache.snapshots.https
Setting project property: project.build.sourceEncoding -> UTF-8
Setting project property: distMgmtSnapshotsUrl -> 
https://repository.apache.org/content/repositories/snapshots
Setting project property: distMgmtStagingUrl -> 
https://repository.apache.org/service/local/staging/deploy/maven2
Setting project property: test.build.data -> 

Setting project property: commons-daemon.version -> 1.0.13
Setting project property: hadoop.common.build.dir -> 

Setting project property: testsThreadCount -> 4
Setting project property: maven.test.redirectTestOutputToFile -> true
Setting project property: jdiff.version -> 1.0.9
Setting project property: distMgmtStagingName -> Apache Release Distribution 
Repository
Setting project property: project.reporting.outputEncoding -> UTF-8
Setting project property: build.platform -> Linux-i386-32
Setting project property: failIfNoTests -> false
Setting project property: distMgmtStagingId -> apache.staging.https
Setting project property: distMgmtSnapshotsName -> Apache Development Snapshot 
Repository
Setting project property: ant.file ->