[jira] [Created] (HADOOP-10025) Replace HttpConfig#getSchemePrefix with implicit scheme in YARN/MR

2013-10-04 Thread Haohui Mai (JIRA)
Haohui Mai created HADOOP-10025:
---

 Summary: Replace HttpConfig#getSchemePrefix with implicit scheme 
in YARN/MR
 Key: HADOOP-10025
 URL: https://issues.apache.org/jira/browse/HADOOP-10025
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Omkar Vinit Joshi






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HADOOP-10023) Add https support in HDFS

2013-10-04 Thread Suresh Srinivas (JIRA)
Suresh Srinivas created HADOOP-10023:


 Summary: Add https support in HDFS
 Key: HADOOP-10023
 URL: https://issues.apache.org/jira/browse/HADOOP-10023
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.2-alpha
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas


This is the HDFS part of HADOOP-10022. This will serve as the umbrella jira for 
all the https related cleanup in HDFS.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HADOOP-10022) Add support for per project https support

2013-10-04 Thread Suresh Srinivas (JIRA)
Suresh Srinivas created HADOOP-10022:


 Summary: Add support for per project https support
 Key: HADOOP-10022
 URL: https://issues.apache.org/jira/browse/HADOOP-10022
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Suresh Srinivas


Current configuration hadoop.https.enable turns on https only support for all 
the daemons in hadoop. This is an umbrella jira to add per project https 
configuration. For more details, see the detailed proposal - 
https://issues.apache.org/jira/browse/HADOOP-8581?focusedCommentId=13784332&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13784332

The current scope of work is descriged in - 
https://issues.apache.org/jira/browse/HADOOP-8581?focusedCommentId=13786567&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13786567
 a



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HADOOP-10009) Backport HADOOP-7808 to branch-1

2013-10-04 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HADOOP-10009.


   Resolution: Fixed
Fix Version/s: 1.3.0
 Hadoop Flags: Reviewed

Thanks for the work Haohui! I've committed it to branch-1.

> Backport HADOOP-7808 to branch-1
> 
>
> Key: HADOOP-10009
> URL: https://issues.apache.org/jira/browse/HADOOP-10009
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
> Attachments: HADOOP-10009.000.patch, HADOOP-10009.001.patch
>
>
> In branch-1, SecurityUtil::setTokenService() might throw a 
> NullPointerException, which is fixed in HADOOP-7808.
> The patch should be backported into branch-1



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: symlink support in Hadoop 2 GA

2013-10-04 Thread Andrew Wang
Colin posted a summary of our phone call yesterday (attendees: myself,
Colin, Daryn, Nathan, Jason, Chris, Suresh, Sanjay) on HADOOP-9984:

https://issues.apache.org/jira/browse/HADOOP-9984?focusedCommentId=13785701&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13785701

Pasted here:


   - We discussed alternatives to
HADOOP-9984,
   but concluded that they weren't workable.
   - We agreed that doing the symlink resolution in each Filesystem
   subclass is what we ought to do in 9984, in order to keep compatibility
   with out-of-tree filesystems.
   - We agreed to disable symlink resolution in Hadoop 2 GA. We will spend
   a few weeks ironing out all the bugs and enable it in Hadoop 2.3. However,
   we would like to make all backwards-incompatible API changes prior to
   Hadoop 2 GA.
   - We agreed that
HADOOP-9972 (new
   symlink-aware API for globStatus) should get into Hadoop 2 GA.
   - We discussed the issue of returning resolved paths versus unresolved
   paths, but were unable to come to any conclusion. Everyone agreed that
   there would be serious performance problems if we returned unresolved
   paths, but some claimed that programs would break when encountering
   resolved paths.


There's also a new umbrella issue at HADOOP-10019 tracking on-going
symlinks changes.

Best,
Andrew


On Thu, Oct 3, 2013 at 2:08 PM, Daryn Sharp  wrote:

> I reluctantly agree that we should disable symlinks in 2.2 until we can
> sort out the compatibility issues.  I'm reluctant in the sense that its a
> feature users have long wanted, and it's something we'd like to use from an
> administrative view.  However I don't see all the issues being shorted out
> in the very near future.
>
> I filed some jiras today that have led me to believe that the current
> implementation of fs symlinks is irreparably flawed.  Adding optional
> primitives to filesystems to make them symlink capable is ok.  However,
> adding symlink resolution to individual filesystems is fundamentally
> broken.  It doesn't work for stacked filesystems (viewfs, chroots, filters,
> etc) because the resolution must occur at the highest level, not within an
> individual filesystem itself.  Otherwise the abstraction of the top-level
> filesystem is violated and all kinds of unexpected behavior like walking
> out of chroots becomes possible.
>
> Daryn
>
> On Oct 3, 2013, at 1:39 PM, sanjay Radia wrote:
>
> > There are a number of issues (some minor, some more than minor).
> > GA is close and we are are still in discussion on the some of them;
> while I believe we will close on these very very shortly, code change like
> this so close to GA is dangerous.
> >
> > I suggest we do the following:
> > 1) Disable Symlinks  in 2.2 GA- throw unsupported exception on
> createSymlink in both FileSystem and FileContext.
> > 2) Deal with the  isDir() in 2.2GA in preparation for item 3 coming
> after GA:
> >   a) Deprecate isDir()
> >b) Add a new API that returns an enum (see FileContext).
> > 3) Fix Symlinks, in a future release, hopefully the very next one after
> 2.2GA
> >   a)  change the stack to use the new API replacing isDir().
> >   b) fix isDIr() to do something smarter (we can detail this later but
> there is a solution that has been discussed). This helps customer
> applications that call isDir().
> >  c) Remove isDir in a future release when customers have had sufficient
> time to migrate.
> >
> > sanjay
> >
> > PS. J Rottinghuis expressed a similar sentiment in a previous email in
> this thread:
> >
> >
> >
> > On Sep 18, 2013, at 5:11 PM, J. Rottinghuis wrote:
> >
> >> I like symlink functionality, but in our migration to Hadoop 2.x this
> is a
> >> total distraction. If the APIs stay in 2.2 GA we'll have to choose to:
> >> a) Not uprev until symlink support is figured out up and down the stack,
> >> and we've been able to migrate all our 1.x (equivalent) clusters to 2.x
> >> (equivalent). Or
> >> b) rip out the API altogether. Or
> >> c) change the implementation to throw an UnsupportedOperationException
> >> I'm not sure yet which of these I like least.
> >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
>
>


[jira] [Reopened] (HADOOP-8828) Support distcp from secure to insecure clusters

2013-10-04 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reopened HADOOP-8828:


  Assignee: Haohui Mai

> Support distcp from secure to insecure clusters
> ---
>
> Key: HADOOP-8828
> URL: https://issues.apache.org/jira/browse/HADOOP-8828
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eli Collins
>Assignee: Haohui Mai
>
> Users currently can't distcp from secure to insecure clusters.
> Relevant background from ATM:
> There's no plumbing to make the HFTP client use AuthenticatedURL in the case 
> security is enabled. This means that even though you have the servlet filter 
> correctly configured on the server, the client doesn't know how to properly 
> authenticate to that filter.
> The crux of the issue is that security is enabled globally instead of 
> per-file system. The trick of using HFTP as the source FS works when the 
> source is insecure, but not the source is secure.
> Normal cp with two hdfs:// URL can be made to work. There is indeed logic in 
> o.a.h.ipc.Client to fall back to using simple authentication if your client 
> config has security enabled (hadoop.security.authentication set to 
> "kerberos") and the server responds with a response for simple 
> authentication. Thing is, there are at least 3 bugs with this that I bumped 
> into. All three can be worked around.
> 1) If your client config has security enabled you *must* have a valid 
> Kerberos TGT, even if you're interacting with an insecure cluster. The hadoop 
> client unfortunately tries to read the local ticket cache before it tries to 
> connect to the server, and so doesn't know that it won't need Kerberos 
> credentials.
> 2) Even though the destination NN is insecure, it has to have a Kerberos 
> principal created for it. You don't need a keytab, and you don't need to 
> change any settings on the destination NN. The principal just needs to exist 
> in the principal database. This is again because the hadoop client will, 
> before connecting to the remote NN, try to get a service ticket for the 
> hdfs/f.q.d.n principal for the remote NN. If this fails, it won't even get to 
> the part where it tries to connect to the insecure NN and falls back to 
> simple auth.
> 3) Once you get through problems 1 and 2, you will try to connect to the 
> remote, insecure NN. This will work, but the reported principal name of your 
> user will include a realm that the remote NN doesn't know about. You will 
> either need to change the default_realm setting in /etc/krb5.conf on the 
> insecure NN to be the same as the secure NN, or you will need to add some 
> custom hadoop.security.auth_to_local mappings on the insecure NN so it knows 
> how to translate this long principal name into a short name.
> Even with all these changes, distcp still won't work since the first thing it 
> tries to do when submitting the job is to get a delegation token for all the 
> involved NNs, which won't work since the insecure NN isn't running a DT 
> secret manager. I haven't been able to figure out a way around this, except 
> to make a custom distcp which doesn't necessarily do this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Hadoop 2.1.x questions?

2013-10-04 Thread Amir Sanjar

>From below list, which version of hadoop 2.1.x is the most stable one?
also which version is closest to 2.2GA release?
release-2.1.0-beta/
release-2.1.0-beta-rc0/
release-2.1.0-beta-rc1/
release-2.1.1-beta-rc0/

Best Regards
Amir Sanjar

PowerLinux Open Source Hadoop Architect
IBM Senior Software Engineer