[jira] [Commented] (HADOOP-16355) ZookeeperMetadataStore: Use Zookeeper as S3Guard backend store

2019-06-07 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859063#comment-16859063
 ] 

Andrew Purtell commented on HADOOP-16355:
-

Let's make sure just this part can be backported to branch-2.9 (unless use of 
s3guard is strongly contraindicated on that branch)

> ZookeeperMetadataStore: Use Zookeeper as S3Guard backend store
> --
>
> Key: HADOOP-16355
> URL: https://issues.apache.org/jira/browse/HADOOP-16355
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Mingliang Liu
>Priority: Major
>
> When S3Guard was proposed, there are a couple of valid reasons to choose 
> DynamoDB as its default backend store: 0) seamless integration as part of AWS 
> ecosystem e.g. client library 1) it's a managed web service which is zero 
> operational cost, highly available and infinitely scalable 2) it's performant 
> with single digit latency 3) it's proven by Netflix's S3mper (not actively 
> maintained) and EMRFS (closed source and usage). As it's pluggable, it's 
> possible to implement {{MetadataStore}} with other backend store without 
> changing semantics, besides null and in-memory local ones.
> Here we propose {{ZookeeperMetadataStore}} which uses Zookeeper as S3Guard 
> backend store. Its main motivation is to provide a new MetadataStore option 
> which:
>  # can be easily integrated as Zookeeper is heavily used in Hadoop community
>  # affordable performance as both client and Zookeeper ensemble are usually 
> "local" in a Hadoop cluster (ZK/HBase/Hive etc)
>  # removes DynamoDB dependency
> Obviously all use cases will not prefer this to default DynamoDB store. For 
> e.g. ZK might not scale well if there are dozens of S3 buckets and each has 
> millions of objects. Our use case is targeting HBase to store HFiles on S3 
> instead of HDFS. A total solution for HBase on S3 must be HBOSS (see 
> HBASE-22149) for recovering atomicity of metadata operations like rename, and 
> S3Guard for consistent enumeration and access to object store bucket 
> metadata. We would like to use Zookeeper as backend store for both.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15775) [JDK9] Add missing javax.activation-api dependency

2018-09-21 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623975#comment-16623975
 ] 

Andrew Purtell commented on HADOOP-15775:
-

FWIW the 003 patch doesn't address some cases, like 
TestWebHDFSStoragePolicyCommands. You need to specify this dependency in the 
hadoop-hdfs-project POM too.

Cause is:
{noformat}
Caused by: java.lang.ClassNotFoundException: javax.activation.DataSource
at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:582)
at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
{noformat}

This is Java 11+28


> [JDK9] Add missing javax.activation-api dependency
> --
>
> Key: HADOOP-15775
> URL: https://issues.apache.org/jira/browse/HADOOP-15775
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Critical
> Attachments: HADOOP-15775.01.patch, HADOOP-15775.02.patch, 
> HADOOP-15775.03.patch
>
>
> Many unit tests fail due to missing java.activation module. This failure can 
> be fixed by adding javax.activation-api as third-party dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15775) [JDK9] Add missing javax.activation-api dependency

2018-09-21 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623975#comment-16623975
 ] 

Andrew Purtell edited comment on HADOOP-15775 at 9/21/18 6:02 PM:
--

FWIW the 003 patch doesn't address some cases, like 
TestWebHDFSStoragePolicyCommands. You need to specify this dependency in the 
hadoop-hdfs-project POM too.

Cause for TestWebHDFSStoragePolicyCommands failure (and others) is:
{noformat}
Caused by: java.lang.ClassNotFoundException: javax.activation.DataSource
at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:582)
at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
{noformat}

This is Java 11+28



was (Author: apurtell):
FWIW the 003 patch doesn't address some cases, like 
TestWebHDFSStoragePolicyCommands. You need to specify this dependency in the 
hadoop-hdfs-project POM too.

Cause is:
{noformat}
Caused by: java.lang.ClassNotFoundException: javax.activation.DataSource
at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:582)
at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
{noformat}

This is Java 11+28


> [JDK9] Add missing javax.activation-api dependency
> --
>
> Key: HADOOP-15775
> URL: https://issues.apache.org/jira/browse/HADOOP-15775
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Critical
> Attachments: HADOOP-15775.01.patch, HADOOP-15775.02.patch, 
> HADOOP-15775.03.patch
>
>
> Many unit tests fail due to missing java.activation module. This failure can 
> be fixed by adding javax.activation-api as third-party dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15775) [JDK9] Add "--add-modules java.activation" to java option in unit tests

2018-09-20 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622925#comment-16622925
 ] 

Andrew Purtell commented on HADOOP-15775:
-

I don't think this is going to work for Java 11+. javax.activation is gone. 
This patch fails. 

{noformat}
Error occurred during initialization of boot layer
java.lang.module.FindException: Module java.activation not found
{noformat}

You can add it as a third party dependency as artifact 
javax.activation:activation and tests will pass. The shaded client modules also 
need an update to shade javax/activation/* in. Not sure if this is the best 
approach. 

> [JDK9] Add "--add-modules java.activation" to java option in unit tests
> ---
>
> Key: HADOOP-15775
> URL: https://issues.apache.org/jira/browse/HADOOP-15775
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Critical
> Attachments: HADOOP-15775.01.patch
>
>
> Many unit tests fail due to missing java.activation module. We need to 
> configure maven surefire plugin to add "--add-modules java.activation" option 
> to test JVM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15566) Remove HTrace support

2018-08-15 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581746#comment-16581746
 ] 

Andrew Purtell commented on HADOOP-15566:
-

What about a HTrace facade for Brave (Zipkin)? 

> Remove HTrace support
> -
>
> Key: HADOOP-15566
> URL: https://issues.apache.org/jira/browse/HADOOP-15566
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 3.1.0
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: security
> Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, 
> ss-trace-s3a.png
>
>
> The HTrace incubator project has voted to retire itself and won't be making 
> further releases. The Hadoop project currently has various hooks with HTrace. 
> It seems in some cases (eg HDFS-13702) these hooks have had measurable 
> performance overhead. Given these two factors, I think we should consider 
> removing the HTrace integration. If there is someone willing to do the work, 
> replacing it with OpenTracing might be a better choice since there is an 
> active community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14805) Upgrade to zstd 1.3.1

2017-09-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16163508#comment-16163508
 ] 

Andrew Purtell commented on HADOOP-14805:
-

There might not be a direct dependency but would it be useful to add checking 
in the cmakefile that the zstandard libraries are version >= 1.3.1, for better 
licensing/compliance assurance? Would help those building custom packages which 
include zstd in the binaries.

> Upgrade to zstd 1.3.1
> -
>
> Key: HADOOP-14805
> URL: https://issues.apache.org/jira/browse/HADOOP-14805
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Andrew Wang
>
> zstandard 1.3.1 has been dual licensed under GPL and BSD. This clears up the 
> concerns regarding the Facebook-specific PATENTS file. If we upgrade to 
> 1.3.1, we can bundle zstd with binary distributions of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14558) RPC requests on a secure cluster are 10x slower due to expensive encryption and decryption

2017-06-21 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058151#comment-16058151
 ] 

Andrew Purtell commented on HADOOP-14558:
-

We did custom AES encryption on HBase RPC. It could be adapted to Hadoop RPC. 
Please see HBASE-16414

> RPC requests on a secure cluster are 10x slower due to expensive encryption 
> and decryption 
> ---
>
> Key: HADOOP-14558
> URL: https://issues.apache.org/jira/browse/HADOOP-14558
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
>Reporter: Mostafa Mokhtar
>Priority: Critical
>  Labels: impala, metadata, rpc
>
> While running a performance tests for Impala comparing secure and un-secure 
> clusters I noticed that metadata loading operations are 10x slower on a 
> cluster with Kerberos+SSL enabled. 
> hadoop.rpc.protection is set to privacy
> Any recommendations on how this can be mitigated? 10x slowdown is a big hit 
> for metadata loading. 
> The majority of the slowdown is coming from the two threads below. 
> {code}
> Stack Trace   Sample CountPercentage(%)
> org.apache.hadoop.ipc.Client$Connection.run() 5,212   46.586
>org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse()   5,203   
> 46.505
>   java.io.DataInputStream.readInt()   5,039   45.039
>  java.io.BufferedInputStream.read()   5,038   45.03
> java.io.BufferedInputStream.fill()5,038   45.03
>
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(byte[], int, 
> int) 5,036   45.013
>   java.io.FilterInputStream.read(byte[], int, int)5,036   
> 45.013
>  
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(byte[], int, 
> int)   5,036   45.013
> 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket()
>5,035   45.004
>
> com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(byte[], int, int) 4,775   
> 42.68
>   sun.security.jgss.GSSContextImpl.unwrap(byte[], 
> int, int, MessageProp)  4,775   42.68
>  
> sun.security.jgss.krb5.Krb5Context.unwrap(byte[], int, int, MessageProp) 
> 4,768   42.617
> 
> sun.security.jgss.krb5.WrapToken.getData()4,714   42.134
>
> sun.security.jgss.krb5.WrapToken.getData(byte[], int)  4,714   42.134
>   
> sun.security.jgss.krb5.WrapToken.getDataFromBuffer(byte[], int) 4,714   
> 42.134
>  
> sun.security.jgss.krb5.CipherHelper.decryptData(WrapToken, byte[], int, int, 
> byte[], int)3,083   27.556
> 
> sun.security.jgss.krb5.CipherHelper.des3KdDecrypt(WrapToken, byte[], int, 
> int, byte[], int)   3,078   27.512
>
> sun.security.krb5.internal.crypto.Des3.decryptRaw(byte[], int, byte[], 
> byte[], int, int)   3,076   27.494
>   
> sun.security.krb5.internal.crypto.dk.DkCrypto.decryptRaw(byte[], int, byte[], 
> byte[], int, int) 3,076   27.494
> {code}
> And 
> {code}
> Stack Trace   Sample CountPercentage(%)
> java.lang.Thread.run()3,379   30.202
>java.util.concurrent.ThreadPoolExecutor$Worker.run()   3,379   30.202
>   
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)  
>   3,379   30.202
>  java.util.concurrent.FutureTask.run()3,367   30.095
> java.util.concurrent.Executors$RunnableAdapter.call() 3,367   
> 30.095
>org.apache.hadoop.ipc.Client$Connection$3.run()3,367   
> 30.095
>   java.io.DataOutputStream.flush()3,367   30.095
>  java.io.BufferedOutputStream.flush() 3,367   30.095
> java.io.BufferedOutputStream.flushBuffer()3,367   
> 30.095
>
> org.apache.hadoop.security.SaslRpcClient$WrappedOutputStream.write(byte[], 
> int, int)   3,367   30.095
>   
> com.sun.security.sasl.gsskerb.GssKrb5Base.wrap(byte[], int, int)3,281 
>   29.326
>  
> sun.security.jgss.GSSContextImpl.wrap(byte[], int, int, MessageProp) 3,281   
> 29.326
> 
> sun.security.jgss.krb5.Krb5Context.wrap(byte[], int, int, MessageProp)
> 3,280   29.317
>
> sun.security.jgss.krb5.WrapToken.(Krb5Context

[jira] [Commented] (HADOOP-12910) Add new FileSystem API to support asynchronous method calls

2016-06-13 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328409#comment-15328409
 ] 

Andrew Purtell commented on HADOOP-12910:
-

bq. Future based api will be of immediate help for Hive. 

Does 'immediate' mean the Hive code using the proposed API has already been 
written? 

bq. Threadpool based approach complicates Hive code base for maintainability 
and is not preferable.

This desire not to use a threadpool argues against a Future based approach. 

> Add new FileSystem API to support asynchronous method calls
> ---
>
> Key: HADOOP-12910
> URL: https://issues.apache.org/jira/browse/HADOOP-12910
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Xiaobing Zhou
> Attachments: HADOOP-12910-HDFS-9924.000.patch, 
> HADOOP-12910-HDFS-9924.001.patch, HADOOP-12910-HDFS-9924.002.patch
>
>
> Add a new API, namely FutureFileSystem (or AsynchronousFileSystem, if it is a 
> better name).  All the APIs in FutureFileSystem are the same as FileSystem 
> except that the return type is wrapped by Future, e.g.
> {code}
>   //FileSystem
>   public boolean rename(Path src, Path dst) throws IOException;
>   //FutureFileSystem
>   public Future rename(Path src, Path dst) throws IOException;
> {code}
> Note that FutureFileSystem does not extend FileSystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12910) Add new FileSystem API to support asynchronous method calls

2016-06-08 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321113#comment-15321113
 ] 

Andrew Purtell commented on HADOOP-12910:
-

For what it's worth, from someone on several PMCs of downstream projects, what 
you have is substantive technical feedback on the proposal from the downstream 
community most likely to try and adopt the feature as soon as there is a 
plausible user facing API. In fact that community has already supplied you, if 
you like, with a working proof of concept DFS client. In open source, that is 
riches, freely offered effort to a problem you now don't need to solve all on 
your own. Of course there could be technical reasons not to consider it 
further, but I don't see it given any consideration at all. Is that correct? In 
any case, I believe the feedback on API design in this issue is all offered in 
good faith hoping to achieve a user facing API that is consumable for high 
performance async applications as they are actually commonly written in the 
industry today. In the interest of all potential downstream users please 
consider accepting the feedback in that spirit. API design done in a manner 
ignoring or antagonistic to likely users can't produce a more usable outcome 
than a humble and collaborative approach. This has the potential to be a big 
deal for the whole ecosystem. By working with your downstreams you will be more 
likely to lift all boats, including your own.

> Add new FileSystem API to support asynchronous method calls
> ---
>
> Key: HADOOP-12910
> URL: https://issues.apache.org/jira/browse/HADOOP-12910
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Xiaobing Zhou
> Attachments: HADOOP-12910-HDFS-9924.000.patch, 
> HADOOP-12910-HDFS-9924.001.patch, HADOOP-12910-HDFS-9924.002.patch
>
>
> Add a new API, namely FutureFileSystem (or AsynchronousFileSystem, if it is a 
> better name).  All the APIs in FutureFileSystem are the same as FileSystem 
> except that the return type is wrapped by Future, e.g.
> {code}
>   //FileSystem
>   public boolean rename(Path src, Path dst) throws IOException;
>   //FutureFileSystem
>   public Future rename(Path src, Path dst) throws IOException;
> {code}
> Note that FutureFileSystem does not extend FileSystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12956) Inevitable Log4j2 migration via slf4j

2016-03-27 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213694#comment-15213694
 ] 

Andrew Purtell commented on HADOOP-12956:
-

We also evaluated a move to log4j2 for HBase on HBASE-10092, and likewise came 
to the conclusion it is impossible to achieve backwards compatibility for 
operators and tools. A change like this can't go in on a minor release. (At 
least, HBase cannot do that, according to our published version compatibility 
guidelines.) The killer issue is no backwards compatibility for the properties 
based configuration files. Operators, not unreasonably, expect a minor or patch 
version increment should continue to "just work" with existing configuration 
files and tooling, not completely break all logging. 

I see after LOG4J2-952 there is a "PropertiesConfigurationFactory", so _a_ 
properties based configuration file format is again possible, but this isn't 
helpful and probably not desired because you get the limitations of a 
properties file format without v1 compatibility.

> Inevitable Log4j2 migration via slf4j
> -
>
> Key: HADOOP-12956
> URL: https://issues.apache.org/jira/browse/HADOOP-12956
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Gopal V
>
> {{5 August 2015 --The Apache Logging Services™ Project Management Committee 
> (PMC) has announced that the Log4j™ 1.x logging framework has reached its end 
> of life (EOL) and is no longer officially supported.}}
> https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces
> A whole framework log4j2 upgrade has to be synchronized, partly for improved 
> performance brought about by log4j2.
> https://logging.apache.org/log4j/2.x/manual/async.html#Performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12228) Document releasedocmaker

2015-07-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648528#comment-14648528
 ] 

Andrew Purtell commented on HADOOP-12228:
-

+1

> Document releasedocmaker
> 
>
> Key: HADOOP-12228
> URL: https://issues.apache.org/jira/browse/HADOOP-12228
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: yetus
>Affects Versions: HADOOP-12111
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
> Attachments: HADOOP-12228.HADOOP-12111.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11747) Why not re-use the security model offered by SELINUX?

2015-03-25 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380726#comment-14380726
 ] 

Andrew Purtell commented on HADOOP-11747:
-

You could repurpose this as a proposal to implement Flask-style type 
enforcement in Hadoop, with push down to the OS if support is available there 
for it, like SELinux or TrustedBSD. :-) 

> Why not re-use the security model offered by SELINUX?
> -
>
> Key: HADOOP-11747
> URL: https://issues.apache.org/jira/browse/HADOOP-11747
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Madhan Sundararajan Devaki
>Priority: Critical
>
> SELINUX was introduced to bring in a robust security management in Linux OS.
> In all distributions of Hadoop (Cloudera/Hortonworks/...) one of the 
> pre-installation checklist items is to disable SELINUX in all the nodes of 
> the cluster.
> Why not re-use the security model offered by SELINUX setting instead of 
> re-inventing from scratch through Sentry/Knox/etc...?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (HADOOP-11693) Azure Storage FileSystem rename operations are throttled too aggressively to complete HBase WAL archiving.

2015-03-09 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell moved HBASE-13173 to HADOOP-11693:
-

Key: HADOOP-11693  (was: HBASE-13173)
Project: Hadoop Common  (was: HBase)

> Azure Storage FileSystem rename operations are throttled too aggressively to 
> complete HBase WAL archiving.
> --
>
> Key: HADOOP-11693
> URL: https://issues.apache.org/jira/browse/HADOOP-11693
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Duo Xu
> Attachments: HADOOP-11681.01.patch
>
>
> One of our customers' production HBase clusters was periodically throttled by 
> Azure storage, when HBase was archiving old WALs. HMaster aborted the region 
> server and tried to restart it.
> However, since the cluster was still being throttled by Azure storage, the 
> upcoming distributed log splitting also failed. Sometimes hbase:meta table 
> was on this region server and finally showed offline, which cause the whole 
> cluster in bad state.
> {code}
> 2015-03-01 18:36:45,623 ERROR org.apache.hadoop.hbase.master.HMaster: Region 
> server 
> workernode4.hbaseproddb4001.f5.internal.cloudapp.net,60020,1424845421044 
> reported a fatal error:
> ABORTING region server 
> workernode4.hbaseproddb4001.f5.internal.cloudapp.net,60020,1424845421044: IOE 
> in log roller
> Cause:
> org.apache.hadoop.fs.azure.AzureException: 
> com.microsoft.windowsazure.storage.StorageException: The server is busy.
>   at 
> org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.rename(AzureNativeFileSystemStore.java:2446)
>   at 
> org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.rename(AzureNativeFileSystemStore.java:2367)
>   at 
> org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.rename(NativeAzureFileSystem.java:1960)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.renameAndSetModifyTime(FSUtils.java:1719)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.archiveLogFile(FSHLog.java:798)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.cleanOldLogs(FSHLog.java:656)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:593)
>   at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:97)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: com.microsoft.windowsazure.storage.StorageException: The server is 
> busy.
>   at 
> com.microsoft.windowsazure.storage.StorageException.translateException(StorageException.java:163)
>   at 
> com.microsoft.windowsazure.storage.core.StorageRequest.materializeException(StorageRequest.java:306)
>   at 
> com.microsoft.windowsazure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:229)
>   at 
> com.microsoft.windowsazure.storage.blob.CloudBlob.startCopyFromBlob(CloudBlob.java:762)
>   at 
> org.apache.hadoop.fs.azurenative.StorageInterfaceImpl$CloudBlobWrapperImpl.startCopyFromBlob(StorageInterfaceImpl.java:350)
>   at 
> org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.rename(AzureNativeFileSystemStore.java:2439)
>   ... 8 more
> 2015-03-01 18:43:29,072 ERROR org.apache.hadoop.hbase.executor.EventHandler: 
> Caught throwable while processing event M_META_SERVER_SHUTDOWN
> java.io.IOException: failed log splitting for 
> workernode13.hbaseproddb4001.f5.internal.cloudapp.net,60020,1424845307901, 
> will retry
>   at 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:71)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.fs.azure.AzureException: 
> com.microsoft.windowsazure.storage.StorageException: The server is busy.
>   at 
> org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.rename(AzureNativeFileSystemStore.java:2446)
>   at 
> org.apache.hadoop.fs.azurenative.NativeAzureFileSystem$FolderRenamePending.execute(NativeAzureFileSystem.java:393)
>   at 
> org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.rename(NativeAzureFileSystem.java:1973)
>   at 
> org.apache.hadoop.hbase.master.MasterFileSystem.getLogDirs(MasterFileSystem.java:319)
>   at 
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:406)
>   at 
> org.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:302)
>   at 
> org.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:293)
>   at 
> org.apa

[jira] [Commented] (HADOOP-11423) [Umbrella] Support Java 10 in Hadoop

2014-12-22 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256657#comment-14256657
 ] 

Andrew Purtell commented on HADOOP-11423:
-

Is this snark?

> [Umbrella] Support Java 10 in Hadoop
> 
>
> Key: HADOOP-11423
> URL: https://issues.apache.org/jira/browse/HADOOP-11423
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: sneaky
>
> Java 10 is coming quickly to various clusters. Making sure Hadoop seamlessly 
> works with Java 10 is important for the Apache community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11090) [Umbrella] Support Java 8 in Hadoop

2014-12-11 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243252#comment-14243252
 ] 

Andrew Purtell commented on HADOOP-11090:
-

bq. I post the patch that can disable doclint when you create hadoop package.

Does it work when you use Java 7 to build?

> [Umbrella] Support Java 8 in Hadoop
> ---
>
> Key: HADOOP-11090
> URL: https://issues.apache.org/jira/browse/HADOOP-11090
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
>
> Java 8 is coming quickly to various clusters. Making sure Hadoop seamlessly 
> works  with Java 8 is important for the Apache community.
>   
> This JIRA is to track  the issues/experiences encountered during Java 8 
> migration. If you find a potential bug , please create a separate JIRA either 
> as a sub-task or linked into this JIRA.
> If you find a Hadoop or JVM configuration tuning, you can create a JIRA as 
> well. Or you can add  a comment  here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10134) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2014-12-08 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14238829#comment-14238829
 ] 

Andrew Purtell commented on HADOOP-10134:
-

Have you tried compiling the result? My guess is there have been more '' 
and other illegal tags added via commits over the months. 

> [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments 
> --
>
> Key: HADOOP-10134
> URL: https://issues.apache.org/jira/browse/HADOOP-10134
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: 10134-branch-2.patch, 10134-branch-2.patch, 
> 10134-trunk.patch, 10134-trunk.patch, 10134-trunk.patch, 
> HADOOP-10134.000.patch
>
>
> Javadoc is more strict by default in JDK8 and will error out on malformed or 
> illegal tags found in doc comments. Although tagged as JDK8 all of the 
> required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11090) [Umbrella] Issues with Java 8 in Hadoop

2014-09-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137516#comment-14137516
 ] 

Andrew Purtell commented on HADOOP-11090:
-

See HBASE-12006, we borrowed this class from core test.

> [Umbrella] Issues with Java 8 in Hadoop
> ---
>
> Key: HADOOP-11090
> URL: https://issues.apache.org/jira/browse/HADOOP-11090
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
>
> Java 8 is coming quickly to various clusters. Making sure Hadoop seamlessly 
> works  with Java 8 is important for the Apache community.
>   
> This JIRA is to track  the issues/experiences encountered during Java 8 
> migration. If you find a potential bug , please create a separate JIRA either 
> as a sub-task or linked into this JIRA.
> If you find a Hadoop or JVM configuration tuning, you can create a JIRA as 
> well. Or you can add  a comment  here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10786) Patch that fixes UGI#reloginFromKeytab on java 8

2014-09-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137440#comment-14137440
 ] 

Andrew Purtell commented on HADOOP-10786:
-

Shouldn't this be a higher priority than 'Minor'? The end of public updates to 
Java 7 will be April 2015. A silent failure to re-login from keytab after TGT 
expiry dooms any long running process that wants to use secure RPC. Anyone who 
cares about security and about running the best performing supported Java 
runtime shortly will be forced to locally patch their core libraries. 

> Patch that fixes UGI#reloginFromKeytab on java 8
> 
>
> Key: HADOOP-10786
> URL: https://issues.apache.org/jira/browse/HADOOP-10786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Tobi Vollebregt
>Assignee: Tobi Vollebregt
>Priority: Minor
> Attachments: HADOOP-10786.patch
>
>
> Krb5LoginModule changed subtly in java 8: in particular, if useKeyTab and 
> storeKey are specified, then only a KeyTab object is added to the Subject's 
> private credentials, whereas in java <= 7 both a KeyTab and some number of 
> KerberosKey objects were added.
> The UGI constructor checks whether or not a keytab was used to login by 
> looking if there are any KerberosKey objects in the Subject's private 
> credentials. If there are, then isKeyTab is set to true, and otherwise it's 
> set to false.
> Thus, in java 8 isKeyTab is always false given the current UGI 
> implementation, which makes UGI#reloginFromKeytab fail silently.
> Attached patch will check for a KeyTab object on the Subject, instead of a 
> KerberosKey object. This fixes relogins from kerberos keytabs on Oracle java 
> 8, and works on Oracle java 7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11090) [Umbrella] Issues with Java 8 in Hadoop

2014-09-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132405#comment-14132405
 ] 

Andrew Purtell commented on HADOOP-11090:
-

You'll have issues with Javadoc lint during *compile*. It can be disabled by a 
javac flag, unfortunately not recognized by javac < 8, but maybe some Maven 
magic could take care of that. Or, the javadoc itself can be fixed.

> [Umbrella] Issues with Java 8 in Hadoop
> ---
>
> Key: HADOOP-11090
> URL: https://issues.apache.org/jira/browse/HADOOP-11090
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
>
> Java 8 is coming quickly to various clusters. Making sure Hadoop seamlessly 
> works  with Java 8 is important for the Apache community.
>   
> This JIRA is to track  the issues/experiences encountered during Java 8 
> migration. If you find a potential bug , please create a separate JIRA either 
> as a sub-task or linked into this JIRA.
> If you find a Hadoop or JVM configuration tuning, you can create a JIRA as 
> well. Or you can add  a comment  here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2014-08-04 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085595#comment-14085595
 ] 

Andrew Purtell commented on HADOOP-9902:


This issue has been open for a year and has 40 watchers (a good proxy for 
general interest), yet the contributor is begging for a review and being shut 
down by a veto. This seems like a good venue to point that out since all the 
information is on hand at a glance. I'm sure [~atm] and [~andrew.wang] can and 
will tap each other on the shoulder if this or that HDFS patch needs to go in, 
but there's no bandwidth for review of third party contribution, decided 
intentionally or through accumulated carelessness. Case in point, this issue. 
Consider spending the time you might argue with me looking at the patch.

> Shell script rewrite
> 
>
> Key: HADOOP-9902
> URL: https://issues.apache.org/jira/browse/HADOOP-9902
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>  Labels: releasenotes
> Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, 
> HADOOP-9902-12.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, 
> HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, 
> HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, 
> HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt
>
>
> Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2014-08-04 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085572#comment-14085572
 ] 

Andrew Purtell commented on HADOOP-9902:


bq. we do review-then-commit here in Hadoop

More like ignore-and-drop but I digress.

> Shell script rewrite
> 
>
> Key: HADOOP-9902
> URL: https://issues.apache.org/jira/browse/HADOOP-9902
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>  Labels: releasenotes
> Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, 
> HADOOP-9902-12.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, 
> HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, 
> HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, 
> HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt
>
>
> Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-21 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068857#comment-14068857
 ] 

Andrew Purtell commented on HADOOP-10641:
-

bq. I want a formal specification of the API, and what we have in the current 
PDF design document is not it. I will also need evidence that the reference ZK 
implementation is consistent with that specification, both by any maths that 
can be provided, and the test cases derived from the specification.

This is a good idea in the abstract, but the notion of applying Amazon's 
process to a volunteer open source project is problematic. In terms of the 
Hadoop contribution process, this is a novel requirement. It is up to the 
Hadoop committership to determine commit criteria of course, but I humbly 
suggest that the intersection of contributors able to mathematically prove the 
correctness of a large code change while simultaneously being able to implement 
production quality systems code is vanishingly small. In this case, the 
contributors might be able to meet the challenge but going forward if 
significant changes to Hadoop will require a team of engineers and 
mathematicians, probably this marks the end of external contributions to the 
project. Also, I looked at HADOOP-9361. The documentation updates there are 
fantastic but I did not find any mathematical proofs of correctness. 

> Introduce Coordination Engine
> -
>
> Key: HADOOP-10641
> URL: https://issues.apache.org/jira/browse/HADOOP-10641
> Project: Hadoop Common
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
> Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
> HADOOP-10641.patch, hadoop-coordination.patch
>
>
> Coordination Engine (CE) is a system, which allows to agree on a sequence of 
> events in a distributed system. In order to be reliable CE should be 
> distributed by itself.
> Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
> zab) and have different implementations, depending on use cases, reliability, 
> availability, and performance requirements.
> CE should have a common API, so that it could serve as a pluggable component 
> in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
> HBase (HBASE-10909).
> First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.

2014-07-02 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050526#comment-14050526
 ] 

Andrew Purtell commented on HADOOP-10734:
-

bq. When using Linux, why not just read from /dev/urandom? That will also use 
the RdRand feature when available.

I think this can be done with the default SecureRandom implementation, with 
-Djava.security.egd=file:/dev/./urandom . However there is file IO involved, it 
may still be faster to invoke the rdrand instruction directly. Docs suggest a 
direct throughput of ~6Gbps. 

> Implementation of true secure random with high performance using hardware 
> random number generator.
> --
>
> Key: HADOOP-10734
> URL: https://issues.apache.org/jira/browse/HADOOP-10734
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10734.patch
>
>
> This JIRA is to implement Secure random using JNI to OpenSSL, and 
> implementation should be thread-safe.
> Utilize RdRand to return random numbers from hardware random number 
> generator. It's TRNG(True Random Number generators) having much higher 
> performance than {{java.security.SecureRandom}}. 
> https://wiki.openssl.org/index.php/Random_Numbers
> http://en.wikipedia.org/wiki/RdRand
> https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10768) Optimize Hadoop RPC encryption performance

2014-07-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049067#comment-14049067
 ] 

Andrew Purtell commented on HADOOP-10768:
-

bq. Even GSSAPI supports using AES, but without AES-NI support by default, so 
the encryption is slow and will become bottleneck.

Java's GSSAPI uses JCE ciphers for crypto support. Would it be possible to 
simply swap in an accelerated provider like Diceros? 

On the other hand, whether to wrap payloads using the SASL client or server or 
not is an application decision. One could wrap the initial payloads with 
whatever encryption was negotiated during connection initiation until 
completing additional key exchange and negotiation steps, then switch to an 
alternate means of applying a symmetric cipher to RPC payloads.

bq. On the other hand, RPC message is small

This is a similar issue we had/have with HBase write ahead log encryption, 
because we need to encrypt on a per-entry boundary for avoiding data loss 
during recovery, and each entry is small. You might think that small payloads 
mean we won't be able to increase throughput with accelerated crypto, and you 
would be right, but the accelerated crypto still reduces on CPU time 
substantially, with proportional reduction in latency introduced by 
cryptographic operations. I think for both the HBase WAL and Hadoop RPC, 
latency is a critical consideration.

> Optimize Hadoop RPC encryption performance
> --
>
> Key: HADOOP-10768
> URL: https://issues.apache.org/jira/browse/HADOOP-10768
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance, security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 3.0.0
>
>
> Hadoop RPC encryption is enabled by setting {{hadoop.rpc.protection}} to 
> "privacy". It utilized SASL {{GSSAPI}} and {{DIGEST-MD5}} mechanisms for 
> secure authentication and data protection. Even {{GSSAPI}} supports using 
> AES, but without AES-NI support by default, so the encryption is slow and 
> will become bottleneck.
> After discuss with [~atm], [~tucu00] and [~umamaheswararao], we can do the 
> same optimization as in HDFS-6606. Use AES-NI with more than *20x* speedup.
> On the other hand, RPC message is small, but RPC is frequent and there may be 
> lots of RPC calls in one connection, we needs to setup benchmark to see real 
> improvement and then make a trade-off. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10693) Implementation of AES-CTR CryptoCodec using JNI to OpenSSL

2014-06-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042802#comment-14042802
 ] 

Andrew Purtell commented on HADOOP-10693:
-

bq. Is it possible to use something like this? 
http://www.literatecode.com/aes256 This implementation is just two files, a .c 
and a .h, and doesn't require any libraries to be installed. 

What is really of value in the OpenSSL crypto implementation are the highly 
optimized (assembler) implementations of the AES algorithm. I took a quick look 
at the provided URL and the author himself says:
{quote}
It is a straightforward and rather naïve byte-oriented portable C 
implementation, where all the lookup tables replaced with on-the-fly 
calculations. Certainly it is slower and more subjective to side-channel 
attacks by nature.
{quote}
If there are concerns about OpenSSL, there is always NSS. Using NSS instead 
won't isolate the project from potential security advisories involving the 
crypto library, you've just chosen a different library with a different 
pedigree, and perhaps not quite the same level of scrutiny (after heartbleed). 

> Implementation of AES-CTR CryptoCodec using JNI to OpenSSL
> --
>
> Key: HADOOP-10693
> URL: https://issues.apache.org/jira/browse/HADOOP-10693
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10693.1.patch, HADOOP-10693.patch
>
>
> In HADOOP-10603, we have an implementation of AES-CTR CryptoCodec using Java 
> JCE provider. 
> To get high performance, the configured JCE provider should utilize native 
> code and AES-NI, but in JDK6,7 the Java embedded provider doesn't support it.
>  
> Considering not all hadoop user will use the provider like Diceros or able to 
> get signed certificate from oracle to develop a custom provider, so this JIRA 
> will have an implementation of AES-CTR CryptoCodec using JNI to OpenSSL 
> directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10539) Provide backward compatibility for ProxyUsers.authorize() call

2014-04-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980318#comment-13980318
 ] 

Andrew Purtell commented on HADOOP-10539:
-

Thanks so much [~cnauroth]

> Provide backward compatibility for ProxyUsers.authorize() call
> --
>
> Key: HADOOP-10539
> URL: https://issues.apache.org/jira/browse/HADOOP-10539
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0, 2.5.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
>Priority: Minor
> Fix For: 3.0.0, 2.5.0
>
> Attachments: HADOOP-10539.patch, HADOOP-10539.patch
>
>
> HADOOP-10499  removed the unused _Configuration_ parameter from 
> _ProxyUsers.authorize()_  method. 
> This broke few components like HBase who invoked that function and forced it 
> to rely on Reflection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10539) Provide backward compatibility for ProxyUsers.authorize() call

2014-04-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980222#comment-13980222
 ] 

Andrew Purtell commented on HADOOP-10539:
-

Reflection will be a perf hit.

> Provide backward compatibility for ProxyUsers.authorize() call
> --
>
> Key: HADOOP-10539
> URL: https://issues.apache.org/jira/browse/HADOOP-10539
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Benoy Antony
>Assignee: Benoy Antony
>Priority: Minor
> Attachments: HADOOP-10539.patch, HADOOP-10539.patch
>
>
> HADOOP-10499  removed the unused _Configuration_ parameter from 
> _ProxyUsers.authorize()_  method. 
> This broke few components like HBase who invoked that function and forced it 
> to rely on Reflection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10539) Provide backward compatibility for ProxyUsers.authorize() call

2014-04-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13979935#comment-13979935
 ] 

Andrew Purtell commented on HADOOP-10539:
-

I resolved HBASE-11067 as Wont Fix as this issue will address the problem with 
a breaking API change in branch 2. As an alternative, if there is a way to 
accomplish what ProxyUsers#authorize does using an interface not marked 
InterfaceAudience.Private, a pointer would be greatly appreciated. 


> Provide backward compatibility for ProxyUsers.authorize() call
> --
>
> Key: HADOOP-10539
> URL: https://issues.apache.org/jira/browse/HADOOP-10539
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Benoy Antony
>Assignee: Benoy Antony
>Priority: Minor
> Attachments: HADOOP-10539.patch
>
>
> HADOOP-10499  removed the unused _Configuration_ parameter from 
> _ProxyUsers.authorize()_  method. 
> This broke few components like HBase who invoked that function and forced it 
> to rely on Reflection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10528) A TokenKeyProvider for a Centralized Key Manager Server (BEE: bee-key-manager)

2014-04-22 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977592#comment-13977592
 ] 

Andrew Purtell commented on HADOOP-10528:
-

bq. While this seems like interesting work, it also duplicates a number of jiras

Certainly in the sense that parts of HADOOP-9331 were peeled off in JIRAs that 
duplicate some of its scope. That seems to have forked work in this area rather 
than foster healthy community collaboration between those with and without the 
commit bit. It's disappointing to see this pattern continuing here with this 
issue dismissed as "duplicate". 

> A TokenKeyProvider for a Centralized Key Manager Server (BEE: bee-key-manager)
> --
>
> Key: HADOOP-10528
> URL: https://issues.apache.org/jira/browse/HADOOP-10528
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Reporter: howie yu
> Attachments: HADOOP-10528.patch
>
>
> This is a key provider based on HADOOP-9331. HADOOP-9331 has designed a 
> complete Hadoop crypto codec framework, but the key can only be retrieved 
> from a local Java KeyStore file. To the convenience, we design a Centralized 
> Key Manager Server (BEE: bee-key-manager) and user can use this 
> TokenKeyProvider to retrieve keys from the Centralized Key Manager Server. By 
> the way, to secure the key exchange, we leverage HTTPS + SPNego/SASL to 
> protect the key exchange. To the detail design and usage, please refer to 
> https://github.com/trendmicro/BEE. 
> Moreover, there are still much more requests about Hadoop Data Encryption 
> (such as provide standalone module, support KMIP...etc.), if anyone has 
> interested in those features, pleas let us know. 
>  
> Ps. Because this patch based on HADOOP-9331, please use patch HADOOP-9333, 
> and HADOOP-9332 and before use our patch HADOOP-10528.patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system

2014-04-21 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975873#comment-13975873
 ] 

Andrew Purtell commented on HADOOP-10150:
-

bq.  there's one more layer to consider: virtualized hadoop clusters.

An interesting paper on this topic is http://eprint.iacr.org/2014/248.pdf, 
which discusses side channel attacks on AES on Xen and VMWare platforms. JCE 
ciphers were not included in the analysis but should be suspect until proven 
otherwise. JRE >= 8 will accelerate AES using AES-NI instructions. Since AES-NI 
performs each full round of AES in a hardware register all known side channel 
attacks are prevented. 

> Hadoop cryptographic file system
> 
>
> Key: HADOOP-10150
> URL: https://issues.apache.org/jira/browse/HADOOP-10150
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>  Labels: rhino
> Fix For: 3.0.0
>
> Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
> system-V2.docx, HADOOP cryptographic file system.pdf, 
> HDFSDataAtRestEncryptionAlternatives.pdf, 
> HDFSDataatRestEncryptionAttackVectors.pdf, 
> HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based 
> on INode feature.patch
>
>
> There is an increasing need for securing data when Hadoop customers use 
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.Transparent to and no modification required for upper layer 
> applications.
> 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
> the wrapped file system supports them.
> 3.Very high performance for encryption and decryption, they will not 
> become bottleneck.
> 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
> modify existing structure of file system, such as namenode and datanode 
> structure if the wrapped file system is HDFS.
> 5.Admin can configure encryption policies, such as which directory will 
> be encrypted.
> 6.A robust key management framework.
> 7.Support Pread and append operations if the wrapped file system supports 
> them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10410) Support ioprio_set in NativeIO

2014-04-19 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974972#comment-13974972
 ] 

Andrew Purtell commented on HADOOP-10410:
-

bq. Doing our own IO scheduling, however, is a much larger project, which to be 
effective may also require other deep changes like switching to O_DIRECT

Sooner or later, a database or database-ish system usually ends up at O_DIRECT

> Support ioprio_set in NativeIO
> --
>
> Key: HADOOP-10410
> URL: https://issues.apache.org/jira/browse/HADOOP-10410
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: native
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HADOOP-10410.txt
>
>
> It would be better to HBase application if HDFS layer provide a fine-grained 
> IO request priority. Most of modern kernel should support ioprio_set system 
> call now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10310) SaslRpcServer should be initialized even when no secret manager present

2014-01-29 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886037#comment-13886037
 ] 

Andrew Purtell commented on HADOOP-10310:
-

Thanks ATM that helps out.

> SaslRpcServer should be initialized even when no secret manager present
> ---
>
> Key: HADOOP-10310
> URL: https://issues.apache.org/jira/browse/HADOOP-10310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Blocker
> Attachments: HADOOP-10310.patch
>
>
> HADOOP-8783 made a change which caused the SaslRpcServer not to be 
> initialized if there is no secret manager present. This works fine for most 
> Hadoop daemons because they need a secret manager to do their business, but 
> JournalNodes do not. The result of this is that JournalNodes are broken and 
> will not handle RPCs in a Kerberos-enabled environment, since the 
> SaslRpcServer will not be initialized.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HADOOP-10310) SaslRpcServer should be initialized even when no secret manager present

2014-01-29 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886033#comment-13886033
 ] 

Andrew Purtell commented on HADOOP-10310:
-

Pretty sure HADOOP-8983 wasn't the breaking change

> SaslRpcServer should be initialized even when no secret manager present
> ---
>
> Key: HADOOP-10310
> URL: https://issues.apache.org/jira/browse/HADOOP-10310
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Blocker
> Attachments: HADOOP-10310.patch
>
>
> HADOOP-8983 made a change which caused the SaslRpcServer not to be 
> initialized if there is no secret manager present. This works fine for most 
> Hadoop daemons because they need a secret manager to do their business, but 
> JournalNodes do not. The result of this is that JournalNodes are broken and 
> will not handle RPCs in a Kerberos-enabled environment, since the 
> SaslRpcServer will not be initialized.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HADOOP-10253) Remove deprecated methods in HttpServer

2014-01-22 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879141#comment-13879141
 ] 

Andrew Purtell commented on HADOOP-10253:
-

bq. This is exactly the situation that we want to avoid – HttpServer is a 
private API and it should not support any downstream uses.

I went and reviewed the code in question, and I don't understand this 
statement. This is the interface annotation for HttpServer on HEAD of Hadoop 
common branch-2:
{noformat}
@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce", "HBase"})
@InterfaceStability.Evolving
public class HttpServer implements FilterContainer {
{noformat}

> Remove deprecated methods in HttpServer
> ---
>
> Key: HADOOP-10253
> URL: https://issues.apache.org/jira/browse/HADOOP-10253
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Haohui Mai
>
> There are a lot of deprecated methods in {{HttpServer}}. They are not used 
> anymore. They should be removed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HADOOP-10253) Remove deprecated methods in HttpServer

2014-01-22 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879088#comment-13879088
 ] 

Andrew Purtell commented on HADOOP-10253:
-

My read of HBASE-10336 is the change there will be committed to HBase trunk, 
but not to the 0.96 and 0.98 release branches. For those releases, we will need 
to add a disclaimer they will not work with Hadoop 2.X where X is wherever the 
backwards compatibility is broken on the Hadoop 2.x release line. 

> Remove deprecated methods in HttpServer
> ---
>
> Key: HADOOP-10253
> URL: https://issues.apache.org/jira/browse/HADOOP-10253
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Haohui Mai
>
> There are a lot of deprecated methods in {{HttpServer}}. They are not used 
> anymore. They should be removed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HADOOP-10253) Remove deprecated methods in HttpServer

2014-01-22 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879068#comment-13879068
 ] 

Andrew Purtell commented on HADOOP-10253:
-

bq. The methods will be removed in the next release.

If you mean 2.5, then you will have only moved the problem for HBase 0.96.x and 
HBase 0.98.x to 2.5. Surely other downstreams will be affected also.

> Remove deprecated methods in HttpServer
> ---
>
> Key: HADOOP-10253
> URL: https://issues.apache.org/jira/browse/HADOOP-10253
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Haohui Mai
>
> There are a lot of deprecated methods in {{HttpServer}}. They are not used 
> anymore. They should be removed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HADOOP-10253) Remove deprecated methods in HttpServer

2014-01-22 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879057#comment-13879057
 ] 

Andrew Purtell commented on HADOOP-10253:
-

Have you checked if downstream projects are using them?

> Remove deprecated methods in HttpServer
> ---
>
> Key: HADOOP-10253
> URL: https://issues.apache.org/jira/browse/HADOOP-10253
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Haohui Mai
>
> There are a lot of deprecated methods in {{HttpServer}}. They are not used 
> anymore. They should be removed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HADOOP-10252) HttpServer can't start if hostname is not specified

2014-01-22 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879016#comment-13879016
 ] 

Andrew Purtell commented on HADOOP-10252:
-

Based on the discussion on HBASE-10336, it is likely not to go in on any 
released or soon to be released version of HBase. The next release is likely a 
few months out. 

> HttpServer can't start if hostname is not specified
> ---
>
> Key: HADOOP-10252
> URL: https://issues.apache.org/jira/browse/HADOOP-10252
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 2.4.0
>
> Attachments: hadoop-10252.patch
>
>
> HADOOP-8362 added a checking to make sure configuration values are not null. 
> By default, we don't specify the hostname for the HttpServer. So we could not 
> start info server due to
> {noformat}
> 2014-01-22 08:43:05,969 FATAL [M:0;localhost:48573] master.HMaster(2187): 
> Unhandled exception. Starting shutdown.
> java.lang.IllegalArgumentException: Property value must not be null
>   at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:92)
>   at org.apache.hadoop.conf.Configuration.set(Configuration.java:958)
>   at org.apache.hadoop.conf.Configuration.set(Configuration.java:940)
>   at 
> org.apache.hadoop.http.HttpServer.initializeWebServer(HttpServer.java:510)
>   at org.apache.hadoop.http.HttpServer.(HttpServer.java:470)
>   at org.apache.hadoop.http.HttpServer.(HttpServer.java:458)
>   at org.apache.hadoop.http.HttpServer.(HttpServer.java:412)
>   at org.apache.hadoop.hbase.util.InfoServer.(InfoServer.java:59)
>   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:584)
>   at java.lang.Thread.run(Thread.java:722){noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HADOOP-10206) Port WASB HBase support to Hadoop 2.0

2014-01-06 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13863369#comment-13863369
 ] 

Andrew Purtell commented on HADOOP-10206:
-

Several questions. This should be an HBase jira?This is work being done / being 
reviewed on a private JIRA? Is "HWX" Hortonworks? 

If you would like to contribute something to Apache HBase you will need to open 
a public JIRA on the ASF HBASE project JIRA (see 
https://issues.apache.org/jira/browse/HBASE) and post the proposed changes 
there. 

> Port WASB HBase support to Hadoop 2.0
> -
>
> Key: HADOOP-10206
> URL: https://issues.apache.org/jira/browse/HADOOP-10206
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.2.0
> Environment: Windows, Windows Azure
>Reporter: Dexter Bradshaw
>  Labels: patch
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> A series of changes for HBase support on Hadoop 2.0. These changes include 
> support for page blobs, fixes to allows HBase logging to block blobs, and 
> support for hsync and hflush methods in the Syncable interface.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HADOOP-10134) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2014-01-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-10134:


Attachment: 10134-trunk.patch
10134-branch-2.patch

Updated patches refreshed to latest trunk and branch-2.

> [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments 
> --
>
> Key: HADOOP-10134
> URL: https://issues.apache.org/jira/browse/HADOOP-10134
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: 10134-branch-2.patch, 10134-branch-2.patch, 
> 10134-trunk.patch, 10134-trunk.patch, 10134-trunk.patch
>
>
> Javadoc is more strict by default in JDK8 and will error out on malformed or 
> illegal tags found in doc comments. Although tagged as JDK8 all of the 
> required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HADOOP-10134) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2013-11-29 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-10134:


Status: Open  (was: Patch Available)

> [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments 
> --
>
> Key: HADOOP-10134
> URL: https://issues.apache.org/jira/browse/HADOOP-10134
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: 10134-branch-2.patch, 10134-trunk.patch, 
> 10134-trunk.patch
>
>
> Javadoc is more strict by default in JDK8 and will error out on malformed or 
> illegal tags found in doc comments. Although tagged as JDK8 all of the 
> required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10134) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2013-11-29 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-10134:


Status: Patch Available  (was: Open)

> [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments 
> --
>
> Key: HADOOP-10134
> URL: https://issues.apache.org/jira/browse/HADOOP-10134
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: 10134-branch-2.patch, 10134-trunk.patch, 
> 10134-trunk.patch
>
>
> Javadoc is more strict by default in JDK8 and will error out on malformed or 
> illegal tags found in doc comments. Although tagged as JDK8 all of the 
> required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10134) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2013-11-29 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-10134:


Attachment: 10134-trunk.patch

Reattach and resubmit trunk patch

> [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments 
> --
>
> Key: HADOOP-10134
> URL: https://issues.apache.org/jira/browse/HADOOP-10134
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: 10134-branch-2.patch, 10134-trunk.patch, 
> 10134-trunk.patch
>
>
> Javadoc is more strict by default in JDK8 and will error out on malformed or 
> illegal tags found in doc comments. Although tagged as JDK8 all of the 
> required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HADOOP-10134) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2013-11-27 Thread Andrew Purtell (JIRA)
Andrew Purtell created HADOOP-10134:
---

 Summary: [JDK8] Fix Javadoc errors caused by incorrect or illegal 
tags in doc comments 
 Key: HADOOP-10134
 URL: https://issues.apache.org/jira/browse/HADOOP-10134
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.3.0
Reporter: Andrew Purtell
Priority: Minor
 Attachments: 10134-branch-2.patch, 10134-trunk.patch

Javadoc is more strict by default in JDK8 and will error out on malformed or 
illegal tags found in doc comments. Although tagged as JDK8 all of the required 
changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10134) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2013-11-27 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-10134:


Attachment: 10134-branch-2.patch
10134-trunk.patch

Attaching common changes here, will open HDFS, MAPREDUCE, and YARN issues 
elsewhere.

> [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments 
> --
>
> Key: HADOOP-10134
> URL: https://issues.apache.org/jira/browse/HADOOP-10134
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Andrew Purtell
>Priority: Minor
> Attachments: 10134-branch-2.patch, 10134-trunk.patch
>
>
> Javadoc is more strict by default in JDK8 and will error out on malformed or 
> illegal tags found in doc comments. Although tagged as JDK8 all of the 
> required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9331) Hadoop crypto codec framework and crypto codec implementations

2013-09-04 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13758681#comment-13758681
 ] 

Andrew Purtell commented on HADOOP-9331:


Would it be possible for a Hadoop committer to comment on the viability of this 
issue and related patches? 

There are HBASE-7544 and HIVE-4227/HIVE-5207 either depending on this framework 
or intent to that effect stated on the respective issues.

In this framework, crypto codec implementations can be implemented and 
optimized in Hadoop core instead of the JRE. This is a likely long term benefit 
because JRE crypto codecs must be signed with a code signing certificate 
obtained under restrictive terms that must be controlled, but Hadoop crypto 
codecs developed for this framework would not have this impediment.

Without a version of Hadoop containing this framework to target, upstream users 
may be forced to seek alternative (and suboptimal, for the reason given above) 
implementation options. Or we could see overlapping or competing frameworks 
that would lead in any case to wasted effort and additional effort at 
rationalization. See 
https://issues.apache.org/jira/browse/HBASE-7544?focusedCommentId=13710611&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13710611
 for an example.

> Hadoop crypto codec framework and crypto codec implementations
> --
>
> Key: HADOOP-9331
> URL: https://issues.apache.org/jira/browse/HADOOP-9331
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Jerry Chen
> Attachments: Hadoop Crypto Design.pdf
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> For use cases that deal with sensitive data, we often need to encrypt data to 
> be stored safely at rest. Hadoop common provides a codec framework for 
> compression algorithms. We start here. However because encryption algorithms 
> require some additional configuration and methods for key management, we 
> introduce a crypto codec framework that builds on the compression codec 
> framework. It cleanly distinguishes crypto algorithms from compression 
> algorithms, but shares common interfaces between them where possible, and 
> also carries extended interfaces where necessary to satisfy those needs. We 
> also introduce a generic Key type, and supporting utility methods and 
> classes, as a necessary abstraction for dealing with both Java crypto keys 
> and PGP keys.
> The task for this feature breaks into two parts:
> 1. The crypto codec framework that based on compression codec which can be 
> shared by all crypto codec implementations.
> 2. The codec implementations such as AES and others.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9671) Improve Hadoop security - Use cases

2013-06-25 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-9671:
---

Description: (was: 
)
   Assignee: (was: Sanjay Radia)
Summary: Improve Hadoop security - Use cases  (was: Improve Hadoop 
security - master jira)

I was on the call. My recollection of the consensus is this is an issue to 
collect use cases, not serve as a "master" or umbrella. That said, since we 
have it now it's convenient enough. 

My understanding is the consensus was also that we are kicking off a community 
based effort to tackle these cross-cutting security concerns in a collaborative 
way, so it doesn't seem appropriate to assign the umbrella.

I've made minor updates here to reflect my understanding of the consensus.

> Improve Hadoop security - Use cases
> ---
>
> Key: HADOOP-9671
> URL: https://issues.apache.org/jira/browse/HADOOP-9671
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Sanjay Radia
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9647) Hadoop streaming needs minor update for exception signature change

2013-06-14 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-9647:
---

Status: Patch Available  (was: Open)

> Hadoop streaming needs minor update for exception signature change
> --
>
> Key: HADOOP-9647
> URL: https://issues.apache.org/jira/browse/HADOOP-9647
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Andrew Purtell
> Attachments: 9647.patch
>
>
> Compilation failure:
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile 
> (default-compile) on project hadoop-streaming: Compilation failure
> [ERROR] 
> /usr/src/Hadoop/hadoop-trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/PipeMapRed.java:[223,6]
>  exception java.lang.InterruptedException is never thrown in body of 
> corresponding try statement
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9647) Hadoop streaming needs minor update for exception signature change

2013-06-14 Thread Andrew Purtell (JIRA)
Andrew Purtell created HADOOP-9647:
--

 Summary: Hadoop streaming needs minor update for exception 
signature change
 Key: HADOOP-9647
 URL: https://issues.apache.org/jira/browse/HADOOP-9647
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Purtell
 Attachments: 9647.patch

Compilation failure:

{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) 
on project hadoop-streaming: Compilation failure
[ERROR] 
/usr/src/Hadoop/hadoop-trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/PipeMapRed.java:[223,6]
 exception java.lang.InterruptedException is never thrown in body of 
corresponding try statement
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9647) Hadoop streaming needs minor update for exception signature change

2013-06-14 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-9647:
---

Attachment: 9647.patch

Trivial fix.

> Hadoop streaming needs minor update for exception signature change
> --
>
> Key: HADOOP-9647
> URL: https://issues.apache.org/jira/browse/HADOOP-9647
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Andrew Purtell
> Attachments: 9647.patch
>
>
> Compilation failure:
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile 
> (default-compile) on project hadoop-streaming: Compilation failure
> [ERROR] 
> /usr/src/Hadoop/hadoop-trunk/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming/PipeMapRed.java:[223,6]
>  exception java.lang.InterruptedException is never thrown in body of 
> corresponding try statement
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9392) Token based authentication and Single Sign On

2013-06-08 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13678805#comment-13678805
 ] 

Andrew Purtell commented on HADOOP-9392:


bq. Currently, a room has been allocated on the 26th from 1:45 to 3:30 PT. 
Specific location will be available at the Summit and any changes in date or 
time will be announced publicly to the best of our abilities. In order to 
create a manageable agenda for this session, I'd like to schedule some prep 
meetings via meetup.com.

[~kevin.minder] Is there a link to that meetup group?

> Token based authentication and Single Sign On
> -
>
> Key: HADOOP-9392
> URL: https://issues.apache.org/jira/browse/HADOOP-9392
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: token-based-authn-plus-sso.pdf
>
>
> This is an umbrella entry for one of project Rhino’s topic, for details of 
> project Rhino, please refer to 
> https://github.com/intel-hadoop/project-rhino/. The major goal for this entry 
> as described in project Rhino was 
>  
> “Core, HDFS, ZooKeeper, and HBase currently support Kerberos authentication 
> at the RPC layer, via SASL. However this does not provide valuable attributes 
> such as group membership, classification level, organizational identity, or 
> support for user defined attributes. Hadoop components must interrogate 
> external resources for discovering these attributes and at scale this is 
> problematic. There is also no consistent delegation model. HDFS has a simple 
> delegation capability, and only Oozie can take limited advantage of it. We 
> will implement a common token based authentication framework to decouple 
> internal user and service authentication from external mechanisms used to 
> support it (like Kerberos)”
>  
> We’d like to start our work from Hadoop-Common and try to provide common 
> facilities by extending existing authentication framework which support:
> 1.Pluggable token provider interface 
> 2.Pluggable token verification protocol and interface
> 3.Security mechanism to distribute secrets in cluster nodes
> 4.Delegation model of user authentication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9392) Token based authentication and Single Sign On

2013-06-04 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674196#comment-13674196
 ] 

Andrew Purtell commented on HADOOP-9392:


When is the meetup?

> Token based authentication and Single Sign On
> -
>
> Key: HADOOP-9392
> URL: https://issues.apache.org/jira/browse/HADOOP-9392
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: token-based-authn-plus-sso.pdf
>
>
> This is an umbrella entry for one of project Rhino’s topic, for details of 
> project Rhino, please refer to 
> https://github.com/intel-hadoop/project-rhino/. The major goal for this entry 
> as described in project Rhino was 
>  
> “Core, HDFS, ZooKeeper, and HBase currently support Kerberos authentication 
> at the RPC layer, via SASL. However this does not provide valuable attributes 
> such as group membership, classification level, organizational identity, or 
> support for user defined attributes. Hadoop components must interrogate 
> external resources for discovering these attributes and at scale this is 
> problematic. There is also no consistent delegation model. HDFS has a simple 
> delegation capability, and only Oozie can take limited advantage of it. We 
> will implement a common token based authentication framework to decouple 
> internal user and service authentication from external mechanisms used to 
> support it (like Kerberos)”
>  
> We’d like to start our work from Hadoop-Common and try to provide common 
> facilities by extending existing authentication framework which support:
> 1.Pluggable token provider interface 
> 2.Pluggable token verification protocol and interface
> 3.Security mechanism to distribute secrets in cluster nodes
> 4.Delegation model of user authentication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9533) Hadoop SSO/Token Service

2013-05-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646978#comment-13646978
 ] 

Andrew Purtell commented on HADOOP-9533:


[~owen.omalley] If the goal of this issue is to provide a *centralized SSO 
server* for Hadoop, let's rename the JIRA and/or make it a subtask of 
HADOOP-9392 so as a result there are not two issues seemingly proposing the 
same high level goals. As you will note on HADOOP-9392, providing a common 
token based authentication framework to decouple internal user and service 
authentication from external mechanisms used to support it (like Kerberos) is 
already a part of the goal for HADOOP-9392. So, if this issue is only proposing 
a subset of that work, let's make that clear for contributors and the 
community. 

> Hadoop SSO/Token Service
> 
>
> Key: HADOOP-9533
> URL: https://issues.apache.org/jira/browse/HADOOP-9533
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Reporter: Larry McCay
>
> This is an umbrella Jira filing to oversee a set of proposals for introducing 
> a new master service for Hadoop Single Sign On (HSSO).
> There is an increasing need for pluggable authentication providers that 
> authenticate both users and services as well as validate tokens in order to 
> federate identities authenticated by trusted IDPs. These IDPs may be deployed 
> within the enterprise or third-party IDPs that are external to the enterprise.
> These needs speak to a specific pain point: which is a narrow integration 
> path into the enterprise identity infrastructure. Kerberos is a fine solution 
> for those that already have it in place or are willing to adopt its use but 
> there remains a class of user that finds this unacceptable and needs to 
> integrate with a wider variety of identity management solutions.
> Another specific pain point is that of rolling and distributing keys. A 
> related and integral part of the HSSO server is library called the Credential 
> Management Framework (CMF), which will be a common library for easing the 
> management of secrets, keys and credentials.
> Initially, the existing delegation, block access and job tokens will continue 
> to be utilized. There may be some changes required to leverage a PKI based 
> signature facility rather than shared secrets. This is a means to simplify 
> the solution for the pain point of distributing shared secrets.
> This project will primarily centralize the responsibility of authentication 
> and federation into a single service that is trusted across the Hadoop 
> cluster and optionally across multiple clusters. This greatly simplifies a 
> number of things in the Hadoop ecosystem:
> 1.a single token format that is used across all of Hadoop regardless of 
> authentication method
> 2.a single service to have pluggable providers instead of all services
> 3.a single token authority that would be trusted across the cluster/s and 
> through PKI encryption be able to easily issue cryptographically verifiable 
> tokens
> 4.automatic rolling of the token authority’s keys and publishing of the 
> public key for easy access by those parties that need to verify incoming 
> tokens
> 5.use of PKI for signatures eliminates the need for securely sharing and 
> distributing shared secrets
> In addition to serving as the internal Hadoop SSO service this service will 
> be leveraged by the Knox Gateway from the cluster perimeter in order to 
> acquire the Hadoop cluster tokens. The same token mechanism that is used for 
> internal services will be used to represent user identities. Providing for 
> interesting scenarios such as SSO across Hadoop clusters within an enterprise 
> and/or into the cloud.
> The HSSO service will be comprised of three major components and capabilities:
> 1.Federating IDP – authenticates users/services and issues the common 
> Hadoop token
> 2.Federating SP – validates the token of trusted external IDPs and issues 
> the common Hadoop token
> 3.Token Authority – management of the common Hadoop tokens – including: 
> a.Issuance 
> b.Renewal
> c.Revocation
> As this is a meta Jira for tracking this overall effort, the details of the 
> individual efforts will be submitted along with the child Jira filings.
> Hadoop-Common would seem to be the most appropriate home for such a service 
> and its related common facilities. We will also leverage and extend existing 
> common mechanisms as appropriate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9533) Hadoop SSO/Token Service

2013-05-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646802#comment-13646802
 ] 

Andrew Purtell commented on HADOOP-9533:


Having a central master service for SSO is a design choice. HADOOP-9392 
proposes a pluggable design exactly because a central master service for SSO is 
not a solution for all environments. This JIRA is a nice clearly defined subset 
of the work for HADOOP-9392, however. Isn't this work appropriately a subtask 
of HADOOP-9392? I think you are describing it as such, please correct me if I 
am mistaken. The title of this JIRA and that of HADOOP-9392 are almost exactly 
the same, and largely the goals for this JIRA are already captured under 
HADOOP-9392 i.e. token based authentication and SSO. We should endeavor to 
resolve the duplication as shared community effort.

> Hadoop SSO/Token Service
> 
>
> Key: HADOOP-9533
> URL: https://issues.apache.org/jira/browse/HADOOP-9533
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Reporter: Larry McCay
>
> This is an umbrella Jira filing to oversee a set of proposals for introducing 
> a new master service for Hadoop Single Sign On (HSSO).
> There is an increasing need for pluggable authentication providers that 
> authenticate both users and services as well as validate tokens in order to 
> federate identities authenticated by trusted IDPs. These IDPs may be deployed 
> within the enterprise or third-party IDPs that are external to the enterprise.
> These needs speak to a specific pain point: which is a narrow integration 
> path into the enterprise identity infrastructure. Kerberos is a fine solution 
> for those that already have it in place or are willing to adopt its use but 
> there remains a class of user that finds this unacceptable and needs to 
> integrate with a wider variety of identity management solutions.
> Another specific pain point is that of rolling and distributing keys. A 
> related and integral part of the HSSO server is library called the Credential 
> Management Framework (CMF), which will be a common library for easing the 
> management of secrets, keys and credentials.
> Initially, the existing delegation, block access and job tokens will continue 
> to be utilized. There may be some changes required to leverage a PKI based 
> signature facility rather than shared secrets. This is a means to simplify 
> the solution for the pain point of distributing shared secrets.
> This project will primarily centralize the responsibility of authentication 
> and federation into a single service that is trusted across the Hadoop 
> cluster and optionally across multiple clusters. This greatly simplifies a 
> number of things in the Hadoop ecosystem:
> 1.a single token format that is used across all of Hadoop regardless of 
> authentication method
> 2.a single service to have pluggable providers instead of all services
> 3.a single token authority that would be trusted across the cluster/s and 
> through PKI encryption be able to easily issue cryptographically verifiable 
> tokens
> 4.automatic rolling of the token authority’s keys and publishing of the 
> public key for easy access by those parties that need to verify incoming 
> tokens
> 5.use of PKI for signatures eliminates the need for securely sharing and 
> distributing shared secrets
> In addition to serving as the internal Hadoop SSO service this service will 
> be leveraged by the Knox Gateway from the cluster perimeter in order to 
> acquire the Hadoop cluster tokens. The same token mechanism that is used for 
> internal services will be used to represent user identities. Providing for 
> interesting scenarios such as SSO across Hadoop clusters within an enterprise 
> and/or into the cloud.
> The HSSO service will be comprised of three major components and capabilities:
> 1.Federating IDP – authenticates users/services and issues the common 
> Hadoop token
> 2.Federating SP – validates the token of trusted external IDPs and issues 
> the common Hadoop token
> 3.Token Authority – management of the common Hadoop tokens – including: 
> a.Issuance 
> b.Renewal
> c.Revocation
> As this is a meta Jira for tracking this overall effort, the details of the 
> individual efforts will be submitted along with the child Jira filings.
> Hadoop-Common would seem to be the most appropriate home for such a service 
> and its related common facilities. We will also leverage and extend existing 
> common mechanisms as appropriate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9533) Hadoop SSO/Token Service

2013-05-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646763#comment-13646763
 ] 

Andrew Purtell commented on HADOOP-9533:


Does this not duplicate HADOOP-9392 exactly? May be you have seen that one yet? 
So we will have one SSO coming out of HADOOP-9392 and another coming out of 
Knox via this JIRA? It might be good to have two competing alternatives (or 
more), but I wonder if there a way to do this together on HADOOP-9392, since we 
are clearly working on exactly the same objective.

> Hadoop SSO/Token Service
> 
>
> Key: HADOOP-9533
> URL: https://issues.apache.org/jira/browse/HADOOP-9533
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Reporter: Larry McCay
>
> This is an umbrella Jira filing to oversee a set of proposals for introducing 
> a new master service for Hadoop Single Sign On (HSSO).
> There is an increasing need for pluggable authentication providers that 
> authenticate both users and services as well as validate tokens in order to 
> federate identities authenticated by trusted IDPs. These IDPs may be deployed 
> within the enterprise or third-party IDPs that are external to the enterprise.
> These needs speak to a specific pain point: which is a narrow integration 
> path into the enterprise identity infrastructure. Kerberos is a fine solution 
> for those that already have it in place or are willing to adopt its use but 
> there remains a class of user that finds this unacceptable and needs to 
> integrate with a wider variety of identity management solutions.
> Another specific pain point is that of rolling and distributing keys. A 
> related and integral part of the HSSO server is library called the Credential 
> Management Framework (CMF), which will be a common library for easing the 
> management of secrets, keys and credentials.
> Initially, the existing delegation, block access and job tokens will continue 
> to be utilized. There may be some changes required to leverage a PKI based 
> signature facility rather than shared secrets. This is a means to simplify 
> the solution for the pain point of distributing shared secrets.
> This project will primarily centralize the responsibility of authentication 
> and federation into a single service that is trusted across the Hadoop 
> cluster and optionally across multiple clusters. This greatly simplifies a 
> number of things in the Hadoop ecosystem:
> 1.a single token format that is used across all of Hadoop regardless of 
> authentication method
> 2.a single service to have pluggable providers instead of all services
> 3.a single token authority that would be trusted across the cluster/s and 
> through PKI encryption be able to easily issue cryptographically verifiable 
> tokens
> 4.automatic rolling of the token authority’s keys and publishing of the 
> public key for easy access by those parties that need to verify incoming 
> tokens
> 5.use of PKI for signatures eliminates the need for securely sharing and 
> distributing shared secrets
> In addition to serving as the internal Hadoop SSO service this service will 
> be leveraged by the Knox Gateway from the cluster perimeter in order to 
> acquire the Hadoop cluster tokens. The same token mechanism that is used for 
> internal services will be used to represent user identities. Providing for 
> interesting scenarios such as SSO across Hadoop clusters within an enterprise 
> and/or into the cloud.
> The HSSO service will be comprised of three major components and capabilities:
> 1.Federating IDP – authenticates users/services and issues the common 
> Hadoop token
> 2.Federating SP – validates the token of trusted external IDPs and issues 
> the common Hadoop token
> 3.Token Authority – management of the common Hadoop tokens – including: 
> a.Issuance 
> b.Renewal
> c.Revocation
> As this is a meta Jira for tracking this overall effort, the details of the 
> individual efforts will be submitted along with the child Jira filings.
> Hadoop-Common would seem to be the most appropriate home for such a service 
> and its related common facilities. We will also leverage and extend existing 
> common mechanisms as appropriate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9454) Support multipart uploads for s3native

2013-04-04 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622613#comment-13622613
 ] 

Andrew Purtell commented on HADOOP-9454:


+1 upgrading jets3t is good for several reasons and this provides a nice 
benefit.

> Support multipart uploads for s3native
> --
>
> Key: HADOOP-9454
> URL: https://issues.apache.org/jira/browse/HADOOP-9454
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Jordan Mendelson
> Attachments: HADOOP-9454-2.patch
>
>
> The s3native filesystem is limited to 5 GB file uploads to S3, however the 
> newest version of jets3t supports multipart uploads to allow storing multi-TB 
> files. While the s3 filesystem lets you bypass this restriction by uploading 
> blocks, it is necessary for us to output our data into Amazon's 
> publicdatasets bucket which is shared with others.
> Amazon has added a similar feature to their distribution of hadoop as has 
> MapR.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9448) Reimplement things

2013-04-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13619216#comment-13619216
 ] 

Andrew Purtell commented on HADOOP-9448:


bq. Can we consider C# and Mono for this reimplementation.

I think it must be Haskell. Then the solution for application problems will 
present at compile time.

> Reimplement things
> --
>
> Key: HADOOP-9448
> URL: https://issues.apache.org/jira/browse/HADOOP-9448
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.4-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
>Priority: Blocker
> Attachments: remove-trunk.patch
>
>
> We've got to the point we need to reimplement things from scratch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9448) Reimplement things

2013-04-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13619198#comment-13619198
 ] 

Andrew Purtell commented on HADOOP-9448:


Instead of reimplementing from a blank slate, I propose we split each Maven 
module into its own TLP. We can simply svn cp everything from trunk into new 
repositories for each new TLP and let them figure out what to keep and what to 
drop from their respective project.

> Reimplement things
> --
>
> Key: HADOOP-9448
> URL: https://issues.apache.org/jira/browse/HADOOP-9448
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.4-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
>Priority: Blocker
> Attachments: remove-trunk.patch
>
>
> We've got to the point we need to reimplement things from scratch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9282) Java 7 support

2013-02-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13571565#comment-13571565
 ] 

Andrew Purtell commented on HADOOP-9282:


Testing yes, but aren't many tests failing?

> Java 7 support
> --
>
> Key: HADOOP-9282
> URL: https://issues.apache.org/jira/browse/HADOOP-9282
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Reporter: Kevin Lyda
>
> The Hadoop Java Versions page makes no mention of Java 7.
> http://wiki.apache.org/hadoop/HadoopJavaVersions
> Java 6 is EOL as of this month ( 
> http://www.java.com/en/download/faq/java_6.xml ) and that's after extending 
> the date twice: https://blogs.oracle.com/henrik/entry/java_6_eol_h_h While 
> Oracle has recently released a number of security patches, chances are more 
> security issues will come up and we'll be left running clusters we can't 
> patch if we stay with Java 6.
> Does Hadoop support Java 7 and if so could the docs be changed to indicate 
> that?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9171) Release resources of unpoolable Decompressors

2013-01-13 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552347#comment-13552347
 ] 

Andrew Purtell commented on HADOOP-9171:


Calling end() potentially more than once on a zlib compressor or decompressor 
is safe. For Snappy and LZ4 it's a no-op. The dummy decompressor for BZIP2 will 
throw an UnsupportedOperationException. Catch that just in case?

> Release resources of unpoolable Decompressors
> -
>
> Key: HADOOP-9171
> URL: https://issues.apache.org/jira/browse/HADOOP-9171
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-4345.txt
>
>
> Found this when looking into HBASE-7435.
> When a Decompressor is returned to the pool in CodecPool.java, we should 
> probably call end() on it to release its resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9203) RPCCallBenchmark should find a random available port

2013-01-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-9203:
---

Status: Patch Available  (was: Open)

> RPCCallBenchmark should find a random available port
> 
>
> Key: HADOOP-9203
> URL: https://issues.apache.org/jira/browse/HADOOP-9203
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc, test
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Andrew Purtell
>Priority: Trivial
> Attachments: HADOOP-9203.patch, HADOOP-9203.patch
>
>
> RPCCallBenchmark insists on port 12345 by default. It should find a random 
> ephemeral range port instead if one isn't specified.
> {noformat}
> testBenchmarkWithProto(org.apache.hadoop.ipc.TestRPCCallBenchmark)  Time 
> elapsed: 5092 sec  <<< ERROR!
> java.net.BindException: Problem binding to [0.0.0.0:12345] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:710)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:361)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:459)
>   at org.apache.hadoop.ipc.Server.(Server.java:1877)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:982)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:376)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:351)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:825)
>   at 
> org.apache.hadoop.ipc.RPCCallBenchmark.startServer(RPCCallBenchmark.java:230)
>   at org.apache.hadoop.ipc.RPCCallBenchmark.run(RPCCallBenchmark.java:264)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at 
> org.apache.hadoop.ipc.TestRPCCallBenchmark.testBenchmarkWithProto(TestRPCCallBenchmark.java:43)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9203) RPCCallBenchmark should find a random available port

2013-01-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-9203:
---

Status: Open  (was: Patch Available)

> RPCCallBenchmark should find a random available port
> 
>
> Key: HADOOP-9203
> URL: https://issues.apache.org/jira/browse/HADOOP-9203
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc, test
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Andrew Purtell
>Priority: Trivial
> Attachments: HADOOP-9203.patch, HADOOP-9203.patch
>
>
> RPCCallBenchmark insists on port 12345 by default. It should find a random 
> ephemeral range port instead if one isn't specified.
> {noformat}
> testBenchmarkWithProto(org.apache.hadoop.ipc.TestRPCCallBenchmark)  Time 
> elapsed: 5092 sec  <<< ERROR!
> java.net.BindException: Problem binding to [0.0.0.0:12345] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:710)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:361)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:459)
>   at org.apache.hadoop.ipc.Server.(Server.java:1877)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:982)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:376)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:351)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:825)
>   at 
> org.apache.hadoop.ipc.RPCCallBenchmark.startServer(RPCCallBenchmark.java:230)
>   at org.apache.hadoop.ipc.RPCCallBenchmark.run(RPCCallBenchmark.java:264)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at 
> org.apache.hadoop.ipc.TestRPCCallBenchmark.testBenchmarkWithProto(TestRPCCallBenchmark.java:43)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9203) RPCCallBenchmark should find a random available port

2013-01-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-9203:
---

Attachment: HADOOP-9203.patch

Suresh, I made the changes you suggest and opened HDFS-4392.

> RPCCallBenchmark should find a random available port
> 
>
> Key: HADOOP-9203
> URL: https://issues.apache.org/jira/browse/HADOOP-9203
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc, test
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Andrew Purtell
>Priority: Trivial
> Attachments: HADOOP-9203.patch, HADOOP-9203.patch
>
>
> RPCCallBenchmark insists on port 12345 by default. It should find a random 
> ephemeral range port instead if one isn't specified.
> {noformat}
> testBenchmarkWithProto(org.apache.hadoop.ipc.TestRPCCallBenchmark)  Time 
> elapsed: 5092 sec  <<< ERROR!
> java.net.BindException: Problem binding to [0.0.0.0:12345] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:710)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:361)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:459)
>   at org.apache.hadoop.ipc.Server.(Server.java:1877)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:982)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:376)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:351)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:825)
>   at 
> org.apache.hadoop.ipc.RPCCallBenchmark.startServer(RPCCallBenchmark.java:230)
>   at org.apache.hadoop.ipc.RPCCallBenchmark.run(RPCCallBenchmark.java:264)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at 
> org.apache.hadoop.ipc.TestRPCCallBenchmark.testBenchmarkWithProto(TestRPCCallBenchmark.java:43)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9203) RPCCallBenchmark should find a random available port

2013-01-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-9203:
---

Attachment: HADOOP-9203.patch

Attached patch allocates a free port when asked if one wasn't specified.

> RPCCallBenchmark should find a random available port
> 
>
> Key: HADOOP-9203
> URL: https://issues.apache.org/jira/browse/HADOOP-9203
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc, test
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Andrew Purtell
>Priority: Trivial
> Attachments: HADOOP-9203.patch
>
>
> RPCCallBenchmark insists on port 12345 by default. It should find a random 
> ephemeral range port instead if one isn't specified.
> {noformat}
> testBenchmarkWithProto(org.apache.hadoop.ipc.TestRPCCallBenchmark)  Time 
> elapsed: 5092 sec  <<< ERROR!
> java.net.BindException: Problem binding to [0.0.0.0:12345] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:710)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:361)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:459)
>   at org.apache.hadoop.ipc.Server.(Server.java:1877)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:982)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:376)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:351)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:825)
>   at 
> org.apache.hadoop.ipc.RPCCallBenchmark.startServer(RPCCallBenchmark.java:230)
>   at org.apache.hadoop.ipc.RPCCallBenchmark.run(RPCCallBenchmark.java:264)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at 
> org.apache.hadoop.ipc.TestRPCCallBenchmark.testBenchmarkWithProto(TestRPCCallBenchmark.java:43)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9203) RPCCallBenchmark should find a random available port

2013-01-12 Thread Andrew Purtell (JIRA)
Andrew Purtell created HADOOP-9203:
--

 Summary: RPCCallBenchmark should find a random available port
 Key: HADOOP-9203
 URL: https://issues.apache.org/jira/browse/HADOOP-9203
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc, test
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Andrew Purtell
Priority: Trivial


RPCCallBenchmark insists on port 12345 by default. It should find a random 
ephemeral range port instead if one isn't specified.

{noformat}
testBenchmarkWithProto(org.apache.hadoop.ipc.TestRPCCallBenchmark)  Time 
elapsed: 5092 sec  <<< ERROR!
java.net.BindException: Problem binding to [0.0.0.0:12345] 
java.net.BindException: Address already in use; For more details see:  
http://wiki.apache.org/hadoop/BindException
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:710)
at org.apache.hadoop.ipc.Server.bind(Server.java:361)
at org.apache.hadoop.ipc.Server$Listener.(Server.java:459)
at org.apache.hadoop.ipc.Server.(Server.java:1877)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:982)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:376)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:351)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:825)
at 
org.apache.hadoop.ipc.RPCCallBenchmark.startServer(RPCCallBenchmark.java:230)
at org.apache.hadoop.ipc.RPCCallBenchmark.run(RPCCallBenchmark.java:264)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at 
org.apache.hadoop.ipc.TestRPCCallBenchmark.testBenchmarkWithProto(TestRPCCallBenchmark.java:43)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8607) Replace references to "Dr Who" in codebase with @BigDataBorat

2012-07-19 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418684#comment-13418684
 ] 

Andrew Purtell commented on HADOOP-8607:


bq. Shall I create JIRA release versions 0.15.4, 0.16.5, 0.18.4 and 0.19.2 for 
this patch?

That won't be necessary until Hadoop QA succeeds.

bq. Also: who is still running 0.15 that can verify the fix took? 

I believe this information is classified.

> Replace references to "Dr Who" in codebase with @BigDataBorat
> -
>
> Key: HADOOP-8607
> URL: https://issues.apache.org/jira/browse/HADOOP-8607
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 0.15.3, 0.16.4, 0.18.3, 0.19.1, 1.0.3, 2.0.0-alpha
>Reporter: Steve Loughran
>Assignee: Sanjay Radia
>Priority: Minor
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> People complain that having "Dr Who" in the code causes confusion and isn't 
> appropriate in Hadoop now that it has matured.
> I propose that we replace this anonymous user ID with {{@BigDataBorat}}. This 
> will
> # Increase brand awareness of @BigDataBorat and their central role in the Big 
> Data ecosystem.
> # Drive traffic to twitter, and increase their revenue. As contributors to 
> the Hadoop platform, this will fund further Hadoop development.
> Patching the code is straightforward; no easy tests, though we could monitor 
> twitter followers to determine rollout of the patch in the field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8607) Replace references to "Dr Who" in codebase with @BigDataBorat

2012-07-19 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418469#comment-13418469
 ] 

Andrew Purtell commented on HADOOP-8607:


We'd like to backport this to all known Hadoop versions in production, which is 
back to 0.15. Shall we attach as separate patches? 

> Replace references to "Dr Who" in codebase with @BigDataBorat
> -
>
> Key: HADOOP-8607
> URL: https://issues.apache.org/jira/browse/HADOOP-8607
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Steve Loughran
>Assignee: Sanjay Radia
>Priority: Minor
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> People complain that having "Dr Who" in the code causes confusion and isn't 
> appropriate in Hadoop now that it has matured.
> I propose that we replace this anonymous user ID with {{@BigDataBorat}}. This 
> will
> # Increase brand awareness of @BigDataBorat and their central role in the Big 
> Data ecosystem.
> # Drive traffic to twitter, and increase their revenue. As contributors to 
> the Hadoop platform, this will fund further Hadoop development.
> Patching the code is straightforward; no easy tests, though we could monitor 
> twitter followers to determine rollout of the patch in the field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HADOOP-7823) port HADOOP-4012 to branch-1 (splitting support for bzip2)

2012-07-16 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reassigned HADOOP-7823:
--

Assignee: (was: Andrew Purtell)

> port HADOOP-4012 to branch-1 (splitting support for bzip2)
> --
>
> Key: HADOOP-7823
> URL: https://issues.apache.org/jira/browse/HADOOP-7823
> Project: Hadoop Common
>  Issue Type: New Feature
>Affects Versions: 0.20.205.0
>Reporter: Tim Broberg
> Attachments: HADOOP-7823-branch-1-v2.patch, 
> HADOOP-7823-branch-1-v3.patch, HADOOP-7823-branch-1-v3.patch, 
> HADOOP-7823-branch-1-v4.patch, HADOOP-7823-branch-1.patch
>
>
> Please see HADOOP-4012 - Providing splitting support for bzip2 compressed 
> files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-7823) port HADOOP-4012 to branch-1

2012-06-22 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HADOOP-7823:
---

Attachment: HADOOP-7823-branch-1-v4.patch

v4 patch includes HADOOP-6925 fix to BZip2Codec. Modified TestCodec passes 
locally.

> port HADOOP-4012 to branch-1
> 
>
> Key: HADOOP-7823
> URL: https://issues.apache.org/jira/browse/HADOOP-7823
> Project: Hadoop Common
>  Issue Type: New Feature
>Affects Versions: 0.20.205.0
>Reporter: Tim Broberg
>Assignee: Andrew Purtell
> Attachments: HADOOP-7823-branch-1-v2.patch, 
> HADOOP-7823-branch-1-v3.patch, HADOOP-7823-branch-1-v3.patch, 
> HADOOP-7823-branch-1-v4.patch, HADOOP-7823-branch-1.patch
>
>
> Please see HADOOP-4012 - Providing splitting support for bzip2 compressed 
> files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8368) Use CMake rather than autotools to build native code

2012-06-06 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1328#comment-1328
 ] 

Andrew Purtell commented on HADOOP-8368:


The last comment on INFRA-4881 indicates the Jenkins slaves for Hadoop are 
controlled internally by Yahoo? So it would seem cmake has not been installed 
yet.



> Use CMake rather than autotools to build native code
> 
>
> Key: HADOOP-8368
> URL: https://issues.apache.org/jira/browse/HADOOP-8368
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.0.1-alpha
>
> Attachments: HADOOP-8368-b2.001.patch, HADOOP-8368-b2.001.rm.patch, 
> HADOOP-8368-b2.001.trimmed.patch, HADOOP-8368-b2.002.rm.patch, 
> HADOOP-8368-b2.002.trimmed.patch, HADOOP-8368.001.patch, 
> HADOOP-8368.005.patch, HADOOP-8368.006.patch, HADOOP-8368.007.patch, 
> HADOOP-8368.008.patch, HADOOP-8368.009.patch, HADOOP-8368.010.patch, 
> HADOOP-8368.012.half.patch, HADOOP-8368.012.patch, HADOOP-8368.012.rm.patch, 
> HADOOP-8368.014.trimmed.patch, HADOOP-8368.015.trimmed.patch, 
> HADOOP-8368.016.trimmed.patch, HADOOP-8368.018.trimmed.patch, 
> HADOOP-8368.020.rm.patch, HADOOP-8368.020.trimmed.patch, 
> HADOOP-8368.021.trimmed.patch, HADOOP-8368.023.trimmed.patch, 
> HADOOP-8368.024.trimmed.patch, HADOOP-8368.025.trimmed.patch, 
> HADOOP-8368.026.rm.patch, HADOOP-8368.026.trimmed.patch
>
>
> It would be good to use cmake rather than autotools to build the native 
> (C/C++) code in Hadoop.
> Rationale:
> 1. automake depends on shell scripts, which often have problems running on 
> different operating systems.  It would be extremely difficult, and perhaps 
> impossible, to use autotools under Windows.  Even if it were possible, it 
> might require horrible workarounds like installing cygwin.  Even on Linux 
> variants like Ubuntu 12.04, there are major build issues because /bin/sh is 
> the Dash shell, rather than the Bash shell as it is in other Linux versions.  
> It is currently impossible to build the native code under Ubuntu 12.04 
> because of this problem.
> CMake has robust cross-platform support, including Windows.  It does not use 
> shell scripts.
> 2. automake error messages are very confusing.  For example, "autoreconf: 
> cannot empty /tmp/ar0.4849: Is a directory" or "Can't locate object method 
> "path" via package "Autom4te..." are common error messages.  In order to even 
> start debugging automake problems you need to learn shell, m4, sed, and the a 
> bunch of other things.  With CMake, all you have to learn is the syntax of 
> CMakeLists.txt, which is simple.
> CMake can do all the stuff autotools can, such as making sure that required 
> libraries are installed.  There is a Maven plugin for CMake as well.
> 3. Different versions of autotools can have very different behaviors.  For 
> example, the version installed under openSUSE defaults to putting libraries 
> in /usr/local/lib64, whereas the version shipped with Ubuntu 11.04 defaults 
> to installing the same libraries under /usr/local/lib.  (This is why the FUSE 
> build is currently broken when using OpenSUSE.)  This is another source of 
> build failures and complexity.  If things go wrong, you will often get an 
> error message which is incomprehensible to normal humans (see point #2).
> CMake allows you to specify the minimum_required_version of CMake that a 
> particular CMakeLists.txt will accept.  In addition, CMake maintains strict 
> backwards compatibility between different versions.  This prevents build bugs 
> due to version skew.
> 4. autoconf, automake, and libtool are large and rather slow.  This adds to 
> build time.
> For all these reasons, I think we should switch to CMake for compiling native 
> (C/C++) code in Hadoop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7823) port HADOOP-4012 to branch-1

2012-05-28 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284604#comment-13284604
 ] 

Andrew Purtell commented on HADOOP-7823:


bq. The rest of the code looks familiar (where did the NLineInputFormat change 
come from?). 

I also did manual code inspection of 0.23, as well as followed JIRA tickets 
referenced by commenters on this issue.

Will put up a v4 shortly that includes HADOOP-6925.

> port HADOOP-4012 to branch-1
> 
>
> Key: HADOOP-7823
> URL: https://issues.apache.org/jira/browse/HADOOP-7823
> Project: Hadoop Common
>  Issue Type: New Feature
>Affects Versions: 0.20.205.0
>Reporter: Tim Broberg
>Assignee: Andrew Purtell
> Attachments: HADOOP-7823-branch-1-v2.patch, 
> HADOOP-7823-branch-1-v3.patch, HADOOP-7823-branch-1-v3.patch, 
> HADOOP-7823-branch-1.patch
>
>
> Please see HADOOP-4012 - Providing splitting support for bzip2 compressed 
> files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7550) Need for Integrity Validation of RPC

2011-08-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087372#comment-13087372
 ] 

Andrew Purtell commented on HADOOP-7550:


bq. Of course, this only addresses SASL RPCs.

Right, and I expect a significant performance impact, as you say.

> Need for Integrity Validation of RPC
> 
>
> Key: HADOOP-7550
> URL: https://issues.apache.org/jira/browse/HADOOP-7550
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Dave Thompson
>Assignee: Dave Thompson
>
> Some recent investigation of network packet corruption has shown a need for 
> hadoop RPC integrity validation beyond assurances already provided by 802.3 
> link layer and TCP 16-bit CRC.
> During an unusual occurrence on a 4k node cluster, we've seen as high as 4 
> TCP anomalies per second on a single node, sustained over an hour (14k per 
> hour).   A TCP anomaly  would be an escaped link layer packet that resulted 
> in a TCP CRC failure, TCP packet out of sequence
> or TCP packet size error.
> According to this paper[*]:  http://tinyurl.com/3aue72r
> TCP's 16-bit CRC has an effective detection rate of 2^10.   1 in 1024 errors 
> may escape detection, and in fact what originally alerted us to this issue 
> was seeing failures due to bit-errors in hadoop traffic.  Extrapolating from 
> that paper, one might expect 14 escaped packet errors per hour for that 
> single node of a 4k cluster.  While the above error rate
> was unusually high due to a broadband aggregate switch issue, hadoop not 
> having an integrity check on RPC makes it problematic to discover, and limit 
> any potential data damage due to
> acting on a corrupt RPC message.
> --
> [*] In case this jira outlives that tinyurl, the IEEE paper cited is:  
> "Performance of Checksums and CRCs over Real Data" by Jonathan Stone, Michael 
> Greenwald, Craig Partridge, Jim Hughes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7550) Need for Integrity Validation of RPC

2011-08-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086702#comment-13086702
 ] 

Andrew Purtell commented on HADOOP-7550:


A SASL QoP of "auth-int" or "auth-conf" provides this, right?

> Need for Integrity Validation of RPC
> 
>
> Key: HADOOP-7550
> URL: https://issues.apache.org/jira/browse/HADOOP-7550
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Dave Thompson
>Assignee: Dave Thompson
>
> Some recent investigation of network packet corruption has shown a need for 
> hadoop RPC integrity validation beyond assurances already provided by 802.3 
> link layer and TCP 16-bit CRC.
> During an unusual occurrence on a 4k node cluster, we've seen as high as 4 
> TCP anomalies per second on a single node, sustained over an hour (14k per 
> hour).   A TCP anomaly  would be an escaped link layer packet that resulted 
> in a TCP CRC failure, TCP packet out of sequence
> or TCP packet size error.
> According to this paper[*]:  http://tinyurl.com/3aue72r
> TCP's 16-bit CRC has an effective detection rate of 2^10.   1 in 1024 errors 
> may escape detection, and in fact what originally alerted us to this issue 
> was seeing failures due to bit-errors in hadoop traffic.  Extrapolating from 
> that paper, one might expect 14 escaped packet errors per hour for that 
> single node of a 4k cluster.  While the above error rate
> was unusually high due to a broadband aggregate switch issue, hadoop not 
> having an integrity check on RPC makes it problematic to discover, and limit 
> any potential data damage due to
> acting on a corrupt RPC message.
> --
> [*] In case this jira outlives that tinyurl, the IEEE paper cited is:  
> "Performance of Checksums and CRCs over Real Data" by Jonathan Stone, Michael 
> Greenwald, Craig Partridge, Jim Hughes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira