[jira] [Assigned] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-27 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned HDFS-16917:


Assignee: Ravindra Dingankar

> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Assignee: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.0, 3.4.0
>
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the transfer rate for 
> datanode reads.
> This will give us a distribution across a window of the read transfer rate 
> for each datanode.
> Quantiles for transfer rate per host will help in identifying issues like 
> hotspotting of datasets as well as finding repetitive slow datanodes.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-27 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16917.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the transfer rate for 
> datanode reads.
> This will give us a distribution across a window of the read transfer rate 
> for each datanode.
> Quantiles for transfer rate per host will help in identifying issues like 
> hotspotting of datasets as well as finding repetitive slow datanodes.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16890) RBF: Add period state refresh to keep router state near active namenode's

2023-02-27 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-16890:
-
Fix Version/s: (was: 3.3.6)

> RBF: Add period state refresh to keep router state near active namenode's
> -
>
> Key: HDFS-16890
> URL: https://issues.apache.org/jira/browse/HDFS-16890
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> When using the ObserverReadProxyProvider, clients can set 
> *dfs.client.failover.observer.auto-msync-period...* to periodically get the 
> Active namenode's state. When using routers without the 
> ObserverReadProxyProvider, this periodic update is lost.
> In a busy cluster, the Router constantly gets updated with the active 
> namenode's state when
>  # There is a write operation.
>  # There is an operation (read/write) from a new clients.
> However, in the scenario when there are no new clients and no write 
> operations, the state kept in the router can lag behind the active's. The 
> router does update its state with responses from the Observer, but the 
> observer may be lagging behind too.
> We should have a periodic refresh in the router to serve a similar role as 
> *dfs.client.failover.observer.auto-msync-period*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16890) RBF: Add period state refresh to keep router state near active namenode's

2023-02-27 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16890.
--
Fix Version/s: 3.4.0
   3.3.6
   Resolution: Fixed

> RBF: Add period state refresh to keep router state near active namenode's
> -
>
> Key: HDFS-16890
> URL: https://issues.apache.org/jira/browse/HDFS-16890
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> When using the ObserverReadProxyProvider, clients can set 
> *dfs.client.failover.observer.auto-msync-period...* to periodically get the 
> Active namenode's state. When using routers without the 
> ObserverReadProxyProvider, this periodic update is lost.
> In a busy cluster, the Router constantly gets updated with the active 
> namenode's state when
>  # There is a write operation.
>  # There is an operation (read/write) from a new clients.
> However, in the scenario when there are no new clients and no write 
> operations, the state kept in the router can lag behind the active's. The 
> router does update its state with responses from the Observer, but the 
> observer may be lagging behind too.
> We should have a periodic refresh in the router to serve a similar role as 
> *dfs.client.failover.observer.auto-msync-period*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context

2023-02-23 Thread Owen O'Malley (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692793#comment-17692793
 ] 

Owen O'Malley commented on HDFS-16901:
--

My trial backport is here - 
https://github.com/omalley/hadoop/tree/HDFS-16901-3.3

> RBF: Routers should propagate the real user in the UGI via the caller context
> -
>
> Key: HDFS-16901
> URL: https://issues.apache.org/jira/browse/HDFS-16901
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> If the router receives an operation from a proxyUser, it drops the realUser 
> in the UGI and makes the routerUser the realUser for the operation that goes 
> to the namenode.
> In the namenode UGI logs, we'd like the ability to know the original realUser.
> The router should propagate the realUser from the client call as part of the 
> callerContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context

2023-02-23 Thread Owen O'Malley (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692786#comment-17692786
 ] 

Owen O'Malley commented on HDFS-16901:
--

Simba, when I backport this to branch-3.3 I get a test failure. Basically the 
new test has 'oomalley' as the login user, but the log is using testRealUser.

 
{code:java}
2023-02-22 17:03:36,169 [IPC Server handler 5 on default port 49453] INFO  
FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8574)) - allowed=true    
ugi=testProxyUser (auth:PROXY) via testRealUser (auth:SIMPLE)       
ip=/127.0.0.1   cmd=listStatus  src=/   dst=null        perm=null           
proto=rpc       
callerContext=clientIp:172.25.204.192,clientPort:49519,realUser:testRealUser
{code}
What the test is looking for is:
{code:java}
ugi=testProxyUser (auth:PROXY) via oomalley (auth:SIMPLE){code}
The test works correctly on trunk.

> RBF: Routers should propagate the real user in the UGI via the caller context
> -
>
> Key: HDFS-16901
> URL: https://issues.apache.org/jira/browse/HDFS-16901
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> If the router receives an operation from a proxyUser, it drops the realUser 
> in the UGI and makes the routerUser the realUser for the operation that goes 
> to the namenode.
> In the namenode UGI logs, we'd like the ability to know the original realUser.
> The router should propagate the realUser from the client call as part of the 
> callerContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context

2023-02-22 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16901.
--
Fix Version/s: 3.4.0
   3.3.6
   Resolution: Fixed

Thanks, Simba!

> RBF: Routers should propagate the real user in the UGI via the caller context
> -
>
> Key: HDFS-16901
> URL: https://issues.apache.org/jira/browse/HDFS-16901
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> If the router receives an operation from a proxyUser, it drops the realUser 
> in the UGI and makes the routerUser the realUser for the operation that goes 
> to the namenode.
> In the namenode UGI logs, we'd like the ability to know the original realUser.
> The router should propagate the realUser from the client call as part of the 
> callerContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16853) The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because HADOOP-18324

2023-02-08 Thread Owen O'Malley (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686083#comment-17686083
 ] 

Owen O'Malley commented on HDFS-16853:
--

This PR is a much simpler solution and shouldn't have any race conditions.

https://github.com/apache/hadoop/pull/5371

> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> because HADOOP-18324
> ---
>
> Key: HDFS-16853
> URL: https://issues.apache.org/jira/browse/HDFS-16853
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.5
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Blocker
>  Labels: pull-request-available
>
> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> with error message: Waiting for cluster to become active. And the blocking 
> jstack as bellows:
> {code:java}
> "BP-1618793397-192.168.3.4-1669198559828 heartbeating to 
> localhost/127.0.0.1:54673" #260 daemon prio=5 os_prio=31 tid=0x
> 7fc1108fa000 nid=0x19303 waiting on condition [0x700017884000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x0007430a9ec0> (a 
> java.util.concurrent.SynchronousQueue$TransferQueue)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(SynchronousQueue.java:762)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.transfer(SynchronousQueue.java:695)
>         at 
> java.util.concurrent.SynchronousQueue.put(SynchronousQueue.java:877)
>         at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1186)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1482)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1429)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
>         at com.sun.proxy.$Proxy23.sendHeartbeat(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClient
> SideTranslatorPB.java:168)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:570)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:714)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:915)
>         at java.lang.Thread.run(Thread.java:748)  {code}
> After looking into the code and found that this bug is imported by 
> HADOOP-18324. Because RpcRequestSender exited without cleaning up the 
> rpcRequestQueue, then caused BPServiceActor was blocked in sending request.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16853) The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because HADOOP-18324

2023-02-07 Thread Owen O'Malley (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685487#comment-17685487
 ] 

Owen O'Malley commented on HDFS-16853:
--

The description is wrong. The SychronousQueue has no storage and thus doesn't 
need to be cleaned up. The problem is that between the check at the top of 
sendRpcRequest and when it offers the serialized bytes the other thread was 
stopped.

Unfortunately, just making sendRpcRequest synchronous, which would fix the race 
condition, wouldn't be ok because we can't hold the lock while we wait for our 
turn in the queue.

The proposed fix doesn't fix the race condition because it releases the lock 
before putting the message in the queue.

Let me look at what we can do.

 

> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> because HADOOP-18324
> ---
>
> Key: HDFS-16853
> URL: https://issues.apache.org/jira/browse/HDFS-16853
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.5
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Blocker
>  Labels: pull-request-available
>
> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> with error message: Waiting for cluster to become active. And the blocking 
> jstack as bellows:
> {code:java}
> "BP-1618793397-192.168.3.4-1669198559828 heartbeating to 
> localhost/127.0.0.1:54673" #260 daemon prio=5 os_prio=31 tid=0x
> 7fc1108fa000 nid=0x19303 waiting on condition [0x700017884000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x0007430a9ec0> (a 
> java.util.concurrent.SynchronousQueue$TransferQueue)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(SynchronousQueue.java:762)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.transfer(SynchronousQueue.java:695)
>         at 
> java.util.concurrent.SynchronousQueue.put(SynchronousQueue.java:877)
>         at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1186)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1482)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1429)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
>         at com.sun.proxy.$Proxy23.sendHeartbeat(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClient
> SideTranslatorPB.java:168)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:570)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:714)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:915)
>         at java.lang.Thread.run(Thread.java:748)  {code}
> After looking into the code and found that this bug is imported by 
> HADOOP-18324. Because RpcRequestSender exited without cleaning up the 
> rpcRequestQueue, then caused BPServiceActor was blocked in sending request.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16895) NamenodeHeartbeatService should use credentials of logged in user

2023-02-07 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16895.
--
Fix Version/s: 3.4.0
   3.3.5
 Assignee: Hector Sandoval Chaverri
   Resolution: Fixed

> NamenodeHeartbeatService should use credentials of logged in user
> -
>
> Key: HDFS-16895
> URL: https://issues.apache.org/jira/browse/HDFS-16895
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Reporter: Hector Sandoval Chaverri
>Assignee: Hector Sandoval Chaverri
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> NamenodeHeartbeatService has been found to log the errors when querying 
> protected Namenode JMX APIs. We have been able to work around this by running 
> kinit with the DFS_ROUTER_KEYTAB_FILE_KEY and 
> DFS_ROUTER_KERBEROS_PRINCIPAL_KEY on the router.
> While investigating a solution, we found that doing the request as part of a  
> UserGroupInformation.getLoginUser.doAs() call doesn't require to kinit before.
> The error logged is:
> {noformat}
> 2022-08-16 21:35:00,265 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.FederationUtil: Cannot parse 
> JMX output for Hadoop:service=NameNode,name=FSNamesystem* from server 
> ltx1-yugiohnn03-ha1.grid.linkedin.com:50070
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> Error while authenticating with endpoint: 
> http://ltx1-yugiohnn03-ha1.grid.linkedin.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem*
>   at sun.reflect.GeneratedConstructorAccessor55.newInstance(Unknown 
> Source)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.wrapExceptionWithMessage(KerberosAuthenticator.java:232)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:219)
>   at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:350)
>   at 
> org.apache.hadoop.hdfs.web.URLConnectionFactory.openConnection(URLConnectionFactory.java:186)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.FederationUtil.getJmx(FederationUtil.java:82)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateJMXParameters(NamenodeHeartbeatService.java:352)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.getNamenodeStatusReport(NamenodeHeartbeatService.java:295)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:218)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.periodicInvoke(NamenodeHeartbeatService.java:172)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.PeriodicService$1.run(PeriodicService.java:178)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:360)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:204)
>   ... 15 more
> Caused by: GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)
>   at 
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
>   at 
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
>   at 
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
>   at 
> sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
>   at 
> sun.security.jgss.GSSConte

[jira] [Resolved] (HDFS-16886) Fix documentation for StateStoreRecordOperations#get(Class ..., Query ...)

2023-01-11 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16886.
--
Fix Version/s: 3.4.0
   3.3.5
   Resolution: Fixed

> Fix documentation for StateStoreRecordOperations#get(Class ..., Query ...)
> --
>
> Key: HDFS-16886
> URL: https://issues.apache.org/jira/browse/HDFS-16886
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> For {*}StateStoreRecordOperations#get(Class ..., Query ...){*}, when multiple 
> records match, the documentation says a null value should be returned and an 
> IOException should be thrown. Both can't happen.
> I believe the intended behavior is that an IOException is thrown. This is the 
> implementation in {*}StateStoreBaseImpl{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16877) Namenode doesn't use alignment context in TestObserverWithRouter

2023-01-06 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16877.
--
Fix Version/s: 3.4.0
 Assignee: Simbarashe Dzinamarira
   Resolution: Fixed

I've committed this. Thanks, Simba!

> Namenode doesn't use alignment context in TestObserverWithRouter
> 
>
> Key: HDFS-16877
> URL: https://issues.apache.org/jira/browse/HDFS-16877
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> We need to set "{*}dfs.namenode.state.context.enabled{*}" to true for the 
> namenode to send it's stateId in client responses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16851) RBF: Add a utility to dump the StateStore

2022-11-29 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16851.
--
Fix Version/s: 3.3.6
   3.4.0
   Resolution: Fixed

> RBF: Add a utility to dump the StateStore
> -
>
> Key: HDFS-16851
> URL: https://issues.apache.org/jira/browse/HDFS-16851
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.6, 3.4.0
>
>
> It would be useful to have a utility to dump the StateStore for RBF.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16847) RBF: StateStore writer should not commit tmp fail if there was an error in writing the file.

2022-11-28 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16847.
--
Fix Version/s: 3.4.0
   3.3.5
   Resolution: Fixed

I committed this. Thanks, Simba!

> RBF: StateStore writer should not commit tmp fail if there was an error in 
> writing the file.
> 
>
> Key: HDFS-16847
> URL: https://issues.apache.org/jira/browse/HDFS-16847
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> The file based implementation of the RBF state store has a commit step that 
> moves a temporary file to a permanent location.
> There is a check to see if the write of the temp file was successfully, 
> however, the code to commit doesn't check the success flag.
> This is the relevant code: 
> [https://github.com/apache/hadoop/blob/7d39abd799a5f801a9fd07868a193205ab500bfa/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileBaseImpl.java#L369]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16845) Add configuration flag to enable observer reads on routers without using ObserverReadProxyProvider

2022-11-28 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16845.
--
Fix Version/s: 3.4.0
   3.3.5
   2.10.3
   Resolution: Fixed

I just committed this. Thanks, Simba!

> Add configuration flag to enable observer reads on routers without using 
> ObserverReadProxyProvider
> --
>
> Key: HDFS-16845
> URL: https://issues.apache.org/jira/browse/HDFS-16845
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5, 2.10.3
>
>
> In order for clients to have routers forward their reads to observers, the 
> clients must use a proxy with an alignment context. This is currently 
> achieved by using the ObserverReadProxyProvider.
> Using ObserverReadProxyProvider allows backward compatible for client 
> configurations.
> However, the ObserverReadProxyProvider forces an msync on initialization 
> which is not required with routers.
> Performing msync calls is more expensive with routers because the router fans 
> out the cal to all namespaces, so we'd like to avoid this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16856) [RBF] Refactor router admin command to use HDFS AdminHelper class

2022-11-28 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16856:


 Summary: [RBF] Refactor router admin command to use HDFS 
AdminHelper class
 Key: HDFS-16856
 URL: https://issues.apache.org/jira/browse/HDFS-16856
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently, the router admin class is a bit of a mess with a lot of custom 
programming. We should use the infrastructure that was developed in the 
AdminHelper class to standardize the command processing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16851) [RBF] Utility to textually dump the StateStore

2022-11-21 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16851:


 Summary: [RBF] Utility to textually dump the StateStore
 Key: HDFS-16851
 URL: https://issues.apache.org/jira/browse/HDFS-16851
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Reporter: Owen O'Malley
Assignee: Owen O'Malley


It would be useful to have a utility to dump the StateStore for RBF.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16844) [RBF] The routers should be resiliant against exceptions from StateStore

2022-11-18 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16844.
--
Fix Version/s: 3.4.0
   3.3.5
   Resolution: Fixed

> [RBF] The routers should be resiliant against exceptions from StateStore
> 
>
> Key: HDFS-16844
> URL: https://issues.apache.org/jira/browse/HDFS-16844
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.4
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> Currently, a single exception from the StateStore will cripple a router by 
> clearing the caches before the replacement is loaded. Since the routers have 
> the information in an in-memory cache, it is better to keep running. There is 
> still the timeout that will push the router into safe-mode if it can't load 
> the state store over a longer period of time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16843) [RBF] The routers should be resiliant against exceptions from StateStore

2022-11-15 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16843.
--
Resolution: Duplicate

> [RBF] The routers should be resiliant against exceptions from StateStore
> 
>
> Key: HDFS-16843
> URL: https://issues.apache.org/jira/browse/HDFS-16843
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.4
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>
> Currently, a single exception from the StateStore will cripple a router by 
> clearing the caches before the replacement is loaded. Since the routers have 
> the information in an in-memory cache, it is better to keep running. There is 
> still the timeout that will push the router into safe-mode if it can't load 
> the state store over a longer period of time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16844) [RBF] The routers should be resiliant against exceptions from StateStore

2022-11-15 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16844:


 Summary: [RBF] The routers should be resiliant against exceptions 
from StateStore
 Key: HDFS-16844
 URL: https://issues.apache.org/jira/browse/HDFS-16844
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Affects Versions: 3.3.4
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently, a single exception from the StateStore will cripple a router by 
clearing the caches before the replacement is loaded. Since the routers have 
the information in an in-memory cache, it is better to keep running. There is 
still the timeout that will push the router into safe-mode if it can't load the 
state store over a longer period of time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16843) [RBF] The routers should be resiliant against exceptions from StateStore

2022-11-15 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16843:


 Summary: [RBF] The routers should be resiliant against exceptions 
from StateStore
 Key: HDFS-16843
 URL: https://issues.apache.org/jira/browse/HDFS-16843
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Affects Versions: 3.3.4
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently, a single exception from the StateStore will cripple a router by 
clearing the caches before the replacement is loaded. Since the routers have 
the information in an in-memory cache, it is better to keep running. There is 
still the timeout that will push the router into safe-mode if it can't load the 
state store over a longer period of time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16836) StandbyCheckpointer can still trigger rollback fs image after RU is finalized

2022-11-15 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16836.
--
Fix Version/s: 3.4.0
   3.3.5
   Resolution: Fixed

I just committed this. Thanks, Lei!

> StandbyCheckpointer can still trigger rollback fs image after RU is finalized
> -
>
> Key: HDFS-16836
> URL: https://issues.apache.org/jira/browse/HDFS-16836
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Lei Yang
>Assignee: Lei Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> StandbyCheckpointer trigger rollback fsimage when RU is started.
> When ru is started, a flag (needRollbackImage) was set to true during edit 
> log replay.
> And it only gets reset to false when doCheckpoint() succeeded.
> Think about following scenario:
>  # Start RU, needRollbackImage is set to true.
>  # doCheckpoint() failed.
>  # RU is finalized.
>  # namesystem.getFSImage().hasRollbackFSImage() is always false since 
> rollback image cannot be generated once RU is over.
>  # needRollbackImage was never set to false.
>  # Checkpoints threshold(1m txns) and period(1hr) are not honored.
> {code:java}
> StandbyCheckpointer:
> void doWork() {
>  
>   doCheckpoint();
>   // reset needRollbackCheckpoint to false only when we finish a ckpt
>   // for rollback image
>   if (needRollbackCheckpoint
>   && namesystem.getFSImage().hasRollbackFSImage()) {
> namesystem.setCreatedRollbackImages(true);
> namesystem.setNeedRollbackFsImage(false);
>   }
>   lastCheckpointTime = now;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16836) StandbyCheckpointer can still trigger rollback fs image after RU is finalized

2022-11-15 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned HDFS-16836:


Assignee: Lei Yang

> StandbyCheckpointer can still trigger rollback fs image after RU is finalized
> -
>
> Key: HDFS-16836
> URL: https://issues.apache.org/jira/browse/HDFS-16836
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Lei Yang
>Assignee: Lei Yang
>Priority: Major
>  Labels: pull-request-available
>
> StandbyCheckpointer trigger rollback fsimage when RU is started.
> When ru is started, a flag (needRollbackImage) was set to true during edit 
> log replay.
> And it only gets reset to false when doCheckpoint() succeeded.
> Think about following scenario:
>  # Start RU, needRollbackImage is set to true.
>  # doCheckpoint() failed.
>  # RU is finalized.
>  # namesystem.getFSImage().hasRollbackFSImage() is always false since 
> rollback image cannot be generated once RU is over.
>  # needRollbackImage was never set to false.
>  # Checkpoints threshold(1m txns) and period(1hr) are not honored.
> {code:java}
> StandbyCheckpointer:
> void doWork() {
>  
>   doCheckpoint();
>   // reset needRollbackCheckpoint to false only when we finish a ckpt
>   // for rollback image
>   if (needRollbackCheckpoint
>   && namesystem.getFSImage().hasRollbackFSImage()) {
> namesystem.setCreatedRollbackImages(true);
> namesystem.setNeedRollbackFsImage(false);
>   }
>   lastCheckpointTime = now;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16778) Separate out the logger for which DN is picked by a DFSInputStream

2022-09-19 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16778:


 Summary: Separate out the logger for which DN is picked by a 
DFSInputStream
 Key: HDFS-16778
 URL: https://issues.apache.org/jira/browse/HDFS-16778
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley


Currently, there is no way to know which DN a given stream chose without 
turning on debug for all of DFSClient. I'd like the ability to just get that 
logged.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16767) RBF: Support observer node from Router-Based Federation

2022-09-14 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16767.
--
Fix Version/s: 3.4.0
   3.3.9
   Resolution: Fixed

I just committed this. Thanks, Simba!

> RBF: Support observer node from Router-Based Federation 
> 
>
> Key: HDFS-16767
> URL: https://issues.apache.org/jira/browse/HDFS-16767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> Enable routers to direct read calls to observer namenodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13522) RBF: Support observer node from Router-Based Federation

2022-09-09 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-13522:
-
Fix Version/s: 3.3.9

> RBF: Support observer node from Router-Based Federation
> ---
>
> Key: HDFS-13522
> URL: https://issues.apache.org/jira/browse/HDFS-13522
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, namenode
>Reporter: Erik Krogen
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
> Attachments: HDFS-13522.001.patch, HDFS-13522.002.patch, 
> HDFS-13522_WIP.patch, RBF_ Observer support.pdf, Router+Observer RPC 
> clogging.png, ShortTerm-Routers+Observer.png, 
> observer_reads_in_rbf_proposal_simbadzina_v1.pdf, 
> observer_reads_in_rbf_proposal_simbadzina_v2.pdf
>
>  Time Spent: 20h 50m
>  Remaining Estimate: 0h
>
> Changes will need to occur to the router to support the new observer node.
> One such change will be to make the router understand the observer state, 
> e.g. {{FederationNamenodeServiceState}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13522) RBF: Support observer node from Router-Based Federation

2022-09-09 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-13522:
-
Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

I committed this. Thanks, Simba!

> RBF: Support observer node from Router-Based Federation
> ---
>
> Key: HDFS-13522
> URL: https://issues.apache.org/jira/browse/HDFS-13522
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, namenode
>Reporter: Erik Krogen
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HDFS-13522.001.patch, HDFS-13522.002.patch, 
> HDFS-13522_WIP.patch, RBF_ Observer support.pdf, Router+Observer RPC 
> clogging.png, ShortTerm-Routers+Observer.png, 
> observer_reads_in_rbf_proposal_simbadzina_v1.pdf, 
> observer_reads_in_rbf_proposal_simbadzina_v2.pdf
>
>  Time Spent: 20h 50m
>  Remaining Estimate: 0h
>
> Changes will need to occur to the router to support the new observer node.
> One such change will be to make the router understand the observer state, 
> e.g. {{FederationNamenodeServiceState}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13522) RBF: Support observer node from Router-Based Federation

2022-06-22 Thread Owen O'Malley (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557699#comment-17557699
 ] 

Owen O'Malley commented on HDFS-13522:
--

I'm concerned about additional msyncs on every call from RBF. It will radically 
increase the rpc load on the active NN.

I'd propose that we add a new field in the client protocol that tracks the 
state of all of the namespaces that a given client has used. The flow would 
look like:

client -> router: {} // no state

router -> nn: msync // get the state

nn -> router: state 1000

router -> client: \{ns1: 1000}

client -> router: \{ns1: 1000}

router -> observer: state 1000

The client just gives back the state it was given. This gracefully handles fail 
overs between routers and avoids additional msyncs.

> RBF: Support observer node from Router-Based Federation
> ---
>
> Key: HDFS-13522
> URL: https://issues.apache.org/jira/browse/HDFS-13522
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, namenode
>Reporter: Erik Krogen
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13522.001.patch, HDFS-13522.002.patch, 
> HDFS-13522_WIP.patch, RBF_ Observer support.pdf, Router+Observer RPC 
> clogging.png, ShortTerm-Routers+Observer.png
>
>  Time Spent: 15h 20m
>  Remaining Estimate: 0h
>
> Changes will need to occur to the router to support the new observer node.
> One such change will be to make the router understand the observer state, 
> e.g. {{FederationNamenodeServiceState}}.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16518) KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager

2022-03-30 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16518.
--
Fix Version/s: 3.4.0
   2.10.2
   3.3.3
   Resolution: Fixed

I committed this. Thanks, Lei!

> KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager
> -
>
> Key: HDFS-16518
> URL: https://issues.apache.org/jira/browse/HDFS-16518
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.10.0
>Reporter: Lei Yang
>Assignee: Lei Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.3.3
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> KeyProvider implements Closable interface but some custom implementation of 
> KeyProvider also needs explicit close in KeyProviderCache. An example is to 
> use custom KeyProvider in DFSClient to integrate read encrypted file on HDFS. 
> KeyProvider  currently gets closed in KeyProviderCache only when cache entry 
> is expired or invalidated. In some cases, this is not happening. This seems 
> related to guava cache.
> This patch is to use hadoop JVM shutdownhookManager to globally cleanup cache 
> entries and thus close KeyProvider using cache hook right after filesystem 
> instance gets closed in a deterministic way.
> {code:java}
> Class KeyProviderCache
> ...
>  public KeyProviderCache(long expiryMs) {
>   cache = CacheBuilder.newBuilder()
> .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS)
> .removalListener(new RemovalListener() {
>   @Override
>   public void onRemoval(
>   @Nonnull RemovalNotification notification) {
> try {
>   assert notification.getValue() != null;
>   notification.getValue().close();
> } catch (Throwable e) {
>   LOG.error(
>   "Error closing KeyProvider with uri ["
>   + notification.getKey() + "]", e);
> }
>   }
> })
> .build(); 
> }{code}
> We could have made a new function KeyProviderCache#close, have each DFSClient 
> call this function and close KeyProvider at the end of each DFSClient#close 
> call but it will expose another problem to potentially close global cache 
> among different DFSClient instances.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16518) KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager

2022-03-30 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned HDFS-16518:


Assignee: Lei Yang

> KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager
> -
>
> Key: HDFS-16518
> URL: https://issues.apache.org/jira/browse/HDFS-16518
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.10.0
>Reporter: Lei Yang
>Assignee: Lei Yang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> KeyProvider implements Closable interface but some custom implementation of 
> KeyProvider also needs explicit close in KeyProviderCache. An example is to 
> use custom KeyProvider in DFSClient to integrate read encrypted file on HDFS. 
> KeyProvider  currently gets closed in KeyProviderCache only when cache entry 
> is expired or invalidated. In some cases, this is not happening. This seems 
> related to guava cache.
> This patch is to use hadoop JVM shutdownhookManager to globally cleanup cache 
> entries and thus close KeyProvider using cache hook right after filesystem 
> instance gets closed in a deterministic way.
> {code:java}
> Class KeyProviderCache
> ...
>  public KeyProviderCache(long expiryMs) {
>   cache = CacheBuilder.newBuilder()
> .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS)
> .removalListener(new RemovalListener() {
>   @Override
>   public void onRemoval(
>   @Nonnull RemovalNotification notification) {
> try {
>   assert notification.getValue() != null;
>   notification.getValue().close();
> } catch (Throwable e) {
>   LOG.error(
>   "Error closing KeyProvider with uri ["
>   + notification.getKey() + "]", e);
> }
>   }
> })
> .build(); 
> }{code}
> We could have made a new function KeyProviderCache#close, have each DFSClient 
> call this function and close KeyProvider at the end of each DFSClient#close 
> call but it will expose another problem to potentially close global cache 
> among different DFSClient instances.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16518) Cached KeyProvider in KeyProviderCache should be closed with ShutdownHookManager

2022-03-24 Thread Owen O'Malley (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512012#comment-17512012
 ] 

Owen O'Malley commented on HDFS-16518:
--

I don't understand why this is required. Obviously at jvm shutdown the cache 
will be discarded. The order of shutdown hooks isn't deterministic, so using 
this isn't a fix against other shutdown hooks using the cache.

Is there some other call to KeyProvider.close() that this should replace?

> Cached KeyProvider in KeyProviderCache should be closed with 
> ShutdownHookManager
> 
>
> Key: HDFS-16518
> URL: https://issues.apache.org/jira/browse/HDFS-16518
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.10.0
>Reporter: Lei Yang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We need to make sure the underlying KeyProvider used by multiple DFSClient 
> instances is closed at one shot during jvm shutdown. Within the shutdownhook, 
> we invalidate the cache and make sure they are all closed. The  cache has a 
> removeListener hook which is called when cache entry is invalidated. 
> {code:java}
> Class KeyProviderCache
> ...
>  public KeyProviderCache(long expiryMs) {
>   cache = CacheBuilder.newBuilder()
> .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS)
> .removalListener(new RemovalListener() {
>   @Override
>   public void onRemoval(
>   @Nonnull RemovalNotification notification) {
> try {
>   assert notification.getValue() != null;
>   notification.getValue().close();
> } catch (Throwable e) {
>   LOG.error(
>   "Error closing KeyProvider with uri ["
>   + notification.getKey() + "]", e);
> }
>   }
> })
> .build(); 
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16518) Cached KeyProvider in KeyProviderCache should be closed with ShutdownHookManager

2022-03-24 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned HDFS-16518:


Assignee: (was: Lei Xu)

> Cached KeyProvider in KeyProviderCache should be closed with 
> ShutdownHookManager
> 
>
> Key: HDFS-16518
> URL: https://issues.apache.org/jira/browse/HDFS-16518
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.10.0
>Reporter: Lei Yang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We need to make sure the underlying KeyProvider used by multiple DFSClient 
> instances is closed at one shot during jvm shutdown. Within the shutdownhook, 
> we invalidate the cache and make sure they are all closed. The  cache has a 
> removeListener hook which is called when cache entry is invalidated. 
> {code:java}
> Class KeyProviderCache
> ...
>  public KeyProviderCache(long expiryMs) {
>   cache = CacheBuilder.newBuilder()
> .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS)
> .removalListener(new RemovalListener() {
>   @Override
>   public void onRemoval(
>   @Nonnull RemovalNotification notification) {
> try {
>   assert notification.getValue() != null;
>   notification.getValue().close();
> } catch (Throwable e) {
>   LOG.error(
>   "Error closing KeyProvider with uri ["
>   + notification.getKey() + "]", e);
> }
>   }
> })
> .build(); 
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16518) Cached KeyProvider in KeyProviderCache should be closed with ShutdownHookManager

2022-03-24 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned HDFS-16518:


Assignee: Lei Xu

> Cached KeyProvider in KeyProviderCache should be closed with 
> ShutdownHookManager
> 
>
> Key: HDFS-16518
> URL: https://issues.apache.org/jira/browse/HDFS-16518
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.10.0
>Reporter: Lei Yang
>Assignee: Lei Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We need to make sure the underlying KeyProvider used by multiple DFSClient 
> instances is closed at one shot during jvm shutdown. Within the shutdownhook, 
> we invalidate the cache and make sure they are all closed. The  cache has a 
> removeListener hook which is called when cache entry is invalidated. 
> {code:java}
> Class KeyProviderCache
> ...
>  public KeyProviderCache(long expiryMs) {
>   cache = CacheBuilder.newBuilder()
> .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS)
> .removalListener(new RemovalListener() {
>   @Override
>   public void onRemoval(
>   @Nonnull RemovalNotification notification) {
> try {
>   assert notification.getValue() != null;
>   notification.getValue().close();
> } catch (Throwable e) {
>   LOG.error(
>   "Error closing KeyProvider with uri ["
>   + notification.getKey() + "]", e);
> }
>   }
> })
> .build(); 
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16517) In 2.10 the distance metric is wrong for non-DN machines

2022-03-23 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16517.
--
Fix Version/s: 2.10.2
   Resolution: Fixed

> In 2.10 the distance metric is wrong for non-DN machines
> 
>
> Key: HDFS-16517
> URL: https://issues.apache.org/jira/browse/HDFS-16517
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.10.2
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In 2.10, the metric for distance between the client and the data node is 
> wrong for machines that aren't running data nodes (ie. 
> getWeightUsingNetworkLocation). The code works correctly in 3.3+. 
> Currently
>  
> ||Client||DataNode||getWeight||getWeightUsingNetworkLocation||
> |/rack1/node1|/rack1/node1|0|0|
> |/rack1/node1|/rack1/node2|2|2|
> |/rack1/node1|/rack2/node2|4|2|
> |/pod1/rack1/node1|/pod1/rack1/node2|2|2|
> |/pod1/rack1/node1|/pod1/rack2/node2|4|2|
> |/pod1/rack1/node1|/pod2/rack2/node2|6|4|
>  
> This bug will destroy data locality on clusters where the clients share racks 
> with DataNodes, but are running on machines that aren't running DataNodes, 
> such as striping federated HDFS clusters across racks.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16517) In 2.10 the distance metric is wrong for non-DN machines

2022-03-22 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-16517:
-
Description: 
In 2.10, the metric for distance between the client and the data node is wrong 
for machines that aren't running data nodes (ie. 
getWeightUsingNetworkLocation). The code works correctly in 3.3+. 

Currently

 
||Client||DataNode||getWeight||getWeightUsingNetworkLocation||
|/rack1/node1|/rack1/node1|0|0|
|/rack1/node1|/rack1/node2|2|2|
|/rack1/node1|/rack2/node2|4|2|
|/pod1/rack1/node1|/pod1/rack1/node2|2|2|
|/pod1/rack1/node1|/pod1/rack2/node2|4|2|
|/pod1/rack1/node1|/pod2/rack2/node2|6|4|

 

This bug will destroy data locality on clusters where the clients share racks 
with DataNodes, but are running on machines that aren't running DataNodes, such 
as striping federated HDFS clusters across racks.

  was:
In 2.10, the metric for distance between the client and the data node is wrong 
for machines that aren't running data nodes (ie. 
getWeightUsingNetworkLocation). The code works correctly in 3.3+.

Currently

 
||Client||DataNode ||getWeight||getWeightUsingNetworkLocation||
|/rack1/node1|/rack1/node1|0|0|
|/rack1/node1|/rack1/node2|2|2|
|/rack1/node1|/rack2/node2|4|2|
|/pod1/rack1/node1|/pod1/rack1/node2|2|2|
|/pod1/rack1/node1|/pod1/rack2/node2|4|2|
|/pod1/rack1/node1|/pod2/rack2/node2|6|4|

 


> In 2.10 the distance metric is wrong for non-DN machines
> 
>
> Key: HDFS-16517
> URL: https://issues.apache.org/jira/browse/HDFS-16517
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>
> In 2.10, the metric for distance between the client and the data node is 
> wrong for machines that aren't running data nodes (ie. 
> getWeightUsingNetworkLocation). The code works correctly in 3.3+. 
> Currently
>  
> ||Client||DataNode||getWeight||getWeightUsingNetworkLocation||
> |/rack1/node1|/rack1/node1|0|0|
> |/rack1/node1|/rack1/node2|2|2|
> |/rack1/node1|/rack2/node2|4|2|
> |/pod1/rack1/node1|/pod1/rack1/node2|2|2|
> |/pod1/rack1/node1|/pod1/rack2/node2|4|2|
> |/pod1/rack1/node1|/pod2/rack2/node2|6|4|
>  
> This bug will destroy data locality on clusters where the clients share racks 
> with DataNodes, but are running on machines that aren't running DataNodes, 
> such as striping federated HDFS clusters across racks.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16517) In 2.10 the distance metric is wrong for non-DN machines

2022-03-22 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-16517:
-
External issue URL: https://github.com/apache/hadoop/pull/4091

> In 2.10 the distance metric is wrong for non-DN machines
> 
>
> Key: HDFS-16517
> URL: https://issues.apache.org/jira/browse/HDFS-16517
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>
> In 2.10, the metric for distance between the client and the data node is 
> wrong for machines that aren't running data nodes (ie. 
> getWeightUsingNetworkLocation). The code works correctly in 3.3+.
> Currently
>  
> ||Client||DataNode ||getWeight||getWeightUsingNetworkLocation||
> |/rack1/node1|/rack1/node1|0|0|
> |/rack1/node1|/rack1/node2|2|2|
> |/rack1/node1|/rack2/node2|4|2|
> |/pod1/rack1/node1|/pod1/rack1/node2|2|2|
> |/pod1/rack1/node1|/pod1/rack2/node2|4|2|
> |/pod1/rack1/node1|/pod2/rack2/node2|6|4|
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16517) In 2.10 the distance metric is wrong for non-DN machines

2022-03-21 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16517:


 Summary: In 2.10 the distance metric is wrong for non-DN machines
 Key: HDFS-16517
 URL: https://issues.apache.org/jira/browse/HDFS-16517
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.10.1
Reporter: Owen O'Malley
Assignee: Owen O'Malley


In 2.10, the metric for distance between the client and the data node is wrong 
for machines that aren't running data nodes (ie. 
getWeightUsingNetworkLocation). The code works correctly in 3.3+.

Currently

 
||Client||DataNode ||getWeight||getWeightUsingNetworkLocation||
|/rack1/node1|/rack1/node1|0|0|
|/rack1/node1|/rack1/node2|2|2|
|/rack1/node1|/rack2/node2|4|2|
|/pod1/rack1/node1|/pod1/rack1/node2|2|2|
|/pod1/rack1/node1|/pod1/rack2/node2|4|2|
|/pod1/rack1/node1|/pod2/rack2/node2|6|4|

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13248) RBF: Namenode need to choose block location for the client

2022-03-21 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-13248:
-
Fix Version/s: 3.4.0
   2.10.2
   3.3.3
 Assignee: Owen O'Malley  (was: Íñigo Goiri)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks for the reviews, Inigo & Ayush!

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wu Weiwei
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.3.3
>
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality 
> Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16495) RBF should prepend the client ip rather than append it.

2022-03-14 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-16495:
-
Fix Version/s: 3.3.3
   (was: 3.2.4)

> RBF should prepend the client ip rather than append it.
> ---
>
> Key: HDFS-16495
> URL: https://issues.apache.org/jira/browse/HDFS-16495
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.3
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently the Routers append the client ip to the caller context if and only 
> if it is not already set. This would allow the user to fake their ip by 
> setting the caller context. Much better is to prepend it unconditionally.
> The NN must be able to trust the client ip from the caller context.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16495) RBF should prepend the client ip rather than append it.

2022-03-14 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16495.
--
Fix Version/s: 3.4.0
   3.2.4
   Resolution: Fixed

> RBF should prepend the client ip rather than append it.
> ---
>
> Key: HDFS-16495
> URL: https://issues.apache.org/jira/browse/HDFS-16495
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently the Routers append the client ip to the caller context if and only 
> if it is not already set. This would allow the user to fake their ip by 
> setting the caller context. Much better is to prepend it unconditionally.
> The NN must be able to trust the client ip from the caller context.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16495) RBF should prepend the client ip rather than append it.

2022-03-04 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16495:


 Summary: RBF should prepend the client ip rather than append it.
 Key: HDFS-16495
 URL: https://issues.apache.org/jira/browse/HDFS-16495
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently the Routers append the client ip to the caller context if and only if 
it is not already set. This would allow the user to fake their ip by setting 
the caller context. Much better is to prepend it unconditionally.

The NN must be able to trust the client ip from the caller context.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16253) Add a toString implementation to DFSInputStream

2021-10-04 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16253:


 Summary: Add a toString implementation to DFSInputStream
 Key: HDFS-16253
 URL: https://issues.apache.org/jira/browse/HDFS-16253
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley


It would help debugging if there was a useful toString on DFSInputStream.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14244) refactor the libhdfs++ build system

2019-02-13 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-14244:
-
Summary: refactor the libhdfs++ build system  (was: hdfs++ doesn't add 
necessary libraries to dynamic library link)
Description: The current cmake for libhdfs++ has the source code for the 
dependent libraries. By refactoring we can remove 150kloc of third party code.  
(was: When linking with shared libraries, the libhdfs++ cmake file doesn't link 
correctly.)

> refactor the libhdfs++ build system
> ---
>
> Key: HDFS-14244
> URL: https://issues.apache.org/jira/browse/HDFS-14244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs++, hdfs-client
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>
> The current cmake for libhdfs++ has the source code for the dependent 
> libraries. By refactoring we can remove 150kloc of third party code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14244) hdfs++ doesn't add necessary libraries to dynamic library link

2019-02-13 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767482#comment-16767482
 ] 

Owen O'Malley commented on HDFS-14244:
--

No, it isn't. With BUILD_SHARED_LIBS=ON the libhdfspp build was broken.

However in digging into this, I think that we need a pretty major refactoring 
of the libhdfspp build system.

In particular:
* Remove the source code from 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party
 .
* Fix support for shared/static libraries.
* Use the packages installed on the system when they are available.
* Add support for rpath on mac os.
* Run the unit tests when building stand alone.
* Use add_ExternalPackage for the projects that we need to build.
* Incorporate the uriparser2 wrapper into libhdfspp, but use uriparser package. 
Most of the linux variants have uriparser.
* Add a cpack definition for libhdfspp so that you can generate a binary 
artifact in the standalone build.
* Support newer versions of asio. (The deadline_timer needs to be replaced with 
the steady_timer.)

These will remove about 150kloc from Hadoop. :)

> hdfs++ doesn't add necessary libraries to dynamic library link
> --
>
> Key: HDFS-14244
> URL: https://issues.apache.org/jira/browse/HDFS-14244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs++, hdfs-client
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>
> When linking with shared libraries, the libhdfs++ cmake file doesn't link 
> correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14244) hdfs++ doesn't add necessary libraries to dynamic library link

2019-01-30 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-14244:


 Summary: hdfs++ doesn't add necessary libraries to dynamic library 
link
 Key: HDFS-14244
 URL: https://issues.apache.org/jira/browse/HDFS-14244
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley


When linking with shared libraries, the libhdfs++ cmake file doesn't link 
correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14244) hdfs++ doesn't add necessary libraries to dynamic library link

2019-01-30 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-14244:
-
Component/s: hdfs-client
 hdfs++

> hdfs++ doesn't add necessary libraries to dynamic library link
> --
>
> Key: HDFS-14244
> URL: https://issues.apache.org/jira/browse/HDFS-14244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs++, hdfs-client
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>
> When linking with shared libraries, the libhdfs++ cmake file doesn't link 
> correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13534) libhdfs++: Fix GCC7 build

2018-06-06 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503913#comment-16503913
 ] 

Owen O'Malley commented on HDFS-13534:
--

[~James C] yes please create a new PR for 
https://issues.apache.org/jira/browse/ORC-375 to update the patch (after this 
one goes in, obviously).

> libhdfs++: Fix GCC7 build
> -
>
> Key: HDFS-13534
> URL: https://issues.apache.org/jira/browse/HDFS-13534
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Major
> Attachments: HDFS-13534.000.patch, HDFS-13534.001.patch
>
>
> After merging HDFS-13403 [~pifta] noticed the build broke on some platforms.  
> [~bibinchundatt] pointed out that prior to gcc 7 mutex, future, and regex 
> implicitly included functional.  Without that implicit include the compiler 
> errors on the std::function in ioservice.h.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13534) libhdfs++: Fix GCC7 build

2018-06-06 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503914#comment-16503914
 ] 

Owen O'Malley commented on HDFS-13534:
--

+1 for the patch to go in here in HDFS. :)

> libhdfs++: Fix GCC7 build
> -
>
> Key: HDFS-13534
> URL: https://issues.apache.org/jira/browse/HDFS-13534
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Major
> Attachments: HDFS-13534.000.patch, HDFS-13534.001.patch
>
>
> After merging HDFS-13403 [~pifta] noticed the build broke on some platforms.  
> [~bibinchundatt] pointed out that prior to gcc 7 mutex, future, and regex 
> implicitly included functional.  Without that implicit include the compiler 
> errors on the std::function in ioservice.h.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13534) libhdfs++: Fix GCC7 build

2018-06-06 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503465#comment-16503465
 ] 

Owen O'Malley edited comment on HDFS-13534 at 6/6/18 3:38 PM:
--

To get it working with gcc 7 for ORC's copy, I had to make two changes:
{code:java}
*** lib/common/async_stream.h~    2017-08-30 07:56:51.0 -0700
--- lib/common/async_stream.h    2018-06-05 22:02:35.0 -0700
***
*** 20,25 
--- 20,26 
  #define LIB_COMMON_ASYNC_STREAM_H_

  #include 
+ #include 

  namespace hdfs {
{code}
{code:java}
*** lib/rpc/request.h~  2017-08-30 07:56:51.0 -0700
--- lib/rpc/request.h   2018-06-05 22:33:59.0 -0700
***
*** 22,27 
--- 22,28 
  #include "common/util.h"
  #include "common/new_delete.h"

+ #include 
  #include 

  #include 
{code}
Those don't seem to be in this patch.


was (Author: owen.omalley):
To get it working with gcc 7, I had to make two changes:
{code:java}
*** lib/common/async_stream.h~    2017-08-30 07:56:51.0 -0700
--- lib/common/async_stream.h    2018-06-05 22:02:35.0 -0700
***
*** 20,25 
--- 20,26 
  #define LIB_COMMON_ASYNC_STREAM_H_

  #include 
+ #include 

  namespace hdfs {
{code}

{code:java}
*** lib/rpc/request.h~  2017-08-30 07:56:51.0 -0700
--- lib/rpc/request.h   2018-06-05 22:33:59.0 -0700
***
*** 22,27 
--- 22,28 
  #include "common/util.h"
  #include "common/new_delete.h"

+ #include 
  #include 

  #include 
{code}

Those don't seem to be in this patch. 

> libhdfs++: Fix GCC7 build
> -
>
> Key: HDFS-13534
> URL: https://issues.apache.org/jira/browse/HDFS-13534
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Major
> Attachments: HDFS-13534.000.patch, HDFS-13534.001.patch
>
>
> After merging HDFS-13403 [~pifta] noticed the build broke on some platforms.  
> [~bibinchundatt] pointed out that prior to gcc 7 mutex, future, and regex 
> implicitly included functional.  Without that implicit include the compiler 
> errors on the std::function in ioservice.h.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13534) libhdfs++: Fix GCC7 build

2018-06-06 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503465#comment-16503465
 ] 

Owen O'Malley commented on HDFS-13534:
--

To get it working with gcc 7, I had to make two changes:
{code:java}
*** lib/common/async_stream.h~    2017-08-30 07:56:51.0 -0700
--- lib/common/async_stream.h    2018-06-05 22:02:35.0 -0700
***
*** 20,25 
--- 20,26 
  #define LIB_COMMON_ASYNC_STREAM_H_

  #include 
+ #include 

  namespace hdfs {
{code}

{code:java}
*** lib/rpc/request.h~  2017-08-30 07:56:51.0 -0700
--- lib/rpc/request.h   2018-06-05 22:33:59.0 -0700
***
*** 22,27 
--- 22,28 
  #include "common/util.h"
  #include "common/new_delete.h"

+ #include 
  #include 

  #include 
{code}

Those don't seem to be in this patch. 

> libhdfs++: Fix GCC7 build
> -
>
> Key: HDFS-13534
> URL: https://issues.apache.org/jira/browse/HDFS-13534
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Major
> Attachments: HDFS-13534.000.patch, HDFS-13534.001.patch
>
>
> After merging HDFS-13403 [~pifta] noticed the build broke on some platforms.  
> [~bibinchundatt] pointed out that prior to gcc 7 mutex, future, and regex 
> implicitly included functional.  Without that implicit include the compiler 
> errors on the std::function in ioservice.h.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020

2018-02-02 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350807#comment-16350807
 ] 

Owen O'Malley commented on HDFS-12990:
--

Ok, I'm late on this.

I'm strongly on the side of changing it back. In my view, this should 
absolutely be a blocker bug on any release. We *need* to change the port back 
because it is part of the public API.

As always, you need to view the relative costs:
 # Users who have installed 3.0.0 and will be inconvenienced by the change.
 # All other users of Hadoop.

Clearly bucket number 2 is much much larger than 1.

Let's also be clear that the release manager can put it in or out of the RC as 
they deem fit. There are NO vetoes. It is a straight vote for whether the RC 
should be released. Personally, I'll vote against an RC that doesn't have this 
patch.

> Change default NameNode RPC port back to 8020
> -
>
> Key: HDFS-12990
> URL: https://issues.apache.org/jira/browse/HDFS-12990
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-12990.01.patch
>
>
> In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all 
> default ports to ephemeral ports, which is very appreciated by admin. As part 
> of that change, we also modified the NN RPC port from the famous 8020 to 
> 9820, to be closer to other ports changed there.
> With more integration going on, it appears that all the other ephemeral port 
> changes are fine, but the NN RPC port change is painful for downstream on 
> migrating to Hadoop 3. Some examples include:
> # Hive table locations pointing to hdfs://nn:port/dir
> # Downstream minicluster unit tests that assumed 8020
> # Oozie workflows / downstream scripts that used 8020
> This isn't a problem for HA URLs, since that does not include the port 
> number. But considering the downstream impact, instead of requiring all of 
> them change their stuff, it would be a way better experience to leave the NN 
> port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary 
> upgrade burdens.
> It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to 
> switch the port back.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7240) Object store in HDFS

2018-01-26 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341429#comment-16341429
 ] 

Owen O'Malley commented on HDFS-7240:
-

I think that the major contribution of this work is pulling out the block 
management layer and the naming should reflect that.

I'd propose that:
 * Ozone should be the object store
 * The block layer should have a different name such as Hadoop Storage Layer 
(HSL).

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, 
> HDFS-7240.002.patch, HDFS-7240.003.patch, HDFS-7240.003.patch, 
> HDFS-7240.004.patch, HDFS-7240.005.patch, HDFS-7240.006.patch, 
> HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9525) hadoop utilities need to support provided delegation tokens

2016-01-25 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116289#comment-15116289
 ] 

Owen O'Malley commented on HDFS-9525:
-

[~daryn] I'm sorry, but I don't see what problem the patch introduced. It lets 
your webhdfs have a token even if your security is turned off as long as it was 
already in the UGI. Where is the problem?

> hadoop utilities need to support provided delegation tokens
> ---
>
> Key: HDFS-9525
> URL: https://issues.apache.org/jira/browse/HDFS-9525
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-7984.001.patch, HDFS-7984.002.patch, 
> HDFS-7984.003.patch, HDFS-7984.004.patch, HDFS-7984.005.patch, 
> HDFS-7984.006.patch, HDFS-7984.007.patch, HDFS-7984.patch, 
> HDFS-9525.008.patch, HDFS-9525.009.patch, HDFS-9525.009.patch, 
> HDFS-9525.branch-2.008.patch, HDFS-9525.branch-2.009.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-09-25 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908540#comment-14908540
 ] 

Owen O'Malley commented on HDFS-8855:
-

This is ok. +1

I'm a little concerned about the runtime performance of generating the string 
of the identifier on every connection to the datanode, but this should be 
correct.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.005.patch, HDFS-8855.1.patch, 
> HDFS-8855.2.patch, HDFS-8855.3.patch, HDFS-8855.4.patch, 
> HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-09-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804547#comment-14804547
 ] 

Owen O'Malley commented on HDFS-8855:
-

A few points:
* You need to use the Token.getKind(), Token.getIdentifier(), and 
Token.getPassword() as the key for the cache. The patch currently uses 
Token.toString, which uses the identifier, kind, and service. The service is 
set by the client so it shouldn't be part of the match. The password on the 
other hand must be part of the match so that guessing the identifier doesn't 
allow a hacker to impersonate the user.
* The timeout should default to 10 minutes instead of 10 seconds.
* Please fix the checkstyle and findbugs warnings.
* Determine what is wrong with the test case.

Other than that, it looks good.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, 
> HDFS-8855.4.patch, HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-09-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803351#comment-14803351
 ] 

Owen O'Malley commented on HDFS-8855:
-

I'm looking at the patch, but you'll need to resolve the checkstyle, findbugs, 
and test case failures.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, 
> HDFS-8855.4.patch, HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9025) fix compilation issues on arch linux

2015-09-04 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-9025:

Status: Patch Available  (was: Open)

> fix compilation issues on arch linux
> 
>
> Key: HDFS-9025
> URL: https://issues.apache.org/jira/browse/HDFS-9025
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HDFS-9025.patch
>
>
> There are several compilation issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9025) fix compilation issues on arch linux

2015-09-04 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-9025:

Attachment: HDFS-9025.patch

fix minor problems.

> fix compilation issues on arch linux
> 
>
> Key: HDFS-9025
> URL: https://issues.apache.org/jira/browse/HDFS-9025
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HDFS-9025.patch
>
>
> There are several compilation issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9025) fix compilation issues on arch linux

2015-09-04 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned HDFS-9025:
---

Assignee: Owen O'Malley

> fix compilation issues on arch linux
> 
>
> Key: HDFS-9025
> URL: https://issues.apache.org/jira/browse/HDFS-9025
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> There are several compilation issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9025) fix compilation issues on arch linux

2015-09-04 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-9025:
---

 Summary: fix compilation issues on arch linux
 Key: HDFS-9025
 URL: https://issues.apache.org/jira/browse/HDFS-9025
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Owen O'Malley


There are several compilation issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8736) ability to deny access to different filesystems

2015-07-08 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14619894#comment-14619894
 ] 

Owen O'Malley commented on HDFS-8736:
-

I agree with Allen. Preventing access to the LocalFileSystem doesn't help 
anything. The Hadoop security model depends on having unix user ids or more 
recently Linux containers. 

> ability to deny access to different filesystems
> ---
>
> Key: HDFS-8736
> URL: https://issues.apache.org/jira/browse/HDFS-8736
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Purvesh Patel
>Priority: Minor
>  Labels: security
> Attachments: Patch.pdf
>
>
> In order to run in a secure context, ability to deny access to different 
> filesystems(specifically the local file system) to non-trusted code this 
> patch adds a new SecurityPermission class(AccessFileSystemPermission) and 
> checks the permission in FileSystem#get before returning a cached file system 
> or creating a new one. Please see attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8707) Implement an async pure c++ HDFS client

2015-07-01 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-8707:
---

 Summary: Implement an async pure c++ HDFS client
 Key: HDFS-8707
 URL: https://issues.apache.org/jira/browse/HDFS-8707
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client
Reporter: Owen O'Malley
Assignee: Haohui Mai


As part of working on the C++ ORC reader at ORC-3, we need an HDFS pure C++ 
client that lets us do async io to HDFS. We want to start from the code that 
Haohui's been working on at https://github.com/haohui/libhdfspp .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3689) Add support for variable length block

2014-08-22 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107194#comment-14107194
 ] 

Owen O'Malley commented on HDFS-3689:
-

One follow up is that fixing MapReduce to use the actual block boundaries 
rather than dividing up the file in fixed size splits would not be difficult 
and would make the generated file splits for ORC and other block compressed 
files much much better. 

Furthermore, note that we could remove the need for lzo and zlib index files 
for text files by having TextOutputFormat cut the block at a line boundary and 
flush the compression codec. Thus TextInputFormat could divide the file at 
block boundaries and have them align at both a compression chunk boundary and a 
line break. That would be *great*.

> Add support for variable length block
> -
>
> Key: HDFS-3689
> URL: https://issues.apache.org/jira/browse/HDFS-3689
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs-client, namenode
>Affects Versions: 3.0.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch
>
>
> Currently HDFS supports fixed length blocks. Supporting variable length block 
> will allow new use cases and features to be built on top of HDFS. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-3689) Add support for variable length block

2014-08-22 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107176#comment-14107176
 ] 

Owen O'Malley commented on HDFS-3689:
-

Since this is a discussion of what to put into trunk, incompatible changes 
aren't a blocker. Furthermore, most clients would never see the difference. 
Variable length blocks would dramatically improve the ability of HDFS to 
support better file formats like ORC.

On the other hand, I've had very bad experiences with sparse files on Unix. It 
is all too easy for a user to copy a sparse file and not understand that the 
copy is 10x larger than the original. That would be *bad* and I do not think 
that HDFS should support it at all.

> Add support for variable length block
> -
>
> Key: HDFS-3689
> URL: https://issues.apache.org/jira/browse/HDFS-3689
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs-client, namenode
>Affects Versions: 3.0.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch
>
>
> Currently HDFS supports fixed length blocks. Supporting variable length block 
> will allow new use cases and features to be built on top of HDFS. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-26 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045344#comment-14045344
 ] 

Owen O'Malley commented on HDFS-6134:
-

In the discussion today, we covered lots of ground. Todd proposed that 
Alejandro add a virtual ".raw" directory to the top level of each encryption 
zone. This would allow processes that want access to read or write the data 
within the encryption zone an access path that doesn't require modifying the 
FileSystem API. With that change, I'm -0 to adding encryption in to HDFS. I 
still think that our users would be far better served by adding 
encryption/compression layers above HDFS rather than baking them into HDFS, but 
I'm not going to block the work. By adding the work directly into HDFS, 
Alejandro and the others working on this are signing up for a high level of QA 
at scale before this is committed.

A couple of other points came up:
* symbolic links in conjunction with cryptofs would allow users to use hdfs 
urls to access encrypted hdfs files.
* there must be an hdfs admin command to list the crypto zones to support 
auditing
* There are significant scalability concerns about each tasks requesting 
decryption of each file key. In particular, if a job has 100,000 tasks and each 
opens 1000 files, that is 100 million key requests. The current design is 
unlikely to scale correctly.
* the kms needs its own delegation tokens and hooks so that yarn will renew and 
cancel them.
* there are three levels of key rolling:
** leaving old data alone and writing new data with the new key
** re-writing the data with the new key 
** re-encoding the per file key (personally this seems pointless)

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-26 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044821#comment-14044821
 ] 

Owen O'Malley commented on HDFS-6134:
-

Alejandro, I was just trying to say that I'd met him and was familiar with his 
work history. If it sounded rude or dismissive, that was unintended. I'm sorry.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-26 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044797#comment-14044797
 ] 

Owen O'Malley commented on HDFS-6134:
-

Mike, I remember you from when I interviewed you.

You are talking about collisions between IV's, not key space. By using either 
32 bytes of randomness (if someone is worried about crypto attacks there is no 
excuse not to use AES256), there is *NO* possibility of collision even assuming 
an insanely bad practice of using a single key version for a huge number of 
files. I obviously understand and applied the birthday paradox to get the 
numbers.

Note that we *already* have key rolling and the key is already a random string 
of bytes. Adding additional layers of randomness just gives the appearance of 
more security. That may be wonderful in the closed source security world, but 
it actively harmful in open source. In open source having a clear 
implementation that is open for inspection is by far the best protection. 

Note that the other issue with not using the keys as intended is that many 
Hadoop users launch jobs that read millions of files. We can't afford to have 
the client fetch a different key for each of those millions of files.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-26 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044428#comment-14044428
 ] 

Owen O'Malley commented on HDFS-6134:
-

Sorry, I messed up my math. Assuming that you have 1million files per key and 8 
bytes of randomness, you get 2.7e-8, which is close enough to 0. At 16 bytes or 
32 bytes of randomness, doubles underflow when calculating the percentage.


> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-26 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044423#comment-14044423
 ] 

Owen O'Malley commented on HDFS-6134:
-

Alejandro, you don't need and shouldn't implement any of the DEK stuff. AES-CTR 
is more than adequate. Rather than use 16 bytes of randomness and 16 bytes of 
counter, use 32 bytes of randomness and just add the counter to it rather than 
concatenate.

Let's take the extreme case of 1million files with the same key version. If you 
have 32 bits of randomness, that leads you to a collision chance that is 
basically 100%. With 64 bits of randomness that drops to 2.7e-8, which is close 
enough to 0.


> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-25 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044109#comment-14044109
 ] 

Owen O'Malley commented on HDFS-6134:
-

Any chance for the PA office? Otherwise I'll be dialing in.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-25 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043893#comment-14043893
 ] 

Owen O'Malley commented on HDFS-6134:
-

{quote}
Owen, that is NOT transparent.
{quote}

Transparent means that you shouldn't have to change your application code. 
Hacking HDFS to add encryption is transparent for one set of apps, but 
completely breaks others. Changing URLs requires no code changes to any apps.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-25 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043836#comment-14043836
 ] 

Owen O'Malley commented on HDFS-6134:
-

Todd, it is *still* transparent encryption if you use cfs:// instead of 
hdfs://. The important piece is that the application doesn't need to change to 
access the decrypted storage. 

My problem is by refusing to layer the change over the storage layer, this jira 
is making much disruptive and unnecessary changes to the critical 
infrastructure and its API.

NSE is whole disk encryption and is equivalent to using lm-crypt to encrypt the 
block files. That level of encryption is always very transparent and is already 
available in HDFS without a code change.

Aaron, I can't do a meeting tomorrow afternoon. How about tomorrow morning? Say 
10am-noon?



> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-25 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043839#comment-14043839
 ] 

Owen O'Malley commented on HDFS-6134:
-

I'll also point out that I've provided a solution that doesn't change the HDFS 
core and still lets you use your hdfs urls with encryption...

Finally, adding compression to the crypto file system would be a great addition 
and *still* not require any changes to HDFS or its API.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-25 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043766#comment-14043766
 ] 

Owen O'Malley commented on HDFS-6134:
-

{quote}
I don’t see a previous -1 in any of the related JIRAs.
{quote}

I had consistently stated objections and some of them have been addressed, but 
the fundamentals have become clear through this jira. I am always hesitant to 
use a -1 and I certainly don't do so lightly. Through the discussion, my 
opinion is transparent encryption in HDFS is a *really* bad idea. Let's run 
through the case:

The one claimed benefit of integrating encryption into HDFS is that the user 
doesn't need to change the URLs that they use. I believe this to be a 
*disadvantage* because it hides the fact that these files are encrypted. That 
said, a better approach if that is the desired goal is to create a *NEW* filter 
filesystem that the user can configure to respond to hdfs urls that does silent 
encryption. This imposes *NO* penalty on people who don't want encryption and 
does not require hacks to the FileSystem API.

{quote}
FileSystem will had a new create()/open() signature to support this, if you 
have access to the file but not the key, you can use the new signatures to copy 
files as per the usecase you are mentioning.
{quote}
This will break every backup application. Some of them, such as HAR and DistCp 
you can hack to handle HDFS as a special case, but this kind of special casing 
always comes back to haunt us as a project. Changing FileSystem API is a really 
bad idea and inducing more differences between the various implementations will 
create many more problems than you are trying to avoid.



> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-24 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042452#comment-14042452
 ] 

Owen O'Malley commented on HDFS-6134:
-

As Sanjay proposed, I think it would be great to get together and discuss the 
issues in person. Would a meeting this week work for you Alejandro?

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-24 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042366#comment-14042366
 ] 

Owen O'Malley commented on HDFS-6134:
-

I'm still -1 to adding this to HDFS. Having a layered file system is a much 
cleaner approach. 

Issues:
* The user needs to be able move, copy, and distribute the directories without 
the key. I should be able to set up a falcon or oozie job that copies 
directories where the user doing the copy has *NO* potential access to the key 
material. This is a critical security constraint.
* A critical use case for encryption is when hdfs admins should not have access 
to the contents of some files. Encryption is the only way to implement that 
since the hdfs admins always have file permissions to both the hdfs files and 
the underlying block files.
* We shouldn't change the filesystem API to deal with encryption, because we 
have a solution that doesn't require the change and will be far less confusing 
to users. In particular, we shouldn't add hacks to read/write unencrypted bytes 
to HDFS.
* Each file needs to record the key version and original IV as written up in 
the CFS design document. The IV should be incremented for each block, but must 
start at a random number. As Alejandro pointed out this is required for strong 
security.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034155#comment-14034155
 ] 

Owen O'Malley commented on HDFS-6134:
-

Alejandro, this is *exactly* equivalent of the delegation token. If a job is 
opening side files, it needs to make sure it has the right delegation tokens 
and keys. For delegation tokens, we added an extra config option for listing 
the extra file systems. The same solution (or listing the extra key versions) 
would make sense.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataAtRestEncryption.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034016#comment-14034016
 ] 

Owen O'Malley commented on HDFS-6134:
-

Alejandro, which use cases don't know their inputs or outputs? Clearly the main 
ones do know their input and output:
* MapReduce
* Hive
* Pig

It is important for the standard cases that we get the encryption keys up front 
instead of letting the horde of containers do it.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataAtRestEncryption.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033981#comment-14033981
 ] 

Owen O'Malley commented on HDFS-6134:
-

A follow up on that is that of course KMS will need proxy users so that Oozie 
will be able to get keys for the users. (If that is desired.)

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataAtRestEncryption.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033975#comment-14033975
 ] 

Owen O'Malley commented on HDFS-6134:
-

The right way to do this is to have the Yarn job submission get the appropriate 
keys from KMS like it currently gets delegation tokens. Both the delegation 
tokens and the keys should be put into the job's credential object. That way 
you don't have all 100,000 containers hitting the KMS at once. It does mean we 
need a new interface for filesystems that given a list of paths, you ensure the 
keys are in a credential object. FileInputFormat and FileOutputFormat should 
check to see if the FileSystem implements that interface and pass in the job's 
credential object.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataAtRestEncryption.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-11 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028086#comment-14028086
 ] 

Owen O'Malley commented on HDFS-6134:
-

I still have two very strong concerns with this work:

* A critical use case is that distcp (and other backup/disaster recovery tools) 
must be able to accurately copy files without access to the encryption keys. 
There are many cases when the automated backup tools are not permitted the 
encryption keys. Obviously, it also has the benefit of being both safer and 
faster if the data is moved in the original encryption.
* The client needs to get the key material directly and not use the NameNode as 
a proxy. This is critical from a security point of view.
** The security (including the audit log) on the key server is much stronger if 
there are no proxies between the user and the key server.
** Security bugs in HDFS or mistakes in setting permissions are a critical use 
case for requiring encryption.

Doing all of the work on the client (including getting the key) makes the 
entire much more secure.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataAtRestEncryption.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-05-12 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995588#comment-13995588
 ] 

Owen O'Malley commented on HDFS-6134:
-

What are the use cases this is trying to address? What are the attacks?

Do users or administrators set the encryption?

Can different directories have different keys or is it one key for the entire 
filesystem?

When you rename a directory does it need to be re-encrypted?

How are backups handled? Does it require the encryption key? What is the 
performance impact on distcp when not using native libraries?

For release in the Hadoop 2.x line, you need to preserve both forward and 
backwards wire compatibility. How do you plan to address that?

It seems that the additional datanode and client complexity is prohibitive. 
Making changes to the HDFS write and read pipeline is extremely touchy.

> Transparent data at rest encryption
> ---
>
> Key: HDFS-6134
> URL: https://issues.apache.org/jira/browse/HDFS-6134
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFSDataAtRestEncryption.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Closed] (HDFS-5852) Change the colors on the hdfs UI

2014-01-31 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley closed HDFS-5852.
---

Assignee: (was: stack)

> Change the colors on the hdfs UI
> 
>
> Key: HDFS-5852
> URL: https://issues.apache.org/jira/browse/HDFS-5852
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: stack
>Priority: Blocker
>  Labels: webui
> Fix For: 2.3.0
>
> Attachments: HDFS-5852.best.txt, HDFS-5852v2.txt, 
> HDFS-5852v3-dkgreen.txt, color-rationale.png, compromise_gray.png, 
> dkgreen.png, hdfs-5852.txt, new_hdfsui_colors.png
>
>
> The HDFS UI colors are too close to HWX green.
> Here is a patch that steers clear of vendor colors.
> I made it a blocker thinking this something we'd want to fix before we 
> release apache hadoop 2.3.0.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (HDFS-5852) Change the colors on the hdfs UI

2014-01-31 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887557#comment-13887557
 ] 

Owen O'Malley edited comment on HDFS-5852 at 1/31/14 7:58 AM:
--

Just to show how *inane* this is jira is, here are the colors measurements of 
Hortonwork's green, Cloudera Blue, and the original color using Mac's digital 
color meter:

|| Attribute || Hortonworks Green || HDFS Color || Cloudera Blue ||
| L | 69.52 | 33.94 | 34.4 |
| A | -44.68 | -22.34 | -17.68 |
| B | 60.91 | 26.88 | -19.98 |

so Hortonworks Green - HDFS Color = 35.58 + 22.34 + 34.03 = 91.95
and Cloudera Blue - HDFS Color = 0.46 + 4.66 + 46.86 = 51.98

Clearly we need to make the color greener to denote additional stability.


was (Author: owen.omalley):
Just to show how *inane* this is jira is, here are the colors measurements of 
Hortonwork's green, Cloudera Blue, and the original color using Mac's digital 
color meter:

|| Attribute || Hortonworks Green || HDFS Color || Cloudera Blue ||
| L | 69.52 | 33.94 | 34.4 |
| A | 44.68 | -22.34 | -17.68 |
| B | 60.91 | 26.88 | -19.98 |

so Hortonworks Green - HDFS Color = 35.58 + 22.34 + 34.03 = 91.95
and Cloudera Blue - HDFS Color = 0.46 + 4.66 + 46.86 = 51.98

Clearly we need to make the color greener to denote additional stability.

> Change the colors on the hdfs UI
> 
>
> Key: HDFS-5852
> URL: https://issues.apache.org/jira/browse/HDFS-5852
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Blocker
>  Labels: webui
> Fix For: 2.3.0
>
> Attachments: HDFS-5852.best.txt, HDFS-5852v2.txt, 
> HDFS-5852v3-dkgreen.txt, color-rationale.png, compromise_gray.png, 
> dkgreen.png, hdfs-5852.txt, new_hdfsui_colors.png
>
>
> The HDFS UI colors are too close to HWX green.
> Here is a patch that steers clear of vendor colors.
> I made it a blocker thinking this something we'd want to fix before we 
> release apache hadoop 2.3.0.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5852) Change the colors on the hdfs UI

2014-01-30 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887557#comment-13887557
 ] 

Owen O'Malley commented on HDFS-5852:
-

Just to show how *inane* this is jira is, here are the colors measurements of 
Hortonwork's green, Cloudera Blue, and the original color using Mac's digital 
color meter:

|| Attribute || Hortonworks Green || HDFS Color || Cloudera Blue ||
| L | 69.52 | 33.94 | 34.4 |
| A | 44.68 | -22.34 | -17.68 |
| B | 60.91 | 26.88 | -19.98 |

so Hortonworks Green - HDFS Color = 35.58 + 22.34 + 34.03 = 91.95
and Cloudera Blue - HDFS Color = 0.46 + 4.66 + 46.86 = 51.98

Clearly we need to make the color greener to denote additional stability.

> Change the colors on the hdfs UI
> 
>
> Key: HDFS-5852
> URL: https://issues.apache.org/jira/browse/HDFS-5852
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Blocker
>  Labels: webui
> Fix For: 2.3.0
>
> Attachments: HDFS-5852.best.txt, HDFS-5852v2.txt, 
> HDFS-5852v3-dkgreen.txt, color-rationale.png, compromise_gray.png, 
> dkgreen.png, hdfs-5852.txt, new_hdfsui_colors.png
>
>
> The HDFS UI colors are too close to HWX green.
> Here is a patch that steers clear of vendor colors.
> I made it a blocker thinking this something we'd want to fix before we 
> release apache hadoop 2.3.0.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5143) Hadoop cryptographic file system

2013-12-08 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842559#comment-13842559
 ] 

Owen O'Malley commented on HDFS-5143:
-

We need to break this work down in to smaller units of work. Jiras with a 
tighter focus will provide a more focused discussion and allow us to make 
progress and accomplish our shared goal of enabling Hadoop users to use 
encryption in their applications without changing each individual input and 
output format.

* The key management needs to be much more flexible and I've created 
HADOOP-10141 to work on it.
* The ByteBufferCipher API should be a separate jira, so I've created 
HADOOP-10149.
* Once HADOOP-10149 is resolved, we can work together on a jni-based 
implementation of it.



> Hadoop cryptographic file system
> 
>
> Key: HDFS-5143
> URL: https://issues.apache.org/jira/browse/HDFS-5143
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>  Labels: rhino
> Fix For: 3.0.0
>
> Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
> system.pdf
>
>
> There is an increasing need for securing data when Hadoop customers use 
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.Transparent to and no modification required for upper layer 
> applications.
> 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
> the wrapped file system supports them.
> 3.Very high performance for encryption and decryption, they will not 
> become bottleneck.
> 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
> modify existing structure of file system, such as namenode and datanode 
> structure if the wrapped file system is HDFS.
> 5.Admin can configure encryption policies, such as which directory will 
> be encrypted.
> 6.A robust key management framework.
> 7.Support Pread and append operations if the wrapped file system supports 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5143) Hadoop cryptographic file system

2013-11-13 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821728#comment-13821728
 ] 

Owen O'Malley commented on HDFS-5143:
-

[~hitliuyi] In the design document, the IV was always 0, but in the comments 
you are suggesting putting a random IV in the start of the underlying file. I 
think that the security advantage of having a random IV is relatively small and 
we'd do better without it. It only protects against having multiple files with 
the same key and the same plain text  co-located in the file.

I think that putting it at the front of the file has a couple of disadvantages:
* Any read of the file has to read the beginning 16 bytes of the file.
* Block boundaries are offset from the expectation. This will cause MapReduce 
input splits to straddle blocks in cases that wouldn't otherwise require it.

I think we should always have an IV of 0 or alternatively encode it in the 
underlying filesystem's filenames. In particular, we could base 64 encode the 
IV and append it onto the filename. If we add 16 characters of base64 that 
would give use 96 bits of IV and it would be easy to strip off. It would look 
like:

cfs://hdfs@nn/dir1/dir2/file -> hdfs://nn/dir1/dir2/file_1234567890ABCDEF

> Hadoop cryptographic file system
> 
>
> Key: HDFS-5143
> URL: https://issues.apache.org/jira/browse/HDFS-5143
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>  Labels: rhino
> Fix For: 3.0.0
>
> Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
> system.pdf
>
>
> There is an increasing need for securing data when Hadoop customers use 
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.Transparent to and no modification required for upper layer 
> applications.
> 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
> the wrapped file system supports them.
> 3.Very high performance for encryption and decryption, they will not 
> become bottleneck.
> 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
> modify existing structure of file system, such as namenode and datanode 
> structure if the wrapped file system is HDFS.
> 5.Admin can configure encryption policies, such as which directory will 
> be encrypted.
> 6.A robust key management framework.
> 7.Support Pread and append operations if the wrapped file system supports 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5143) Hadoop cryptographic file system

2013-11-13 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821679#comment-13821679
 ] 

Owen O'Malley commented on HDFS-5143:
-

[~avik_...@yahoo.com] I'm not misquoting you. You were very clear that you 
weren't planning on working on this in the immediate future and that instead 
you wanted to change all of the file formats.

> Hadoop cryptographic file system
> 
>
> Key: HDFS-5143
> URL: https://issues.apache.org/jira/browse/HDFS-5143
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>  Labels: rhino
> Fix For: 3.0.0
>
> Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
> system.pdf
>
>
> There is an increasing need for securing data when Hadoop customers use 
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.Transparent to and no modification required for upper layer 
> applications.
> 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
> the wrapped file system supports them.
> 3.Very high performance for encryption and decryption, they will not 
> become bottleneck.
> 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
> modify existing structure of file system, such as namenode and datanode 
> structure if the wrapped file system is HDFS.
> 5.Admin can configure encryption policies, such as which directory will 
> be encrypted.
> 6.A robust key management framework.
> 7.Support Pread and append operations if the wrapped file system supports 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5143) Hadoop cryptographic file system

2013-11-13 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-5143:


Status: Open  (was: Patch Available)

It should only be marked Patch Available when Yi thinks it is ready to commit.

> Hadoop cryptographic file system
> 
>
> Key: HDFS-5143
> URL: https://issues.apache.org/jira/browse/HDFS-5143
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>  Labels: rhino
> Fix For: 3.0.0
>
> Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
> system.pdf
>
>
> There is an increasing need for securing data when Hadoop customers use 
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.Transparent to and no modification required for upper layer 
> applications.
> 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
> the wrapped file system supports them.
> 3.Very high performance for encryption and decryption, they will not 
> become bottleneck.
> 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
> modify existing structure of file system, such as namenode and datanode 
> structure if the wrapped file system is HDFS.
> 5.Admin can configure encryption policies, such as which directory will 
> be encrypted.
> 6.A robust key management framework.
> 7.Support Pread and append operations if the wrapped file system supports 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5143) Hadoop cryptographic file system

2013-11-13 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HDFS-5143:


Assignee: Yi Liu  (was: Owen O'Malley)

It wasn't assigned and no one seemed to be working on this. Talking to Avik at 
Strata, he said no one was going to be working on this for 9 months. I'm glad 
to see that Yi has posted a patch.

> Hadoop cryptographic file system
> 
>
> Key: HDFS-5143
> URL: https://issues.apache.org/jira/browse/HDFS-5143
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>  Labels: rhino
> Fix For: 3.0.0
>
> Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
> system.pdf
>
>
> There is an increasing need for securing data when Hadoop customers use 
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.Transparent to and no modification required for upper layer 
> applications.
> 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
> the wrapped file system supports them.
> 3.Very high performance for encryption and decryption, they will not 
> become bottleneck.
> 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
> modify existing structure of file system, such as namenode and datanode 
> structure if the wrapped file system is HDFS.
> 5.Admin can configure encryption policies, such as which directory will 
> be encrypted.
> 6.A robust key management framework.
> 7.Support Pread and append operations if the wrapped file system supports 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HDFS-5143) Hadoop cryptographic file system

2013-11-12 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned HDFS-5143:
---

Assignee: Owen O'Malley

> Hadoop cryptographic file system
> 
>
> Key: HDFS-5143
> URL: https://issues.apache.org/jira/browse/HDFS-5143
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Owen O'Malley
>  Labels: rhino
> Fix For: 3.0.0
>
> Attachments: HADOOP cryptographic file system.pdf
>
>
> There is an increasing need for securing data when Hadoop customers use 
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.Transparent to and no modification required for upper layer 
> applications.
> 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
> the wrapped file system supports them.
> 3.Very high performance for encryption and decryption, they will not 
> become bottleneck.
> 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
> modify existing structure of file system, such as namenode and datanode 
> structure if the wrapped file system is HDFS.
> 5.Admin can configure encryption policies, such as which directory will 
> be encrypted.
> 6.A robust key management framework.
> 7.Support Pread and append operations if the wrapped file system supports 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HDFS-3699) HftpFileSystem should try both KSSL and SPNEGO when authentication is required

2013-10-01 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-3699.
-

Resolution: Won't Fix

Using KSSL is strongly deprecated and should be avoided in secure clusters.

> HftpFileSystem should try both KSSL and SPNEGO when authentication is required
> --
>
> Key: HDFS-3699
> URL: https://issues.apache.org/jira/browse/HDFS-3699
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: eric baldeschwieler
>
> See discussion in HDFS-2617 (Replaced Kerberized SSL for image transfer and 
> fsck with SPNEGO-based solution).
> To handle the transition from Hadoop1.0 systems running KSSL authentication 
> to Hadoop systems running SPNEGO, it would be good to fix the client in both 
> 1 and 2 to try SPNEGO and then fall back to try KSSL.  
> This will allow organizations that are running a lot of Hadoop 1.0 to 
> gradually transition over, without needing to convert all clusters at the 
> same time.  They would first need to update their 1.0 HFTP clients (and 
> 2.0/0.23 if they are already running those) and then they could copy data 
> between clusters without needing to move all clusters to SPNEGO in a big bang.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Closed] (HDFS-3699) HftpFileSystem should try both KSSL and SPNEGO when authentication is required

2013-10-01 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley closed HDFS-3699.
---


> HftpFileSystem should try both KSSL and SPNEGO when authentication is required
> --
>
> Key: HDFS-3699
> URL: https://issues.apache.org/jira/browse/HDFS-3699
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: eric baldeschwieler
>
> See discussion in HDFS-2617 (Replaced Kerberized SSL for image transfer and 
> fsck with SPNEGO-based solution).
> To handle the transition from Hadoop1.0 systems running KSSL authentication 
> to Hadoop systems running SPNEGO, it would be good to fix the client in both 
> 1 and 2 to try SPNEGO and then fall back to try KSSL.  
> This will allow organizations that are running a lot of Hadoop 1.0 to 
> gradually transition over, without needing to convert all clusters at the 
> same time.  They would first need to update their 1.0 HFTP clients (and 
> 2.0/0.23 if they are already running those) and then they could copy data 
> between clusters without needing to move all clusters to SPNEGO in a big bang.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Closed] (HDFS-3983) Hftp should support both SPNEGO and KSSL

2013-10-01 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley closed HDFS-3983.
---

Assignee: (was: Eli Collins)

> Hftp should support both SPNEGO and KSSL
> 
>
> Key: HDFS-3983
> URL: https://issues.apache.org/jira/browse/HDFS-3983
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Priority: Blocker
> Attachments: hdfs-3983.txt, hdfs-3983.txt
>
>
> Hftp currently doesn't work against a secure cluster unless you configure 
> {{dfs.https.port}} to be the http port, otherwise the client can't fetch 
> tokens:
> {noformat}
> $ hadoop fs -ls hftp://c1225.hal.cloudera.com:50070/
> 12/09/26 18:02:00 INFO fs.FileSystem: Couldn't get a delegation token from 
> http://c1225.hal.cloudera.com:50470 using http.
> ls: Security enabled but user not authenticated by filter
> {noformat}
> This is due to Hftp still using the https port. Post HDFS-2617 it should use 
> the regular http port. Hsftp should still use the secure port, however now 
> that we have HADOOP-8581 it's worth considering removing Hsftp entirely. I'll 
> start a separate thread about that.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HDFS-3983) Hftp should support both SPNEGO and KSSL

2013-10-01 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-3983.
-

  Resolution: Won't Fix
Target Version/s:   (was: )

KSSL is deprecated and should never be used for secure deployments.

> Hftp should support both SPNEGO and KSSL
> 
>
> Key: HDFS-3983
> URL: https://issues.apache.org/jira/browse/HDFS-3983
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Blocker
> Attachments: hdfs-3983.txt, hdfs-3983.txt
>
>
> Hftp currently doesn't work against a secure cluster unless you configure 
> {{dfs.https.port}} to be the http port, otherwise the client can't fetch 
> tokens:
> {noformat}
> $ hadoop fs -ls hftp://c1225.hal.cloudera.com:50070/
> 12/09/26 18:02:00 INFO fs.FileSystem: Couldn't get a delegation token from 
> http://c1225.hal.cloudera.com:50470 using http.
> ls: Security enabled but user not authenticated by filter
> {noformat}
> This is due to Hftp still using the https port. Post HDFS-2617 it should use 
> the regular http port. Hsftp should still use the secure port, however now 
> that we have HADOOP-8581 it's worth considering removing Hsftp entirely. I'll 
> start a separate thread about that.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive

2013-09-12 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766009#comment-13766009
 ] 

Owen O'Malley commented on HDFS-5191:
-

+1 for EnumSet

> revisit zero-copy API in FSDataInputStream to make it more intuitive
> 
>
> Key: HDFS-5191
> URL: https://issues.apache.org/jira/browse/HDFS-5191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client, libhdfs
>Affects Versions: HDFS-4949
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>
> As per the discussion on HDFS-4953, we should revisit the zero-copy API to 
> make it more intuitive for new users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap

2013-09-11 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764428#comment-13764428
 ] 

Owen O'Malley commented on HDFS-4953:
-

{quote}
You can't know ahead of time whether your call to mmap will succeed. As I said, 
mmap can fail, for dozens of reasons. And of course blocks move over time. 
There is a fundamental "time of check, time of use" (TOCTOU) race condition in 
this kind of API.
{quote}

Ok, I guess I'm fine with the exception assuming the user passed in a null 
factory. It will be expensive in terms of time, but it won't affect the vast 
majority of users.

{quote}
is it necessary to read 200 MB at a time to decode the ORC file format?
{quote}

Actually, yes. The set of rows that are written together is large (typically 
~200MB) so that reading them is efficient. For a 100 column table, that means 
that you have all of the values for column 1 in the first ~2MB, followed by all 
of the values for column 2 in the next 2MB, etc. To read the first row, you 
need all 100 of the 2MB sections.

Obviously mmapping this is much more efficient, because the pages of the file 
can be brought in as needed.

{quote}
There is already a method named FSDataInputStream#read(ByteBuffer buf) in 
FSDataInputStream. If we create a new method named 
FSDataInputStream#readByteBuffer, I would expect there to be some confusion 
between the two. That's why I proposed FSDataInputStream#readZero for the new 
name. Does that make sense?
{quote}

I see your point, but readZero, which sounds like it just fills zeros into a 
byte buffer, doesn't convey the right meaning. The fundamental action that the 
user is taking is in fact read. I'd propose that we overload it with the other 
read and comment it saying that this read supports zero copy while the other 
doesn't.  How does this look?

{code}
/**
 * Read a byte buffer from the stream using zero copy if possible. Typically 
the read will return
 * maxLength bytes, but it may return fewer at the end of the file system block 
or the end of the
 * file.
 * @param factory a factory that creates ByteBuffers for the read if the region 
of the file can't be
 *   mmapped.
 * @param maxLength the maximum number of bytes that will be returned
 * @return a ByteBuffer with between 1 and maxLength bytes from the file. The 
buffer should be released
 *   to the stream when the user is done with it.
 */
public ByteBuffer read(ByteBufferFactory factory, int maxLength) throws 
IOException;
{code}

{quote}
I'd like to get some other prospective zero-copy API users to comment on 
whether they like the wrapper object or the DFSInputStream#releaseByteBuffer 
approach better...
{quote}

Uh, that is exactly what is happening. I'm a user who is trying to use this 
interface for a very typical use case of quickly reading bytes that may or may 
not be on the local machine. I also
care a lot about APIs and have been working on Hadoop for 7.75 years.

{quote}
If, instead of returning a ByteBuffer from the readByteBuffer call, we returned 
a ZeroBuffer object wrapping the ByteBuffer, we could simply call 
ZeroBuffer#close()
{quote}

Users don't want to make interfaces for reading from some Hadoop type named 
ZeroBuffer. The user wants a ByteBuffer because it is a standard Java type. To 
make this concrete and crystal clear, I have to make Hive and ORC work with 
both Hadoop 1.x and Hadoop 2.x. Therefore, if you use a non-standard type I 
need to wrap it in a shim. That sucks. Especially, if it is in the inner loop, 
which this absolutely would be. I *need* a ByteBuffer because I can make a shim 
that always returns a ByteBuffer that works regardless of which version of 
Hadoop that the user is using.

> enable HDFS local reads via mmap
> 
>
> Key: HDFS-4953
> URL: https://issues.apache.org/jira/browse/HDFS-4953
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.3.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: HDFS-4949
>
> Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, 
> HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, 
> HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch
>
>
> Currently, the short-circuit local read pathway allows HDFS clients to access 
> files directly without going through the DataNode.  However, all of these 
> reads involve a copy at the operating system level, since they rely on the 
> read() / pread() / etc family of kernel interfaces.
> We would like to enable HDFS to read local files via mmap.  This would enable 
> truly zero-copy reads.
> In the initial implementation, zero-copy reads will only be performed when 
> checksums were disabled.  Later, we can use the DataNode's cache awareness to 
> only perform zero-co

[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap

2013-09-10 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763526#comment-13763526
 ] 

Owen O'Malley commented on HDFS-4953:
-

{quote}
This seems like a fairly arbitrary requirement
{quote}

Actually, it is a standard best practice. Building exception objects is 
*expensive* and handling exceptions is error-prone.

{quote}
Unfortunately, zero copy being available is not a binary thing. It might be 
available sometimes, but not other times.
{quote}

I clearly stated in the previous comment it was about whether the zero copy was 
available AT THE CURRENT POINT IN THE STREAM.

{quote}
This is the point of the fallback buffer-- it will be used when a block 
boundary, non-local block, etc. prevents the request being fulfilled with an 
mmap. We've made this as efficient as it can be made.
{quote}

No, you haven't made it efficient at all. If the code is going to do a buffer 
copy and use the fallback buffer if it crosses a block, the performance will be 
dreadful. With my interface, you get no extra allocation, and no buffer copies.

The two reasonable strategies are:
* short read to cut it to a record boundary
* return multiple buffers

You can't make it easier than that, because the whole point of this API is to 
avoid buffer copies. You can't violate the goal of the interface because you 
don't like those choices.

{quote}
Well, what are the alternatives? We can't subclass java.nio.ByteBuffer, since 
all of its constructors are package-private.
{quote}

Agreed that you can't extend ByteBuffer. But other than close, you don't really 
need anything.

{quote}
There has to be a close method on whatever we return
{quote}

That is NOT a requirement. Especially when it conflicts with fundamental 
performance. As I wrote, adding a return method on the stream is fine.

{quote}
I'm sort of getting the impression that you plan on having super-huge buffers, 
which scares me.
{quote}

For ORC, I need to read typically 200 MB at a time. Obviously, if I don't need 
the whole range, I won't get it, but getting it in large sets of bytes is much 
much more efficient than lots of little reads.

{quote}
The factory API also gives me the impression that you plan on allocating a new 
buffer for each read, which would also be problematic.
{quote}

No, that is not the case. The point of the API is to avoid allocating the 
buffers if they aren't needed. The current API requires a buffer whether it is 
needed or not. Obviously the application will need a cache of the buffers to 
reuse, but the factory lets them write efficient code.

{quote}
If we have a "Factory" object, it needs to have not only a "get" method, but 
also a "put" method, where ByteBuffers are placed back when we're done with 
them. At that point, it becomes more like a cache. This might be a reasonable 
API, but I wonder if the additional complexity is worth it.
{quote}

Of course the implementations of the factory will have a release method. The 
question was just whether the FSDataInputStream needed to access the release 
method. If we add the releaseByteBuffer then we'd need the release to the 
factory.

Based on this, I'd propose:

{code}
/**
 * Is the current location of the stream available via zero copy?
 */
public boolean isZeroCopyAvailable();

/**
 * Read from the current location at least 1 and up to maxLength bytes. In most 
situations, the returned
 * buffer will contain maxLength bytes unless either:
 *  * the read crosses a block boundary and zero copy is being used
 *  * the stream has fewer than maxLength bytes left
 * The returned buffer will either be one that was created by the factory or a 
MappedByteBuffer.
 */
public ByteBuffer readByteBuffer(ByteBufferFactory factory, int maxLength) 
throws IOException;

/**
 * Release a buffer that was returned from readByteBuffer. If the method was 
created by the factory
 * it will be returned to the factory.
 */
public void releaseByteBuffer(ByteBufferFactory factory, ByteBuffer buffer);

/**
 * Allow application to manage how ByteBuffers are created for fallback 
buffers. Only buffers created by
 * the factory will be released to it.
 */
public interface ByteBufferFactory {
  ByteBuffer createBuffer(int capacity);
  void releaseBuffer(ByteBuffer buffer);
}
{code}

This will allow applications to:
* determine whether zero copy is available for the next read
* the user can use the same read interface for all filesystems and files, using 
zero copy if available
* no extra buffer copies
* no bytebuffers are allocated if they are not needed
* applications have to deal with short reads, but only get a single byte buffer
* allow applications to create buffer managers that reuse buffers
* allow applications to control whether direct or byte[] byte buffers are used

The example code would look like:
{code}
FSDataInputStream in = fs.open(path);
in.seek(100*1024*1024);
List buffers = new ArrayL

[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap

2013-09-10 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763167#comment-13763167
 ] 

Owen O'Malley commented on HDFS-4953:
-

Thanks, Colin, for giving more details of the design.

Your new API is much better, but a few issues remain:
* If an application needs to determine whether zero copy is available, it 
should be able to do so without catching exceptions.
* What happens if the user reads across a block boundary? Most applications 
don't care about block boundaries and shouldn't have to add special code to cut 
their requests to block boundaries. That will impose inefficiencies.
* The cost of a second level of indirection (app -> ZeroCopy -> ByteBuffer) in 
the inner loop of the client seems prohibitive.
* Requiring pre-allocation of a fallback buffer that hopefully is never needed 
is really problematic. I'd propose that we flip this around to a factory.
* You either need to support short reads or return multiple bytebuffers. I 
don't see a way to avoid both unless applications are forced to never read 
across block boundaries. That would be much worse than either of the other 
options. I'd prefer to have multiple ByteBuffers returned, but if you hate that 
worse than short reads, I can handle that.
* It isn't clear to me how you plan to release mmapped buffers, since Java 
doesn't provide an API to do that. If you have a mechanism to do that, we need 
a releaseByteBuffer(ByteBuffer buffer) to release it.

I'd propose that we add the following to FSDataInputStream:
{code}
/**
 * Is the current location of the stream available via zero copy?
 */
public boolean isZeroCopyAvailable();

/**
 * Read from the current location at least 1 and up to maxLength bytes. In most 
situations, the returned
 * buffer will contain maxLength bytes unless either:
 *  * the read crosses a block boundary and zero copy is being used
 *  * the stream has fewer than maxLength bytes left
 * The returned buffer will either be one that was created by the factory or a 
MappedByteBuffer.
 */
public ByteBuffer readByteBuffer(ByteBufferFactory factory, int maxLength) 
throws IOException;

/**
 * Allow application to manage how ByteBuffers are created for fallback buffers.
 */
public interface ByteBufferFactory {
  ByteBuffer createBuffer(int capacity);
}
{code}

 

> enable HDFS local reads via mmap
> 
>
> Key: HDFS-4953
> URL: https://issues.apache.org/jira/browse/HDFS-4953
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.3.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: HDFS-4949
>
> Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, 
> HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, 
> HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch
>
>
> Currently, the short-circuit local read pathway allows HDFS clients to access 
> files directly without going through the DataNode.  However, all of these 
> reads involve a copy at the operating system level, since they rely on the 
> read() / pread() / etc family of kernel interfaces.
> We would like to enable HDFS to read local files via mmap.  This would enable 
> truly zero-copy reads.
> In the initial implementation, zero-copy reads will only be performed when 
> checksums were disabled.  Later, we can use the DataNode's cache awareness to 
> only perform zero-copy reads when we know that checksum has already been 
> verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap

2013-09-06 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760682#comment-13760682
 ] 

Owen O'Malley commented on HDFS-4953:
-

Colin, please read my suggestion and my analysis of the difference before 
commenting.

The simplified API absolutely provides a means to releasing the ByteBuffer and 
yet it is 2 lines long instead of 20. Furthermore, I didn't even realize that I 
was supposed to close the zero copy cursor, since it just came in from closable.

My complaint stands. The API as currently in this branch is very error-prone 
and difficult to explain. Using it is difficult and requires complex handling 
including exception handlers to handle arbitrary file systems.

> enable HDFS local reads via mmap
> 
>
> Key: HDFS-4953
> URL: https://issues.apache.org/jira/browse/HDFS-4953
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.3.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: HDFS-4949
>
> Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, 
> HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, 
> HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch
>
>
> Currently, the short-circuit local read pathway allows HDFS clients to access 
> files directly without going through the DataNode.  However, all of these 
> reads involve a copy at the operating system level, since they rely on the 
> read() / pread() / etc family of kernel interfaces.
> We would like to enable HDFS to read local files via mmap.  This would enable 
> truly zero-copy reads.
> In the initial implementation, zero-copy reads will only be performed when 
> checksums were disabled.  Later, we can use the DataNode's cache awareness to 
> only perform zero-copy reads when we know that checksum has already been 
> verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   3   4   >