[jira] [Assigned] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HDFS-16917: Assignee: Ravindra Dingankar > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Assignee: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > Fix For: 3.3.0, 3.4.0 > > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16917. -- Fix Version/s: 3.4.0 Resolution: Fixed > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16890) RBF: Add period state refresh to keep router state near active namenode's
[ https://issues.apache.org/jira/browse/HDFS-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-16890: - Fix Version/s: (was: 3.3.6) > RBF: Add period state refresh to keep router state near active namenode's > - > > Key: HDFS-16890 > URL: https://issues.apache.org/jira/browse/HDFS-16890 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > When using the ObserverReadProxyProvider, clients can set > *dfs.client.failover.observer.auto-msync-period...* to periodically get the > Active namenode's state. When using routers without the > ObserverReadProxyProvider, this periodic update is lost. > In a busy cluster, the Router constantly gets updated with the active > namenode's state when > # There is a write operation. > # There is an operation (read/write) from a new clients. > However, in the scenario when there are no new clients and no write > operations, the state kept in the router can lag behind the active's. The > router does update its state with responses from the Observer, but the > observer may be lagging behind too. > We should have a periodic refresh in the router to serve a similar role as > *dfs.client.failover.observer.auto-msync-period* -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16890) RBF: Add period state refresh to keep router state near active namenode's
[ https://issues.apache.org/jira/browse/HDFS-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16890. -- Fix Version/s: 3.4.0 3.3.6 Resolution: Fixed > RBF: Add period state refresh to keep router state near active namenode's > - > > Key: HDFS-16890 > URL: https://issues.apache.org/jira/browse/HDFS-16890 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.6 > > > When using the ObserverReadProxyProvider, clients can set > *dfs.client.failover.observer.auto-msync-period...* to periodically get the > Active namenode's state. When using routers without the > ObserverReadProxyProvider, this periodic update is lost. > In a busy cluster, the Router constantly gets updated with the active > namenode's state when > # There is a write operation. > # There is an operation (read/write) from a new clients. > However, in the scenario when there are no new clients and no write > operations, the state kept in the router can lag behind the active's. The > router does update its state with responses from the Observer, but the > observer may be lagging behind too. > We should have a periodic refresh in the router to serve a similar role as > *dfs.client.failover.observer.auto-msync-period* -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context
[ https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692793#comment-17692793 ] Owen O'Malley commented on HDFS-16901: -- My trial backport is here - https://github.com/omalley/hadoop/tree/HDFS-16901-3.3 > RBF: Routers should propagate the real user in the UGI via the caller context > - > > Key: HDFS-16901 > URL: https://issues.apache.org/jira/browse/HDFS-16901 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.6 > > > If the router receives an operation from a proxyUser, it drops the realUser > in the UGI and makes the routerUser the realUser for the operation that goes > to the namenode. > In the namenode UGI logs, we'd like the ability to know the original realUser. > The router should propagate the realUser from the client call as part of the > callerContext. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context
[ https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692786#comment-17692786 ] Owen O'Malley commented on HDFS-16901: -- Simba, when I backport this to branch-3.3 I get a test failure. Basically the new test has 'oomalley' as the login user, but the log is using testRealUser. {code:java} 2023-02-22 17:03:36,169 [IPC Server handler 5 on default port 49453] INFO FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8574)) - allowed=true ugi=testProxyUser (auth:PROXY) via testRealUser (auth:SIMPLE) ip=/127.0.0.1 cmd=listStatus src=/ dst=null perm=null proto=rpc callerContext=clientIp:172.25.204.192,clientPort:49519,realUser:testRealUser {code} What the test is looking for is: {code:java} ugi=testProxyUser (auth:PROXY) via oomalley (auth:SIMPLE){code} The test works correctly on trunk. > RBF: Routers should propagate the real user in the UGI via the caller context > - > > Key: HDFS-16901 > URL: https://issues.apache.org/jira/browse/HDFS-16901 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.6 > > > If the router receives an operation from a proxyUser, it drops the realUser > in the UGI and makes the routerUser the realUser for the operation that goes > to the namenode. > In the namenode UGI logs, we'd like the ability to know the original realUser. > The router should propagate the realUser from the client call as part of the > callerContext. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context
[ https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16901. -- Fix Version/s: 3.4.0 3.3.6 Resolution: Fixed Thanks, Simba! > RBF: Routers should propagate the real user in the UGI via the caller context > - > > Key: HDFS-16901 > URL: https://issues.apache.org/jira/browse/HDFS-16901 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.6 > > > If the router receives an operation from a proxyUser, it drops the realUser > in the UGI and makes the routerUser the realUser for the operation that goes > to the namenode. > In the namenode UGI logs, we'd like the ability to know the original realUser. > The router should propagate the realUser from the client call as part of the > callerContext. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16853) The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because HADOOP-18324
[ https://issues.apache.org/jira/browse/HDFS-16853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686083#comment-17686083 ] Owen O'Malley commented on HDFS-16853: -- This PR is a much simpler solution and shouldn't have any race conditions. https://github.com/apache/hadoop/pull/5371 > The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed > because HADOOP-18324 > --- > > Key: HDFS-16853 > URL: https://issues.apache.org/jira/browse/HDFS-16853 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.5 >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Blocker > Labels: pull-request-available > > The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed > with error message: Waiting for cluster to become active. And the blocking > jstack as bellows: > {code:java} > "BP-1618793397-192.168.3.4-1669198559828 heartbeating to > localhost/127.0.0.1:54673" #260 daemon prio=5 os_prio=31 tid=0x > 7fc1108fa000 nid=0x19303 waiting on condition [0x700017884000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0007430a9ec0> (a > java.util.concurrent.SynchronousQueue$TransferQueue) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(SynchronousQueue.java:762) > at > java.util.concurrent.SynchronousQueue$TransferQueue.transfer(SynchronousQueue.java:695) > at > java.util.concurrent.SynchronousQueue.put(SynchronousQueue.java:877) > at > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1186) > at org.apache.hadoop.ipc.Client.call(Client.java:1482) > at org.apache.hadoop.ipc.Client.call(Client.java:1429) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139) > at com.sun.proxy.$Proxy23.sendHeartbeat(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClient > SideTranslatorPB.java:168) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:570) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:714) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:915) > at java.lang.Thread.run(Thread.java:748) {code} > After looking into the code and found that this bug is imported by > HADOOP-18324. Because RpcRequestSender exited without cleaning up the > rpcRequestQueue, then caused BPServiceActor was blocked in sending request. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16853) The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because HADOOP-18324
[ https://issues.apache.org/jira/browse/HDFS-16853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685487#comment-17685487 ] Owen O'Malley commented on HDFS-16853: -- The description is wrong. The SychronousQueue has no storage and thus doesn't need to be cleaned up. The problem is that between the check at the top of sendRpcRequest and when it offers the serialized bytes the other thread was stopped. Unfortunately, just making sendRpcRequest synchronous, which would fix the race condition, wouldn't be ok because we can't hold the lock while we wait for our turn in the queue. The proposed fix doesn't fix the race condition because it releases the lock before putting the message in the queue. Let me look at what we can do. > The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed > because HADOOP-18324 > --- > > Key: HDFS-16853 > URL: https://issues.apache.org/jira/browse/HDFS-16853 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.5 >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Blocker > Labels: pull-request-available > > The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed > with error message: Waiting for cluster to become active. And the blocking > jstack as bellows: > {code:java} > "BP-1618793397-192.168.3.4-1669198559828 heartbeating to > localhost/127.0.0.1:54673" #260 daemon prio=5 os_prio=31 tid=0x > 7fc1108fa000 nid=0x19303 waiting on condition [0x700017884000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0007430a9ec0> (a > java.util.concurrent.SynchronousQueue$TransferQueue) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(SynchronousQueue.java:762) > at > java.util.concurrent.SynchronousQueue$TransferQueue.transfer(SynchronousQueue.java:695) > at > java.util.concurrent.SynchronousQueue.put(SynchronousQueue.java:877) > at > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1186) > at org.apache.hadoop.ipc.Client.call(Client.java:1482) > at org.apache.hadoop.ipc.Client.call(Client.java:1429) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139) > at com.sun.proxy.$Proxy23.sendHeartbeat(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClient > SideTranslatorPB.java:168) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:570) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:714) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:915) > at java.lang.Thread.run(Thread.java:748) {code} > After looking into the code and found that this bug is imported by > HADOOP-18324. Because RpcRequestSender exited without cleaning up the > rpcRequestQueue, then caused BPServiceActor was blocked in sending request. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16895) NamenodeHeartbeatService should use credentials of logged in user
[ https://issues.apache.org/jira/browse/HDFS-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16895. -- Fix Version/s: 3.4.0 3.3.5 Assignee: Hector Sandoval Chaverri Resolution: Fixed > NamenodeHeartbeatService should use credentials of logged in user > - > > Key: HDFS-16895 > URL: https://issues.apache.org/jira/browse/HDFS-16895 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Reporter: Hector Sandoval Chaverri >Assignee: Hector Sandoval Chaverri >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > NamenodeHeartbeatService has been found to log the errors when querying > protected Namenode JMX APIs. We have been able to work around this by running > kinit with the DFS_ROUTER_KEYTAB_FILE_KEY and > DFS_ROUTER_KERBEROS_PRINCIPAL_KEY on the router. > While investigating a solution, we found that doing the request as part of a > UserGroupInformation.getLoginUser.doAs() call doesn't require to kinit before. > The error logged is: > {noformat} > 2022-08-16 21:35:00,265 ERROR > org.apache.hadoop.hdfs.server.federation.router.FederationUtil: Cannot parse > JMX output for Hadoop:service=NameNode,name=FSNamesystem* from server > ltx1-yugiohnn03-ha1.grid.linkedin.com:50070 > org.apache.hadoop.security.authentication.client.AuthenticationException: > Error while authenticating with endpoint: > http://ltx1-yugiohnn03-ha1.grid.linkedin.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem* > at sun.reflect.GeneratedConstructorAccessor55.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.wrapExceptionWithMessage(KerberosAuthenticator.java:232) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:219) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:350) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.openConnection(URLConnectionFactory.java:186) > at > org.apache.hadoop.hdfs.server.federation.router.FederationUtil.getJmx(FederationUtil.java:82) > at > org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateJMXParameters(NamenodeHeartbeatService.java:352) > at > org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.getNamenodeStatusReport(NamenodeHeartbeatService.java:295) > at > org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:218) > at > org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.periodicInvoke(NamenodeHeartbeatService.java:172) > at > org.apache.hadoop.hdfs.server.federation.router.PeriodicService$1.run(PeriodicService.java:178) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:360) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:204) > ... 15 more > Caused by: GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt) > at > sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) > at > sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) > at > sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) > at > sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) > at > sun.security.jgss.GSSConte
[jira] [Resolved] (HDFS-16886) Fix documentation for StateStoreRecordOperations#get(Class ..., Query ...)
[ https://issues.apache.org/jira/browse/HDFS-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16886. -- Fix Version/s: 3.4.0 3.3.5 Resolution: Fixed > Fix documentation for StateStoreRecordOperations#get(Class ..., Query ...) > -- > > Key: HDFS-16886 > URL: https://issues.apache.org/jira/browse/HDFS-16886 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > For {*}StateStoreRecordOperations#get(Class ..., Query ...){*}, when multiple > records match, the documentation says a null value should be returned and an > IOException should be thrown. Both can't happen. > I believe the intended behavior is that an IOException is thrown. This is the > implementation in {*}StateStoreBaseImpl{*}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16877) Namenode doesn't use alignment context in TestObserverWithRouter
[ https://issues.apache.org/jira/browse/HDFS-16877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16877. -- Fix Version/s: 3.4.0 Assignee: Simbarashe Dzinamarira Resolution: Fixed I've committed this. Thanks, Simba! > Namenode doesn't use alignment context in TestObserverWithRouter > > > Key: HDFS-16877 > URL: https://issues.apache.org/jira/browse/HDFS-16877 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, rbf >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > We need to set "{*}dfs.namenode.state.context.enabled{*}" to true for the > namenode to send it's stateId in client responses. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16851) RBF: Add a utility to dump the StateStore
[ https://issues.apache.org/jira/browse/HDFS-16851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16851. -- Fix Version/s: 3.3.6 3.4.0 Resolution: Fixed > RBF: Add a utility to dump the StateStore > - > > Key: HDFS-16851 > URL: https://issues.apache.org/jira/browse/HDFS-16851 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Fix For: 3.3.6, 3.4.0 > > > It would be useful to have a utility to dump the StateStore for RBF. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16847) RBF: StateStore writer should not commit tmp fail if there was an error in writing the file.
[ https://issues.apache.org/jira/browse/HDFS-16847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16847. -- Fix Version/s: 3.4.0 3.3.5 Resolution: Fixed I committed this. Thanks, Simba! > RBF: StateStore writer should not commit tmp fail if there was an error in > writing the file. > > > Key: HDFS-16847 > URL: https://issues.apache.org/jira/browse/HDFS-16847 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, rbf >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > The file based implementation of the RBF state store has a commit step that > moves a temporary file to a permanent location. > There is a check to see if the write of the temp file was successfully, > however, the code to commit doesn't check the success flag. > This is the relevant code: > [https://github.com/apache/hadoop/blob/7d39abd799a5f801a9fd07868a193205ab500bfa/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileBaseImpl.java#L369] > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16845) Add configuration flag to enable observer reads on routers without using ObserverReadProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16845. -- Fix Version/s: 3.4.0 3.3.5 2.10.3 Resolution: Fixed I just committed this. Thanks, Simba! > Add configuration flag to enable observer reads on routers without using > ObserverReadProxyProvider > -- > > Key: HDFS-16845 > URL: https://issues.apache.org/jira/browse/HDFS-16845 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5, 2.10.3 > > > In order for clients to have routers forward their reads to observers, the > clients must use a proxy with an alignment context. This is currently > achieved by using the ObserverReadProxyProvider. > Using ObserverReadProxyProvider allows backward compatible for client > configurations. > However, the ObserverReadProxyProvider forces an msync on initialization > which is not required with routers. > Performing msync calls is more expensive with routers because the router fans > out the cal to all namespaces, so we'd like to avoid this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16856) [RBF] Refactor router admin command to use HDFS AdminHelper class
Owen O'Malley created HDFS-16856: Summary: [RBF] Refactor router admin command to use HDFS AdminHelper class Key: HDFS-16856 URL: https://issues.apache.org/jira/browse/HDFS-16856 Project: Hadoop HDFS Issue Type: Improvement Components: rbf Reporter: Owen O'Malley Assignee: Owen O'Malley Currently, the router admin class is a bit of a mess with a lot of custom programming. We should use the infrastructure that was developed in the AdminHelper class to standardize the command processing. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16851) [RBF] Utility to textually dump the StateStore
Owen O'Malley created HDFS-16851: Summary: [RBF] Utility to textually dump the StateStore Key: HDFS-16851 URL: https://issues.apache.org/jira/browse/HDFS-16851 Project: Hadoop HDFS Issue Type: Improvement Components: rbf Reporter: Owen O'Malley Assignee: Owen O'Malley It would be useful to have a utility to dump the StateStore for RBF. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16844) [RBF] The routers should be resiliant against exceptions from StateStore
[ https://issues.apache.org/jira/browse/HDFS-16844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16844. -- Fix Version/s: 3.4.0 3.3.5 Resolution: Fixed > [RBF] The routers should be resiliant against exceptions from StateStore > > > Key: HDFS-16844 > URL: https://issues.apache.org/jira/browse/HDFS-16844 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.3.4 >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > Currently, a single exception from the StateStore will cripple a router by > clearing the caches before the replacement is loaded. Since the routers have > the information in an in-memory cache, it is better to keep running. There is > still the timeout that will push the router into safe-mode if it can't load > the state store over a longer period of time. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16843) [RBF] The routers should be resiliant against exceptions from StateStore
[ https://issues.apache.org/jira/browse/HDFS-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16843. -- Resolution: Duplicate > [RBF] The routers should be resiliant against exceptions from StateStore > > > Key: HDFS-16843 > URL: https://issues.apache.org/jira/browse/HDFS-16843 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.3.4 >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > > Currently, a single exception from the StateStore will cripple a router by > clearing the caches before the replacement is loaded. Since the routers have > the information in an in-memory cache, it is better to keep running. There is > still the timeout that will push the router into safe-mode if it can't load > the state store over a longer period of time. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16844) [RBF] The routers should be resiliant against exceptions from StateStore
Owen O'Malley created HDFS-16844: Summary: [RBF] The routers should be resiliant against exceptions from StateStore Key: HDFS-16844 URL: https://issues.apache.org/jira/browse/HDFS-16844 Project: Hadoop HDFS Issue Type: Improvement Components: rbf Affects Versions: 3.3.4 Reporter: Owen O'Malley Assignee: Owen O'Malley Currently, a single exception from the StateStore will cripple a router by clearing the caches before the replacement is loaded. Since the routers have the information in an in-memory cache, it is better to keep running. There is still the timeout that will push the router into safe-mode if it can't load the state store over a longer period of time. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16843) [RBF] The routers should be resiliant against exceptions from StateStore
Owen O'Malley created HDFS-16843: Summary: [RBF] The routers should be resiliant against exceptions from StateStore Key: HDFS-16843 URL: https://issues.apache.org/jira/browse/HDFS-16843 Project: Hadoop HDFS Issue Type: Improvement Components: rbf Affects Versions: 3.3.4 Reporter: Owen O'Malley Assignee: Owen O'Malley Currently, a single exception from the StateStore will cripple a router by clearing the caches before the replacement is loaded. Since the routers have the information in an in-memory cache, it is better to keep running. There is still the timeout that will push the router into safe-mode if it can't load the state store over a longer period of time. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16836) StandbyCheckpointer can still trigger rollback fs image after RU is finalized
[ https://issues.apache.org/jira/browse/HDFS-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16836. -- Fix Version/s: 3.4.0 3.3.5 Resolution: Fixed I just committed this. Thanks, Lei! > StandbyCheckpointer can still trigger rollback fs image after RU is finalized > - > > Key: HDFS-16836 > URL: https://issues.apache.org/jira/browse/HDFS-16836 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Lei Yang >Assignee: Lei Yang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > StandbyCheckpointer trigger rollback fsimage when RU is started. > When ru is started, a flag (needRollbackImage) was set to true during edit > log replay. > And it only gets reset to false when doCheckpoint() succeeded. > Think about following scenario: > # Start RU, needRollbackImage is set to true. > # doCheckpoint() failed. > # RU is finalized. > # namesystem.getFSImage().hasRollbackFSImage() is always false since > rollback image cannot be generated once RU is over. > # needRollbackImage was never set to false. > # Checkpoints threshold(1m txns) and period(1hr) are not honored. > {code:java} > StandbyCheckpointer: > void doWork() { > > doCheckpoint(); > // reset needRollbackCheckpoint to false only when we finish a ckpt > // for rollback image > if (needRollbackCheckpoint > && namesystem.getFSImage().hasRollbackFSImage()) { > namesystem.setCreatedRollbackImages(true); > namesystem.setNeedRollbackFsImage(false); > } > lastCheckpointTime = now; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16836) StandbyCheckpointer can still trigger rollback fs image after RU is finalized
[ https://issues.apache.org/jira/browse/HDFS-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HDFS-16836: Assignee: Lei Yang > StandbyCheckpointer can still trigger rollback fs image after RU is finalized > - > > Key: HDFS-16836 > URL: https://issues.apache.org/jira/browse/HDFS-16836 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Lei Yang >Assignee: Lei Yang >Priority: Major > Labels: pull-request-available > > StandbyCheckpointer trigger rollback fsimage when RU is started. > When ru is started, a flag (needRollbackImage) was set to true during edit > log replay. > And it only gets reset to false when doCheckpoint() succeeded. > Think about following scenario: > # Start RU, needRollbackImage is set to true. > # doCheckpoint() failed. > # RU is finalized. > # namesystem.getFSImage().hasRollbackFSImage() is always false since > rollback image cannot be generated once RU is over. > # needRollbackImage was never set to false. > # Checkpoints threshold(1m txns) and period(1hr) are not honored. > {code:java} > StandbyCheckpointer: > void doWork() { > > doCheckpoint(); > // reset needRollbackCheckpoint to false only when we finish a ckpt > // for rollback image > if (needRollbackCheckpoint > && namesystem.getFSImage().hasRollbackFSImage()) { > namesystem.setCreatedRollbackImages(true); > namesystem.setNeedRollbackFsImage(false); > } > lastCheckpointTime = now; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16778) Separate out the logger for which DN is picked by a DFSInputStream
Owen O'Malley created HDFS-16778: Summary: Separate out the logger for which DN is picked by a DFSInputStream Key: HDFS-16778 URL: https://issues.apache.org/jira/browse/HDFS-16778 Project: Hadoop HDFS Issue Type: Improvement Reporter: Owen O'Malley Currently, there is no way to know which DN a given stream chose without turning on debug for all of DFSClient. I'd like the ability to just get that logged. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16767) RBF: Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-16767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16767. -- Fix Version/s: 3.4.0 3.3.9 Resolution: Fixed I just committed this. Thanks, Simba! > RBF: Support observer node from Router-Based Federation > > > Key: HDFS-16767 > URL: https://issues.apache.org/jira/browse/HDFS-16767 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > > Enable routers to direct read calls to observer namenodes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13522) RBF: Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-13522: - Fix Version/s: 3.3.9 > RBF: Support observer node from Router-Based Federation > --- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > Attachments: HDFS-13522.001.patch, HDFS-13522.002.patch, > HDFS-13522_WIP.patch, RBF_ Observer support.pdf, Router+Observer RPC > clogging.png, ShortTerm-Routers+Observer.png, > observer_reads_in_rbf_proposal_simbadzina_v1.pdf, > observer_reads_in_rbf_proposal_simbadzina_v2.pdf > > Time Spent: 20h 50m > Remaining Estimate: 0h > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13522) RBF: Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-13522: - Fix Version/s: 3.4.0 Resolution: Fixed Status: Resolved (was: Patch Available) I committed this. Thanks, Simba! > RBF: Support observer node from Router-Based Federation > --- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HDFS-13522.001.patch, HDFS-13522.002.patch, > HDFS-13522_WIP.patch, RBF_ Observer support.pdf, Router+Observer RPC > clogging.png, ShortTerm-Routers+Observer.png, > observer_reads_in_rbf_proposal_simbadzina_v1.pdf, > observer_reads_in_rbf_proposal_simbadzina_v2.pdf > > Time Spent: 20h 50m > Remaining Estimate: 0h > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) RBF: Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557699#comment-17557699 ] Owen O'Malley commented on HDFS-13522: -- I'm concerned about additional msyncs on every call from RBF. It will radically increase the rpc load on the active NN. I'd propose that we add a new field in the client protocol that tracks the state of all of the namespaces that a given client has used. The flow would look like: client -> router: {} // no state router -> nn: msync // get the state nn -> router: state 1000 router -> client: \{ns1: 1000} client -> router: \{ns1: 1000} router -> observer: state 1000 The client just gives back the state it was given. This gracefully handles fail overs between routers and avoids additional msyncs. > RBF: Support observer node from Router-Based Federation > --- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Attachments: HDFS-13522.001.patch, HDFS-13522.002.patch, > HDFS-13522_WIP.patch, RBF_ Observer support.pdf, Router+Observer RPC > clogging.png, ShortTerm-Routers+Observer.png > > Time Spent: 15h 20m > Remaining Estimate: 0h > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16518) KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager
[ https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16518. -- Fix Version/s: 3.4.0 2.10.2 3.3.3 Resolution: Fixed I committed this. Thanks, Lei! > KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager > - > > Key: HDFS-16518 > URL: https://issues.apache.org/jira/browse/HDFS-16518 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.10.0 >Reporter: Lei Yang >Assignee: Lei Yang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 2.10.2, 3.3.3 > > Time Spent: 2h > Remaining Estimate: 0h > > KeyProvider implements Closable interface but some custom implementation of > KeyProvider also needs explicit close in KeyProviderCache. An example is to > use custom KeyProvider in DFSClient to integrate read encrypted file on HDFS. > KeyProvider currently gets closed in KeyProviderCache only when cache entry > is expired or invalidated. In some cases, this is not happening. This seems > related to guava cache. > This patch is to use hadoop JVM shutdownhookManager to globally cleanup cache > entries and thus close KeyProvider using cache hook right after filesystem > instance gets closed in a deterministic way. > {code:java} > Class KeyProviderCache > ... > public KeyProviderCache(long expiryMs) { > cache = CacheBuilder.newBuilder() > .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS) > .removalListener(new RemovalListener() { > @Override > public void onRemoval( > @Nonnull RemovalNotification notification) { > try { > assert notification.getValue() != null; > notification.getValue().close(); > } catch (Throwable e) { > LOG.error( > "Error closing KeyProvider with uri [" > + notification.getKey() + "]", e); > } > } > }) > .build(); > }{code} > We could have made a new function KeyProviderCache#close, have each DFSClient > call this function and close KeyProvider at the end of each DFSClient#close > call but it will expose another problem to potentially close global cache > among different DFSClient instances. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16518) KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager
[ https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HDFS-16518: Assignee: Lei Yang > KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager > - > > Key: HDFS-16518 > URL: https://issues.apache.org/jira/browse/HDFS-16518 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.10.0 >Reporter: Lei Yang >Assignee: Lei Yang >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > KeyProvider implements Closable interface but some custom implementation of > KeyProvider also needs explicit close in KeyProviderCache. An example is to > use custom KeyProvider in DFSClient to integrate read encrypted file on HDFS. > KeyProvider currently gets closed in KeyProviderCache only when cache entry > is expired or invalidated. In some cases, this is not happening. This seems > related to guava cache. > This patch is to use hadoop JVM shutdownhookManager to globally cleanup cache > entries and thus close KeyProvider using cache hook right after filesystem > instance gets closed in a deterministic way. > {code:java} > Class KeyProviderCache > ... > public KeyProviderCache(long expiryMs) { > cache = CacheBuilder.newBuilder() > .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS) > .removalListener(new RemovalListener() { > @Override > public void onRemoval( > @Nonnull RemovalNotification notification) { > try { > assert notification.getValue() != null; > notification.getValue().close(); > } catch (Throwable e) { > LOG.error( > "Error closing KeyProvider with uri [" > + notification.getKey() + "]", e); > } > } > }) > .build(); > }{code} > We could have made a new function KeyProviderCache#close, have each DFSClient > call this function and close KeyProvider at the end of each DFSClient#close > call but it will expose another problem to potentially close global cache > among different DFSClient instances. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16518) Cached KeyProvider in KeyProviderCache should be closed with ShutdownHookManager
[ https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512012#comment-17512012 ] Owen O'Malley commented on HDFS-16518: -- I don't understand why this is required. Obviously at jvm shutdown the cache will be discarded. The order of shutdown hooks isn't deterministic, so using this isn't a fix against other shutdown hooks using the cache. Is there some other call to KeyProvider.close() that this should replace? > Cached KeyProvider in KeyProviderCache should be closed with > ShutdownHookManager > > > Key: HDFS-16518 > URL: https://issues.apache.org/jira/browse/HDFS-16518 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.10.0 >Reporter: Lei Yang >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > We need to make sure the underlying KeyProvider used by multiple DFSClient > instances is closed at one shot during jvm shutdown. Within the shutdownhook, > we invalidate the cache and make sure they are all closed. The cache has a > removeListener hook which is called when cache entry is invalidated. > {code:java} > Class KeyProviderCache > ... > public KeyProviderCache(long expiryMs) { > cache = CacheBuilder.newBuilder() > .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS) > .removalListener(new RemovalListener() { > @Override > public void onRemoval( > @Nonnull RemovalNotification notification) { > try { > assert notification.getValue() != null; > notification.getValue().close(); > } catch (Throwable e) { > LOG.error( > "Error closing KeyProvider with uri [" > + notification.getKey() + "]", e); > } > } > }) > .build(); > }{code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16518) Cached KeyProvider in KeyProviderCache should be closed with ShutdownHookManager
[ https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HDFS-16518: Assignee: (was: Lei Xu) > Cached KeyProvider in KeyProviderCache should be closed with > ShutdownHookManager > > > Key: HDFS-16518 > URL: https://issues.apache.org/jira/browse/HDFS-16518 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.10.0 >Reporter: Lei Yang >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > We need to make sure the underlying KeyProvider used by multiple DFSClient > instances is closed at one shot during jvm shutdown. Within the shutdownhook, > we invalidate the cache and make sure they are all closed. The cache has a > removeListener hook which is called when cache entry is invalidated. > {code:java} > Class KeyProviderCache > ... > public KeyProviderCache(long expiryMs) { > cache = CacheBuilder.newBuilder() > .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS) > .removalListener(new RemovalListener() { > @Override > public void onRemoval( > @Nonnull RemovalNotification notification) { > try { > assert notification.getValue() != null; > notification.getValue().close(); > } catch (Throwable e) { > LOG.error( > "Error closing KeyProvider with uri [" > + notification.getKey() + "]", e); > } > } > }) > .build(); > }{code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16518) Cached KeyProvider in KeyProviderCache should be closed with ShutdownHookManager
[ https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HDFS-16518: Assignee: Lei Xu > Cached KeyProvider in KeyProviderCache should be closed with > ShutdownHookManager > > > Key: HDFS-16518 > URL: https://issues.apache.org/jira/browse/HDFS-16518 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.10.0 >Reporter: Lei Yang >Assignee: Lei Xu >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > We need to make sure the underlying KeyProvider used by multiple DFSClient > instances is closed at one shot during jvm shutdown. Within the shutdownhook, > we invalidate the cache and make sure they are all closed. The cache has a > removeListener hook which is called when cache entry is invalidated. > {code:java} > Class KeyProviderCache > ... > public KeyProviderCache(long expiryMs) { > cache = CacheBuilder.newBuilder() > .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS) > .removalListener(new RemovalListener() { > @Override > public void onRemoval( > @Nonnull RemovalNotification notification) { > try { > assert notification.getValue() != null; > notification.getValue().close(); > } catch (Throwable e) { > LOG.error( > "Error closing KeyProvider with uri [" > + notification.getKey() + "]", e); > } > } > }) > .build(); > }{code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16517) In 2.10 the distance metric is wrong for non-DN machines
[ https://issues.apache.org/jira/browse/HDFS-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16517. -- Fix Version/s: 2.10.2 Resolution: Fixed > In 2.10 the distance metric is wrong for non-DN machines > > > Key: HDFS-16517 > URL: https://issues.apache.org/jira/browse/HDFS-16517 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.10.1 >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Fix For: 2.10.2 > > Time Spent: 1.5h > Remaining Estimate: 0h > > In 2.10, the metric for distance between the client and the data node is > wrong for machines that aren't running data nodes (ie. > getWeightUsingNetworkLocation). The code works correctly in 3.3+. > Currently > > ||Client||DataNode||getWeight||getWeightUsingNetworkLocation|| > |/rack1/node1|/rack1/node1|0|0| > |/rack1/node1|/rack1/node2|2|2| > |/rack1/node1|/rack2/node2|4|2| > |/pod1/rack1/node1|/pod1/rack1/node2|2|2| > |/pod1/rack1/node1|/pod1/rack2/node2|4|2| > |/pod1/rack1/node1|/pod2/rack2/node2|6|4| > > This bug will destroy data locality on clusters where the clients share racks > with DataNodes, but are running on machines that aren't running DataNodes, > such as striping federated HDFS clusters across racks. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16517) In 2.10 the distance metric is wrong for non-DN machines
[ https://issues.apache.org/jira/browse/HDFS-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-16517: - Description: In 2.10, the metric for distance between the client and the data node is wrong for machines that aren't running data nodes (ie. getWeightUsingNetworkLocation). The code works correctly in 3.3+. Currently ||Client||DataNode||getWeight||getWeightUsingNetworkLocation|| |/rack1/node1|/rack1/node1|0|0| |/rack1/node1|/rack1/node2|2|2| |/rack1/node1|/rack2/node2|4|2| |/pod1/rack1/node1|/pod1/rack1/node2|2|2| |/pod1/rack1/node1|/pod1/rack2/node2|4|2| |/pod1/rack1/node1|/pod2/rack2/node2|6|4| This bug will destroy data locality on clusters where the clients share racks with DataNodes, but are running on machines that aren't running DataNodes, such as striping federated HDFS clusters across racks. was: In 2.10, the metric for distance between the client and the data node is wrong for machines that aren't running data nodes (ie. getWeightUsingNetworkLocation). The code works correctly in 3.3+. Currently ||Client||DataNode ||getWeight||getWeightUsingNetworkLocation|| |/rack1/node1|/rack1/node1|0|0| |/rack1/node1|/rack1/node2|2|2| |/rack1/node1|/rack2/node2|4|2| |/pod1/rack1/node1|/pod1/rack1/node2|2|2| |/pod1/rack1/node1|/pod1/rack2/node2|4|2| |/pod1/rack1/node1|/pod2/rack2/node2|6|4| > In 2.10 the distance metric is wrong for non-DN machines > > > Key: HDFS-16517 > URL: https://issues.apache.org/jira/browse/HDFS-16517 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.10.1 >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > > In 2.10, the metric for distance between the client and the data node is > wrong for machines that aren't running data nodes (ie. > getWeightUsingNetworkLocation). The code works correctly in 3.3+. > Currently > > ||Client||DataNode||getWeight||getWeightUsingNetworkLocation|| > |/rack1/node1|/rack1/node1|0|0| > |/rack1/node1|/rack1/node2|2|2| > |/rack1/node1|/rack2/node2|4|2| > |/pod1/rack1/node1|/pod1/rack1/node2|2|2| > |/pod1/rack1/node1|/pod1/rack2/node2|4|2| > |/pod1/rack1/node1|/pod2/rack2/node2|6|4| > > This bug will destroy data locality on clusters where the clients share racks > with DataNodes, but are running on machines that aren't running DataNodes, > such as striping federated HDFS clusters across racks. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16517) In 2.10 the distance metric is wrong for non-DN machines
[ https://issues.apache.org/jira/browse/HDFS-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-16517: - External issue URL: https://github.com/apache/hadoop/pull/4091 > In 2.10 the distance metric is wrong for non-DN machines > > > Key: HDFS-16517 > URL: https://issues.apache.org/jira/browse/HDFS-16517 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.10.1 >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > > In 2.10, the metric for distance between the client and the data node is > wrong for machines that aren't running data nodes (ie. > getWeightUsingNetworkLocation). The code works correctly in 3.3+. > Currently > > ||Client||DataNode ||getWeight||getWeightUsingNetworkLocation|| > |/rack1/node1|/rack1/node1|0|0| > |/rack1/node1|/rack1/node2|2|2| > |/rack1/node1|/rack2/node2|4|2| > |/pod1/rack1/node1|/pod1/rack1/node2|2|2| > |/pod1/rack1/node1|/pod1/rack2/node2|4|2| > |/pod1/rack1/node1|/pod2/rack2/node2|6|4| > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16517) In 2.10 the distance metric is wrong for non-DN machines
Owen O'Malley created HDFS-16517: Summary: In 2.10 the distance metric is wrong for non-DN machines Key: HDFS-16517 URL: https://issues.apache.org/jira/browse/HDFS-16517 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.10.1 Reporter: Owen O'Malley Assignee: Owen O'Malley In 2.10, the metric for distance between the client and the data node is wrong for machines that aren't running data nodes (ie. getWeightUsingNetworkLocation). The code works correctly in 3.3+. Currently ||Client||DataNode ||getWeight||getWeightUsingNetworkLocation|| |/rack1/node1|/rack1/node1|0|0| |/rack1/node1|/rack1/node2|2|2| |/rack1/node1|/rack2/node2|4|2| |/pod1/rack1/node1|/pod1/rack1/node2|2|2| |/pod1/rack1/node1|/pod1/rack2/node2|4|2| |/pod1/rack1/node1|/pod2/rack2/node2|6|4| -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13248) RBF: Namenode need to choose block location for the client
[ https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-13248: - Fix Version/s: 3.4.0 2.10.2 3.3.3 Assignee: Owen O'Malley (was: Íñigo Goiri) Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the reviews, Inigo & Ayush! > RBF: Namenode need to choose block location for the client > -- > > Key: HDFS-13248 > URL: https://issues.apache.org/jira/browse/HDFS-13248 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Wu Weiwei >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 2.10.2, 3.3.3 > > Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, > HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, > HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality > Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg > > Time Spent: 5h 10m > Remaining Estimate: 0h > > When execute a put operation via router, the NameNode will choose block > location for the router, not for the real client. This will affect the file's > locality. > I think on both NameNode and Router, we should add a new addBlock method, or > add a parameter for the current addBlock method, to pass the real client > information. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16495) RBF should prepend the client ip rather than append it.
[ https://issues.apache.org/jira/browse/HDFS-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-16495: - Fix Version/s: 3.3.3 (was: 3.2.4) > RBF should prepend the client ip rather than append it. > --- > > Key: HDFS-16495 > URL: https://issues.apache.org/jira/browse/HDFS-16495 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.3 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Currently the Routers append the client ip to the caller context if and only > if it is not already set. This would allow the user to fake their ip by > setting the caller context. Much better is to prepend it unconditionally. > The NN must be able to trust the client ip from the caller context. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16495) RBF should prepend the client ip rather than append it.
[ https://issues.apache.org/jira/browse/HDFS-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16495. -- Fix Version/s: 3.4.0 3.2.4 Resolution: Fixed > RBF should prepend the client ip rather than append it. > --- > > Key: HDFS-16495 > URL: https://issues.apache.org/jira/browse/HDFS-16495 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Currently the Routers append the client ip to the caller context if and only > if it is not already set. This would allow the user to fake their ip by > setting the caller context. Much better is to prepend it unconditionally. > The NN must be able to trust the client ip from the caller context. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16495) RBF should prepend the client ip rather than append it.
Owen O'Malley created HDFS-16495: Summary: RBF should prepend the client ip rather than append it. Key: HDFS-16495 URL: https://issues.apache.org/jira/browse/HDFS-16495 Project: Hadoop HDFS Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley Currently the Routers append the client ip to the caller context if and only if it is not already set. This would allow the user to fake their ip by setting the caller context. Much better is to prepend it unconditionally. The NN must be able to trust the client ip from the caller context. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16253) Add a toString implementation to DFSInputStream
Owen O'Malley created HDFS-16253: Summary: Add a toString implementation to DFSInputStream Key: HDFS-16253 URL: https://issues.apache.org/jira/browse/HDFS-16253 Project: Hadoop HDFS Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley It would help debugging if there was a useful toString on DFSInputStream. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14244) refactor the libhdfs++ build system
[ https://issues.apache.org/jira/browse/HDFS-14244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-14244: - Summary: refactor the libhdfs++ build system (was: hdfs++ doesn't add necessary libraries to dynamic library link) Description: The current cmake for libhdfs++ has the source code for the dependent libraries. By refactoring we can remove 150kloc of third party code. (was: When linking with shared libraries, the libhdfs++ cmake file doesn't link correctly.) > refactor the libhdfs++ build system > --- > > Key: HDFS-14244 > URL: https://issues.apache.org/jira/browse/HDFS-14244 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs++, hdfs-client >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > > The current cmake for libhdfs++ has the source code for the dependent > libraries. By refactoring we can remove 150kloc of third party code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14244) hdfs++ doesn't add necessary libraries to dynamic library link
[ https://issues.apache.org/jira/browse/HDFS-14244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767482#comment-16767482 ] Owen O'Malley commented on HDFS-14244: -- No, it isn't. With BUILD_SHARED_LIBS=ON the libhdfspp build was broken. However in digging into this, I think that we need a pretty major refactoring of the libhdfspp build system. In particular: * Remove the source code from hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/third_party . * Fix support for shared/static libraries. * Use the packages installed on the system when they are available. * Add support for rpath on mac os. * Run the unit tests when building stand alone. * Use add_ExternalPackage for the projects that we need to build. * Incorporate the uriparser2 wrapper into libhdfspp, but use uriparser package. Most of the linux variants have uriparser. * Add a cpack definition for libhdfspp so that you can generate a binary artifact in the standalone build. * Support newer versions of asio. (The deadline_timer needs to be replaced with the steady_timer.) These will remove about 150kloc from Hadoop. :) > hdfs++ doesn't add necessary libraries to dynamic library link > -- > > Key: HDFS-14244 > URL: https://issues.apache.org/jira/browse/HDFS-14244 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs++, hdfs-client >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > > When linking with shared libraries, the libhdfs++ cmake file doesn't link > correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14244) hdfs++ doesn't add necessary libraries to dynamic library link
Owen O'Malley created HDFS-14244: Summary: hdfs++ doesn't add necessary libraries to dynamic library link Key: HDFS-14244 URL: https://issues.apache.org/jira/browse/HDFS-14244 Project: Hadoop HDFS Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley When linking with shared libraries, the libhdfs++ cmake file doesn't link correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14244) hdfs++ doesn't add necessary libraries to dynamic library link
[ https://issues.apache.org/jira/browse/HDFS-14244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-14244: - Component/s: hdfs-client hdfs++ > hdfs++ doesn't add necessary libraries to dynamic library link > -- > > Key: HDFS-14244 > URL: https://issues.apache.org/jira/browse/HDFS-14244 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs++, hdfs-client >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > > When linking with shared libraries, the libhdfs++ cmake file doesn't link > correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13534) libhdfs++: Fix GCC7 build
[ https://issues.apache.org/jira/browse/HDFS-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503913#comment-16503913 ] Owen O'Malley commented on HDFS-13534: -- [~James C] yes please create a new PR for https://issues.apache.org/jira/browse/ORC-375 to update the patch (after this one goes in, obviously). > libhdfs++: Fix GCC7 build > - > > Key: HDFS-13534 > URL: https://issues.apache.org/jira/browse/HDFS-13534 > Project: Hadoop HDFS > Issue Type: Task >Reporter: James Clampffer >Assignee: James Clampffer >Priority: Major > Attachments: HDFS-13534.000.patch, HDFS-13534.001.patch > > > After merging HDFS-13403 [~pifta] noticed the build broke on some platforms. > [~bibinchundatt] pointed out that prior to gcc 7 mutex, future, and regex > implicitly included functional. Without that implicit include the compiler > errors on the std::function in ioservice.h. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13534) libhdfs++: Fix GCC7 build
[ https://issues.apache.org/jira/browse/HDFS-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503914#comment-16503914 ] Owen O'Malley commented on HDFS-13534: -- +1 for the patch to go in here in HDFS. :) > libhdfs++: Fix GCC7 build > - > > Key: HDFS-13534 > URL: https://issues.apache.org/jira/browse/HDFS-13534 > Project: Hadoop HDFS > Issue Type: Task >Reporter: James Clampffer >Assignee: James Clampffer >Priority: Major > Attachments: HDFS-13534.000.patch, HDFS-13534.001.patch > > > After merging HDFS-13403 [~pifta] noticed the build broke on some platforms. > [~bibinchundatt] pointed out that prior to gcc 7 mutex, future, and regex > implicitly included functional. Without that implicit include the compiler > errors on the std::function in ioservice.h. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13534) libhdfs++: Fix GCC7 build
[ https://issues.apache.org/jira/browse/HDFS-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503465#comment-16503465 ] Owen O'Malley edited comment on HDFS-13534 at 6/6/18 3:38 PM: -- To get it working with gcc 7 for ORC's copy, I had to make two changes: {code:java} *** lib/common/async_stream.h~ 2017-08-30 07:56:51.0 -0700 --- lib/common/async_stream.h 2018-06-05 22:02:35.0 -0700 *** *** 20,25 --- 20,26 #define LIB_COMMON_ASYNC_STREAM_H_ #include + #include namespace hdfs { {code} {code:java} *** lib/rpc/request.h~ 2017-08-30 07:56:51.0 -0700 --- lib/rpc/request.h 2018-06-05 22:33:59.0 -0700 *** *** 22,27 --- 22,28 #include "common/util.h" #include "common/new_delete.h" + #include #include #include {code} Those don't seem to be in this patch. was (Author: owen.omalley): To get it working with gcc 7, I had to make two changes: {code:java} *** lib/common/async_stream.h~ 2017-08-30 07:56:51.0 -0700 --- lib/common/async_stream.h 2018-06-05 22:02:35.0 -0700 *** *** 20,25 --- 20,26 #define LIB_COMMON_ASYNC_STREAM_H_ #include + #include namespace hdfs { {code} {code:java} *** lib/rpc/request.h~ 2017-08-30 07:56:51.0 -0700 --- lib/rpc/request.h 2018-06-05 22:33:59.0 -0700 *** *** 22,27 --- 22,28 #include "common/util.h" #include "common/new_delete.h" + #include #include #include {code} Those don't seem to be in this patch. > libhdfs++: Fix GCC7 build > - > > Key: HDFS-13534 > URL: https://issues.apache.org/jira/browse/HDFS-13534 > Project: Hadoop HDFS > Issue Type: Task >Reporter: James Clampffer >Assignee: James Clampffer >Priority: Major > Attachments: HDFS-13534.000.patch, HDFS-13534.001.patch > > > After merging HDFS-13403 [~pifta] noticed the build broke on some platforms. > [~bibinchundatt] pointed out that prior to gcc 7 mutex, future, and regex > implicitly included functional. Without that implicit include the compiler > errors on the std::function in ioservice.h. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13534) libhdfs++: Fix GCC7 build
[ https://issues.apache.org/jira/browse/HDFS-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503465#comment-16503465 ] Owen O'Malley commented on HDFS-13534: -- To get it working with gcc 7, I had to make two changes: {code:java} *** lib/common/async_stream.h~ 2017-08-30 07:56:51.0 -0700 --- lib/common/async_stream.h 2018-06-05 22:02:35.0 -0700 *** *** 20,25 --- 20,26 #define LIB_COMMON_ASYNC_STREAM_H_ #include + #include namespace hdfs { {code} {code:java} *** lib/rpc/request.h~ 2017-08-30 07:56:51.0 -0700 --- lib/rpc/request.h 2018-06-05 22:33:59.0 -0700 *** *** 22,27 --- 22,28 #include "common/util.h" #include "common/new_delete.h" + #include #include #include {code} Those don't seem to be in this patch. > libhdfs++: Fix GCC7 build > - > > Key: HDFS-13534 > URL: https://issues.apache.org/jira/browse/HDFS-13534 > Project: Hadoop HDFS > Issue Type: Task >Reporter: James Clampffer >Assignee: James Clampffer >Priority: Major > Attachments: HDFS-13534.000.patch, HDFS-13534.001.patch > > > After merging HDFS-13403 [~pifta] noticed the build broke on some platforms. > [~bibinchundatt] pointed out that prior to gcc 7 mutex, future, and regex > implicitly included functional. Without that implicit include the compiler > errors on the std::function in ioservice.h. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12990) Change default NameNode RPC port back to 8020
[ https://issues.apache.org/jira/browse/HDFS-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350807#comment-16350807 ] Owen O'Malley commented on HDFS-12990: -- Ok, I'm late on this. I'm strongly on the side of changing it back. In my view, this should absolutely be a blocker bug on any release. We *need* to change the port back because it is part of the public API. As always, you need to view the relative costs: # Users who have installed 3.0.0 and will be inconvenienced by the change. # All other users of Hadoop. Clearly bucket number 2 is much much larger than 1. Let's also be clear that the release manager can put it in or out of the RC as they deem fit. There are NO vetoes. It is a straight vote for whether the RC should be released. Personally, I'll vote against an RC that doesn't have this patch. > Change default NameNode RPC port back to 8020 > - > > Key: HDFS-12990 > URL: https://issues.apache.org/jira/browse/HDFS-12990 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-12990.01.patch > > > In HDFS-9427 (HDFS should not default to ephemeral ports), we changed all > default ports to ephemeral ports, which is very appreciated by admin. As part > of that change, we also modified the NN RPC port from the famous 8020 to > 9820, to be closer to other ports changed there. > With more integration going on, it appears that all the other ephemeral port > changes are fine, but the NN RPC port change is painful for downstream on > migrating to Hadoop 3. Some examples include: > # Hive table locations pointing to hdfs://nn:port/dir > # Downstream minicluster unit tests that assumed 8020 > # Oozie workflows / downstream scripts that used 8020 > This isn't a problem for HA URLs, since that does not include the port > number. But considering the downstream impact, instead of requiring all of > them change their stuff, it would be a way better experience to leave the NN > port unchanged. This will benefit Hadoop 3 adoption and ease unnecessary > upgrade burdens. > It is of course incompatible, but giving 3.0.0 is just out, IMO it worths to > switch the port back. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7240) Object store in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341429#comment-16341429 ] Owen O'Malley commented on HDFS-7240: - I think that the major contribution of this work is pulling out the block management layer and the naming should reflect that. I'd propose that: * Ozone should be the object store * The block layer should have a different name such as Hadoop Storage Layer (HSL). > Object store in HDFS > > > Key: HDFS-7240 > URL: https://issues.apache.org/jira/browse/HDFS-7240 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey >Priority: Major > Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, > HDFS-7240.002.patch, HDFS-7240.003.patch, HDFS-7240.003.patch, > HDFS-7240.004.patch, HDFS-7240.005.patch, HDFS-7240.006.patch, > HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, > Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf > > > This jira proposes to add object store capabilities into HDFS. > As part of the federation work (HDFS-1052) we separated block storage as a > generic storage layer. Using the Block Pool abstraction, new kinds of > namespaces can be built on top of the storage layer i.e. datanodes. > In this jira I will explore building an object store using the datanode > storage, but independent of namespace metadata. > I will soon update with a detailed design document. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9525) hadoop utilities need to support provided delegation tokens
[ https://issues.apache.org/jira/browse/HDFS-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116289#comment-15116289 ] Owen O'Malley commented on HDFS-9525: - [~daryn] I'm sorry, but I don't see what problem the patch introduced. It lets your webhdfs have a token even if your security is turned off as long as it was already in the UGI. Where is the problem? > hadoop utilities need to support provided delegation tokens > --- > > Key: HDFS-9525 > URL: https://issues.apache.org/jira/browse/HDFS-9525 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: HeeSoo Kim >Priority: Blocker > Fix For: 3.0.0 > > Attachments: HDFS-7984.001.patch, HDFS-7984.002.patch, > HDFS-7984.003.patch, HDFS-7984.004.patch, HDFS-7984.005.patch, > HDFS-7984.006.patch, HDFS-7984.007.patch, HDFS-7984.patch, > HDFS-9525.008.patch, HDFS-9525.009.patch, HDFS-9525.009.patch, > HDFS-9525.branch-2.008.patch, HDFS-9525.branch-2.009.patch > > > When using the webhdfs:// filesystem (especially from distcp), we need the > ability to inject a delegation token rather than webhdfs initialize its own. > This would allow for cross-authentication-zone file system accesses. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908540#comment-14908540 ] Owen O'Malley commented on HDFS-8855: - This is ok. +1 I'm a little concerned about the runtime performance of generating the string of the identifier on every connection to the datanode, but this should be correct. > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Attachments: HDFS-8855.005.patch, HDFS-8855.1.patch, > HDFS-8855.2.patch, HDFS-8855.3.patch, HDFS-8855.4.patch, > HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804547#comment-14804547 ] Owen O'Malley commented on HDFS-8855: - A few points: * You need to use the Token.getKind(), Token.getIdentifier(), and Token.getPassword() as the key for the cache. The patch currently uses Token.toString, which uses the identifier, kind, and service. The service is set by the client so it shouldn't be part of the match. The password on the other hand must be part of the match so that guessing the identifier doesn't allow a hacker to impersonate the user. * The timeout should default to 10 minutes instead of 10 seconds. * Please fix the checkstyle and findbugs warnings. * Determine what is wrong with the test case. Other than that, it looks good. > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Attachments: HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803351#comment-14803351 ] Owen O'Malley commented on HDFS-8855: - I'm looking at the patch, but you'll need to resolve the checkstyle, findbugs, and test case failures. > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Attachments: HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9025) fix compilation issues on arch linux
[ https://issues.apache.org/jira/browse/HDFS-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-9025: Status: Patch Available (was: Open) > fix compilation issues on arch linux > > > Key: HDFS-9025 > URL: https://issues.apache.org/jira/browse/HDFS-9025 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HDFS-9025.patch > > > There are several compilation issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9025) fix compilation issues on arch linux
[ https://issues.apache.org/jira/browse/HDFS-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-9025: Attachment: HDFS-9025.patch fix minor problems. > fix compilation issues on arch linux > > > Key: HDFS-9025 > URL: https://issues.apache.org/jira/browse/HDFS-9025 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HDFS-9025.patch > > > There are several compilation issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9025) fix compilation issues on arch linux
[ https://issues.apache.org/jira/browse/HDFS-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HDFS-9025: --- Assignee: Owen O'Malley > fix compilation issues on arch linux > > > Key: HDFS-9025 > URL: https://issues.apache.org/jira/browse/HDFS-9025 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Owen O'Malley >Assignee: Owen O'Malley > > There are several compilation issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9025) fix compilation issues on arch linux
Owen O'Malley created HDFS-9025: --- Summary: fix compilation issues on arch linux Key: HDFS-9025 URL: https://issues.apache.org/jira/browse/HDFS-9025 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Owen O'Malley There are several compilation issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8736) ability to deny access to different filesystems
[ https://issues.apache.org/jira/browse/HDFS-8736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14619894#comment-14619894 ] Owen O'Malley commented on HDFS-8736: - I agree with Allen. Preventing access to the LocalFileSystem doesn't help anything. The Hadoop security model depends on having unix user ids or more recently Linux containers. > ability to deny access to different filesystems > --- > > Key: HDFS-8736 > URL: https://issues.apache.org/jira/browse/HDFS-8736 > Project: Hadoop HDFS > Issue Type: Improvement > Components: security >Affects Versions: 2.5.0 >Reporter: Purvesh Patel >Priority: Minor > Labels: security > Attachments: Patch.pdf > > > In order to run in a secure context, ability to deny access to different > filesystems(specifically the local file system) to non-trusted code this > patch adds a new SecurityPermission class(AccessFileSystemPermission) and > checks the permission in FileSystem#get before returning a cached file system > or creating a new one. Please see attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8707) Implement an async pure c++ HDFS client
Owen O'Malley created HDFS-8707: --- Summary: Implement an async pure c++ HDFS client Key: HDFS-8707 URL: https://issues.apache.org/jira/browse/HDFS-8707 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Reporter: Owen O'Malley Assignee: Haohui Mai As part of working on the C++ ORC reader at ORC-3, we need an HDFS pure C++ client that lets us do async io to HDFS. We want to start from the code that Haohui's been working on at https://github.com/haohui/libhdfspp . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107194#comment-14107194 ] Owen O'Malley commented on HDFS-3689: - One follow up is that fixing MapReduce to use the actual block boundaries rather than dividing up the file in fixed size splits would not be difficult and would make the generated file splits for ORC and other block compressed files much much better. Furthermore, note that we could remove the need for lzo and zlib index files for text files by having TextOutputFormat cut the block at a line boundary and flush the compression codec. Thus TextInputFormat could divide the file at block boundaries and have them align at both a compression chunk boundary and a line break. That would be *great*. > Add support for variable length block > - > > Key: HDFS-3689 > URL: https://issues.apache.org/jira/browse/HDFS-3689 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs-client, namenode >Affects Versions: 3.0.0 >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas > Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch > > > Currently HDFS supports fixed length blocks. Supporting variable length block > will allow new use cases and features to be built on top of HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107176#comment-14107176 ] Owen O'Malley commented on HDFS-3689: - Since this is a discussion of what to put into trunk, incompatible changes aren't a blocker. Furthermore, most clients would never see the difference. Variable length blocks would dramatically improve the ability of HDFS to support better file formats like ORC. On the other hand, I've had very bad experiences with sparse files on Unix. It is all too easy for a user to copy a sparse file and not understand that the copy is 10x larger than the original. That would be *bad* and I do not think that HDFS should support it at all. > Add support for variable length block > - > > Key: HDFS-3689 > URL: https://issues.apache.org/jira/browse/HDFS-3689 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs-client, namenode >Affects Versions: 3.0.0 >Reporter: Suresh Srinivas >Assignee: Suresh Srinivas > Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch > > > Currently HDFS supports fixed length blocks. Supporting variable length block > will allow new use cases and features to be built on top of HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045344#comment-14045344 ] Owen O'Malley commented on HDFS-6134: - In the discussion today, we covered lots of ground. Todd proposed that Alejandro add a virtual ".raw" directory to the top level of each encryption zone. This would allow processes that want access to read or write the data within the encryption zone an access path that doesn't require modifying the FileSystem API. With that change, I'm -0 to adding encryption in to HDFS. I still think that our users would be far better served by adding encryption/compression layers above HDFS rather than baking them into HDFS, but I'm not going to block the work. By adding the work directly into HDFS, Alejandro and the others working on this are signing up for a high level of QA at scale before this is committed. A couple of other points came up: * symbolic links in conjunction with cryptofs would allow users to use hdfs urls to access encrypted hdfs files. * there must be an hdfs admin command to list the crypto zones to support auditing * There are significant scalability concerns about each tasks requesting decryption of each file key. In particular, if a job has 100,000 tasks and each opens 1000 files, that is 100 million key requests. The current design is unlikely to scale correctly. * the kms needs its own delegation tokens and hooks so that yarn will renew and cancel them. * there are three levels of key rolling: ** leaving old data alone and writing new data with the new key ** re-writing the data with the new key ** re-encoding the per file key (personally this seems pointless) > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044821#comment-14044821 ] Owen O'Malley commented on HDFS-6134: - Alejandro, I was just trying to say that I'd met him and was familiar with his work history. If it sounded rude or dismissive, that was unintended. I'm sorry. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044797#comment-14044797 ] Owen O'Malley commented on HDFS-6134: - Mike, I remember you from when I interviewed you. You are talking about collisions between IV's, not key space. By using either 32 bytes of randomness (if someone is worried about crypto attacks there is no excuse not to use AES256), there is *NO* possibility of collision even assuming an insanely bad practice of using a single key version for a huge number of files. I obviously understand and applied the birthday paradox to get the numbers. Note that we *already* have key rolling and the key is already a random string of bytes. Adding additional layers of randomness just gives the appearance of more security. That may be wonderful in the closed source security world, but it actively harmful in open source. In open source having a clear implementation that is open for inspection is by far the best protection. Note that the other issue with not using the keys as intended is that many Hadoop users launch jobs that read millions of files. We can't afford to have the client fetch a different key for each of those millions of files. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044428#comment-14044428 ] Owen O'Malley commented on HDFS-6134: - Sorry, I messed up my math. Assuming that you have 1million files per key and 8 bytes of randomness, you get 2.7e-8, which is close enough to 0. At 16 bytes or 32 bytes of randomness, doubles underflow when calculating the percentage. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044423#comment-14044423 ] Owen O'Malley commented on HDFS-6134: - Alejandro, you don't need and shouldn't implement any of the DEK stuff. AES-CTR is more than adequate. Rather than use 16 bytes of randomness and 16 bytes of counter, use 32 bytes of randomness and just add the counter to it rather than concatenate. Let's take the extreme case of 1million files with the same key version. If you have 32 bits of randomness, that leads you to a collision chance that is basically 100%. With 64 bits of randomness that drops to 2.7e-8, which is close enough to 0. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044109#comment-14044109 ] Owen O'Malley commented on HDFS-6134: - Any chance for the PA office? Otherwise I'll be dialing in. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043893#comment-14043893 ] Owen O'Malley commented on HDFS-6134: - {quote} Owen, that is NOT transparent. {quote} Transparent means that you shouldn't have to change your application code. Hacking HDFS to add encryption is transparent for one set of apps, but completely breaks others. Changing URLs requires no code changes to any apps. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043836#comment-14043836 ] Owen O'Malley commented on HDFS-6134: - Todd, it is *still* transparent encryption if you use cfs:// instead of hdfs://. The important piece is that the application doesn't need to change to access the decrypted storage. My problem is by refusing to layer the change over the storage layer, this jira is making much disruptive and unnecessary changes to the critical infrastructure and its API. NSE is whole disk encryption and is equivalent to using lm-crypt to encrypt the block files. That level of encryption is always very transparent and is already available in HDFS without a code change. Aaron, I can't do a meeting tomorrow afternoon. How about tomorrow morning? Say 10am-noon? > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043839#comment-14043839 ] Owen O'Malley commented on HDFS-6134: - I'll also point out that I've provided a solution that doesn't change the HDFS core and still lets you use your hdfs urls with encryption... Finally, adding compression to the crypto file system would be a great addition and *still* not require any changes to HDFS or its API. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043766#comment-14043766 ] Owen O'Malley commented on HDFS-6134: - {quote} I don’t see a previous -1 in any of the related JIRAs. {quote} I had consistently stated objections and some of them have been addressed, but the fundamentals have become clear through this jira. I am always hesitant to use a -1 and I certainly don't do so lightly. Through the discussion, my opinion is transparent encryption in HDFS is a *really* bad idea. Let's run through the case: The one claimed benefit of integrating encryption into HDFS is that the user doesn't need to change the URLs that they use. I believe this to be a *disadvantage* because it hides the fact that these files are encrypted. That said, a better approach if that is the desired goal is to create a *NEW* filter filesystem that the user can configure to respond to hdfs urls that does silent encryption. This imposes *NO* penalty on people who don't want encryption and does not require hacks to the FileSystem API. {quote} FileSystem will had a new create()/open() signature to support this, if you have access to the file but not the key, you can use the new signatures to copy files as per the usecase you are mentioning. {quote} This will break every backup application. Some of them, such as HAR and DistCp you can hack to handle HDFS as a special case, but this kind of special casing always comes back to haunt us as a project. Changing FileSystem API is a really bad idea and inducing more differences between the various implementations will create many more problems than you are trying to avoid. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042452#comment-14042452 ] Owen O'Malley commented on HDFS-6134: - As Sanjay proposed, I think it would be great to get together and discuss the issues in person. Would a meeting this week work for you Alejandro? > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042366#comment-14042366 ] Owen O'Malley commented on HDFS-6134: - I'm still -1 to adding this to HDFS. Having a layered file system is a much cleaner approach. Issues: * The user needs to be able move, copy, and distribute the directories without the key. I should be able to set up a falcon or oozie job that copies directories where the user doing the copy has *NO* potential access to the key material. This is a critical security constraint. * A critical use case for encryption is when hdfs admins should not have access to the contents of some files. Encryption is the only way to implement that since the hdfs admins always have file permissions to both the hdfs files and the underlying block files. * We shouldn't change the filesystem API to deal with encryption, because we have a solution that doesn't require the change and will be far less confusing to users. In particular, we shouldn't add hacks to read/write unencrypted bytes to HDFS. * Each file needs to record the key version and original IV as written up in the CFS design document. The IV should be incremented for each block, but must start at a random number. As Alejandro pointed out this is required for strong security. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, > HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034155#comment-14034155 ] Owen O'Malley commented on HDFS-6134: - Alejandro, this is *exactly* equivalent of the delegation token. If a job is opening side files, it needs to make sure it has the right delegation tokens and keys. For delegation tokens, we added an extra config option for listing the extra file systems. The same solution (or listing the extra key versions) would make sense. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataAtRestEncryption.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034016#comment-14034016 ] Owen O'Malley commented on HDFS-6134: - Alejandro, which use cases don't know their inputs or outputs? Clearly the main ones do know their input and output: * MapReduce * Hive * Pig It is important for the standard cases that we get the encryption keys up front instead of letting the horde of containers do it. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataAtRestEncryption.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033981#comment-14033981 ] Owen O'Malley commented on HDFS-6134: - A follow up on that is that of course KMS will need proxy users so that Oozie will be able to get keys for the users. (If that is desired.) > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataAtRestEncryption.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033975#comment-14033975 ] Owen O'Malley commented on HDFS-6134: - The right way to do this is to have the Yarn job submission get the appropriate keys from KMS like it currently gets delegation tokens. Both the delegation tokens and the keys should be put into the job's credential object. That way you don't have all 100,000 containers hitting the KMS at once. It does mean we need a new interface for filesystems that given a list of paths, you ensure the keys are in a credential object. FileInputFormat and FileOutputFormat should check to see if the FileSystem implements that interface and pass in the job's credential object. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataAtRestEncryption.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028086#comment-14028086 ] Owen O'Malley commented on HDFS-6134: - I still have two very strong concerns with this work: * A critical use case is that distcp (and other backup/disaster recovery tools) must be able to accurately copy files without access to the encryption keys. There are many cases when the automated backup tools are not permitted the encryption keys. Obviously, it also has the benefit of being both safer and faster if the data is moved in the original encryption. * The client needs to get the key material directly and not use the NameNode as a proxy. This is critical from a security point of view. ** The security (including the audit log) on the key server is much stronger if there are no proxies between the user and the key server. ** Security bugs in HDFS or mistakes in setting permissions are a critical use case for requiring encryption. Doing all of the work on the client (including getting the key) makes the entire much more secure. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataAtRestEncryption.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995588#comment-13995588 ] Owen O'Malley commented on HDFS-6134: - What are the use cases this is trying to address? What are the attacks? Do users or administrators set the encryption? Can different directories have different keys or is it one key for the entire filesystem? When you rename a directory does it need to be re-encrypted? How are backups handled? Does it require the encryption key? What is the performance impact on distcp when not using native libraries? For release in the Hadoop 2.x line, you need to preserve both forward and backwards wire compatibility. How do you plan to address that? It seems that the additional datanode and client complexity is prohibitive. Making changes to the HDFS write and read pipeline is extremely touchy. > Transparent data at rest encryption > --- > > Key: HDFS-6134 > URL: https://issues.apache.org/jira/browse/HDFS-6134 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 2.3.0 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: HDFSDataAtRestEncryption.pdf > > > Because of privacy and security regulations, for many industries, sensitive > data at rest must be in encrypted form. For example: the healthcare industry > (HIPAA regulations), the card payment industry (PCI DSS regulations) or the > US government (FISMA regulations). > This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can > be used transparently by any application accessing HDFS via Hadoop Filesystem > Java API, Hadoop libhdfs C library, or WebHDFS REST API. > The resulting implementation should be able to be used in compliance with > different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Closed] (HDFS-5852) Change the colors on the hdfs UI
[ https://issues.apache.org/jira/browse/HDFS-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley closed HDFS-5852. --- Assignee: (was: stack) > Change the colors on the hdfs UI > > > Key: HDFS-5852 > URL: https://issues.apache.org/jira/browse/HDFS-5852 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: stack >Priority: Blocker > Labels: webui > Fix For: 2.3.0 > > Attachments: HDFS-5852.best.txt, HDFS-5852v2.txt, > HDFS-5852v3-dkgreen.txt, color-rationale.png, compromise_gray.png, > dkgreen.png, hdfs-5852.txt, new_hdfsui_colors.png > > > The HDFS UI colors are too close to HWX green. > Here is a patch that steers clear of vendor colors. > I made it a blocker thinking this something we'd want to fix before we > release apache hadoop 2.3.0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (HDFS-5852) Change the colors on the hdfs UI
[ https://issues.apache.org/jira/browse/HDFS-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887557#comment-13887557 ] Owen O'Malley edited comment on HDFS-5852 at 1/31/14 7:58 AM: -- Just to show how *inane* this is jira is, here are the colors measurements of Hortonwork's green, Cloudera Blue, and the original color using Mac's digital color meter: || Attribute || Hortonworks Green || HDFS Color || Cloudera Blue || | L | 69.52 | 33.94 | 34.4 | | A | -44.68 | -22.34 | -17.68 | | B | 60.91 | 26.88 | -19.98 | so Hortonworks Green - HDFS Color = 35.58 + 22.34 + 34.03 = 91.95 and Cloudera Blue - HDFS Color = 0.46 + 4.66 + 46.86 = 51.98 Clearly we need to make the color greener to denote additional stability. was (Author: owen.omalley): Just to show how *inane* this is jira is, here are the colors measurements of Hortonwork's green, Cloudera Blue, and the original color using Mac's digital color meter: || Attribute || Hortonworks Green || HDFS Color || Cloudera Blue || | L | 69.52 | 33.94 | 34.4 | | A | 44.68 | -22.34 | -17.68 | | B | 60.91 | 26.88 | -19.98 | so Hortonworks Green - HDFS Color = 35.58 + 22.34 + 34.03 = 91.95 and Cloudera Blue - HDFS Color = 0.46 + 4.66 + 46.86 = 51.98 Clearly we need to make the color greener to denote additional stability. > Change the colors on the hdfs UI > > > Key: HDFS-5852 > URL: https://issues.apache.org/jira/browse/HDFS-5852 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Blocker > Labels: webui > Fix For: 2.3.0 > > Attachments: HDFS-5852.best.txt, HDFS-5852v2.txt, > HDFS-5852v3-dkgreen.txt, color-rationale.png, compromise_gray.png, > dkgreen.png, hdfs-5852.txt, new_hdfsui_colors.png > > > The HDFS UI colors are too close to HWX green. > Here is a patch that steers clear of vendor colors. > I made it a blocker thinking this something we'd want to fix before we > release apache hadoop 2.3.0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5852) Change the colors on the hdfs UI
[ https://issues.apache.org/jira/browse/HDFS-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887557#comment-13887557 ] Owen O'Malley commented on HDFS-5852: - Just to show how *inane* this is jira is, here are the colors measurements of Hortonwork's green, Cloudera Blue, and the original color using Mac's digital color meter: || Attribute || Hortonworks Green || HDFS Color || Cloudera Blue || | L | 69.52 | 33.94 | 34.4 | | A | 44.68 | -22.34 | -17.68 | | B | 60.91 | 26.88 | -19.98 | so Hortonworks Green - HDFS Color = 35.58 + 22.34 + 34.03 = 91.95 and Cloudera Blue - HDFS Color = 0.46 + 4.66 + 46.86 = 51.98 Clearly we need to make the color greener to denote additional stability. > Change the colors on the hdfs UI > > > Key: HDFS-5852 > URL: https://issues.apache.org/jira/browse/HDFS-5852 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Blocker > Labels: webui > Fix For: 2.3.0 > > Attachments: HDFS-5852.best.txt, HDFS-5852v2.txt, > HDFS-5852v3-dkgreen.txt, color-rationale.png, compromise_gray.png, > dkgreen.png, hdfs-5852.txt, new_hdfsui_colors.png > > > The HDFS UI colors are too close to HWX green. > Here is a patch that steers clear of vendor colors. > I made it a blocker thinking this something we'd want to fix before we > release apache hadoop 2.3.0. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5143) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842559#comment-13842559 ] Owen O'Malley commented on HDFS-5143: - We need to break this work down in to smaller units of work. Jiras with a tighter focus will provide a more focused discussion and allow us to make progress and accomplish our shared goal of enabling Hadoop users to use encryption in their applications without changing each individual input and output format. * The key management needs to be much more flexible and I've created HADOOP-10141 to work on it. * The ByteBufferCipher API should be a separate jira, so I've created HADOOP-10149. * Once HADOOP-10149 is resolved, we can work together on a jni-based implementation of it. > Hadoop cryptographic file system > > > Key: HDFS-5143 > URL: https://issues.apache.org/jira/browse/HDFS-5143 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5143) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821728#comment-13821728 ] Owen O'Malley commented on HDFS-5143: - [~hitliuyi] In the design document, the IV was always 0, but in the comments you are suggesting putting a random IV in the start of the underlying file. I think that the security advantage of having a random IV is relatively small and we'd do better without it. It only protects against having multiple files with the same key and the same plain text co-located in the file. I think that putting it at the front of the file has a couple of disadvantages: * Any read of the file has to read the beginning 16 bytes of the file. * Block boundaries are offset from the expectation. This will cause MapReduce input splits to straddle blocks in cases that wouldn't otherwise require it. I think we should always have an IV of 0 or alternatively encode it in the underlying filesystem's filenames. In particular, we could base 64 encode the IV and append it onto the filename. If we add 16 characters of base64 that would give use 96 bits of IV and it would be easy to strip off. It would look like: cfs://hdfs@nn/dir1/dir2/file -> hdfs://nn/dir1/dir2/file_1234567890ABCDEF > Hadoop cryptographic file system > > > Key: HDFS-5143 > URL: https://issues.apache.org/jira/browse/HDFS-5143 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5143) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821679#comment-13821679 ] Owen O'Malley commented on HDFS-5143: - [~avik_...@yahoo.com] I'm not misquoting you. You were very clear that you weren't planning on working on this in the immediate future and that instead you wanted to change all of the file formats. > Hadoop cryptographic file system > > > Key: HDFS-5143 > URL: https://issues.apache.org/jira/browse/HDFS-5143 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5143) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-5143: Status: Open (was: Patch Available) It should only be marked Patch Available when Yi thinks it is ready to commit. > Hadoop cryptographic file system > > > Key: HDFS-5143 > URL: https://issues.apache.org/jira/browse/HDFS-5143 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5143) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-5143: Assignee: Yi Liu (was: Owen O'Malley) It wasn't assigned and no one seemed to be working on this. Talking to Avik at Strata, he said no one was going to be working on this for 9 months. I'm glad to see that Yi has posted a patch. > Hadoop cryptographic file system > > > Key: HDFS-5143 > URL: https://issues.apache.org/jira/browse/HDFS-5143 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HDFS-5143) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HDFS-5143: --- Assignee: Owen O'Malley > Hadoop cryptographic file system > > > Key: HDFS-5143 > URL: https://issues.apache.org/jira/browse/HDFS-5143 > Project: Hadoop HDFS > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Owen O'Malley > Labels: rhino > Fix For: 3.0.0 > > Attachments: HADOOP cryptographic file system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HDFS-3699) HftpFileSystem should try both KSSL and SPNEGO when authentication is required
[ https://issues.apache.org/jira/browse/HDFS-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-3699. - Resolution: Won't Fix Using KSSL is strongly deprecated and should be avoided in secure clusters. > HftpFileSystem should try both KSSL and SPNEGO when authentication is required > -- > > Key: HDFS-3699 > URL: https://issues.apache.org/jira/browse/HDFS-3699 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: eric baldeschwieler > > See discussion in HDFS-2617 (Replaced Kerberized SSL for image transfer and > fsck with SPNEGO-based solution). > To handle the transition from Hadoop1.0 systems running KSSL authentication > to Hadoop systems running SPNEGO, it would be good to fix the client in both > 1 and 2 to try SPNEGO and then fall back to try KSSL. > This will allow organizations that are running a lot of Hadoop 1.0 to > gradually transition over, without needing to convert all clusters at the > same time. They would first need to update their 1.0 HFTP clients (and > 2.0/0.23 if they are already running those) and then they could copy data > between clusters without needing to move all clusters to SPNEGO in a big bang. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Closed] (HDFS-3699) HftpFileSystem should try both KSSL and SPNEGO when authentication is required
[ https://issues.apache.org/jira/browse/HDFS-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley closed HDFS-3699. --- > HftpFileSystem should try both KSSL and SPNEGO when authentication is required > -- > > Key: HDFS-3699 > URL: https://issues.apache.org/jira/browse/HDFS-3699 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: eric baldeschwieler > > See discussion in HDFS-2617 (Replaced Kerberized SSL for image transfer and > fsck with SPNEGO-based solution). > To handle the transition from Hadoop1.0 systems running KSSL authentication > to Hadoop systems running SPNEGO, it would be good to fix the client in both > 1 and 2 to try SPNEGO and then fall back to try KSSL. > This will allow organizations that are running a lot of Hadoop 1.0 to > gradually transition over, without needing to convert all clusters at the > same time. They would first need to update their 1.0 HFTP clients (and > 2.0/0.23 if they are already running those) and then they could copy data > between clusters without needing to move all clusters to SPNEGO in a big bang. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Closed] (HDFS-3983) Hftp should support both SPNEGO and KSSL
[ https://issues.apache.org/jira/browse/HDFS-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley closed HDFS-3983. --- Assignee: (was: Eli Collins) > Hftp should support both SPNEGO and KSSL > > > Key: HDFS-3983 > URL: https://issues.apache.org/jira/browse/HDFS-3983 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.0.0-alpha >Reporter: Eli Collins >Priority: Blocker > Attachments: hdfs-3983.txt, hdfs-3983.txt > > > Hftp currently doesn't work against a secure cluster unless you configure > {{dfs.https.port}} to be the http port, otherwise the client can't fetch > tokens: > {noformat} > $ hadoop fs -ls hftp://c1225.hal.cloudera.com:50070/ > 12/09/26 18:02:00 INFO fs.FileSystem: Couldn't get a delegation token from > http://c1225.hal.cloudera.com:50470 using http. > ls: Security enabled but user not authenticated by filter > {noformat} > This is due to Hftp still using the https port. Post HDFS-2617 it should use > the regular http port. Hsftp should still use the secure port, however now > that we have HADOOP-8581 it's worth considering removing Hsftp entirely. I'll > start a separate thread about that. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HDFS-3983) Hftp should support both SPNEGO and KSSL
[ https://issues.apache.org/jira/browse/HDFS-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-3983. - Resolution: Won't Fix Target Version/s: (was: ) KSSL is deprecated and should never be used for secure deployments. > Hftp should support both SPNEGO and KSSL > > > Key: HDFS-3983 > URL: https://issues.apache.org/jira/browse/HDFS-3983 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 2.0.0-alpha >Reporter: Eli Collins >Assignee: Eli Collins >Priority: Blocker > Attachments: hdfs-3983.txt, hdfs-3983.txt > > > Hftp currently doesn't work against a secure cluster unless you configure > {{dfs.https.port}} to be the http port, otherwise the client can't fetch > tokens: > {noformat} > $ hadoop fs -ls hftp://c1225.hal.cloudera.com:50070/ > 12/09/26 18:02:00 INFO fs.FileSystem: Couldn't get a delegation token from > http://c1225.hal.cloudera.com:50470 using http. > ls: Security enabled but user not authenticated by filter > {noformat} > This is due to Hftp still using the https port. Post HDFS-2617 it should use > the regular http port. Hsftp should still use the secure port, however now > that we have HADOOP-8581 it's worth considering removing Hsftp entirely. I'll > start a separate thread about that. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5191) revisit zero-copy API in FSDataInputStream to make it more intuitive
[ https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766009#comment-13766009 ] Owen O'Malley commented on HDFS-5191: - +1 for EnumSet > revisit zero-copy API in FSDataInputStream to make it more intuitive > > > Key: HDFS-5191 > URL: https://issues.apache.org/jira/browse/HDFS-5191 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, libhdfs >Affects Versions: HDFS-4949 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > > As per the discussion on HDFS-4953, we should revisit the zero-copy API to > make it more intuitive for new users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap
[ https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764428#comment-13764428 ] Owen O'Malley commented on HDFS-4953: - {quote} You can't know ahead of time whether your call to mmap will succeed. As I said, mmap can fail, for dozens of reasons. And of course blocks move over time. There is a fundamental "time of check, time of use" (TOCTOU) race condition in this kind of API. {quote} Ok, I guess I'm fine with the exception assuming the user passed in a null factory. It will be expensive in terms of time, but it won't affect the vast majority of users. {quote} is it necessary to read 200 MB at a time to decode the ORC file format? {quote} Actually, yes. The set of rows that are written together is large (typically ~200MB) so that reading them is efficient. For a 100 column table, that means that you have all of the values for column 1 in the first ~2MB, followed by all of the values for column 2 in the next 2MB, etc. To read the first row, you need all 100 of the 2MB sections. Obviously mmapping this is much more efficient, because the pages of the file can be brought in as needed. {quote} There is already a method named FSDataInputStream#read(ByteBuffer buf) in FSDataInputStream. If we create a new method named FSDataInputStream#readByteBuffer, I would expect there to be some confusion between the two. That's why I proposed FSDataInputStream#readZero for the new name. Does that make sense? {quote} I see your point, but readZero, which sounds like it just fills zeros into a byte buffer, doesn't convey the right meaning. The fundamental action that the user is taking is in fact read. I'd propose that we overload it with the other read and comment it saying that this read supports zero copy while the other doesn't. How does this look? {code} /** * Read a byte buffer from the stream using zero copy if possible. Typically the read will return * maxLength bytes, but it may return fewer at the end of the file system block or the end of the * file. * @param factory a factory that creates ByteBuffers for the read if the region of the file can't be * mmapped. * @param maxLength the maximum number of bytes that will be returned * @return a ByteBuffer with between 1 and maxLength bytes from the file. The buffer should be released * to the stream when the user is done with it. */ public ByteBuffer read(ByteBufferFactory factory, int maxLength) throws IOException; {code} {quote} I'd like to get some other prospective zero-copy API users to comment on whether they like the wrapper object or the DFSInputStream#releaseByteBuffer approach better... {quote} Uh, that is exactly what is happening. I'm a user who is trying to use this interface for a very typical use case of quickly reading bytes that may or may not be on the local machine. I also care a lot about APIs and have been working on Hadoop for 7.75 years. {quote} If, instead of returning a ByteBuffer from the readByteBuffer call, we returned a ZeroBuffer object wrapping the ByteBuffer, we could simply call ZeroBuffer#close() {quote} Users don't want to make interfaces for reading from some Hadoop type named ZeroBuffer. The user wants a ByteBuffer because it is a standard Java type. To make this concrete and crystal clear, I have to make Hive and ORC work with both Hadoop 1.x and Hadoop 2.x. Therefore, if you use a non-standard type I need to wrap it in a shim. That sucks. Especially, if it is in the inner loop, which this absolutely would be. I *need* a ByteBuffer because I can make a shim that always returns a ByteBuffer that works regardless of which version of Hadoop that the user is using. > enable HDFS local reads via mmap > > > Key: HDFS-4953 > URL: https://issues.apache.org/jira/browse/HDFS-4953 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 2.3.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: HDFS-4949 > > Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, > HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, > HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch > > > Currently, the short-circuit local read pathway allows HDFS clients to access > files directly without going through the DataNode. However, all of these > reads involve a copy at the operating system level, since they rely on the > read() / pread() / etc family of kernel interfaces. > We would like to enable HDFS to read local files via mmap. This would enable > truly zero-copy reads. > In the initial implementation, zero-copy reads will only be performed when > checksums were disabled. Later, we can use the DataNode's cache awareness to > only perform zero-co
[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap
[ https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763526#comment-13763526 ] Owen O'Malley commented on HDFS-4953: - {quote} This seems like a fairly arbitrary requirement {quote} Actually, it is a standard best practice. Building exception objects is *expensive* and handling exceptions is error-prone. {quote} Unfortunately, zero copy being available is not a binary thing. It might be available sometimes, but not other times. {quote} I clearly stated in the previous comment it was about whether the zero copy was available AT THE CURRENT POINT IN THE STREAM. {quote} This is the point of the fallback buffer-- it will be used when a block boundary, non-local block, etc. prevents the request being fulfilled with an mmap. We've made this as efficient as it can be made. {quote} No, you haven't made it efficient at all. If the code is going to do a buffer copy and use the fallback buffer if it crosses a block, the performance will be dreadful. With my interface, you get no extra allocation, and no buffer copies. The two reasonable strategies are: * short read to cut it to a record boundary * return multiple buffers You can't make it easier than that, because the whole point of this API is to avoid buffer copies. You can't violate the goal of the interface because you don't like those choices. {quote} Well, what are the alternatives? We can't subclass java.nio.ByteBuffer, since all of its constructors are package-private. {quote} Agreed that you can't extend ByteBuffer. But other than close, you don't really need anything. {quote} There has to be a close method on whatever we return {quote} That is NOT a requirement. Especially when it conflicts with fundamental performance. As I wrote, adding a return method on the stream is fine. {quote} I'm sort of getting the impression that you plan on having super-huge buffers, which scares me. {quote} For ORC, I need to read typically 200 MB at a time. Obviously, if I don't need the whole range, I won't get it, but getting it in large sets of bytes is much much more efficient than lots of little reads. {quote} The factory API also gives me the impression that you plan on allocating a new buffer for each read, which would also be problematic. {quote} No, that is not the case. The point of the API is to avoid allocating the buffers if they aren't needed. The current API requires a buffer whether it is needed or not. Obviously the application will need a cache of the buffers to reuse, but the factory lets them write efficient code. {quote} If we have a "Factory" object, it needs to have not only a "get" method, but also a "put" method, where ByteBuffers are placed back when we're done with them. At that point, it becomes more like a cache. This might be a reasonable API, but I wonder if the additional complexity is worth it. {quote} Of course the implementations of the factory will have a release method. The question was just whether the FSDataInputStream needed to access the release method. If we add the releaseByteBuffer then we'd need the release to the factory. Based on this, I'd propose: {code} /** * Is the current location of the stream available via zero copy? */ public boolean isZeroCopyAvailable(); /** * Read from the current location at least 1 and up to maxLength bytes. In most situations, the returned * buffer will contain maxLength bytes unless either: * * the read crosses a block boundary and zero copy is being used * * the stream has fewer than maxLength bytes left * The returned buffer will either be one that was created by the factory or a MappedByteBuffer. */ public ByteBuffer readByteBuffer(ByteBufferFactory factory, int maxLength) throws IOException; /** * Release a buffer that was returned from readByteBuffer. If the method was created by the factory * it will be returned to the factory. */ public void releaseByteBuffer(ByteBufferFactory factory, ByteBuffer buffer); /** * Allow application to manage how ByteBuffers are created for fallback buffers. Only buffers created by * the factory will be released to it. */ public interface ByteBufferFactory { ByteBuffer createBuffer(int capacity); void releaseBuffer(ByteBuffer buffer); } {code} This will allow applications to: * determine whether zero copy is available for the next read * the user can use the same read interface for all filesystems and files, using zero copy if available * no extra buffer copies * no bytebuffers are allocated if they are not needed * applications have to deal with short reads, but only get a single byte buffer * allow applications to create buffer managers that reuse buffers * allow applications to control whether direct or byte[] byte buffers are used The example code would look like: {code} FSDataInputStream in = fs.open(path); in.seek(100*1024*1024); List buffers = new ArrayL
[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap
[ https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763167#comment-13763167 ] Owen O'Malley commented on HDFS-4953: - Thanks, Colin, for giving more details of the design. Your new API is much better, but a few issues remain: * If an application needs to determine whether zero copy is available, it should be able to do so without catching exceptions. * What happens if the user reads across a block boundary? Most applications don't care about block boundaries and shouldn't have to add special code to cut their requests to block boundaries. That will impose inefficiencies. * The cost of a second level of indirection (app -> ZeroCopy -> ByteBuffer) in the inner loop of the client seems prohibitive. * Requiring pre-allocation of a fallback buffer that hopefully is never needed is really problematic. I'd propose that we flip this around to a factory. * You either need to support short reads or return multiple bytebuffers. I don't see a way to avoid both unless applications are forced to never read across block boundaries. That would be much worse than either of the other options. I'd prefer to have multiple ByteBuffers returned, but if you hate that worse than short reads, I can handle that. * It isn't clear to me how you plan to release mmapped buffers, since Java doesn't provide an API to do that. If you have a mechanism to do that, we need a releaseByteBuffer(ByteBuffer buffer) to release it. I'd propose that we add the following to FSDataInputStream: {code} /** * Is the current location of the stream available via zero copy? */ public boolean isZeroCopyAvailable(); /** * Read from the current location at least 1 and up to maxLength bytes. In most situations, the returned * buffer will contain maxLength bytes unless either: * * the read crosses a block boundary and zero copy is being used * * the stream has fewer than maxLength bytes left * The returned buffer will either be one that was created by the factory or a MappedByteBuffer. */ public ByteBuffer readByteBuffer(ByteBufferFactory factory, int maxLength) throws IOException; /** * Allow application to manage how ByteBuffers are created for fallback buffers. */ public interface ByteBufferFactory { ByteBuffer createBuffer(int capacity); } {code} > enable HDFS local reads via mmap > > > Key: HDFS-4953 > URL: https://issues.apache.org/jira/browse/HDFS-4953 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 2.3.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: HDFS-4949 > > Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, > HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, > HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch > > > Currently, the short-circuit local read pathway allows HDFS clients to access > files directly without going through the DataNode. However, all of these > reads involve a copy at the operating system level, since they rely on the > read() / pread() / etc family of kernel interfaces. > We would like to enable HDFS to read local files via mmap. This would enable > truly zero-copy reads. > In the initial implementation, zero-copy reads will only be performed when > checksums were disabled. Later, we can use the DataNode's cache awareness to > only perform zero-copy reads when we know that checksum has already been > verified. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap
[ https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760682#comment-13760682 ] Owen O'Malley commented on HDFS-4953: - Colin, please read my suggestion and my analysis of the difference before commenting. The simplified API absolutely provides a means to releasing the ByteBuffer and yet it is 2 lines long instead of 20. Furthermore, I didn't even realize that I was supposed to close the zero copy cursor, since it just came in from closable. My complaint stands. The API as currently in this branch is very error-prone and difficult to explain. Using it is difficult and requires complex handling including exception handlers to handle arbitrary file systems. > enable HDFS local reads via mmap > > > Key: HDFS-4953 > URL: https://issues.apache.org/jira/browse/HDFS-4953 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 2.3.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: HDFS-4949 > > Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, > HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, > HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch > > > Currently, the short-circuit local read pathway allows HDFS clients to access > files directly without going through the DataNode. However, all of these > reads involve a copy at the operating system level, since they rely on the > read() / pread() / etc family of kernel interfaces. > We would like to enable HDFS to read local files via mmap. This would enable > truly zero-copy reads. > In the initial implementation, zero-copy reads will only be performed when > checksums were disabled. Later, we can use the DataNode's cache awareness to > only perform zero-copy reads when we know that checksum has already been > verified. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira