[jira] [Resolved] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-27 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16917.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the transfer rate for 
> datanode reads.
> This will give us a distribution across a window of the read transfer rate 
> for each datanode.
> Quantiles for transfer rate per host will help in identifying issues like 
> hotspotting of datasets as well as finding repetitive slow datanodes.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16890) RBF: Add period state refresh to keep router state near active namenode's

2023-02-27 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16890.
--
Fix Version/s: 3.4.0
   3.3.6
   Resolution: Fixed

> RBF: Add period state refresh to keep router state near active namenode's
> -
>
> Key: HDFS-16890
> URL: https://issues.apache.org/jira/browse/HDFS-16890
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> When using the ObserverReadProxyProvider, clients can set 
> *dfs.client.failover.observer.auto-msync-period...* to periodically get the 
> Active namenode's state. When using routers without the 
> ObserverReadProxyProvider, this periodic update is lost.
> In a busy cluster, the Router constantly gets updated with the active 
> namenode's state when
>  # There is a write operation.
>  # There is an operation (read/write) from a new clients.
> However, in the scenario when there are no new clients and no write 
> operations, the state kept in the router can lag behind the active's. The 
> router does update its state with responses from the Observer, but the 
> observer may be lagging behind too.
> We should have a periodic refresh in the router to serve a similar role as 
> *dfs.client.failover.observer.auto-msync-period*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context

2023-02-22 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16901.
--
Fix Version/s: 3.4.0
   3.3.6
   Resolution: Fixed

Thanks, Simba!

> RBF: Routers should propagate the real user in the UGI via the caller context
> -
>
> Key: HDFS-16901
> URL: https://issues.apache.org/jira/browse/HDFS-16901
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> If the router receives an operation from a proxyUser, it drops the realUser 
> in the UGI and makes the routerUser the realUser for the operation that goes 
> to the namenode.
> In the namenode UGI logs, we'd like the ability to know the original realUser.
> The router should propagate the realUser from the client call as part of the 
> callerContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16895) NamenodeHeartbeatService should use credentials of logged in user

2023-02-07 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16895.
--
Fix Version/s: 3.4.0
   3.3.5
 Assignee: Hector Sandoval Chaverri
   Resolution: Fixed

> NamenodeHeartbeatService should use credentials of logged in user
> -
>
> Key: HDFS-16895
> URL: https://issues.apache.org/jira/browse/HDFS-16895
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Reporter: Hector Sandoval Chaverri
>Assignee: Hector Sandoval Chaverri
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> NamenodeHeartbeatService has been found to log the errors when querying 
> protected Namenode JMX APIs. We have been able to work around this by running 
> kinit with the DFS_ROUTER_KEYTAB_FILE_KEY and 
> DFS_ROUTER_KERBEROS_PRINCIPAL_KEY on the router.
> While investigating a solution, we found that doing the request as part of a  
> UserGroupInformation.getLoginUser.doAs() call doesn't require to kinit before.
> The error logged is:
> {noformat}
> 2022-08-16 21:35:00,265 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.FederationUtil: Cannot parse 
> JMX output for Hadoop:service=NameNode,name=FSNamesystem* from server 
> ltx1-yugiohnn03-ha1.grid.linkedin.com:50070
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> Error while authenticating with endpoint: 
> http://ltx1-yugiohnn03-ha1.grid.linkedin.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem*
>   at sun.reflect.GeneratedConstructorAccessor55.newInstance(Unknown 
> Source)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.wrapExceptionWithMessage(KerberosAuthenticator.java:232)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:219)
>   at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:350)
>   at 
> org.apache.hadoop.hdfs.web.URLConnectionFactory.openConnection(URLConnectionFactory.java:186)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.FederationUtil.getJmx(FederationUtil.java:82)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateJMXParameters(NamenodeHeartbeatService.java:352)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.getNamenodeStatusReport(NamenodeHeartbeatService.java:295)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:218)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.periodicInvoke(NamenodeHeartbeatService.java:172)
>   at 
> org.apache.hadoop.hdfs.server.federation.router.PeriodicService$1.run(PeriodicService.java:178)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:360)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:204)
>   ... 15 more
> Caused by: GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)
>   at 
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
>   at 
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
>   at 
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
>   at 
> sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
>   at 
> 

[jira] [Resolved] (HDFS-16886) Fix documentation for StateStoreRecordOperations#get(Class ..., Query ...)

2023-01-11 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16886.
--
Fix Version/s: 3.4.0
   3.3.5
   Resolution: Fixed

> Fix documentation for StateStoreRecordOperations#get(Class ..., Query ...)
> --
>
> Key: HDFS-16886
> URL: https://issues.apache.org/jira/browse/HDFS-16886
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> For {*}StateStoreRecordOperations#get(Class ..., Query ...){*}, when multiple 
> records match, the documentation says a null value should be returned and an 
> IOException should be thrown. Both can't happen.
> I believe the intended behavior is that an IOException is thrown. This is the 
> implementation in {*}StateStoreBaseImpl{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16877) Namenode doesn't use alignment context in TestObserverWithRouter

2023-01-06 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16877.
--
Fix Version/s: 3.4.0
 Assignee: Simbarashe Dzinamarira
   Resolution: Fixed

I've committed this. Thanks, Simba!

> Namenode doesn't use alignment context in TestObserverWithRouter
> 
>
> Key: HDFS-16877
> URL: https://issues.apache.org/jira/browse/HDFS-16877
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> We need to set "{*}dfs.namenode.state.context.enabled{*}" to true for the 
> namenode to send it's stateId in client responses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16851) RBF: Add a utility to dump the StateStore

2022-11-29 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16851.
--
Fix Version/s: 3.3.6
   3.4.0
   Resolution: Fixed

> RBF: Add a utility to dump the StateStore
> -
>
> Key: HDFS-16851
> URL: https://issues.apache.org/jira/browse/HDFS-16851
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.6, 3.4.0
>
>
> It would be useful to have a utility to dump the StateStore for RBF.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16847) RBF: StateStore writer should not commit tmp fail if there was an error in writing the file.

2022-11-28 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16847.
--
Fix Version/s: 3.4.0
   3.3.5
   Resolution: Fixed

I committed this. Thanks, Simba!

> RBF: StateStore writer should not commit tmp fail if there was an error in 
> writing the file.
> 
>
> Key: HDFS-16847
> URL: https://issues.apache.org/jira/browse/HDFS-16847
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> The file based implementation of the RBF state store has a commit step that 
> moves a temporary file to a permanent location.
> There is a check to see if the write of the temp file was successfully, 
> however, the code to commit doesn't check the success flag.
> This is the relevant code: 
> [https://github.com/apache/hadoop/blob/7d39abd799a5f801a9fd07868a193205ab500bfa/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileBaseImpl.java#L369]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16845) Add configuration flag to enable observer reads on routers without using ObserverReadProxyProvider

2022-11-28 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16845.
--
Fix Version/s: 3.4.0
   3.3.5
   2.10.3
   Resolution: Fixed

I just committed this. Thanks, Simba!

> Add configuration flag to enable observer reads on routers without using 
> ObserverReadProxyProvider
> --
>
> Key: HDFS-16845
> URL: https://issues.apache.org/jira/browse/HDFS-16845
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5, 2.10.3
>
>
> In order for clients to have routers forward their reads to observers, the 
> clients must use a proxy with an alignment context. This is currently 
> achieved by using the ObserverReadProxyProvider.
> Using ObserverReadProxyProvider allows backward compatible for client 
> configurations.
> However, the ObserverReadProxyProvider forces an msync on initialization 
> which is not required with routers.
> Performing msync calls is more expensive with routers because the router fans 
> out the cal to all namespaces, so we'd like to avoid this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16856) [RBF] Refactor router admin command to use HDFS AdminHelper class

2022-11-28 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16856:


 Summary: [RBF] Refactor router admin command to use HDFS 
AdminHelper class
 Key: HDFS-16856
 URL: https://issues.apache.org/jira/browse/HDFS-16856
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently, the router admin class is a bit of a mess with a lot of custom 
programming. We should use the infrastructure that was developed in the 
AdminHelper class to standardize the command processing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16851) [RBF] Utility to textually dump the StateStore

2022-11-21 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16851:


 Summary: [RBF] Utility to textually dump the StateStore
 Key: HDFS-16851
 URL: https://issues.apache.org/jira/browse/HDFS-16851
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Reporter: Owen O'Malley
Assignee: Owen O'Malley


It would be useful to have a utility to dump the StateStore for RBF.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16844) [RBF] The routers should be resiliant against exceptions from StateStore

2022-11-18 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16844.
--
Fix Version/s: 3.4.0
   3.3.5
   Resolution: Fixed

> [RBF] The routers should be resiliant against exceptions from StateStore
> 
>
> Key: HDFS-16844
> URL: https://issues.apache.org/jira/browse/HDFS-16844
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.4
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> Currently, a single exception from the StateStore will cripple a router by 
> clearing the caches before the replacement is loaded. Since the routers have 
> the information in an in-memory cache, it is better to keep running. There is 
> still the timeout that will push the router into safe-mode if it can't load 
> the state store over a longer period of time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16843) [RBF] The routers should be resiliant against exceptions from StateStore

2022-11-15 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16843.
--
Resolution: Duplicate

> [RBF] The routers should be resiliant against exceptions from StateStore
> 
>
> Key: HDFS-16843
> URL: https://issues.apache.org/jira/browse/HDFS-16843
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.4
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>
> Currently, a single exception from the StateStore will cripple a router by 
> clearing the caches before the replacement is loaded. Since the routers have 
> the information in an in-memory cache, it is better to keep running. There is 
> still the timeout that will push the router into safe-mode if it can't load 
> the state store over a longer period of time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16844) [RBF] The routers should be resiliant against exceptions from StateStore

2022-11-15 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16844:


 Summary: [RBF] The routers should be resiliant against exceptions 
from StateStore
 Key: HDFS-16844
 URL: https://issues.apache.org/jira/browse/HDFS-16844
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Affects Versions: 3.3.4
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently, a single exception from the StateStore will cripple a router by 
clearing the caches before the replacement is loaded. Since the routers have 
the information in an in-memory cache, it is better to keep running. There is 
still the timeout that will push the router into safe-mode if it can't load the 
state store over a longer period of time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16843) [RBF] The routers should be resiliant against exceptions from StateStore

2022-11-15 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16843:


 Summary: [RBF] The routers should be resiliant against exceptions 
from StateStore
 Key: HDFS-16843
 URL: https://issues.apache.org/jira/browse/HDFS-16843
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Affects Versions: 3.3.4
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently, a single exception from the StateStore will cripple a router by 
clearing the caches before the replacement is loaded. Since the routers have 
the information in an in-memory cache, it is better to keep running. There is 
still the timeout that will push the router into safe-mode if it can't load the 
state store over a longer period of time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16836) StandbyCheckpointer can still trigger rollback fs image after RU is finalized

2022-11-15 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16836.
--
Fix Version/s: 3.4.0
   3.3.5
   Resolution: Fixed

I just committed this. Thanks, Lei!

> StandbyCheckpointer can still trigger rollback fs image after RU is finalized
> -
>
> Key: HDFS-16836
> URL: https://issues.apache.org/jira/browse/HDFS-16836
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Lei Yang
>Assignee: Lei Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> StandbyCheckpointer trigger rollback fsimage when RU is started.
> When ru is started, a flag (needRollbackImage) was set to true during edit 
> log replay.
> And it only gets reset to false when doCheckpoint() succeeded.
> Think about following scenario:
>  # Start RU, needRollbackImage is set to true.
>  # doCheckpoint() failed.
>  # RU is finalized.
>  # namesystem.getFSImage().hasRollbackFSImage() is always false since 
> rollback image cannot be generated once RU is over.
>  # needRollbackImage was never set to false.
>  # Checkpoints threshold(1m txns) and period(1hr) are not honored.
> {code:java}
> StandbyCheckpointer:
> void doWork() {
>  
>   doCheckpoint();
>   // reset needRollbackCheckpoint to false only when we finish a ckpt
>   // for rollback image
>   if (needRollbackCheckpoint
>   && namesystem.getFSImage().hasRollbackFSImage()) {
> namesystem.setCreatedRollbackImages(true);
> namesystem.setNeedRollbackFsImage(false);
>   }
>   lastCheckpointTime = now;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16778) Separate out the logger for which DN is picked by a DFSInputStream

2022-09-19 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16778:


 Summary: Separate out the logger for which DN is picked by a 
DFSInputStream
 Key: HDFS-16778
 URL: https://issues.apache.org/jira/browse/HDFS-16778
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley


Currently, there is no way to know which DN a given stream chose without 
turning on debug for all of DFSClient. I'd like the ability to just get that 
logged.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16767) RBF: Support observer node from Router-Based Federation

2022-09-14 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16767.
--
Fix Version/s: 3.4.0
   3.3.9
   Resolution: Fixed

I just committed this. Thanks, Simba!

> RBF: Support observer node from Router-Based Federation 
> 
>
> Key: HDFS-16767
> URL: https://issues.apache.org/jira/browse/HDFS-16767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> Enable routers to direct read calls to observer namenodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16518) KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager

2022-03-30 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16518.
--
Fix Version/s: 3.4.0
   2.10.2
   3.3.3
   Resolution: Fixed

I committed this. Thanks, Lei!

> KeyProviderCache close cached KeyProvider with Hadoop ShutdownHookManager
> -
>
> Key: HDFS-16518
> URL: https://issues.apache.org/jira/browse/HDFS-16518
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.10.0
>Reporter: Lei Yang
>Assignee: Lei Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.3.3
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> KeyProvider implements Closable interface but some custom implementation of 
> KeyProvider also needs explicit close in KeyProviderCache. An example is to 
> use custom KeyProvider in DFSClient to integrate read encrypted file on HDFS. 
> KeyProvider  currently gets closed in KeyProviderCache only when cache entry 
> is expired or invalidated. In some cases, this is not happening. This seems 
> related to guava cache.
> This patch is to use hadoop JVM shutdownhookManager to globally cleanup cache 
> entries and thus close KeyProvider using cache hook right after filesystem 
> instance gets closed in a deterministic way.
> {code:java}
> Class KeyProviderCache
> ...
>  public KeyProviderCache(long expiryMs) {
>   cache = CacheBuilder.newBuilder()
> .expireAfterAccess(expiryMs, TimeUnit.MILLISECONDS)
> .removalListener(new RemovalListener() {
>   @Override
>   public void onRemoval(
>   @Nonnull RemovalNotification notification) {
> try {
>   assert notification.getValue() != null;
>   notification.getValue().close();
> } catch (Throwable e) {
>   LOG.error(
>   "Error closing KeyProvider with uri ["
>   + notification.getKey() + "]", e);
> }
>   }
> })
> .build(); 
> }{code}
> We could have made a new function KeyProviderCache#close, have each DFSClient 
> call this function and close KeyProvider at the end of each DFSClient#close 
> call but it will expose another problem to potentially close global cache 
> among different DFSClient instances.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16517) In 2.10 the distance metric is wrong for non-DN machines

2022-03-23 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16517.
--
Fix Version/s: 2.10.2
   Resolution: Fixed

> In 2.10 the distance metric is wrong for non-DN machines
> 
>
> Key: HDFS-16517
> URL: https://issues.apache.org/jira/browse/HDFS-16517
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.10.2
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In 2.10, the metric for distance between the client and the data node is 
> wrong for machines that aren't running data nodes (ie. 
> getWeightUsingNetworkLocation). The code works correctly in 3.3+. 
> Currently
>  
> ||Client||DataNode||getWeight||getWeightUsingNetworkLocation||
> |/rack1/node1|/rack1/node1|0|0|
> |/rack1/node1|/rack1/node2|2|2|
> |/rack1/node1|/rack2/node2|4|2|
> |/pod1/rack1/node1|/pod1/rack1/node2|2|2|
> |/pod1/rack1/node1|/pod1/rack2/node2|4|2|
> |/pod1/rack1/node1|/pod2/rack2/node2|6|4|
>  
> This bug will destroy data locality on clusters where the clients share racks 
> with DataNodes, but are running on machines that aren't running DataNodes, 
> such as striping federated HDFS clusters across racks.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16517) In 2.10 the distance metric is wrong for non-DN machines

2022-03-21 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16517:


 Summary: In 2.10 the distance metric is wrong for non-DN machines
 Key: HDFS-16517
 URL: https://issues.apache.org/jira/browse/HDFS-16517
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.10.1
Reporter: Owen O'Malley
Assignee: Owen O'Malley


In 2.10, the metric for distance between the client and the data node is wrong 
for machines that aren't running data nodes (ie. 
getWeightUsingNetworkLocation). The code works correctly in 3.3+.

Currently

 
||Client||DataNode ||getWeight||getWeightUsingNetworkLocation||
|/rack1/node1|/rack1/node1|0|0|
|/rack1/node1|/rack1/node2|2|2|
|/rack1/node1|/rack2/node2|4|2|
|/pod1/rack1/node1|/pod1/rack1/node2|2|2|
|/pod1/rack1/node1|/pod1/rack2/node2|4|2|
|/pod1/rack1/node1|/pod2/rack2/node2|6|4|

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16495) RBF should prepend the client ip rather than append it.

2022-03-14 Thread Owen O'Malley (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-16495.
--
Fix Version/s: 3.4.0
   3.2.4
   Resolution: Fixed

> RBF should prepend the client ip rather than append it.
> ---
>
> Key: HDFS-16495
> URL: https://issues.apache.org/jira/browse/HDFS-16495
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently the Routers append the client ip to the caller context if and only 
> if it is not already set. This would allow the user to fake their ip by 
> setting the caller context. Much better is to prepend it unconditionally.
> The NN must be able to trust the client ip from the caller context.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16495) RBF should prepend the client ip rather than append it.

2022-03-04 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16495:


 Summary: RBF should prepend the client ip rather than append it.
 Key: HDFS-16495
 URL: https://issues.apache.org/jira/browse/HDFS-16495
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently the Routers append the client ip to the caller context if and only if 
it is not already set. This would allow the user to fake their ip by setting 
the caller context. Much better is to prepend it unconditionally.

The NN must be able to trust the client ip from the caller context.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16253) Add a toString implementation to DFSInputStream

2021-10-04 Thread Owen O'Malley (Jira)
Owen O'Malley created HDFS-16253:


 Summary: Add a toString implementation to DFSInputStream
 Key: HDFS-16253
 URL: https://issues.apache.org/jira/browse/HDFS-16253
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley


It would help debugging if there was a useful toString on DFSInputStream.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-14244) hdfs++ doesn't add necessary libraries to dynamic library link

2019-01-30 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-14244:


 Summary: hdfs++ doesn't add necessary libraries to dynamic library 
link
 Key: HDFS-14244
 URL: https://issues.apache.org/jira/browse/HDFS-14244
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley


When linking with shared libraries, the libhdfs++ cmake file doesn't link 
correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-9025) fix compilation issues on arch linux

2015-09-04 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-9025:
---

 Summary: fix compilation issues on arch linux
 Key: HDFS-9025
 URL: https://issues.apache.org/jira/browse/HDFS-9025
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Owen O'Malley


There are several compilation issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8707) Implement an async pure c++ HDFS client

2015-07-01 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-8707:
---

 Summary: Implement an async pure c++ HDFS client
 Key: HDFS-8707
 URL: https://issues.apache.org/jira/browse/HDFS-8707
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client
Reporter: Owen O'Malley
Assignee: Haohui Mai


As part of working on the C++ ORC reader at ORC-3, we need an HDFS pure C++ 
client that lets us do async io to HDFS. We want to start from the code that 
Haohui's been working on at https://github.com/haohui/libhdfspp .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-3983) Hftp should support both SPNEGO and KSSL

2013-10-01 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-3983.
-

  Resolution: Won't Fix
Target Version/s:   (was: )

KSSL is deprecated and should never be used for secure deployments.

 Hftp should support both SPNEGO and KSSL
 

 Key: HDFS-3983
 URL: https://issues.apache.org/jira/browse/HDFS-3983
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Blocker
 Attachments: hdfs-3983.txt, hdfs-3983.txt


 Hftp currently doesn't work against a secure cluster unless you configure 
 {{dfs.https.port}} to be the http port, otherwise the client can't fetch 
 tokens:
 {noformat}
 $ hadoop fs -ls hftp://c1225.hal.cloudera.com:50070/
 12/09/26 18:02:00 INFO fs.FileSystem: Couldn't get a delegation token from 
 http://c1225.hal.cloudera.com:50470 using http.
 ls: Security enabled but user not authenticated by filter
 {noformat}
 This is due to Hftp still using the https port. Post HDFS-2617 it should use 
 the regular http port. Hsftp should still use the secure port, however now 
 that we have HADOOP-8581 it's worth considering removing Hsftp entirely. I'll 
 start a separate thread about that.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HDFS-3699) HftpFileSystem should try both KSSL and SPNEGO when authentication is required

2013-10-01 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-3699.
-

Resolution: Won't Fix

Using KSSL is strongly deprecated and should be avoided in secure clusters.

 HftpFileSystem should try both KSSL and SPNEGO when authentication is required
 --

 Key: HDFS-3699
 URL: https://issues.apache.org/jira/browse/HDFS-3699
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: eric baldeschwieler

 See discussion in HDFS-2617 (Replaced Kerberized SSL for image transfer and 
 fsck with SPNEGO-based solution).
 To handle the transition from Hadoop1.0 systems running KSSL authentication 
 to Hadoop systems running SPNEGO, it would be good to fix the client in both 
 1 and 2 to try SPNEGO and then fall back to try KSSL.  
 This will allow organizations that are running a lot of Hadoop 1.0 to 
 gradually transition over, without needing to convert all clusters at the 
 same time.  They would first need to update their 1.0 HFTP clients (and 
 2.0/0.23 if they are already running those) and then they could copy data 
 between clusters without needing to move all clusters to SPNEGO in a big bang.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HDFS-4009) WebHdfsFileSystem and HftpFileSystem don't need delegation tokens

2012-10-09 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-4009.
-

Resolution: Won't Fix

This is a feature, not a bug.

In particular, if your kerberos ticket has 1 hour left, the application will 
fail without a token. In tools that copy large amounts of data using the http 
filesystem, this happens relatively often.

 WebHdfsFileSystem and HftpFileSystem don't need delegation tokens
 -

 Key: HDFS-4009
 URL: https://issues.apache.org/jira/browse/HDFS-4009
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Tom White
Assignee: Karthik Kambatla
 Attachments: hadoop-8852.patch, hadoop-8852.patch, 
 hadoop-8852-v1.patch


 Parent JIRA to track the work of removing delegation tokens from these 
 filesystems. 
 This JIRA has evolved from the initial issue of these filesystems not 
 stopping the DelegationTokenRenewer thread they were creating.
 After further investigation, Daryn pointed out - If you can get a token, you 
 don't need a token! Hence, these filesystems shouldn't use delegation tokens.
 Evolution of the JIRA is listed below:
 Update 2:
 DelegationTokenRenewer is not required. The filesystems that are using it 
 already have Krb tickets and do not need tokens. Remove 
 DelegationTokenRenewer and all the related logic from WebHdfs and Hftp 
 filesystems.
 Update1:
 DelegationTokenRenewer should be Singleton - the instance and renewer threads 
 should be created/started lazily. The filesystems using the renewer shouldn't 
 need to explicity start/stop the renewer, and only register/de-register for 
 token renewal.
 Initial issue:
 HftpFileSystem and WebHdfsFileSystem should stop the DelegationTokenRenewer 
 thread when they are closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4010) Remove unused TokenRenewer implementation from WebHdfsFileSystem and HftpFileSystem

2012-10-09 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-4010.
-

Resolution: Won't Fix

They are used via the java ServiceLoader interface.

 Remove unused TokenRenewer implementation from WebHdfsFileSystem and 
 HftpFileSystem
 ---

 Key: HDFS-4010
 URL: https://issues.apache.org/jira/browse/HDFS-4010
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: HDFS-4010.patch


 WebHdfsFileSystem and HftpFileSystem implement TokenRenewer without using 
 anywhere.
 As we are in the process of migrating them to not use tokens, this code 
 should be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4011) WebHdfsFileSystem shouldn't implicitly fetch delegation tokens

2012-10-09 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-4011.
-

Resolution: Invalid

This is a feature.

 WebHdfsFileSystem shouldn't implicitly fetch delegation tokens
 --

 Key: HDFS-4011
 URL: https://issues.apache.org/jira/browse/HDFS-4011
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4012) HftpFileSystem shouldn't implicity fetch delegation tokens

2012-10-09 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-4012.
-

Resolution: Not A Problem

This is not a problem.

 HftpFileSystem shouldn't implicity fetch delegation tokens
 --

 Key: HDFS-4012
 URL: https://issues.apache.org/jira/browse/HDFS-4012
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Karthik Kambatla



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-3993) The KSSL class should not limit the ssl ciphers

2012-10-01 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-3993:
---

 Summary: The KSSL class should not limit the ssl ciphers
 Key: HDFS-3993
 URL: https://issues.apache.org/jira/browse/HDFS-3993
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Owen O'Malley


The KSSL class' static block currently limits the ssl ciphers to a single 
value. It should use a much more permissive list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-3749) Disable check for jsvc on windows

2012-08-01 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-3749:
---

 Summary: Disable check for jsvc on windows
 Key: HDFS-3749
 URL: https://issues.apache.org/jira/browse/HDFS-3749
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Jsvc doesn't make sense on windows and thus we should not require the datanode 
to start up under it on that platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-2386) with security enabled fsck calls lead to handshake_failure and hftp fails throwing the same exception in the logs

2012-06-25 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-2386.
-

Resolution: Invalid

Fixed via HDFS-2617.

 with security enabled fsck calls lead to handshake_failure and hftp fails 
 throwing the same exception in the logs
 -

 Key: HDFS-2386
 URL: https://issues.apache.org/jira/browse/HDFS-2386
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.205.0
Reporter: Arpit Gupta



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3466) The SPNEGO filter for the NameNode should come out of the web keytab file

2012-05-25 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-3466:
---

 Summary: The SPNEGO filter for the NameNode should come out of the 
web keytab file
 Key: HDFS-3466
 URL: https://issues.apache.org/jira/browse/HDFS-3466
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node, security
Affects Versions: 2.0.0-alpha, 1.1.0
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently, the spnego filter uses the DFS_NAMENODE_KEYTAB_FILE_KEY to find the 
keytab. It should use the DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY to do it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3461) HFTP should use the same port protocol for getting the delegation token

2012-05-23 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-3461:
---

 Summary: HFTP should use the same port  protocol for getting the 
delegation token
 Key: HDFS-3461
 URL: https://issues.apache.org/jira/browse/HDFS-3461
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 1.1.0


Currently, hftp uses http to the Namenode's https port, which doesn't work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3374) hdfs' TestDelegationToken fails intermittently with a race condition

2012-05-04 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-3374:
---

 Summary: hdfs' TestDelegationToken fails intermittently with a 
race condition
 Key: HDFS-3374
 URL: https://issues.apache.org/jira/browse/HDFS-3374
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: Owen O'Malley
Assignee: Owen O'Malley


The testcase is failing because the MiniDFSCluster is shutdown before the 
secret manager can change the key, which calls system.exit with no edit streams 
available.

{code}

[junit] 2012-05-04 15:03:51,521 WARN  common.Storage 
(FSImage.java:updateRemovedDirs(224)) - Removing storage dir 
/home/horton/src/hadoop/build/test/data/dfs/name1
[junit] 2012-05-04 15:03:51,522 FATAL namenode.FSNamesystem 
(FSEditLog.java:fatalExit(388)) - No edit streams are accessible
[junit] java.lang.Exception: No edit streams are accessible
[junit] at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.fatalExit(FSEditLog.java:388)
[junit] at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.exitIfNoStreams(FSEditLog.java:407)
[junit] at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.removeEditsAndStorageDir(FSEditLog.java:432)
[junit] at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.removeEditsStreamsAndStorageDirs(FSEditLog.java:468)
[junit] at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:1028)
[junit] at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.logUpdateMasterKey(FSNamesystem.java:5641)
[junit] at 
org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logUpdateMasterKey(DelegationTokenSecretManager.java:286)
[junit] at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.updateCurrentKey(AbstractDelegationTokenSecretManager.java:150)
[junit] at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.rollMasterKey(AbstractDelegationTokenSecretManager.java:174)
[junit] at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:385)
[junit] at java.lang.Thread.run(Thread.java:662)
[junit] Running org.apache.hadoop.hdfs.security.TestDelegationToken
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.hadoop.hdfs.security.TestDelegationToken FAILED 
(crashed)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3348) After HDFS-2617 can't use 0.0.0.0 in dfs.http.address

2012-05-02 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-3348:
---

 Summary: After HDFS-2617 can't use 0.0.0.0 in dfs.http.address
 Key: HDFS-3348
 URL: https://issues.apache.org/jira/browse/HDFS-3348
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Owen O'Malley


After HDFS-2617, if you use 0.0.0.0 in dfs.http.address, the _HOST resolution 
for SPNEGO will create a principal like nn/0.0@example.com.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3345) Primary and secondary Principals must be the same

2012-05-01 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-3345:
---

 Summary: Primary and secondary Principals must be the same
 Key: HDFS-3345
 URL: https://issues.apache.org/jira/browse/HDFS-3345
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Owen O'Malley


The NameNode and SecondaryNameNode have two different configuration knobs 
(dfs.namenode.kerberos.principal and 
dfs.secondary.namenode.kerberos.principal), but the secondary namenode fails 
authorization unless it is the same user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3316) The tar ball doesn't include jsvc any more

2012-04-24 Thread Owen O'Malley (JIRA)
Owen O'Malley created HDFS-3316:
---

 Summary: The tar ball doesn't include jsvc any more
 Key: HDFS-3316
 URL: https://issues.apache.org/jira/browse/HDFS-3316
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 1.0.3


The current release tarballs on the 1.0 branch don't include jsvc by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-2328) hftp throws NPE if security is not enabled on remote cluster

2011-09-14 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-2328.
-

Resolution: Fixed

I committed this to 20-s and 205. I'll commit it to trunk as part of 
MAPREDUCE-2764.

 hftp throws NPE if security is not enabled on remote cluster
 

 Key: HDFS-2328
 URL: https://issues.apache.org/jira/browse/HDFS-2328
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.205.0
Reporter: Daryn Sharp
Assignee: Owen O'Malley
Priority: Critical
 Fix For: 0.20.205.0

 Attachments: h-2328.patch


 If hftp cannot locate either a hdfs or hftp token in the ugi, it will call 
 {{getDelegationToken}} to acquire one from the remote nn.  This method may 
 return a null {{Token}} if security is disabled(*)  on the remote nn.  Hftp 
 will internally call its {{setDelegationToken}} which will throw a NPE when 
 the token is {{null}}.
 (*) Actually, if any problem happens while acquiring the token it assumes 
 security is disabled!  However, it's a pre-existing issue beyond the scope of 
 the token renewal changes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-1952) FSEditLog.open() appears to succeed even if all EDITS directories fail

2011-06-10 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-1952.
-

Resolution: Fixed

Resolving this, since it was committed to trunk.

 FSEditLog.open() appears to succeed even if all EDITS directories fail
 --

 Key: HDFS-1952
 URL: https://issues.apache.org/jira/browse/HDFS-1952
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Matt Foley
Assignee: Andrew Wang
  Labels: newbie
 Attachments: hdfs-1952-0.22.patch, hdfs-1952.patch, hdfs-1952.patch, 
 hdfs-1952.patch


 FSEditLog.open() appears to succeed even if all of the individual 
 directories failed to allow creation of an EditLogOutputStream.  The problem 
 and solution are essentially similar to that of HDFS-1505.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1666) TestAuthorizationFilter is failing

2011-06-10 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-1666.
-

   Resolution: Fixed
Fix Version/s: 0.22.0
 Assignee: Todd Lipcon

This was committed to trunk.

 TestAuthorizationFilter is failing
 --

 Key: HDFS-1666
 URL: https://issues.apache.org/jira/browse/HDFS-1666
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/hdfsproxy
Affects Versions: 0.22.0, 0.23.0
Reporter: Konstantin Boudnik
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.22.0

 Attachments: hdfs-1666-disable-tests.txt


 two test cases were failing for a number of builds (see attached logs)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1809) Lack of proper DefaultMetricsSystem initialization breaks some tests

2011-04-05 Thread Owen O'Malley (JIRA)
Lack of proper DefaultMetricsSystem initialization breaks some tests


 Key: HDFS-1809
 URL: https://issues.apache.org/jira/browse/HDFS-1809
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Suresh Srinivas
 Fix For: 0.20.203.0


Following tests are failing:
TestHDFSServerPorts
TestNNLeaseRecovery
TestSaveNamespace

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1810) Remove duplicate jar entries from common

2011-04-05 Thread Owen O'Malley (JIRA)
Remove duplicate jar entries from common


 Key: HDFS-1810
 URL: https://issues.apache.org/jira/browse/HDFS-1810
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Luke Lu


Remove the jars that we get from common from our direct dependency list.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1811) Create scripts to decommission datanodes

2011-04-05 Thread Owen O'Malley (JIRA)
Create scripts to decommission datanodes


 Key: HDFS-1811
 URL: https://issues.apache.org/jira/browse/HDFS-1811
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Erik Steffl


Create scripts to decommission datanodes:

  - distribute exclude file
- input is location of exclude file
- location on namenodes: hdfs getconf -excludeFile
- list of namenodes: hdfs getconf -namenodes
- scp excludes files to all namenodes

  - refresh namenodes
- list of namenodes: hdfs getconf -namenodes
- refresh namenodes: hdfs dfsadmin -refreshNodes

Two scripts are needed because each of them might require different permissions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Created: (HDFS-1729) Improve metrics for measuring NN startup costs.

2011-03-07 Thread Owen O'Malley (JIRA)
Improve metrics for measuring NN startup costs.
---

 Key: HDFS-1729
 URL: https://issues.apache.org/jira/browse/HDFS-1729
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Owen O'Malley
Assignee: Matt Foley
 Fix For: 0.20.100


Current logging and metrics are insufficient to diagnose latency problems in 
cluster startup.  Add:
1. better logs in both Datanode and Namenode for Initial Block Report 
processing, to help distinguish between block
report processing problems and RPC/queuing problems;
2. new logs to measure cost of scanning all blocks for over/under/invalid 
replicas, which occurs in Namenode just
before exiting safe mode;
3. new logs to measure cost of processing the under/invalid replica queues 
(created by the above mentioned scan), which
occurs just after exiting safe mode, and is said to take 100% of CPU.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Resolved: (HDFS-98) creating a file in hdfs should not automatically create the parent directories

2010-09-12 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-98?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-98.
---

Resolution: Won't Fix

Leaving the old FileSystem behavior and fixing it in FileContext is right. 
Closing this.

 creating a file in hdfs should not automatically create the parent directories
 --

 Key: HDFS-98
 URL: https://issues.apache.org/jira/browse/HDFS-98
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Pi Song
 Attachments: hadoop-2759-complete1.patch, HADOOP-2759_1.patch, 
 hadoop_tmp.patch


 I think it would be better if HDFS didn't automatically create directories 
 for the user. In particular, in clean up code, it would be nice if deleting a 
 directory couldn't be undone by mistake by a process that hasn't been killed 
 yet creating a new file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1187) Modify fetchdt to allow renewing and canceling token

2010-06-03 Thread Owen O'Malley (JIRA)
Modify fetchdt to allow renewing and canceling token


 Key: HDFS-1187
 URL: https://issues.apache.org/jira/browse/HDFS-1187
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: fetchdt.patch

I would like to extend fetchdt to allow renewing and canceling tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1178) The NameNode servlets should not use RPC to connect to the NameNode

2010-05-27 Thread Owen O'Malley (JIRA)
The NameNode servlets should not use RPC to connect to the NameNode
---

 Key: HDFS-1178
 URL: https://issues.apache.org/jira/browse/HDFS-1178
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently some of the NameNode servlets use RPC to connect from the NameNode to 
itself. They should do it more directly with the NameNode object.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-986) Push HADOOP-6551 into HDFS

2010-02-18 Thread Owen O'Malley (JIRA)
Push HADOOP-6551 into HDFS
--

 Key: HDFS-986
 URL: https://issues.apache.org/jira/browse/HDFS-986
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Owen O'Malley


We need to throw readable error messages instead of returning false on errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-787) Make the versions of libraries consistent

2009-11-25 Thread Owen O'Malley (JIRA)
Make the versions of libraries consistent
-

 Key: HDFS-787
 URL: https://issues.apache.org/jira/browse/HDFS-787
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.21.0, 0.22.0


This is the version for HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-641) Move all of the benchmarks and tests that depend on mapreduce to mapreduce

2009-11-13 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-641.


Resolution: Fixed

I just committed this.

 Move all of the benchmarks and tests that depend on mapreduce to mapreduce
 --

 Key: HDFS-641
 URL: https://issues.apache.org/jira/browse/HDFS-641
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.2
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.21.0


 Currently, we have a bad cycle where to build hdfs you need to test mapreduce 
 and iterate once. This is broken.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-638) The build.xml refences jars that don't exist

2009-09-22 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-638.


Resolution: Fixed

Suresh fixed this without a jira.

 The build.xml refences jars that don't exist
 

 Key: HDFS-638
 URL: https://issues.apache.org/jira/browse/HDFS-638
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.22.0


 Currently the build is broken on trunk. The jar files need to be updated to 
 the current ones, along with the new version.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-641) Move all of the benchmarks and tests that depend on mapreduce to mapreduce

2009-09-22 Thread Owen O'Malley (JIRA)
Move all of the benchmarks and tests that depend on mapreduce to mapreduce
--

 Key: HDFS-641
 URL: https://issues.apache.org/jira/browse/HDFS-641
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.2
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.21.0


Currently, we have a bad cycle where to build hdfs you need to test mapreduce 
and iterate once. This is broken.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-638) The build.xml refences jars that don't exist

2009-09-21 Thread Owen O'Malley (JIRA)
The build.xml refences jars that don't exist


 Key: HDFS-638
 URL: https://issues.apache.org/jira/browse/HDFS-638
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.22.0


Currently the build is broken on trunk. The jar files need to be updated to the 
current ones, along with the new version.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.