from:"Arpit Gupta \(JIRA\)"

[jira] [Updated] (HDFS-7340) rollingUpgrade prepare command does not have retry cache support

2014-11-03 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-7340:
--
Assignee: Jing Zhao

> rollingUpgrade prepare command does not have retry cache support
> 
>
> Key: HDFS-7340
> URL: https://issues.apache.org/jira/browse/HDFS-7340
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.6.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
>
> I was running this on a HA cluster with 
> dfs.client.test.drop.namenode.response.number set to 1. So the first request 
> goes through but the response is dropped. Which then causes another request 
> which fails and says a request is already in progress. We should add retry 
> cache support for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7340) rollingUpgrade prepare command does not have retry cache support

2014-11-03 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-7340:
-

 Summary: rollingUpgrade prepare command does not have retry cache 
support
 Key: HDFS-7340
 URL: https://issues.apache.org/jira/browse/HDFS-7340
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.6.0
Reporter: Arpit Gupta


I was running this on a HA cluster with 
dfs.client.test.drop.namenode.response.number set to 1. So the first request 
goes through but the response is dropped. Which then causes another request 
which fails and says a request is already in progress. We should add retry 
cache support for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7305) NPE seen in wbhdfs FS while running SLive

2014-10-29 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-7305:
-

 Summary: NPE seen in wbhdfs FS while running SLive
 Key: HDFS-7305
 URL: https://issues.apache.org/jira/browse/HDFS-7305
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Priority: Critical


{code}
2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task status: 
"Failed at running due to java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154)
at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163)
at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80)
at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63)
at 
org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122)
at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
" truncated to max limit (512 characters)

Activity
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-6715) webhdfs wont fail over when it gets java.io.IOException: Namenode is in startup mode

2014-07-21 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-6715:
-

 Summary: webhdfs wont fail over when it gets java.io.IOException: 
Namenode is in startup mode
 Key: HDFS-6715
 URL: https://issues.apache.org/jira/browse/HDFS-6715
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.2.0
Reporter: Arpit Gupta


Noticed in our HA testing when we run MR job with webhdfs file system we some 
times run into 

{code}
2014-04-17 05:08:06,346 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report 
from attempt_1397710493213_0001_r_08_0: Container killed by the 
ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

2014-04-17 05:08:10,205 ERROR [CommitterEvent Processor #1] 
org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Could not 
commit job
java.io.IOException: Namenode is in startup mode
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6354) NN startup does not fail when it fails to login with the spnego principal

2014-05-22 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14006562#comment-14006562
 ] 

Arpit Gupta commented on HDFS-6354:
---

[~daryn]

I was manually just testing out i cannot recall what JDK version was being used 
at that time. This was tested on 2.4 

> NN startup does not fail when it fails to login with the spnego principal
> -
>
> Key: HDFS-6354
> URL: https://issues.apache.org/jira/browse/HDFS-6354
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Arpit Gupta
>
> I have noticed where the NN startup did not report any issues the login fails 
> because either the keytab is wrong or the principal does not exist etc. This 
> can be mis leading and lead to authentication failures when a client tries to 
> authenticate to the spnego principal.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6312) WebHdfs HA failover is broken on secure clusters

2014-04-30 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985788#comment-13985788
 ] 

Arpit Gupta commented on HDFS-6312:
---

Ah i see. I was just curious to see if could add more tests to reach this issue 
:).

> WebHdfs HA failover is broken on secure clusters
> 
>
> Key: HDFS-6312
> URL: https://issues.apache.org/jira/browse/HDFS-6312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Daryn Sharp
>Priority: Blocker
>
> When webhdfs does a failover, it blanks out the delegation token.  This will 
> cause subsequent operations against the other NN to acquire a new token.  
> Tasks cannot acquire a token (no kerberos credentials) so jobs will fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6312) WebHdfs HA failover is broken on secure clusters

2014-04-30 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985657#comment-13985657
 ] 

Arpit Gupta commented on HDFS-6312:
---

[~daryn] in our testing with webhdfs + HA on a secure cluster we hit 
HADOOP-10519. I am curious what kind of job did you run that actually started 
running tasks.

> WebHdfs HA failover is broken on secure clusters
> 
>
> Key: HDFS-6312
> URL: https://issues.apache.org/jira/browse/HDFS-6312
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Daryn Sharp
>Priority: Blocker
>
> When webhdfs does a failover, it blanks out the delegation token.  This will 
> cause subsequent operations against the other NN to acquire a new token.  
> Tasks cannot acquire a token (no kerberos credentials) so jobs will fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6245) datanode fails to start with a bad disk even when failed volumes is set

2014-04-14 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968983#comment-13968983
 ] 

Arpit Gupta commented on HDFS-6245:
---

Here is the stack trace

{code}
2014-04-14 22:17:23,688 INFO  datanode.DataNode 
(SignalLogger.java:register(91)) - registered UNIX signal handlers for [TERM, 
HUP, INT]
2014-04-14 22:17:23,750 WARN  common.Util (Util.java:stringAsURI(56)) - Path 
/grid/0/hdp/hdfs/data should be specified as a URI in configuration files. 
Please update hdfs configuration.
2014-04-14 22:17:23,751 WARN  common.Util (Util.java:stringAsURI(56)) - Path 
/grid/1/hdp/hdfs/data should be specified as a URI in configuration files. 
Please update hdfs configuration.
2014-04-14 22:17:23,751 WARN  common.Util (Util.java:stringAsURI(56)) - Path 
/grid/2/hdp/hdfs/data should be specified as a URI in configuration files. 
Please update hdfs configuration.
2014-04-14 22:17:23,751 WARN  common.Util (Util.java:stringAsURI(56)) - Path 
/grid/3/hdp/hdfs/data should be specified as a URI in configuration files. 
Please update hdfs configuration.
2014-04-14 22:17:23,751 WARN  common.Util (Util.java:stringAsURI(56)) - Path 
/grid/4/hdp/hdfs/data should be specified as a URI in configuration files. 
Please update hdfs configuration.
2014-04-14 22:17:23,752 WARN  common.Util (Util.java:stringAsURI(56)) - Path 
/grid/5/hdp/hdfs/data should be specified as a URI in configuration files. 
Please update hdfs configuration.
2014-04-14 22:17:23,769 FATAL datanode.DataNode 
(DataNode.java:secureMain(1995)) - Exception in secureMain
java.lang.IllegalArgumentException: Failed to parse conf property 
dfs.datanode.data.dir: /grid/5/hdp/hdfs/data
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.getStorageLocations(DataNode.java:1786)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1768)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1812)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1988)
at 
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
Caused by: java.io.IOException: Input/output error
at java.io.UnixFileSystem.canonicalize0(Native Method)
at java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:157)
at java.io.File.getCanonicalPath(File.java:559)
at java.io.File.getCanonicalFile(File.java:583)
at org.apache.hadoop.hdfs.server.common.Util.fileAsURI(Util.java:73)
at org.apache.hadoop.hdfs.server.common.Util.stringAsURI(Util.java:58)
at 
org.apache.hadoop.hdfs.server.datanode.StorageLocation.parse(StorageLocation.java:94)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.getStorageLocations(DataNode.java:1784)
... 9 more
2014-04-14 22:17:23,772 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
Exiting with status 1
2014-04-14 22:17:23,774 INFO  datanode.DataNode (StringUtils.java:run(640)) - 
SHUTDOWN_MSG:
{code}


> datanode fails to start with a bad disk even when failed volumes is set
> ---
>
> Key: HDFS-6245
> URL: https://issues.apache.org/jira/browse/HDFS-6245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Arpit Agarwal
>
> Data node startup failed even when failed volumes was  set. Had to remove the 
> bad disk from the config to get it to boot.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6245) datanode fails to start with a bad disk even when failed volumes is set

2014-04-14 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-6245:
-

 Summary: datanode fails to start with a bad disk even when failed 
volumes is set
 Key: HDFS-6245
 URL: https://issues.apache.org/jira/browse/HDFS-6245
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Arpit Agarwal


Data node startup failed even when failed volumes was  set. Had to remove the 
bad disk from the config to get it to boot.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6207) ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken()

2014-04-08 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-6207:
--

Assignee: Jing Zhao

> ConcurrentModificationException in 
> AbstractDelegationTokenSelector.selectToken()
> 
>
> Key: HDFS-6207
> URL: https://issues.apache.org/jira/browse/HDFS-6207
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
>
> While running a hive job on a HA cluster saw ConcurrentModificationException 
> in AbstractDelegationTokenSelector.selectToken()



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6207) ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken()

2014-04-08 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963550#comment-13963550
 ] 

Arpit Gupta commented on HDFS-6207:
---

{code}
Caused by: java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
at java.util.HashMap$ValueIterator.next(HashMap.java:922)
at 
java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1067)
at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSelector.selectToken(AbstractDelegationTokenSelector.java:53)
at 
org.apache.hadoop.hdfs.HAUtil.cloneDelegationTokenForLogicalUri(HAUtil.java:260)
at 
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider.
{code}

> ConcurrentModificationException in 
> AbstractDelegationTokenSelector.selectToken()
> 
>
> Key: HDFS-6207
> URL: https://issues.apache.org/jira/browse/HDFS-6207
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>
> While running a hive job on a HA cluster saw ConcurrentModificationException 
> in AbstractDelegationTokenSelector.selectToken()



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6207) ConcurrentModificationException in AbstractDelegationTokenSelector.selectToken()

2014-04-08 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-6207:
-

 Summary: ConcurrentModificationException in 
AbstractDelegationTokenSelector.selectToken()
 Key: HDFS-6207
 URL: https://issues.apache.org/jira/browse/HDFS-6207
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta


While running a hive job on a HA cluster saw ConcurrentModificationException in 
AbstractDelegationTokenSelector.selectToken()



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6127) sLive with webhdfs fails on secure HA cluster with does not contain valid host port authority error

2014-03-19 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13940837#comment-13940837
 ] 

Arpit Gupta commented on HDFS-6127:
---

Here is the stack trace

{code}
/usr/lib/hadoop/bin/hadoop org.apache.hadoop.fs.slive.SliveTest -rename 
14,uniform -packetSize 65536 -baseDir webhdfs://ha-2-secure/user/user/ha-slive 
-seed 12345678 -sleep 100,1000 -duration 600 -append 14,uniform -blockSize 
16777216,33554432 -create 16,uniform -mkdir 14,uniform -maps 12 -ls 14,uniform 
-writeSize 1,134217728 -files 1024 -ops 1 -read 14,uniform -replication 1,3 
-appendSize 1,134217728 -reduces 6 -resFile 
/grid/0/tmp/hwqe/artifacts/ha-slive-2-namenode2-1395127484.out -readSize 
1,4294967295 -dirSize 16 -delete 14,uniform
INFO|Initial wait for Service namenode2: 60
14/03/18 07:24:44 INFO slive.SliveTest: Running with option list -rename 
14,uniform -packetSize 65536 -baseDir webhdfs://ha-2-secure/user/user/ha-slive 
-seed 12345678 -sleep 100,1000 -duration 600 -append 14,uniform -blockSize 
16777216,33554432 -create 16,uniform -mkdir 14,uniform -maps 12 -ls 14,uniform 
-writeSize 1,134217728 -files 1024 -ops 1 -read 14,uniform -replication 1,3 
-appendSize 1,134217728 -reduces 6 -resFile 
/grid/0/tmp/hwqe/artifacts/ha-slive-2-namenode2-1395127484.out -readSize 
1,4294967295 -dirSize 16 -delete 14,uniform
14/03/18 07:24:44 INFO slive.SliveTest: Options are:
14/03/18 07:24:44 INFO slive.ConfigExtractor: Base directory = 
webhdfs://ha-2-secure/user/user/ha-slive/slive
14/03/18 07:24:44 INFO slive.ConfigExtractor: Data directory = 
webhdfs://ha-2-secure/user/user/ha-slive/slive/data
14/03/18 07:24:44 INFO slive.ConfigExtractor: Output directory = 
webhdfs://ha-2-secure/user/user/ha-slive/slive/output
14/03/18 07:24:44 INFO slive.ConfigExtractor: Result file = 
/grid/0/tmp/hwqe/artifacts/ha-slive-2-namenode2-1395127484.out
14/03/18 07:24:44 INFO slive.ConfigExtractor: Grid queue = default
14/03/18 07:24:44 INFO slive.ConfigExtractor: Should exit on first error = false
14/03/18 07:24:44 INFO slive.ConfigExtractor: Duration = 60 milliseconds
14/03/18 07:24:44 INFO slive.ConfigExtractor: Map amount = 12
14/03/18 07:24:44 INFO slive.ConfigExtractor: Reducer amount = 6
14/03/18 07:24:44 INFO slive.ConfigExtractor: Operation amount = 1
14/03/18 07:24:44 INFO slive.ConfigExtractor: Total file limit = 1024
14/03/18 07:24:44 INFO slive.ConfigExtractor: Total dir file limit = 16
14/03/18 07:24:44 INFO slive.ConfigExtractor: Read size = 1,4294967295 bytes
14/03/18 07:24:44 INFO slive.ConfigExtractor: Write size = 1,134217728 bytes
14/03/18 07:24:44 INFO slive.ConfigExtractor: Append size = 1,134217728 bytes
14/03/18 07:24:44 INFO slive.ConfigExtractor: Block size = 16777216,33554432 
bytes
14/03/18 07:24:44 INFO slive.ConfigExtractor: Random seed = 12345678
14/03/18 07:24:44 INFO slive.ConfigExtractor: Sleep range = 100,1000 
milliseconds
14/03/18 07:24:44 INFO slive.ConfigExtractor: Replication amount = 1,3
14/03/18 07:24:44 INFO slive.ConfigExtractor: Operations are:
14/03/18 07:24:44 INFO slive.ConfigExtractor: READ
14/03/18 07:24:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/18 07:24:44 INFO slive.ConfigExtractor:  14%
14/03/18 07:24:44 INFO slive.ConfigExtractor: APPEND
14/03/18 07:24:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/18 07:24:44 INFO slive.ConfigExtractor:  14%
14/03/18 07:24:44 INFO slive.ConfigExtractor: MKDIR
14/03/18 07:24:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/18 07:24:44 INFO slive.ConfigExtractor:  14%
14/03/18 07:24:44 INFO slive.ConfigExtractor: LS
14/03/18 07:24:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/18 07:24:44 INFO slive.ConfigExtractor:  14%
14/03/18 07:24:44 INFO slive.ConfigExtractor: DELETE
14/03/18 07:24:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/18 07:24:44 INFO slive.ConfigExtractor:  14%
14/03/18 07:24:44 INFO slive.ConfigExtractor: RENAME
14/03/18 07:24:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/18 07:24:44 INFO slive.ConfigExtractor:  14%
14/03/18 07:24:44 INFO slive.ConfigExtractor: CREATE
14/03/18 07:24:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/18 07:24:44 INFO slive.ConfigExtractor:  16%
14/03/18 07:24:44 INFO slive.SliveTest: Running job:
14/03/18 07:24:44 WARN hdfs.DFSClient: 
dfs.client.test.drop.namenode.response.number is set to 1, this hacked client 
will proactively drop responses
14/03/18 07:24:45 WARN hdfs.DFSClient: 
dfs.client.test.drop.namenode.response.number is set to 1, this hacked client 
will proactively drop responses
14/03/18 07:24:45 WARN hdfs.DFSClient: 
dfs.client.test.drop.namenode.response.number is set to 1, this hacked client 
will proactively drop responses
14/03/18 07:24:48 WARN token.Token: Cannot find class for token kind WEBHDFS 
delegation
14/03/18 07:24:48 INFO security.TokenCache: Got dt for webhdfs://ha-2-secure; 
Kind: WEBHDFS delegation, Service: ha-hdfs:ha-2-secure, Ident: 00 06 68 72 74 
5f 71 61

[jira] [Created] (HDFS-6127) sLive with webhdfs fails on secure HA cluster with does not contain valid host port authority error

2014-03-19 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-6127:
-

 Summary: sLive with webhdfs fails on secure HA cluster with does 
not contain valid host port authority error
 Key: HDFS-6127
 URL: https://issues.apache.org/jira/browse/HDFS-6127
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Haohui Mai






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6100) webhdfs filesystem does not failover in HA mode

2014-03-12 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-6100:
-

 Summary: webhdfs filesystem does not failover in HA mode
 Key: HDFS-6100
 URL: https://issues.apache.org/jira/browse/HDFS-6100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Haohui Mai


While running slive with a webhdfs file system reducers fail as they keep 
trying to write to standby namenode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-6089:
--

Description: 
The following scenario was tested:

* Determine Active NN and suspend the process (kill -19)
* Wait about 60s to let the standby transition to active
* Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
active.


What was noticed that some times the call to get the service state of nn2 got a 
socket time out exception.

  was:
The following scenario was tested:

* Determine Active NN and suspend the process (kill -19)
* Wait about 60s to let the standby transition to active
* Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
active.


What was noticed that some times the call to get the service state of nn2 got a 
socket time out connection.


> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> 
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13930954#comment-13930954
 ] 

Arpit Gupta commented on HDFS-6089:
---

Here is the console log

{code}
sudo su - -c "/usr/bin/hdfs haadmin -getServiceState nn1" hdfs
active
exit code = 0
sudo su - -c "/usr/bin/hdfs haadmin -getServiceState nn2" hdfs
standby
exit code = 0
ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null hostname "sudo 
su - -c \"cat /grid/0/var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid | xargs kill 
-19\" hdfs"
sudo su - -c "/usr/bin/hdfs haadmin -getServiceState nn1" hdfs
Operation failed: Call From host1/ip to host1:8020 failed on socket timeout 
exception: java.net.SocketTimeoutException: 2 millis timeout while waiting 
for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=host1/ip:35192 
remote=host1/ip:8020]; For more details see:  
http://wiki.apache.org/hadoop/SocketTimeout
exit code = 255
sudo su - -c "/usr/bin/hdfs haadmin -getServiceState nn2" hdfs
Operation failed: Call From host2/ip to host2:8020 failed on socket timeout 
exception: java.net.SocketTimeoutException: 2 millis timeout while waiting 
for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=host2/ip:37640 
remote=host2/68.142.247.217:8020]; For more details see:  
http://wiki.apache.org/hadoop/SocketTimeout
exit code = 255
{code}

> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> 
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out connection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-6089:
-

 Summary: Standby NN while transitioning to active throws a 
connection refused error when the prior active NN process is suspended
 Key: HDFS-6089
 URL: https://issues.apache.org/jira/browse/HDFS-6089
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Jing Zhao


The following scenario was tested:

* Determine Active NN and suspend the process (kill -19)
* Wait about 60s to let the standby transition to active
* Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
active.


What was noticed that some times the call to get the service state of nn2 got a 
socket time out connection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6077) running slive with webhdfs on secure HA cluster fails with un kown host exception

2014-03-07 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13924433#comment-13924433
 ] 

Arpit Gupta commented on HDFS-6077:
---

{code}
RUNNING: /usr/lib/hadoop/bin/hadoop org.apache.hadoop.fs.slive.SliveTest 
-rename 14,uniform -packetSize 65536 -baseDir 
webhdfs://ha-2-secure:50070/user/hrt_qa/ha-slive -seed 12345678 -sleep 100,1000 
-duration 600 -append 14,uniform -blockSize 16777216,33554432 -create 
16,uniform -mkdir 14,uniform -maps 15 -ls 14,uniform -writeSize 1,134217728 
-files 1024 -ops 1 -read 14,uniform -replication 1,3 -appendSize 
1,134217728 -reduces 8 -resFile 
/grid/0/tmp/hwqe/artifacts/ha-slive-6-namenode2-1394091404.out -readSize 
1,4294967295 -dirSize 16 -delete 14,uniform
INFO|Initial wait for Service namenode2: 60
14/03/06 07:36:44 INFO slive.SliveTest: Running with option list -rename 
14,uniform -packetSize 65536 -baseDir 
webhdfs://ha-2-secure:50070/user/hrt_qa/ha-slive -seed 12345678 -sleep 100,1000 
-duration 600 -append 14,uniform -blockSize 16777216,33554432 -create 
16,uniform -mkdir 14,uniform -maps 15 -ls 14,uniform -writeSize 1,134217728 
-files 1024 -ops 1 -read 14,uniform -replication 1,3 -appendSize 
1,134217728 -reduces 8 -resFile 
/grid/0/tmp/hwqe/artifacts/ha-slive-6-namenode2-1394091404.out -readSize 
1,4294967295 -dirSize 16 -delete 14,uniform
14/03/06 07:36:44 INFO slive.SliveTest: Options are:
14/03/06 07:36:44 INFO slive.ConfigExtractor: Base directory = 
webhdfs://ha-2-secure:50070/user/hrt_qa/ha-slive/slive
14/03/06 07:36:44 INFO slive.ConfigExtractor: Data directory = 
webhdfs://ha-2-secure:50070/user/hrt_qa/ha-slive/slive/data
14/03/06 07:36:44 INFO slive.ConfigExtractor: Output directory = 
webhdfs://ha-2-secure:50070/user/hrt_qa/ha-slive/slive/output
14/03/06 07:36:44 INFO slive.ConfigExtractor: Result file = 
/grid/0/tmp/hwqe/artifacts/ha-slive-6-namenode2-1394091404.out
14/03/06 07:36:44 INFO slive.ConfigExtractor: Grid queue = default
14/03/06 07:36:44 INFO slive.ConfigExtractor: Should exit on first error = false
14/03/06 07:36:44 INFO slive.ConfigExtractor: Duration = 60 milliseconds
14/03/06 07:36:44 INFO slive.ConfigExtractor: Map amount = 15
14/03/06 07:36:44 INFO slive.ConfigExtractor: Reducer amount = 8
14/03/06 07:36:44 INFO slive.ConfigExtractor: Operation amount = 1
14/03/06 07:36:44 INFO slive.ConfigExtractor: Total file limit = 1024
14/03/06 07:36:44 INFO slive.ConfigExtractor: Total dir file limit = 16
14/03/06 07:36:44 INFO slive.ConfigExtractor: Read size = 1,4294967295 bytes
14/03/06 07:36:44 INFO slive.ConfigExtractor: Write size = 1,134217728 bytes
14/03/06 07:36:44 INFO slive.ConfigExtractor: Append size = 1,134217728 bytes
14/03/06 07:36:44 INFO slive.ConfigExtractor: Block size = 16777216,33554432 
bytes
14/03/06 07:36:44 INFO slive.ConfigExtractor: Random seed = 12345678
14/03/06 07:36:44 INFO slive.ConfigExtractor: Sleep range = 100,1000 
milliseconds
14/03/06 07:36:44 INFO slive.ConfigExtractor: Replication amount = 1,3
14/03/06 07:36:44 INFO slive.ConfigExtractor: Operations are:
14/03/06 07:36:44 INFO slive.ConfigExtractor: LS
14/03/06 07:36:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/06 07:36:44 INFO slive.ConfigExtractor:  14%
14/03/06 07:36:44 INFO slive.ConfigExtractor: READ
14/03/06 07:36:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/06 07:36:44 INFO slive.ConfigExtractor:  14%
14/03/06 07:36:44 INFO slive.ConfigExtractor: APPEND
14/03/06 07:36:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/06 07:36:44 INFO slive.ConfigExtractor:  14%
14/03/06 07:36:44 INFO slive.ConfigExtractor: CREATE
14/03/06 07:36:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/06 07:36:44 INFO slive.ConfigExtractor:  16%
14/03/06 07:36:44 INFO slive.ConfigExtractor: RENAME
14/03/06 07:36:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/06 07:36:44 INFO slive.ConfigExtractor:  14%
14/03/06 07:36:44 INFO slive.ConfigExtractor: DELETE
14/03/06 07:36:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/06 07:36:44 INFO slive.ConfigExtractor:  14%
14/03/06 07:36:44 INFO slive.ConfigExtractor: MKDIR
14/03/06 07:36:44 INFO slive.ConfigExtractor:  UNIFORM
14/03/06 07:36:44 INFO slive.ConfigExtractor:  14%
14/03/06 07:36:44 INFO slive.SliveTest: Running job:
14/03/06 07:36:44 WARN hdfs.DFSClient: 
dfs.client.test.drop.namenode.response.number is set to 1, this hacked client 
will proactively drop responses
14/03/06 07:36:45 WARN hdfs.DFSClient: 
dfs.client.test.drop.namenode.response.number is set to 1, this hacked client 
will proactively drop responses
14/03/06 07:36:45 WARN hdfs.DFSClient: 
dfs.client.test.drop.namenode.response.number is set to 1, this hacked client 
will proactively drop responses
14/03/06 07:36:45 ERROR slive.SliveTest: Unable to run job due to error:
java.lang.IllegalArgumentException: java.net.UnknownHostException: ha-2-secure
at 
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.ja

[jira] [Created] (HDFS-6077) running slive with webhdfs on secure HA cluster fails with un kown host exception

2014-03-07 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-6077:
-

 Summary: running slive with webhdfs on secure HA cluster fails 
with un kown host exception
 Key: HDFS-6077
 URL: https://issues.apache.org/jira/browse/HDFS-6077
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Arpit Gupta
Assignee: Jing Zhao






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5399) Revisit SafeModeException and corresponding retry policies

2014-02-04 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891074#comment-13891074
 ] 

Arpit Gupta commented on HDFS-5399:
---

[~atm]

bq.  Am I correct in assuming that the test you were running did not manually 
cause the NN to enter or leave safemode?

Yes that is correct.

> Revisit SafeModeException and corresponding retry policies
> --
>
> Key: HDFS-5399
> URL: https://issues.apache.org/jira/browse/HDFS-5399
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> Currently for NN SafeMode, we have the following corresponding retry policies:
> # In non-HA setup, for certain API call ("create"), the client will retry if 
> the NN is in SafeMode. Specifically, the client side's RPC adopts 
> MultipleLinearRandomRetry policy for a wrapped SafeModeException when retry 
> is enabled.
> # In HA setup, the client will retry if the NN is Active and in SafeMode. 
> Specifically, the SafeModeException is wrapped as a RetriableException in the 
> server side. Client side's RPC uses FailoverOnNetworkExceptionRetry policy 
> which recognizes RetriableException (see HDFS-5291).
> There are several possible issues in the current implementation:
> # The NN SafeMode can be a "Manual" SafeMode (i.e., started by administrator 
> through CLI), and the clients may not want to retry on this type of SafeMode.
> # Client may want to retry on other API calls in non-HA setup.
> # We should have a single generic strategy to address the mapping between 
> SafeMode and retry policy for both HA and non-HA setup. A possible 
> straightforward solution is to always wrap the SafeModeException in the 
> RetriableException to indicate that the clients should retry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5399) Revisit SafeModeException and corresponding retry policies

2014-01-30 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887320#comment-13887320
 ] 

Arpit Gupta commented on HDFS-5399:
---

bq. Can you comment on how frequently/quickly the active NN is killed and 
restarted in this test?
The tests were killing the active namenode every 5 Mins.

bq. I'm guessing you meant "NOT a flaw in the test" here? Or do I misunderstand 
your point?
Yes you are correct i meant not :).

bq.  I'm specifically curious about whether or not the standby NN was given 
enough time to get out of startup safemode before a failover to it was 
attempted.

I wanted to make sure i understand this scenario. To me this would happen if 
the current standby namenode (nn2) was active before and recently (a few 
seconds ago) was killed and started causing it be in safemode and then the 
active (nn1) at the same time was killed causing the client to go to nn2 and 
its still in safemode. Did i understand it right?

I dont believe we hit this scenario as we restarted the active NN every 5 mins. 
However i can see the need of client retires to make sure even during the above 
scenario dfsclient is able to retry and wait for the nn to come out of safemode.
 

> Revisit SafeModeException and corresponding retry policies
> --
>
> Key: HDFS-5399
> URL: https://issues.apache.org/jira/browse/HDFS-5399
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> Currently for NN SafeMode, we have the following corresponding retry policies:
> # In non-HA setup, for certain API call ("create"), the client will retry if 
> the NN is in SafeMode. Specifically, the client side's RPC adopts 
> MultipleLinearRandomRetry policy for a wrapped SafeModeException when retry 
> is enabled.
> # In HA setup, the client will retry if the NN is Active and in SafeMode. 
> Specifically, the SafeModeException is wrapped as a RetriableException in the 
> server side. Client side's RPC uses FailoverOnNetworkExceptionRetry policy 
> which recognizes RetriableException (see HDFS-5291).
> There are several possible issues in the current implementation:
> # The NN SafeMode can be a "Manual" SafeMode (i.e., started by administrator 
> through CLI), and the clients may not want to retry on this type of SafeMode.
> # Client may want to retry on other API calls in non-HA setup.
> # We should have a single generic strategy to address the mapping between 
> SafeMode and retry policy for both HA and non-HA setup. A possible 
> straightforward solution is to always wrap the SafeModeException in the 
> RetriableException to indicate that the clients should retry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5532) Enable the webhdfs by default to support new HDFS web UI

2014-01-30 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886842#comment-13886842
 ] 

Arpit Gupta commented on HDFS-5532:
---

A property already exists for this

dfs.web.authentication.kerberos.principal

And there is another property that defines where your keytab is. I am not sure 
we should set defaults for these as we dont do them for any other principal and 
keytab properties.

We should probably update any documentation we have regarding secure setup.

> Enable the webhdfs by default to support new HDFS web UI
> 
>
> Key: HDFS-5532
> URL: https://issues.apache.org/jira/browse/HDFS-5532
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Vinay
>Assignee: Vinay
> Fix For: 2.3.0
>
> Attachments: HDFS-5532.patch, HDFS-5532.patch
>
>
> Recently in HDFS-5444, new HDFS web UI is made as default. 
>  but this needs webhdfs to be enabled. 
> WebHDFS is disabled by default. Lets enable it by default to support new 
> really cool web UI.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5399) Revisit SafeModeException and corresponding retry policies

2014-01-29 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886208#comment-13886208
 ] 

Arpit Gupta commented on HDFS-5399:
---

bq. The test you observed this issue in didn't run long enough for the standby 
NN to leave startup safemode on its own before the failover was attempted. The 
NN will delay processing block reports for block IDs it doesn't recognize 
(because they're created in edits that the NN hasn't read yet) and then only on 
transition to active do we fully catch up by reading all the edits, and then 
re-process the delayed block reports, triggering the NN to leave startup 
safemode.

Its not the test that directly fails. We see exceptions in the RM when its 
trying to talk to HDFS or in RS when its trying to talk to HDFS which causes 
the actual MR job etc to fail. So its not something that the test can control. 
For example we are running an MR job and are periodically killing the active NN 
and the job eventually fails as the tasks that want to talk to hdfs fail or the 
RM runs into this exception causing the application to fail. Hence i would 
argue that its a flaw in the test :).


> Revisit SafeModeException and corresponding retry policies
> --
>
> Key: HDFS-5399
> URL: https://issues.apache.org/jira/browse/HDFS-5399
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> Currently for NN SafeMode, we have the following corresponding retry policies:
> # In non-HA setup, for certain API call ("create"), the client will retry if 
> the NN is in SafeMode. Specifically, the client side's RPC adopts 
> MultipleLinearRandomRetry policy for a wrapped SafeModeException when retry 
> is enabled.
> # In HA setup, the client will retry if the NN is Active and in SafeMode. 
> Specifically, the SafeModeException is wrapped as a RetriableException in the 
> server side. Client side's RPC uses FailoverOnNetworkExceptionRetry policy 
> which recognizes RetriableException (see HDFS-5291).
> There are several possible issues in the current implementation:
> # The NN SafeMode can be a "Manual" SafeMode (i.e., started by administrator 
> through CLI), and the clients may not want to retry on this type of SafeMode.
> # Client may want to retry on other API calls in non-HA setup.
> # We should have a single generic strategy to address the mapping between 
> SafeMode and retry policy for both HA and non-HA setup. A possible 
> straightforward solution is to always wrap the SafeModeException in the 
> RetriableException to indicate that the clients should retry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5399) Revisit SafeModeException and corresponding retry policies

2014-01-29 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885916#comment-13885916
 ] 

Arpit Gupta commented on HDFS-5399:
---

We had run into this issue while testing HA. You can see in HDFS-5291 that the 
standby NN after transitioning to active went into safemode. We saw issues 
where Resource Manager and Region Servers would crash/complain because of this. 
We ran into this frequently before HDFS-5291 was fixed.

> Revisit SafeModeException and corresponding retry policies
> --
>
> Key: HDFS-5399
> URL: https://issues.apache.org/jira/browse/HDFS-5399
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> Currently for NN SafeMode, we have the following corresponding retry policies:
> # In non-HA setup, for certain API call ("create"), the client will retry if 
> the NN is in SafeMode. Specifically, the client side's RPC adopts 
> MultipleLinearRandomRetry policy for a wrapped SafeModeException when retry 
> is enabled.
> # In HA setup, the client will retry if the NN is Active and in SafeMode. 
> Specifically, the SafeModeException is wrapped as a RetriableException in the 
> server side. Client side's RPC uses FailoverOnNetworkExceptionRetry policy 
> which recognizes RetriableException (see HDFS-5291).
> There are several possible issues in the current implementation:
> # The NN SafeMode can be a "Manual" SafeMode (i.e., started by administrator 
> through CLI), and the clients may not want to retry on this type of SafeMode.
> # Client may want to retry on other API calls in non-HA setup.
> # We should have a single generic strategy to address the mapping between 
> SafeMode and retry policy for both HA and non-HA setup. A possible 
> straightforward solution is to always wrap the SafeModeException in the 
> RetriableException to indicate that the clients should retry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HDFS-5653) Log namenode hostname in various exceptions being thrown in a HA setup

2013-12-10 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-5653:
--

Priority: Minor  (was: Major)

> Log namenode hostname in various exceptions being thrown in a HA setup
> --
>
> Key: HDFS-5653
> URL: https://issues.apache.org/jira/browse/HDFS-5653
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha
>Affects Versions: 2.2.0
>Reporter: Arpit Gupta
>Priority: Minor
>
> In a HA setup any time we see an exception such as safemode or namenode in 
> standby etc we dont know which namenode it came from. The user has to go to 
> the logs of the namenode and determine which one was active and/or standby 
> around the same time.
> I think it would help with debugging if any such exceptions could include the 
> namenode hostname so the user could know exactly which namenode served the 
> request.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Created] (HDFS-5653) Log namenode hostname in various exceptions being thrown in a HA setup

2013-12-10 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-5653:
-

 Summary: Log namenode hostname in various exceptions being thrown 
in a HA setup
 Key: HDFS-5653
 URL: https://issues.apache.org/jira/browse/HDFS-5653
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.2.0
Reporter: Arpit Gupta


In a HA setup any time we see an exception such as safemode or namenode in 
standby etc we dont know which namenode it came from. The user has to go to the 
logs of the namenode and determine which one was active and/or standby around 
the same time.
I think it would help with debugging if any such exceptions could include the 
namenode hostname so the user could know exactly which namenode served the 
request.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Commented] (HDFS-5382) Implement the UI of browsing filesystems in HTML 5 page

2013-10-17 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13798221#comment-13798221
 ] 

Arpit Gupta commented on HDFS-5382:
---

Will this also handle when webhdfs is not configured and when security is on?

> Implement the UI of browsing filesystems in HTML 5 page
> ---
>
> Key: HDFS-5382
> URL: https://issues.apache.org/jira/browse/HDFS-5382
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5382.000.patch
>
>
> The UI of browsing filesystems can be implemented as an HTML 5 application. 
> The UI can pull the data from WebHDFS.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5322) HDFS delegation token not found in cache errors seen on secure HA clusters

2013-10-11 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13793061#comment-13793061
 ] 

Arpit Gupta commented on HDFS-5322:
---

Bunch of secure HA tests were run last night with this change and we did not 
see test failures because of this.

+1

> HDFS delegation token not found in cache errors seen on secure HA clusters
> --
>
> Key: HDFS-5322
> URL: https://issues.apache.org/jira/browse/HDFS-5322
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.1.1-beta
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
> Attachments: HDFS-5322.000.patch, HDFS-5322.000.patch, 
> HDFS-5322.001.patch, HDFS-5322.002.patch, HDFS-5322.003.patch, 
> HDFS-5322.004.patch, HDFS-5322.005.patch, HDFS-5322.006.patch
>
>
> While running HA tests we have seen issues were we see HDFS delegation token 
> not found in cache errors causing jobs running to fail.
> {code}
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> |2013-10-06 20:14:51,193 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1381090351344_0001_m_07_0, Status : FAILED
> Error: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 11 for hrt_qa) can't be found in cache
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HDFS-5335) Hive query failed with possible race in dfs output stream

2013-10-09 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-5335:
-

 Summary: Hive query failed with possible race in dfs output stream
 Key: HDFS-5335
 URL: https://issues.apache.org/jira/browse/HDFS-5335
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Arpit Gupta
Assignee: Haohui Mai


Here is the stack trace from the client

{code}
java.nio.channels.ClosedChannelException
at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1317)
at 
org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(DFSOutputStream.java:1810)
at 
org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:1789)
at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1877)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:71)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:104)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:54)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyRemoteFiles(JobSubmitter.java:139)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:212)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:300)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:387)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Job Submission failed with exception 
'java.nio.channels.ClosedChannelException(null)'
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HDFS-5322) HDFS delegation token not found in cache errors seen on secure HA clusters

2013-10-07 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-5322:
-

 Summary: HDFS delegation token not found in cache errors seen on 
secure HA clusters
 Key: HDFS-5322
 URL: https://issues.apache.org/jira/browse/HDFS-5322
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.1.1-beta
Reporter: Arpit Gupta
Assignee: Jing Zhao


While running HA tests we have seen issues were we see HDFS delegation token 
not found in cache errors causing jobs running to fail.

{code}
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
|2013-10-06 20:14:51,193 INFO  [main] mapreduce.Job: Task Id : 
attempt_1381090351344_0001_m_07_0, Status : FAILED
Error: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (HDFS_DELEGATION_TOKEN token 11 for hrt_qa) can't be found in cache
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5291) Standby namenode after transition to active goes into safemode

2013-10-02 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784647#comment-13784647
 ] 

Arpit Gupta commented on HDFS-5291:
---

This is seen in our nightlies where we see other services being impacted by 
namenode being in safemode. In our tests we are killing the active namenode 
every 5 mins and some times we see that after the transition from standby to 
active the namenode goes into safemode.

> Standby namenode after transition to active goes into safemode
> --
>
> Key: HDFS-5291
> URL: https://issues.apache.org/jira/browse/HDFS-5291
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.1.1-beta
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: nn.log
>
>
> Some log snippets
> standby state to active transition
> {code}
> 2013-10-02 00:13:49,482 INFO  ipc.Server (Server.java:run(2068)) - IPC Server 
> handler 69 on 8020, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.renewLease from IP:33911 
> Call#1483 Retry#1: error: org.apache.hadoop.ipc.StandbyException: Operation 
> category WRITE is not supported in state standby
> 2013-10-02 00:13:49,689 INFO  ipc.Server (Server.java:saslProcess(1342)) - 
> Auth successful for nn/hostn...@example.com (auth:SIMPLE)
> 2013-10-02 00:13:49,696 INFO  authorize.ServiceAuthorizationManager 
> (ServiceAuthorizationManager.java:authorize(111)) - Authorization successful 
> for nn/hostn...@example.com (auth:KERBEROS) for protocol=interface 
> org.apache.hadoop.ha.HAServiceProtocol
> 2013-10-02 00:13:49,700 INFO  namenode.FSNamesystem 
> (FSNamesystem.java:stopStandbyServices(1013)) - Stopping services started for 
> standby state
> 2013-10-02 00:13:49,701 WARN  ha.EditLogTailer 
> (EditLogTailer.java:doWork(336)) - Edit log tailer interrupted
> java.lang.InterruptedException: sleep interrupted
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1463)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:454)
> at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTail
> 2013-10-02 00:13:49,704 INFO  namenode.FSNamesystem 
> (FSNamesystem.java:startActiveServices(885)) - Starting services required for 
> active state
> 2013-10-02 00:13:49,719 INFO  client.QuorumJournalManager 
> (QuorumJournalManager.java:recoverUnfinalizedSegments(419)) - Starting 
> recovery process for unclosed journal segments...
> 2013-10-02 00:13:49,755 INFO  ipc.Server (Server.java:saslProcess(1342)) - 
> Auth successful for hbase/hostn...@example.com (auth:SIMPLE)
> 2013-10-02 00:13:49,761 INFO  authorize.ServiceAuthorizationManager 
> (ServiceAuthorizationManager.java:authorize(111)) - Authorization successful 
> for hbase/hostn...@example.com (auth:KERBEROS) for protocol=interface 
> org.apache.hadoop.hdfs.protocol.ClientProtocol
> 2013-10-02 00:13:49,839 INFO  client.QuorumJournalManager 
> (QuorumJournalManager.java:recoverUnfinalizedSegments(421)) - Successfully 
> started new epoch 85
> 2013-10-02 00:13:49,839 INFO  client.QuorumJournalManager 
> (QuorumJournalManager.java:recoverUnclosedSegment(249)) - Beginning recovery 
> of unclosed segment starting at txid 887112
> 2013-10-02 00:13:49,874 INFO  client.QuorumJournalManager 
> (QuorumJournalManager.java:recoverUnclosedSegment(258)) - Recovery prepare 
> phase complete. Responses:
> IP:8485: segmentState { startTxId: 887112 endTxId: 887531 isInProgress: true 
> } lastWriterEpoch: 84 lastCommittedTxId: 887530
> 172.18.145.97:8485: segmentState { startTxId: 887112 endTxId: 887531 
> isInProgress: true } lastWriterEpoch: 84 lastCommittedTxId: 887530
> 2013-10-02 00:13:49,875 INFO  client.QuorumJournalManager 
> (QuorumJournalManager.java:recover
> {code}
> And then we get into safemode
> {code}
> Construction[IP:1019|RBW]]} size 0
> 2013-10-02 00:13:50,277 INFO  BlockStateChange 
> (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: blockMap 
> updated: IP:1019 is added to blk_IP157{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[IP:1019|RBW], 
> ReplicaUnderConstruction[172.18.145.96:10

[jira] [Updated] (HDFS-5291) Standby namenode after transition to active goes into safemode

2013-10-02 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-5291:
--

Attachment: nn.log

> Standby namenode after transition to active goes into safemode
> --
>
> Key: HDFS-5291
> URL: https://issues.apache.org/jira/browse/HDFS-5291
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.1.1-beta
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: nn.log
>
>
> Some log snippets
> standby state to active transition
> {code}
> 2013-10-02 00:13:49,482 INFO  ipc.Server (Server.java:run(2068)) - IPC Server 
> handler 69 on 8020, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.renewLease from IP:33911 
> Call#1483 Retry#1: error: org.apache.hadoop.ipc.StandbyException: Operation 
> category WRITE is not supported in state standby
> 2013-10-02 00:13:49,689 INFO  ipc.Server (Server.java:saslProcess(1342)) - 
> Auth successful for nn/hostn...@example.com (auth:SIMPLE)
> 2013-10-02 00:13:49,696 INFO  authorize.ServiceAuthorizationManager 
> (ServiceAuthorizationManager.java:authorize(111)) - Authorization successful 
> for nn/hostn...@example.com (auth:KERBEROS) for protocol=interface 
> org.apache.hadoop.ha.HAServiceProtocol
> 2013-10-02 00:13:49,700 INFO  namenode.FSNamesystem 
> (FSNamesystem.java:stopStandbyServices(1013)) - Stopping services started for 
> standby state
> 2013-10-02 00:13:49,701 WARN  ha.EditLogTailer 
> (EditLogTailer.java:doWork(336)) - Edit log tailer interrupted
> java.lang.InterruptedException: sleep interrupted
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1463)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:454)
> at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTail
> 2013-10-02 00:13:49,704 INFO  namenode.FSNamesystem 
> (FSNamesystem.java:startActiveServices(885)) - Starting services required for 
> active state
> 2013-10-02 00:13:49,719 INFO  client.QuorumJournalManager 
> (QuorumJournalManager.java:recoverUnfinalizedSegments(419)) - Starting 
> recovery process for unclosed journal segments...
> 2013-10-02 00:13:49,755 INFO  ipc.Server (Server.java:saslProcess(1342)) - 
> Auth successful for hbase/hostn...@example.com (auth:SIMPLE)
> 2013-10-02 00:13:49,761 INFO  authorize.ServiceAuthorizationManager 
> (ServiceAuthorizationManager.java:authorize(111)) - Authorization successful 
> for hbase/hostn...@example.com (auth:KERBEROS) for protocol=interface 
> org.apache.hadoop.hdfs.protocol.ClientProtocol
> 2013-10-02 00:13:49,839 INFO  client.QuorumJournalManager 
> (QuorumJournalManager.java:recoverUnfinalizedSegments(421)) - Successfully 
> started new epoch 85
> 2013-10-02 00:13:49,839 INFO  client.QuorumJournalManager 
> (QuorumJournalManager.java:recoverUnclosedSegment(249)) - Beginning recovery 
> of unclosed segment starting at txid 887112
> 2013-10-02 00:13:49,874 INFO  client.QuorumJournalManager 
> (QuorumJournalManager.java:recoverUnclosedSegment(258)) - Recovery prepare 
> phase complete. Responses:
> IP:8485: segmentState { startTxId: 887112 endTxId: 887531 isInProgress: true 
> } lastWriterEpoch: 84 lastCommittedTxId: 887530
> 172.18.145.97:8485: segmentState { startTxId: 887112 endTxId: 887531 
> isInProgress: true } lastWriterEpoch: 84 lastCommittedTxId: 887530
> 2013-10-02 00:13:49,875 INFO  client.QuorumJournalManager 
> (QuorumJournalManager.java:recover
> {code}
> And then we get into safemode
> {code}
> Construction[IP:1019|RBW]]} size 0
> 2013-10-02 00:13:50,277 INFO  BlockStateChange 
> (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: blockMap 
> updated: IP:1019 is added to blk_IP157{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[IP:1019|RBW], 
> ReplicaUnderConstruction[172.18.145.96:1019|RBW], ReplicaUnde
> rConstruction[IP:1019|RBW]]} size 0
> 2013-10-02 00:13:50,279 INFO  hdfs.StateChange 
> (FSNamesystem.java:reportStatus(4703)) - STATE* Safe mode ON.
> The reported blocks 1071 needs additional 5 blocks to reach the threshold 
> 1. of total blocks 1075.
> Safe mode will be turne

[jira] [Created] (HDFS-5291) Standby namenode after transition to active goes into safemode

2013-10-02 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-5291:
-

 Summary: Standby namenode after transition to active goes into 
safemode
 Key: HDFS-5291
 URL: https://issues.apache.org/jira/browse/HDFS-5291
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.1.1-beta
Reporter: Arpit Gupta
Assignee: Jing Zhao
Priority: Critical
 Attachments: nn.log

Some log snippets

standby state to active transition
{code}
2013-10-02 00:13:49,482 INFO  ipc.Server (Server.java:run(2068)) - IPC Server 
handler 69 on 8020, call 
org.apache.hadoop.hdfs.protocol.ClientProtocol.renewLease from IP:33911 
Call#1483 Retry#1: error: org.apache.hadoop.ipc.StandbyException: Operation 
category WRITE is not supported in state standby
2013-10-02 00:13:49,689 INFO  ipc.Server (Server.java:saslProcess(1342)) - Auth 
successful for nn/hostn...@example.com (auth:SIMPLE)
2013-10-02 00:13:49,696 INFO  authorize.ServiceAuthorizationManager 
(ServiceAuthorizationManager.java:authorize(111)) - Authorization successful 
for nn/hostn...@example.com (auth:KERBEROS) for protocol=interface 
org.apache.hadoop.ha.HAServiceProtocol
2013-10-02 00:13:49,700 INFO  namenode.FSNamesystem 
(FSNamesystem.java:stopStandbyServices(1013)) - Stopping services started for 
standby state
2013-10-02 00:13:49,701 WARN  ha.EditLogTailer (EditLogTailer.java:doWork(336)) 
- Edit log tailer interrupted
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:334)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1463)
at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:454)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTail
2013-10-02 00:13:49,704 INFO  namenode.FSNamesystem 
(FSNamesystem.java:startActiveServices(885)) - Starting services required for 
active state
2013-10-02 00:13:49,719 INFO  client.QuorumJournalManager 
(QuorumJournalManager.java:recoverUnfinalizedSegments(419)) - Starting recovery 
process for unclosed journal segments...
2013-10-02 00:13:49,755 INFO  ipc.Server (Server.java:saslProcess(1342)) - Auth 
successful for hbase/hostn...@example.com (auth:SIMPLE)
2013-10-02 00:13:49,761 INFO  authorize.ServiceAuthorizationManager 
(ServiceAuthorizationManager.java:authorize(111)) - Authorization successful 
for hbase/hostn...@example.com (auth:KERBEROS) for protocol=interface 
org.apache.hadoop.hdfs.protocol.ClientProtocol
2013-10-02 00:13:49,839 INFO  client.QuorumJournalManager 
(QuorumJournalManager.java:recoverUnfinalizedSegments(421)) - Successfully 
started new epoch 85
2013-10-02 00:13:49,839 INFO  client.QuorumJournalManager 
(QuorumJournalManager.java:recoverUnclosedSegment(249)) - Beginning recovery of 
unclosed segment starting at txid 887112
2013-10-02 00:13:49,874 INFO  client.QuorumJournalManager 
(QuorumJournalManager.java:recoverUnclosedSegment(258)) - Recovery prepare 
phase complete. Responses:
IP:8485: segmentState { startTxId: 887112 endTxId: 887531 isInProgress: true } 
lastWriterEpoch: 84 lastCommittedTxId: 887530
172.18.145.97:8485: segmentState { startTxId: 887112 endTxId: 887531 
isInProgress: true } lastWriterEpoch: 84 lastCommittedTxId: 887530
2013-10-02 00:13:49,875 INFO  client.QuorumJournalManager 
(QuorumJournalManager.java:recover
{code}


And then we get into safemode

{code}
Construction[IP:1019|RBW]]} size 0
2013-10-02 00:13:50,277 INFO  BlockStateChange 
(BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: blockMap 
updated: IP:1019 is added to blk_IP157{blockUCState=UNDER_CONSTRUCTION, 
primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[IP:1019|RBW], 
ReplicaUnderConstruction[172.18.145.96:1019|RBW], ReplicaUnde
rConstruction[IP:1019|RBW]]} size 0
2013-10-02 00:13:50,279 INFO  hdfs.StateChange 
(FSNamesystem.java:reportStatus(4703)) - STATE* Safe mode ON.
The reported blocks 1071 needs additional 5 blocks to reach the threshold 
1. of total blocks 1075.
Safe mode will be turned off automatically
2013-10-02 00:13:50,279 INFO  BlockStateChange 
(BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: blockMap 
updated: IP:1019 is added to blk_IP158{blockUCState=UNDER_CONSTRUCTION, 
primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[172.18.145.99:1019|RBW], 
ReplicaUnderConstruction[172.18.145.97:1019|RBW], 
R

[jira] [Commented] (HDFS-5221) hftp: does not work with HA NN configuration

2013-09-17 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770140#comment-13770140
 ] 

Arpit Gupta commented on HDFS-5221:
---

This might be a dup of HDFS-5123

> hftp: does not work with HA NN configuration
> 
>
> Key: HDFS-5221
> URL: https://issues.apache.org/jira/browse/HDFS-5221
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, hdfs-client
>Affects Versions: 2.0.5-alpha
>Reporter: Joep Rottinghuis
>Priority: Blocker
>
> When copying data between clusters of significant different version (say from 
> Hadoop 1.x equivalent to Hadoop 2.x) we have to use hftp.
> When HA is configured, you have to point to a single (active) NN.
> Now, when the active NN becomes standby, the the hftp: addresses will fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5176) WebHDFS should support logical service names in URIs

2013-09-09 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762428#comment-13762428
 ] 

Arpit Gupta commented on HDFS-5176:
---

Is this a dup of HDFS-5122?

> WebHDFS should support logical service names in URIs
> 
>
> Key: HDFS-5176
> URL: https://issues.apache.org/jira/browse/HDFS-5176
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>
> Having WebHDFS support logical URIs would allow users to eg distcp from one 
> system to another (using a webhdfs source) w/o having to first figure out the 
> hostname for the active NameNode on the source. Eventually we can make 
> WebHdfsFileSystem fully support HA (eg failover) but this would be a useful 
> intermediate point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5147) Certain dfsadmin commands such as safemode do not interact with the active namenode in ha setup

2013-09-03 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756728#comment-13756728
 ] 

Arpit Gupta commented on HDFS-5147:
---

Ah thanks Konstantin.

Yes by default we should always go to the active NN.

> Certain dfsadmin commands such as safemode do not interact with the active 
> namenode in ha setup
> ---
>
> Key: HDFS-5147
> URL: https://issues.apache.org/jira/browse/HDFS-5147
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
>
> There are certain commands in dfsadmin return the status of the first 
> namenode specified in the configs rather than interacting with the active 
> namenode
> For example. Issue
> hdfs dfsadmin -safemode get
> and it will return the status of the first namenode in the configs rather 
> than the active namenode.
> I think all dfsadmin commands should determine which is the active namenode 
> do the operation on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5147) Certain dfsadmin commands such as safemode do not interact with the active namenode in ha setup

2013-08-30 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13754933#comment-13754933
 ] 

Arpit Gupta commented on HDFS-5147:
---

Also we should add an optional argument to take in the namenode hostname or 
dfs.ha.namenodes.${dfs.nameservices} value so the user can do admin operation 
on any namenode.

> Certain dfsadmin commands such as safemode do not interact with the active 
> namenode in ha setup
> ---
>
> Key: HDFS-5147
> URL: https://issues.apache.org/jira/browse/HDFS-5147
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>
> There are certain commands in dfsadmin return the status of the first 
> namenode specified in the configs rather than interacting with the active 
> namenode
> For example. Issue
> hdfs dfsadmin -safemode get
> and it will return the status of the first namenode in the configs rather 
> than the active namenode.
> I think all dfsadmin commands should determine which is the active namenode 
> do the operation on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5147) Certain dfsadmin commands such as safemode do not interact with the active namenode in ha setup

2013-08-30 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13754930#comment-13754930
 ] 

Arpit Gupta commented on HDFS-5147:
---

Currently dfsadmin has the following commands

{code}
Note: Administrative commands can only be run as the HDFS superuser.
   [-report]
   [-safemode enter | leave | get | wait]
   [-allowSnapshot ]
   [-disallowSnapshot ]
   [-saveNamespace]
   [-rollEdits]
   [-restoreFailedStorage true|false|check]
   [-refreshNodes]
   [-finalizeUpgrade]
   [-metasave filename]
   [-refreshServiceAcl]
   [-refreshUserToGroupsMappings]
   [-refreshSuperUserGroupsConfiguration]
   [-printTopology]
   [-refreshNamenodes datanodehost:port]
   [-deleteBlockPool datanode-host:port blockpoolId [force]]
   [-setQuota  ...]
   [-clrQuota ...]
   [-setSpaceQuota  ...]
   [-clrSpaceQuota ...]
   [-setBalancerBandwidth ]
   [-fetchImage ]
{code}

> Certain dfsadmin commands such as safemode do not interact with the active 
> namenode in ha setup
> ---
>
> Key: HDFS-5147
> URL: https://issues.apache.org/jira/browse/HDFS-5147
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>
> There are certain commands in dfsadmin return the status of the first 
> namenode specified in the configs rather than interacting with the active 
> namenode
> For example. Issue
> hdfs dfsadmin -safemode get
> and it will return the status of the first namenode in the configs rather 
> than the active namenode.
> I think all dfsadmin commands should determine which is the active namenode 
> do the operation on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-5147) Certain dfsadmin commands such as safemode do not interact with the active namenode in ha setup

2013-08-30 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-5147:
-

 Summary: Certain dfsadmin commands such as safemode do not 
interact with the active namenode in ha setup
 Key: HDFS-5147
 URL: https://issues.apache.org/jira/browse/HDFS-5147
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.1.0-beta
Reporter: Arpit Gupta


There are certain commands in dfsadmin return the status of the first namenode 
specified in the configs rather than interacting with the active namenode

For example. Issue

hdfs dfsadmin -safemode get

and it will return the status of the first namenode in the configs rather than 
the active namenode.

I think all dfsadmin commands should determine which is the active namenode do 
the operation on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5140) Too many safemode monitor threads being created in the standby namenode

2013-08-28 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752639#comment-13752639
 ] 

Arpit Gupta commented on HDFS-5140:
---

Here is the stack trace from the standby namenode

{code}
2013-08-28 08:58:45,519 INFO  hdfs.StateChange 
(FSNamesystem.java:reportStatus(4677)) - STATE* Safe mode extension entered.
The reported blocks 833 has reached the threshold 1. of total blocks 833. 
The number of live datanodes 3 has reached the minimum number 0. Safe mode will 
be turned off automatically in 29 seconds.
2013-08-28 08:58:45,524 ERROR namenode.FSEditLogLoader 
(FSEditLogLoader.java:loadEditRecords(203)) - Encountered exception on 
operation CloseOp [length=0, inodeId=0, 
path=/user/hrt_qa/ha-loadgenerator/100-threads/dir3/dir2/dir5/dir4/dir2/dir1/hostname63,
 replication=3, mtime=1377680236411, atime=1377680236320, blockSize=134217728, 
blocks=[blk_1073940431_205511], permissions=hrt_qa:hrt_qa:rw-r--r--, 
clientName=, clientMachine=, opCode=OP_CLOSE, txid=1141116]
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:640)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.checkMode(FSNamesystem.java:4521)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.incrementSafeBlockCount(FSNamesystem.java:4568)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.access$1900(FSNamesystem.java:4275)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.incrementSafeBlockCount(FSNamesystem.java:4854)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:596)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:608)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:621)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:696)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:372)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:198)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:111)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:733)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:456)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:292)
2013-08-28 08:58:45,597 FATAL ha.EditLogTailer (EditLogTailer.java:doWork(328)) 
- Unknown error encountered while tailing edits. Shutting down standby NN.
java.io.IOException: Failed to apply edit log operation CloseOp [length=0, 
inodeId=0, 
path=/user/hrt_qa/ha-loadgenerator/100-threads/dir3/dir2/dir5/dir4/dir2/dir1/hostname63,
 replication=3, mtime=1377680236411, atime=1377680236320, blockSize=134217728, 
blocks=[blk_1073940431_205511], permissions=hrt_qa:hrt_qa:rw-r--r--, 
clientName=, clientMachine=, opCode=OP_CLOSE, txid=1141116]: error unable to 
create new native thread
at 
org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:204)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:111)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:733)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:456)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:292)
2013-08-28 08:58:45,636 INFO  util.ExitUtil (ExitUtil.java:terminate(1

[jira] [Updated] (HDFS-5140) Too many safemode monitor threads being created in the standby namenode causing it to fail with out of memory error

2013-08-28 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-5140:
--

Summary: Too many safemode monitor threads being created in the standby 
namenode causing it to fail with out of memory error  (was: Too many safemode 
monitor threads being created in the standby namenode)

> Too many safemode monitor threads being created in the standby namenode 
> causing it to fail with out of memory error
> ---
>
> Key: HDFS-5140
> URL: https://issues.apache.org/jira/browse/HDFS-5140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
>Priority: Blocker
>
> While running namenode load generator with 100 threads for 10 mins namenode 
> was being failed over ever 2 mins.
> The standby namenode shut itself down as it ran out of memory and was not 
> able to create another thread.
> When we searched for 'Safe mode extension entered' in the standby log it was 
> present 55000+ times

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-5140) Too many safemode monitor threads being created in the standby namenode

2013-08-28 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-5140:
-

 Summary: Too many safemode monitor threads being created in the 
standby namenode
 Key: HDFS-5140
 URL: https://issues.apache.org/jira/browse/HDFS-5140
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.1.0-beta
Reporter: Arpit Gupta
Assignee: Jing Zhao
Priority: Blocker


While running namenode load generator with 100 threads for 10 mins namenode was 
being failed over ever 2 mins.
The standby namenode shut itself down as it ran out of memory and was not able 
to create another thread.
When we searched for 'Safe mode extension entered' in the standby log it was 
present 55000+ times

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-5132) Deadlock in namenode while running load generator with 15 threads

2013-08-26 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-5132:
--

Attachment: jstack.log

Attaching the output of jstack

> Deadlock in namenode while running load generator with 15 threads
> -
>
> Key: HDFS-5132
> URL: https://issues.apache.org/jira/browse/HDFS-5132
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
> Attachments: jstack.log
>
>
> While running nn load generator with 15 threads for 20 mins the standby 
> namenode deadlocked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-5132) Deadlock in namenode while running load generator with 15 threads

2013-08-26 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-5132:
-

 Summary: Deadlock in namenode while running load generator with 15 
threads
 Key: HDFS-5132
 URL: https://issues.apache.org/jira/browse/HDFS-5132
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.1.0-beta
Reporter: Arpit Gupta


While running nn load generator with 15 threads for 20 mins the standby 
namenode deadlocked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5122) webhdfs paths on an ha cluster still require the use of the active nn address rather than using the nameservice

2013-08-21 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13746948#comment-13746948
 ] 

Arpit Gupta commented on HDFS-5122:
---

This is similar to HDFS-5123

> webhdfs paths on an ha cluster still require the use of the active nn address 
> rather than using the nameservice
> ---
>
> Key: HDFS-5122
> URL: https://issues.apache.org/jira/browse/HDFS-5122
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>
> For example if the dfs.nameservices is set to arpit
> {code}
> hdfs dfs -ls webhdfs://arpit:50070/tmp
> or 
> hdfs dfs -ls webhdfs://arpit/tmp
> {code}
> does not work
> You have to provide the exact active namenode hostname. On an HA cluster 
> using dfs client one should not need to provide the active nn hostname

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-5123) hftp paths on an ha cluster still require the use of the active nn address rather than using the nameservice

2013-08-21 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13746949#comment-13746949
 ] 

Arpit Gupta commented on HDFS-5123:
---

This is similar to HDFS-5122

> hftp paths on an ha cluster still require the use of the active nn address 
> rather than using the nameservice
> 
>
> Key: HDFS-5123
> URL: https://issues.apache.org/jira/browse/HDFS-5123
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>
> For example if the dfs.nameservices is set to arpit
> {code}
> hdfs dfs -ls hftp://arpit:50070/tmp
> or 
> hdfs dfs -ls hftp://arpit/tmp
> {code}
> does not work
> You have to provide the exact active namenode hostname. On an HA cluster 
> using dfs client one should not need to provide the active nn hostname

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-5123) hftp paths on an ha cluster still require the use of the active nn address rather than using the nameservice

2013-08-21 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-5123:
-

 Summary: hftp paths on an ha cluster still require the use of the 
active nn address rather than using the nameservice
 Key: HDFS-5123
 URL: https://issues.apache.org/jira/browse/HDFS-5123
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.1.0-beta
Reporter: Arpit Gupta


For example if the dfs.nameservices is set to arpit

{code}
hdfs dfs -ls hftp://arpit:50070/tmp

or 

hdfs dfs -ls hftp://arpit/tmp
{code}
does not work

You have to provide the exact active namenode hostname. On an HA cluster using 
dfs client one should not need to provide the active nn hostname

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-5122) webhdfs paths on an ha cluster still require the use of the active nn address rather than using the nameservice

2013-08-21 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-5122:
--

Description: 
For example if the dfs.nameservices is set to arpit

{code}
hdfs dfs -ls webhdfs://arpit:50070/tmp

or 

hdfs dfs -ls webhdfs://arpit/tmp
{code}
does not work

You have to provide the exact active namenode hostname. On an HA cluster using 
dfs client one should not need to provide the active nn hostname

  was:
For example if the dfs.nameservices is set to arpit

{code}
hdfs dfs -ls webhdfs://arpit:50070/tmp

or 

hdfs dfs -ls webhdfs://arpit/tmp
{code}
does not work

You have to provide the exact active namenode hostname


> webhdfs paths on an ha cluster still require the use of the active nn address 
> rather than using the nameservice
> ---
>
> Key: HDFS-5122
> URL: https://issues.apache.org/jira/browse/HDFS-5122
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.1.0-beta
>Reporter: Arpit Gupta
>
> For example if the dfs.nameservices is set to arpit
> {code}
> hdfs dfs -ls webhdfs://arpit:50070/tmp
> or 
> hdfs dfs -ls webhdfs://arpit/tmp
> {code}
> does not work
> You have to provide the exact active namenode hostname. On an HA cluster 
> using dfs client one should not need to provide the active nn hostname

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-5122) webhdfs paths on an ha cluster still require the use of the active nn address rather than using the nameservice

2013-08-21 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-5122:
-

 Summary: webhdfs paths on an ha cluster still require the use of 
the active nn address rather than using the nameservice
 Key: HDFS-5122
 URL: https://issues.apache.org/jira/browse/HDFS-5122
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.1.0-beta
Reporter: Arpit Gupta


For example if the dfs.nameservices is set to arpit

{code}
hdfs dfs -ls webhdfs://arpit:50070/tmp

or 

hdfs dfs -ls webhdfs://arpit/tmp
{code}
does not work

You have to provide the exact active namenode hostname

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4594) WebHDFS open sets Content-Length header to what is specified by length parameter rather than how much data is actually returned.

2013-03-12 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-4594:
-

 Summary: WebHDFS open sets Content-Length header to what is 
specified by length parameter rather than how much data is actually returned. 
 Key: HDFS-4594
 URL: https://issues.apache.org/jira/browse/HDFS-4594
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Arpit Gupta


This was noticed on 2.0.3 alpha

Lets say we have a file of length x

We make an webhdfs open call specifying length=x+1

The response of the call redirected to the datanode sets the content length 
header to value x+1 rather than x.

Now this causes an error when the client tries to read the data.

For the test i was using HttpResponse.getEntity().getContent()

This failed with message "Premature end of Content-Length delimited message 
body (expected: 71898; received: 71897"

This was not seen in hadoop 1 as we did not set the content length header.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4565) use DFSUtil.getSpnegoKeytabKey() to get the spnego keytab key in secondary namenode and namenode http server

2013-03-07 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596224#comment-13596224
 ] 

Arpit Gupta commented on HDFS-4565:
---

No tests added as method being used already has tests. Test failure is 
unrelated to this patch.

> use DFSUtil.getSpnegoKeytabKey() to get the spnego keytab key in secondary 
> namenode and namenode http server
> 
>
> Key: HDFS-4565
> URL: https://issues.apache.org/jira/browse/HDFS-4565
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
>Priority: Minor
> Attachments: HDFS-4565.patch
>
>
> use the method introduced by HDFS-4540 to the spengo keytab key. Better as we 
> have unit test coverage for the new method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4565) use DFSUtil.getSpnegoKeytabKey() to get the spnego keytab key in secondary namenode and namenode http server

2013-03-07 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4565:
--

Summary: use DFSUtil.getSpnegoKeytabKey() to get the spnego keytab key in 
secondary namenode and namenode http server  (was: use 
DFSUtil.getSpnegoKeytabKey to get the key in secondary namenode and namenode 
http server)

> use DFSUtil.getSpnegoKeytabKey() to get the spnego keytab key in secondary 
> namenode and namenode http server
> 
>
> Key: HDFS-4565
> URL: https://issues.apache.org/jira/browse/HDFS-4565
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
>Priority: Minor
> Attachments: HDFS-4565.patch
>
>
> use the method introduced by HDFS-4540 to the spengo keytab key. Better as we 
> have unit test coverage for the new method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4565) use DFSUtil.getSpnegoKeytabKey to get the key in secondary namenode and namenode http server

2013-03-07 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4565:
--

Status: Patch Available  (was: Open)

> use DFSUtil.getSpnegoKeytabKey to get the key in secondary namenode and 
> namenode http server
> 
>
> Key: HDFS-4565
> URL: https://issues.apache.org/jira/browse/HDFS-4565
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
>Priority: Minor
> Attachments: HDFS-4565.patch
>
>
> use the method introduced by HDFS-4540 to the spengo keytab key. Better as we 
> have unit test coverage for the new method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4565) use DFSUtil.getSpnegoKeytabKey to get the key in secondary namenode and namenode http server

2013-03-07 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4565:
--

Attachment: HDFS-4565.patch

> use DFSUtil.getSpnegoKeytabKey to get the key in secondary namenode and 
> namenode http server
> 
>
> Key: HDFS-4565
> URL: https://issues.apache.org/jira/browse/HDFS-4565
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
>Priority: Minor
> Attachments: HDFS-4565.patch
>
>
> use the method introduced by HDFS-4540 to the spengo keytab key. Better as we 
> have unit test coverage for the new method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4565) use DFSUtil.getSpnegoKeytabKey to get the key in secondary namenode and namenode http server

2013-03-07 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-4565:
-

 Summary: use DFSUtil.getSpnegoKeytabKey to get the key in 
secondary namenode and namenode http server
 Key: HDFS-4565
 URL: https://issues.apache.org/jira/browse/HDFS-4565
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: security
Affects Versions: 2.0.3-alpha
Reporter: Arpit Gupta
Assignee: Arpit Gupta
Priority: Minor


use the method introduced by HDFS-4540 to the spengo keytab key. Better as we 
have unit test coverage for the new method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4540) namenode http server should use the web authentication keytab for spnego principal

2013-03-05 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4540:
--

Attachment: HDFS-4540.patch

Minor updates, add a check and test for null

> namenode http server should use the web authentication keytab for spnego 
> principal
> --
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4540.patch, HDFS-4540.patch, HDFS-4540.patch, 
> HDFS-4540.patch, HDFS-4540.patch, HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4540) namenode http server should use the web authentication keytab for spnego principal

2013-03-04 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4540:
--

Attachment: HDFS-4540.patch

added timeout to the test.

> namenode http server should use the web authentication keytab for spnego 
> principal
> --
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4540.patch, HDFS-4540.patch, HDFS-4540.patch, 
> HDFS-4540.patch, HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4540) namenode http server should use the web authentication keytab for spnego principal

2013-03-04 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4540:
--

Attachment: HDFS-4540.patch

generate the patch from the correct branch duh!

> namenode http server should use the web authentication keytab for spnego 
> principal
> --
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4540.patch, HDFS-4540.patch, HDFS-4540.patch, 
> HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4540) namenode http server should use the web authentication keytab for spnego principal

2013-03-04 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4540:
--

Attachment: HDFS-4540.patch

Attached a patch where the code is moved to DFSUtil and added a test.

Also removed unused imports.

> namenode http server should use the web authentication keytab for spnego 
> principal
> --
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4540.patch, HDFS-4540.patch, HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4541) set hadoop.log.dir and hadoop.id.str when starting secure datanode so it writes the logs to the correct dir by default

2013-03-01 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590926#comment-13590926
 ] 

Arpit Gupta commented on HDFS-4541:
---

No tests added as this is a change to shell scripts. Manually verified that the 
secure datanode logs are being written to the appropriate directory.

Test failure is unrelated.

> set hadoop.log.dir and hadoop.id.str when starting secure datanode so it 
> writes the logs to the correct dir by default
> --
>
> Key: HDFS-4541
> URL: https://issues.apache.org/jira/browse/HDFS-4541
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4541.patch, HDFS-4541.patch
>
>
> currently in hadoop-config.sh we set the following
> {code}
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
> {code}
> however when this file is sourced we dont know whether we are starting a 
> secure data node.
> In the hdfs script when we determine whether we are starting secure data node 
> or not we should also update HADOOP_OPTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4541) set hadoop.log.dir and hadoop.id.str when starting secure datanode so it writes the logs to the correct dir by default

2013-03-01 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4541:
--

Status: Patch Available  (was: Open)

> set hadoop.log.dir and hadoop.id.str when starting secure datanode so it 
> writes the logs to the correct dir by default
> --
>
> Key: HDFS-4541
> URL: https://issues.apache.org/jira/browse/HDFS-4541
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4541.patch, HDFS-4541.patch
>
>
> currently in hadoop-config.sh we set the following
> {code}
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
> {code}
> however when this file is sourced we dont know whether we are starting a 
> secure data node.
> In the hdfs script when we determine whether we are starting secure data node 
> or not we should also update HADOOP_OPTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4541) set hadoop.log.dir and hadoop.id.str when starting secure datanode so it writes the logs to the correct dir by default

2013-03-01 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590819#comment-13590819
 ] 

Arpit Gupta commented on HDFS-4541:
---

@Chris

Actually it will appear 3 times after this change for secure datanode. It 
appears twice for any hdfs service right now.

hadoop-deamon.sh -> hadoop (sources hadoop-config.sh) -> hdfs (source 
hdfs-config.sh which sources hadoop-config.sh) and then we set it again.

I think the problem of duplicates in OPTS is a an issue that should be solved 
in a different jira.

> set hadoop.log.dir and hadoop.id.str when starting secure datanode so it 
> writes the logs to the correct dir by default
> --
>
> Key: HDFS-4541
> URL: https://issues.apache.org/jira/browse/HDFS-4541
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4541.patch, HDFS-4541.patch
>
>
> currently in hadoop-config.sh we set the following
> {code}
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
> {code}
> however when this file is sourced we dont know whether we are starting a 
> secure data node.
> In the hdfs script when we determine whether we are starting secure data node 
> or not we should also update HADOOP_OPTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4541) set hadoop.log.dir and hadoop.id.str when starting secure datanode so it writes the logs to the correct dir by default

2013-03-01 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4541:
--

Attachment: HDFS-4541.patch

regenerated the patch --no-prefix

> set hadoop.log.dir and hadoop.id.str when starting secure datanode so it 
> writes the logs to the correct dir by default
> --
>
> Key: HDFS-4541
> URL: https://issues.apache.org/jira/browse/HDFS-4541
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4541.patch, HDFS-4541.patch
>
>
> currently in hadoop-config.sh we set the following
> {code}
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
> {code}
> however when this file is sourced we dont know whether we are starting a 
> secure data node.
> In the hdfs script when we determine whether we are starting secure data node 
> or not we should also update HADOOP_OPTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4540) namenode http server should use the web authentication keytab for spnego principal

2013-03-01 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4540:
--

Attachment: HDFS-4540.patch

updated the patch with the check.

> namenode http server should use the web authentication keytab for spnego 
> principal
> --
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4540.patch, HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4540) namenode http server should use the web authentication keytab for spnego principal

2013-03-01 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590763#comment-13590763
 ] 

Arpit Gupta commented on HDFS-4540:
---

Good point Suresh. Let me update the patch with more checks.

> namenode http server should use the web authentication keytab for spnego 
> principal
> --
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4541) set hadoop.log.dir and hadoop.id.str when starting secure datanode so it writes the logs to the correct dir by default

2013-03-01 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590735#comment-13590735
 ] 

Arpit Gupta commented on HDFS-4541:
---

Attached a patch with it applied secure datanode logs will be written to the 
correct dir by default. Without this it wanted to write to your_log_dir/root 
dir.

> set hadoop.log.dir and hadoop.id.str when starting secure datanode so it 
> writes the logs to the correct dir by default
> --
>
> Key: HDFS-4541
> URL: https://issues.apache.org/jira/browse/HDFS-4541
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4541.patch
>
>
> currently in hadoop-config.sh we set the following
> {code}
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
> {code}
> however when this file is sourced we dont know whether we are starting a 
> secure data node.
> In the hdfs script when we determine whether we are starting secure data node 
> or not we should also update HADOOP_OPTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4541) set hadoop.log.dir and hadoop.id.str when starting secure datanode so it writes the logs to the correct dir by default

2013-03-01 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4541:
--

Attachment: HDFS-4541.patch

> set hadoop.log.dir and hadoop.id.str when starting secure datanode so it 
> writes the logs to the correct dir by default
> --
>
> Key: HDFS-4541
> URL: https://issues.apache.org/jira/browse/HDFS-4541
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4541.patch
>
>
> currently in hadoop-config.sh we set the following
> {code}
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
> HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
> {code}
> however when this file is sourced we dont know whether we are starting a 
> secure data node.
> In the hdfs script when we determine whether we are starting secure data node 
> or not we should also update HADOOP_OPTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4541) set hadoop.log.dir and hadoop.id.str when starting secure datanode so it writes the logs to the correct dir by default

2013-03-01 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-4541:
-

 Summary: set hadoop.log.dir and hadoop.id.str when starting secure 
datanode so it writes the logs to the correct dir by default
 Key: HDFS-4541
 URL: https://issues.apache.org/jira/browse/HDFS-4541
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, security
Affects Versions: 2.0.3-alpha
Reporter: Arpit Gupta
Assignee: Arpit Gupta


currently in hadoop-config.sh we set the following

{code}
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
{code}

however when this file is sourced we dont know whether we are starting a secure 
data node.

In the hdfs script when we determine whether we are starting secure data node 
or not we should also update HADOOP_OPTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4540) namenode http server should use the web authentication keytab for spnego principal

2013-03-01 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4540:
--

Target Version/s: 2.0.4-beta

> namenode http server should use the web authentication keytab for spnego 
> principal
> --
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4540) namenode http server should use the web authentication keytab for spnego principal

2013-03-01 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4540:
--

Fix Version/s: (was: 2.0.4-beta)

> namenode http server should use the web authentication keytab for spnego 
> principal
> --
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4540) namenode http server should use the web authentication keytab for spnego principal

2013-02-28 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590240#comment-13590240
 ] 

Arpit Gupta commented on HDFS-4540:
---

No new tests added as this is a security related change. Confirmed namenode 
starts up with this change when using a different keytab for spnego principal.

> namenode http server should use the web authentication keytab for spnego 
> principal
> --
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Fix For: 2.0.4-beta
>
> Attachments: HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4540) namenode http server should use the web authentication keytab for spnego principal

2013-02-28 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4540:
--

Summary: namenode http server should use the web authentication keytab for 
spnego principal  (was: Spnego principal should be looked up in the web 
authentication kerberos keytab before the namenode's keytab)

> namenode http server should use the web authentication keytab for spnego 
> principal
> --
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Fix For: 2.0.4-beta
>
> Attachments: HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4540) Spnego principal should be looked up in the web authentication kerberos keytab before the namenode's keytab

2013-02-28 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4540:
--

Attachment: HDFS-4540.patch

Patch that uses the correct config if available

> Spnego principal should be looked up in the web authentication kerberos 
> keytab before the namenode's keytab
> ---
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Fix For: 2.0.4-beta
>
> Attachments: HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4540) Spnego principal should be looked up in the web authentication kerberos keytab before the namenode's keytab

2013-02-28 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4540:
--

Status: Patch Available  (was: Open)

> Spnego principal should be looked up in the web authentication kerberos 
> keytab before the namenode's keytab
> ---
>
> Key: HDFS-4540
> URL: https://issues.apache.org/jira/browse/HDFS-4540
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.3-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Fix For: 2.0.4-beta
>
> Attachments: HDFS-4540.patch
>
>
> This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
> should look for dfs.web.authentication.kerberos.keytab before using 
> dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4540) Spnego principal should be looked up in the web authentication kerberos keytab before the namenode's keytab

2013-02-28 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-4540:
-

 Summary: Spnego principal should be looked up in the web 
authentication kerberos keytab before the namenode's keytab
 Key: HDFS-4540
 URL: https://issues.apache.org/jira/browse/HDFS-4540
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Affects Versions: 2.0.3-alpha
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Fix For: 2.0.4-beta


This is similar to HDFS-4105. in the NameNodeHttpServer.start() initSpnego 
should look for dfs.web.authentication.kerberos.keytab before using 
dfs.namenode.keytab.file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3727) When using SPNEGO, NN should not try to log in using KSSL principal

2012-11-30 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3727:
--

Affects Version/s: 1.1.1

> When using SPNEGO, NN should not try to log in using KSSL principal
> ---
>
> Key: HDFS-3727
> URL: https://issues.apache.org/jira/browse/HDFS-3727
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.1.0, 1.1.1, 1.2.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 1.2.0
>
> Attachments: HDFS-3727.patch
>
>
> When performing a checkpoint with security enabled, the NN will attempt to 
> relogin from its keytab before making an HTTP request back to the 2NN to 
> fetch the newly-merged image. However, it always attempts to log in using the 
> KSSL principal, even if SPNEGO is configured to be used.
> This issue was discovered by Stephen Chu.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3727) When using SPNEGO, NN should not try to log in using KSSL principal

2012-11-30 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3727:
--

Affects Version/s: 1.1.0

> When using SPNEGO, NN should not try to log in using KSSL principal
> ---
>
> Key: HDFS-3727
> URL: https://issues.apache.org/jira/browse/HDFS-3727
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.1.0, 1.1.1, 1.2.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 1.2.0
>
> Attachments: HDFS-3727.patch
>
>
> When performing a checkpoint with security enabled, the NN will attempt to 
> relogin from its keytab before making an HTTP request back to the 2NN to 
> fetch the newly-merged image. However, it always attempts to log in using the 
> KSSL principal, even if SPNEGO is configured to be used.
> This issue was discovered by Stephen Chu.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HDFS-3727) When using SPNEGO, NN should not try to log in using KSSL principal

2012-11-30 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta reopened HDFS-3727:
---


Can we commit this to branch 1.1 so that the next release can pull it in.

Also a couple of unused imports got left in the class after this patch.

> When using SPNEGO, NN should not try to log in using KSSL principal
> ---
>
> Key: HDFS-3727
> URL: https://issues.apache.org/jira/browse/HDFS-3727
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.1.0, 1.1.1, 1.2.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 1.2.0
>
> Attachments: HDFS-3727.patch
>
>
> When performing a checkpoint with security enabled, the NN will attempt to 
> relogin from its keytab before making an HTTP request back to the 2NN to 
> fetch the newly-merged image. However, it always attempts to log in using the 
> KSSL principal, even if SPNEGO is configured to be used.
> This issue was discovered by Stephen Chu.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4219) Port slive to branch-1

2012-11-20 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501711#comment-13501711
 ] 

Arpit Gupta commented on HDFS-4219:
---

Here is the output from test patch

{code}
[exec] 
 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 78 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] -1 findbugs.  The patch appears to introduce 10 new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
{code}

Findbug warnings are not related to this patch.

> Port slive to branch-1
> --
>
> Key: HDFS-4219
> URL: https://issues.apache.org/jira/browse/HDFS-4219
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 1.1.0
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4219.branch-1.patch
>
>
> Originally it was committed in HDFS-708 and MAPREDUCE-1804

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4219) Port slive to branch-1

2012-11-20 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501671#comment-13501671
 ] 

Arpit Gupta commented on HDFS-4219:
---

I will update the jira with the results of test patch when done.

> Port slive to branch-1
> --
>
> Key: HDFS-4219
> URL: https://issues.apache.org/jira/browse/HDFS-4219
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 1.1.0
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4219.branch-1.patch
>
>
> Originally it was committed in HDFS-708 and MAPREDUCE-1804

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4219) Port slive to branch-1

2012-11-20 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501670#comment-13501670
 ] 

Arpit Gupta commented on HDFS-4219:
---

It was a straight forward port by taking the code from trunk 
(hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/slive)
 to branch-1. Had to change SliveMapper.java

{code}
if(conf.get(MRJobConfig.TASK_ATTEMPT_ID) != null ) {
  this.taskId = TaskAttemptID.forName(conf.get(MRJobConfig.TASK_ATTEMPT_ID))
.getTaskID().getId();
} else {
  // So that branch-1/0.20 can run this same code as well
  this.taskId = TaskAttemptID.forName(conf.get("mapred.task.id"))
  .getTaskID().getId();
}

{code}

and remove the if/else block and just make it

{code}
this.taskId = TaskAttemptID.forName(conf.get("mapred.task.id"))
  .getTaskID().getId();
{code}

As the MRJobConfig is not available in branch-1

> Port slive to branch-1
> --
>
> Key: HDFS-4219
> URL: https://issues.apache.org/jira/browse/HDFS-4219
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 1.1.0
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4219.branch-1.patch
>
>
> Originally it was committed in HDFS-708 and MAPREDUCE-1804

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4219) Port slive to branch-1

2012-11-20 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4219:
--

Description: Originally it was committed in HDFS-708 and MAPREDUCE-1804

> Port slive to branch-1
> --
>
> Key: HDFS-4219
> URL: https://issues.apache.org/jira/browse/HDFS-4219
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 1.1.0
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4219.branch-1.patch
>
>
> Originally it was committed in HDFS-708 and MAPREDUCE-1804

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4219) Port slive to branch-1

2012-11-20 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4219:
--

Attachment: HDFS-4219.branch-1.patch

> Port slive to branch-1
> --
>
> Key: HDFS-4219
> URL: https://issues.apache.org/jira/browse/HDFS-4219
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 1.1.0
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4219.branch-1.patch
>
>
> Originally it was committed in HDFS-708 and MAPREDUCE-1804

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4219) Port slive to branch-1

2012-11-20 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-4219:
-

 Summary: Port slive to branch-1
 Key: HDFS-4219
 URL: https://issues.apache.org/jira/browse/HDFS-4219
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 1.1.0
Reporter: Arpit Gupta




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-4219) Port slive to branch-1

2012-11-20 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta reassigned HDFS-4219:
-

Assignee: Arpit Gupta

> Port slive to branch-1
> --
>
> Key: HDFS-4219
> URL: https://issues.apache.org/jira/browse/HDFS-4219
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 1.1.0
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4105) the SPNEGO user for secondary namenode should use the web keytab

2012-10-30 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487237#comment-13487237
 ] 

Arpit Gupta commented on HDFS-4105:
---

patched a secure hadoop 1.1.0 deploy with the patch and now the secondary 
namenode is able to log in.

Question if the HTTP principal fails to login should we not stop the secondary 
namenode server? I think we should do that as the image calls would fail 
without the if the HTTP principal was not available. Let me know and i can log 
a different jira for it.

> the SPNEGO user for secondary namenode should use the web keytab
> 
>
> Key: HDFS-4105
> URL: https://issues.apache.org/jira/browse/HDFS-4105
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.2-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4105.branch-1.patch, HDFS-4105.patch
>
>
> This is similar to HDFS-3466 where we made sure the namenode checks for the 
> web keytab before it uses the namenode keytab.
> The same needs to be done for secondary namenode as well.
> {code}
> String httpKeytab = 
>   conf.get(DFSConfigKeys.DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY);
> if (httpKeytab != null && !httpKeytab.isEmpty()) {
>   params.put("kerberos.keytab", httpKeytab);
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4108) In a secure cluster, in the HDFS WEBUI , clicking on a datanode in the node list , gives an error

2012-10-25 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484224#comment-13484224
 ] 

Arpit Gupta commented on HDFS-4108:
---

What user would they get the delegation token for? Should they not be using 
SPNEGO and making the client provide kerberos credentials?

> In a secure cluster, in the HDFS WEBUI , clicking on a datanode in the node 
> list , gives an error
> -
>
> Key: HDFS-4108
> URL: https://issues.apache.org/jira/browse/HDFS-4108
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security, webhdfs
>Affects Versions: 1.1.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
>Priority: Minor
> Attachments: HDFS-4108-1-1.patch
>
>
> This issue happens in secure cluster.
> To reproduce :
> Go to the NameNode WEB UI. (dfshealth.jsp)
> Click to bring up the list of LiveNodes  (dfsnodelist.jsp)
> Click on a datanode to bring up the filesystem  web page ( 
> browsedirectory.jsp)
> The page containing the directory listing does not come up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4105) the SPNEGO user for secondary namenode should use the web keytab

2012-10-23 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482755#comment-13482755
 ] 

Arpit Gupta commented on HDFS-4105:
---

no tests are added as changes are related to secure setup.

here is the test patch output for branch-1

{code}
[exec] BUILD SUCCESSFUL
 [exec] Total time: 5 minutes 0 seconds
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no tests are needed for 
this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] -1 findbugs.  The patch appears to introduce 9 new Findbugs 
(version 1.3.9) warnings.
{code}

Findbugs warnings are not related to this patch.

> the SPNEGO user for secondary namenode should use the web keytab
> 
>
> Key: HDFS-4105
> URL: https://issues.apache.org/jira/browse/HDFS-4105
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.2-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4105.branch-1.patch, HDFS-4105.patch
>
>
> This is similar to HDFS-3466 where we made sure the namenode checks for the 
> web keytab before it uses the namenode keytab.
> The same needs to be done for secondary namenode as well.
> {code}
> String httpKeytab = 
>   conf.get(DFSConfigKeys.DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY);
> if (httpKeytab != null && !httpKeytab.isEmpty()) {
>   params.put("kerberos.keytab", httpKeytab);
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4105) the SPNEGO user for secondary namenode should use the web keytab

2012-10-22 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4105:
--

Attachment: HDFS-4105.patch

patch for trunk.

> the SPNEGO user for secondary namenode should use the web keytab
> 
>
> Key: HDFS-4105
> URL: https://issues.apache.org/jira/browse/HDFS-4105
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.2-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4105.branch-1.patch, HDFS-4105.patch
>
>
> This is similar to HDFS-3466 where we made sure the namenode checks for the 
> web keytab before it uses the namenode keytab.
> The same needs to be done for secondary namenode as well.
> {code}
> String httpKeytab = 
>   conf.get(DFSConfigKeys.DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY);
> if (httpKeytab != null && !httpKeytab.isEmpty()) {
>   params.put("kerberos.keytab", httpKeytab);
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4105) the SPNEGO user for secondary namenode should use the web keytab

2012-10-22 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4105:
--

Status: Patch Available  (was: Open)

> the SPNEGO user for secondary namenode should use the web keytab
> 
>
> Key: HDFS-4105
> URL: https://issues.apache.org/jira/browse/HDFS-4105
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha, 1.1.0
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4105.branch-1.patch, HDFS-4105.patch
>
>
> This is similar to HDFS-3466 where we made sure the namenode checks for the 
> web keytab before it uses the namenode keytab.
> The same needs to be done for secondary namenode as well.
> {code}
> String httpKeytab = 
>   conf.get(DFSConfigKeys.DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY);
> if (httpKeytab != null && !httpKeytab.isEmpty()) {
>   params.put("kerberos.keytab", httpKeytab);
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4105) the SPNEGO user for secondary namenode should use the web keytab

2012-10-22 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-4105:
--

Attachment: HDFS-4105.branch-1.patch

patch for branch-1

> the SPNEGO user for secondary namenode should use the web keytab
> 
>
> Key: HDFS-4105
> URL: https://issues.apache.org/jira/browse/HDFS-4105
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.2-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4105.branch-1.patch
>
>
> This is similar to HDFS-3466 where we made sure the namenode checks for the 
> web keytab before it uses the namenode keytab.
> The same needs to be done for secondary namenode as well.
> {code}
> String httpKeytab = 
>   conf.get(DFSConfigKeys.DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY);
> if (httpKeytab != null && !httpKeytab.isEmpty()) {
>   params.put("kerberos.keytab", httpKeytab);
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-4105) the SPNEGO user for secondary namenode should use the web keytab

2012-10-22 Thread Arpit Gupta (JIRA)

Arpit Gupta created HDFS-4105:
-

 Summary: the SPNEGO user for secondary namenode should use the web 
keytab
 Key: HDFS-4105
 URL: https://issues.apache.org/jira/browse/HDFS-4105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.2-alpha, 1.1.0
Reporter: Arpit Gupta
Assignee: Arpit Gupta


This is similar to HDFS-3466 where we made sure the namenode checks for the web 
keytab before it uses the namenode keytab.

The same needs to be done for secondary namenode as well.

{code}
String httpKeytab = 
  conf.get(DFSConfigKeys.DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY);
if (httpKeytab != null && !httpKeytab.isEmpty()) {
  params.put("kerberos.keytab", httpKeytab);
}
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4084) provide CLI support for allow and disallow snapshot on a directory

2012-10-19 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480256#comment-13480256
 ] 

Arpit Gupta commented on HDFS-4084:
---

@Brandon

Can we make the new commands case insensitive? We can log a different jira to 
make existing commands also case insensitive. 

> provide CLI support for allow and disallow snapshot on a directory
> --
>
> Key: HDFS-4084
> URL: https://issues.apache.org/jira/browse/HDFS-4084
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs client, name-node, tools
>Affects Versions: HDFS-2802
>Reporter: Brandon Li
>Assignee: Brandon Li
> Attachments: HDFS-4084.patch
>
>
> To provide CLI support to allow snapshot, disallow snapshot on a directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4063) Unable to change JAVA_HOME directory in hadoop-setup-conf.sh script.

2012-10-16 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477149#comment-13477149
 ] 

Arpit Gupta commented on HDFS-4063:
---

These scripts were written to help with setup for rpm's being generated. Given 
the discussion on HADOOP-8925 which talks about removing packaging from hadoop 
does it make sense to wait for resolution and close this as wont fix after that?

> Unable to change JAVA_HOME directory in hadoop-setup-conf.sh script.
> 
>
> Key: HDFS-4063
> URL: https://issues.apache.org/jira/browse/HDFS-4063
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: scripts, tools
>Affects Versions: 1.0.3, 1.1.0, 2.0.2-alpha
> Environment: Fedora 17 3.3.4-5.fc17.x86_64t, java version 
> "1.7.0_06-icedtea", Rackspace Cloud (NextGen)
>Reporter: Haoquan Wang
>Priority: Minor
>  Labels: patch
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The JAVA_HOME directory remains unchanged no matter what you enter when you 
> run hadoop-setup-conf.sh to generate hadoop configurations. Please see below 
> example:
> *
> [root@hadoop-slave ~]# /sbin/hadoop-setup-conf.sh
> Setup Hadoop Configuration
> Where would you like to put config directory? (/etc/hadoop)
> Where would you like to put log directory? (/var/log/hadoop)
> Where would you like to put pid directory? (/var/run/hadoop)
> What is the host of the namenode? (hadoop-slave)
> Where would you like to put namenode data directory? 
> (/var/lib/hadoop/hdfs/namenode)
> Where would you like to put datanode data directory? 
> (/var/lib/hadoop/hdfs/datanode)
> What is the host of the jobtracker? (hadoop-slave)
> Where would you like to put jobtracker/tasktracker data directory? 
> (/var/lib/hadoop/mapred)
> Where is JAVA_HOME directory? (/usr/java/default) *+/usr/lib/jvm/jre+*
> Would you like to create directories/copy conf files to localhost? (Y/n)
> Review your choices:
> Config directory: /etc/hadoop
> Log directory   : /var/log/hadoop
> PID directory   : /var/run/hadoop
> Namenode host   : hadoop-slave
> Namenode directory  : /var/lib/hadoop/hdfs/namenode
> Datanode directory  : /var/lib/hadoop/hdfs/datanode
> Jobtracker host : hadoop-slave
> Mapreduce directory : /var/lib/hadoop/mapred
> Task scheduler  : org.apache.hadoop.mapred.JobQueueTaskScheduler
> JAVA_HOME directory : *+/usr/java/default+*
> Create dirs/copy conf files : y
> Proceed with generate configuration? (y/N) n
> User aborted setup, exiting...
> *
> Resolution:
> Amend line 509 in file /sbin/hadoop-setup-conf.sh
> from:
> JAVA_HOME=${USER_USER_JAVA_HOME:-$JAVA_HOME}
> to:
> JAVA_HOME=${USER_JAVA_HOME:-$JAVA_HOME}
> will resolve this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3977) Incompatible change between hadoop-1 and hadoop-2 when the dfs.hosts and dfs.hosts.exclude files are not present

2012-09-26 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta resolved HDFS-3977.
---

Resolution: Invalid

Thanks Todd.

Resolving it as invalid.

> Incompatible change between hadoop-1 and hadoop-2 when the dfs.hosts and 
> dfs.hosts.exclude files are not present
> 
>
> Key: HDFS-3977
> URL: https://issues.apache.org/jira/browse/HDFS-3977
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
>
> While testing hadoop-1 and hadoop-2 the following was noticed
> if the files in the properties dfs.hosts and dfs.hosts.exclude do not exist
> in hadoop-1 namenode format and start went through successfully.
> in hadoop-2 we get a file not found exception and both the format and the 
> namenode start commands fail.
> We should be logging a warning in the case when the file is not found so that 
> we are compatible with hadoop-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3978) Document backward incompatible changes between hadoop-1.x and 2.x

2012-09-26 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3978:
--

Description: 
We should create a new site document to explicitly list down the know 
incompatible changes between hadoop 1.x and 2.x

This will make it easier for users to determine these differenence

  was:
The following incompatible changes were noticed between branch-1 and branch-2 
caused by HADOOP-8551

1. mkdir would create parent directories in branch-1 if they did not exist. In 
branch-2 users have to explicitly send mkdir -p

2. Create a multi level dir in branch 1 something like

mkdir /test/1

/test would get permissions 755 and /test/1 would get the permissions based on 
your umask settings

however if you run the command in branch-2
mkdir -p /test/1

both /test and /test/1 will get the permissions based on your umask. 

These are significant changes that we should document.


> Document backward incompatible changes between hadoop-1.x and 2.x
> -
>
> Key: HDFS-3978
> URL: https://issues.apache.org/jira/browse/HDFS-3978
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Arpit Gupta
>
> We should create a new site document to explicitly list down the know 
> incompatible changes between hadoop 1.x and 2.x
> This will make it easier for users to determine these differenence

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3978) Document backward incompatible changes between hadoop-1.x and 2.x

2012-09-26 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3978:
--

Description: 
We should create a new site document to explicitly list down the know 
incompatible changes between hadoop 1.x and 2.x

I believe this will make it easier for users to determine all the changes one 
needs to make when moving from 1.x to 2.x

  was:
We should create a new site document to explicitly list down the know 
incompatible changes between hadoop 1.x and 2.x

This will make it easier for users to determine these differenence


> Document backward incompatible changes between hadoop-1.x and 2.x
> -
>
> Key: HDFS-3978
> URL: https://issues.apache.org/jira/browse/HDFS-3978
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Arpit Gupta
>
> We should create a new site document to explicitly list down the know 
> incompatible changes between hadoop 1.x and 2.x
> I believe this will make it easier for users to determine all the changes one 
> needs to make when moving from 1.x to 2.x

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3978) Document backward incompatible changes between hadoop-1.x and 2.x

2012-09-26 Thread Arpit Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463917#comment-13463917
 ] 

Arpit Gupta commented on HDFS-3978:
---

The following incompatible changes were noticed between branch-1 and branch-2 
caused by HADOOP-8551
1. mkdir would create parent directories in branch-1 if they did not exist. In 
branch-2 users have to explicitly send mkdir -p
2. Create a multi level dir in branch 1 something like
mkdir /test/1
/test would get permissions 755 and /test/1 would get the permissions based on 
your umask settings
however if you run the command in branch-2
mkdir -p /test/1
both /test and /test/1 will get the permissions based on your umask.
These are significant changes that we should document.

> Document backward incompatible changes between hadoop-1.x and 2.x
> -
>
> Key: HDFS-3978
> URL: https://issues.apache.org/jira/browse/HDFS-3978
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Arpit Gupta
>
> The following incompatible changes were noticed between branch-1 and branch-2 
> caused by HADOOP-8551
> 1. mkdir would create parent directories in branch-1 if they did not exist. 
> In branch-2 users have to explicitly send mkdir -p
> 2. Create a multi level dir in branch 1 something like
> mkdir /test/1
> /test would get permissions 755 and /test/1 would get the permissions based 
> on your umask settings
> however if you run the command in branch-2
> mkdir -p /test/1
> both /test and /test/1 will get the permissions based on your umask. 
> These are significant changes that we should document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3978) Document backward incompatible changes between hadoop-1.x and 2.x

2012-09-26 Thread Arpit Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3978:
--

Summary: Document backward incompatible changes between hadoop-1.x and 2.x  
(was: Document backward incompatible changes introduced by HADOOP-8551)

> Document backward incompatible changes between hadoop-1.x and 2.x
> -
>
> Key: HDFS-3978
> URL: https://issues.apache.org/jira/browse/HDFS-3978
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Arpit Gupta
>
> The following incompatible changes were noticed between branch-1 and branch-2 
> caused by HADOOP-8551
> 1. mkdir would create parent directories in branch-1 if they did not exist. 
> In branch-2 users have to explicitly send mkdir -p
> 2. Create a multi level dir in branch 1 something like
> mkdir /test/1
> /test would get permissions 755 and /test/1 would get the permissions based 
> on your umask settings
> however if you run the command in branch-2
> mkdir -p /test/1
> both /test and /test/1 will get the permissions based on your umask. 
> These are significant changes that we should document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

1 2 >

1 - 100 of 111 matches

Mail list logo