[jira] [Commented] (HDFS-1447) Make getGenerationStampFromFile() more efficient, so it doesn't reprocess full directory listing for every block

2015-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335597#comment-14335597
 ] 

Hadoop QA commented on HDFS-1447:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12499004/HDFS-1447.patch
  against trunk revision 9a37247.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9661//console

This message is automatically generated.

 Make getGenerationStampFromFile() more efficient, so it doesn't reprocess 
 full directory listing for every block
 

 Key: HDFS-1447
 URL: https://issues.apache.org/jira/browse/HDFS-1447
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 0.20.2
Reporter: Matt Foley
Assignee: Matt Foley
 Attachments: HDFS-1447.patch, Test_HDFS_1447_NotForCommitt.java.patch


 Make getGenerationStampFromFile() more efficient. Currently this routine is 
 called by addToReplicasMap() for every blockfile in the directory tree, and 
 it walks each file's containing directory on every call. There is a simple 
 refactoring that should make it more efficient.
 This work item is one of four sub-tasks for HDFS-1443, Improve Datanode 
 startup time.
 The fix will probably be folded into sibling task HDFS-1446, which is already 
 refactoring the method that calls getGenerationStampFromFile().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case

2015-02-24 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335658#comment-14335658
 ] 

Andrew Wang commented on HDFS-7763:
---

This looks good to me, though one little nit is we could do {{System.exit}} in 
a {{finally}}.

+1, I'll commit shortly.

 fix zkfc hung issue due to not catching exception in a corner case
 --

 Key: HDFS-7763
 URL: https://issues.apache.org/jira/browse/HDFS-7763
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.6.0
Reporter: Liang Xie
Assignee: Liang Xie
 Attachments: HDFS-7763-001.txt, HDFS-7763-002.txt, jstack.4936


 In our product cluster, we hit both the two zkfc process is hung after a zk 
 network outage.
 the zkfc log said:
 {code}
 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 3334ms for sessionid 
 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect
 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: 
 Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further 
 znode monitoring connection errors.
 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 
 0x4a61bacdd9dfb2 closed
 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: 
 Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. 
 Not retrying further znode monitoring connection errors.
 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 
 11300
 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread 
 shut down
 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
 Yielding from election
 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC 
 Server Responder
 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping 
 HealthMonitor thread
 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC 
 Server listener on 11300
 {code}
 and the thread dump also be uploaded as attachment.
 From the dump, we can see due to the unknown non-daemon 
 threads(pool-*-thread-*), the process did not exit, but the critical threads, 
 like health monitor and rpc threads had been stopped, so our 
 watchdog(supervisord) had not not observed the zkfc process is down or 
 abnormal.  so the following namenode failover could not be done as expected.
 there're two possible fixes here, 1) figure out the unset-thread-name, like 
 pool-7-thread-1, where them came from and close or set daemon property. i 
 tried to search but got nothing right now. 2) catch the exception from 
 ZKFailoverController.run() so we can continue to exec the System.exit, the 
 attached patch is 2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7837) Allocate and persist striped blocks in FSNamesystem

2015-02-24 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-7837:
---

 Summary: Allocate and persist striped blocks in FSNamesystem
 Key: HDFS-7837
 URL: https://issues.apache.org/jira/browse/HDFS-7837
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao


Try to finish the remaining work from HDFS-7339 (except the 
ClientProtocol/DFSClient part):
# Allow FSNamesystem#getAdditionalBlock to create striped blocks and persist 
striped blocks to editlog
# Update FSImage for max allocated striped block ID
# Update the block commit/complete logic in BlockManager



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7836) BlockManager Scalability Improvements

2015-02-24 Thread Charles Lamb (JIRA)
Charles Lamb created HDFS-7836:
--

 Summary: BlockManager Scalability Improvements
 Key: HDFS-7836
 URL: https://issues.apache.org/jira/browse/HDFS-7836
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Charles Lamb
Assignee: Charles Lamb


Improvements to BlockManager scalability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1732) Enhance NNThroughputBenchmark to observe scale-dependent changes in IBR processing

2015-02-24 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-1732:
---
Fix Version/s: (was: 0.24.0)

 Enhance NNThroughputBenchmark to observe scale-dependent changes in IBR 
 processing
 --

 Key: HDFS-1732
 URL: https://issues.apache.org/jira/browse/HDFS-1732
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 0.22.0
Reporter: Matt Foley
Assignee: Matt Foley
 Attachments: IBRBenchmark_apachetrunk_v8.patch, 
 IBRBenchmark_apachetrunk_v9.patch


 Rework NNThroughputBenchmark to provide more detailed info about Initial 
 Block Report (IBR) processing time, as a function of number of nodes, number 
 of unique blocks, and total number of replicas.  
 Allow both direct local communication and remote and local RPC, so we can see 
 how much impact RPC overhead has on IBR processing time.
 Also plug some holes in performance-specific logging of Block Report 
 processing, so that consistent and complete data are logged from both 
 Namenode and Datanode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1447) Make getGenerationStampFromFile() more efficient, so it doesn't reprocess full directory listing for every block

2015-02-24 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-1447:
---
Status: Open  (was: Patch Available)

 Make getGenerationStampFromFile() more efficient, so it doesn't reprocess 
 full directory listing for every block
 

 Key: HDFS-1447
 URL: https://issues.apache.org/jira/browse/HDFS-1447
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 0.20.2
Reporter: Matt Foley
Assignee: Matt Foley
 Attachments: HDFS-1447.patch, Test_HDFS_1447_NotForCommitt.java.patch


 Make getGenerationStampFromFile() more efficient. Currently this routine is 
 called by addToReplicasMap() for every blockfile in the directory tree, and 
 it walks each file's containing directory on every call. There is a simple 
 refactoring that should make it more efficient.
 This work item is one of four sub-tasks for HDFS-1443, Improve Datanode 
 startup time.
 The fix will probably be folded into sibling task HDFS-1446, which is already 
 refactoring the method that calls getGenerationStampFromFile().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1732) Enhance NNThroughputBenchmark to observe scale-dependent changes in IBR processing

2015-02-24 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-1732:
---
Status: Open  (was: Patch Available)

Cancelling patch as it no longer applies to trunk.

 Enhance NNThroughputBenchmark to observe scale-dependent changes in IBR 
 processing
 --

 Key: HDFS-1732
 URL: https://issues.apache.org/jira/browse/HDFS-1732
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 0.22.0
Reporter: Matt Foley
Assignee: Matt Foley
 Attachments: IBRBenchmark_apachetrunk_v8.patch, 
 IBRBenchmark_apachetrunk_v9.patch


 Rework NNThroughputBenchmark to provide more detailed info about Initial 
 Block Report (IBR) processing time, as a function of number of nodes, number 
 of unique blocks, and total number of replicas.  
 Allow both direct local communication and remote and local RPC, so we can see 
 how much impact RPC overhead has on IBR processing time.
 Also plug some holes in performance-specific logging of Block Report 
 processing, so that consistent and complete data are logged from both 
 Namenode and Datanode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7840) cdd

2015-02-24 Thread ashu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ashu updated HDFS-7840:
---
Description: 

cdc

  was:
We are trying to set the following properties in Hue LDAP section of our 
environment so that all 
usernames are forced to lowercase and so authentication ignores case. 
This will avoid new user’s home folders being in UPPERCASE, causing access 
errors with HDFS which parses to lowercase.

Configuration to Set/Append:

[[ldap]]
ignore_username_case=true
force_username_lowercase=true

Problem:

Cannot identify proper configuration file to edit (tried runtime Hue.ini and 
safety-valve files). 
Have edited the following files and restarted Hue service, and the runtime 
Hue.ini still 
does not show changes made. 
(8378 is the current process as of this email). 
Cloudera Managwer also does not expose these parameters, but does offer a field 
for safety-vale entries.

1.   
/var/run/cloudera-scm-agent/process/8378-hue-HUE_SERVER/hue_safety_valve.ini
a.   currently contains above LDAP section
2.   /var/run/cloudera-scm-agent/process/8378-hue-HUE_SERVER/hue.ini
a.   does not show safety valve changes; editing file directly does not 
work since changes are lost next service restart.
3.   
/opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/etc/hue/conf.empty/hue.ini
a.   File is empty


 Please provide correct files to edit and which services need restarted for 
change to take effect.


Summary: cdd  (was: errors with HDFS which parses to lowercase.)

 cdd
 ---

 Key: HDFS-7840
 URL: https://issues.apache.org/jira/browse/HDFS-7840
 Project: Hadoop HDFS
  Issue Type: Bug
 Environment: Hadoop
Reporter: ashu

 cdc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7840) cdd

2015-02-24 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HDFS-7840.

Resolution: Invalid

 cdd
 ---

 Key: HDFS-7840
 URL: https://issues.apache.org/jira/browse/HDFS-7840
 Project: Hadoop HDFS
  Issue Type: Bug
 Environment: Hadoop
Reporter: ashu

 cdc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7840) errors with HDFS which parses to lowercase.

2015-02-24 Thread ashu (JIRA)
ashu created HDFS-7840:
--

 Summary: errors with HDFS which parses to lowercase.
 Key: HDFS-7840
 URL: https://issues.apache.org/jira/browse/HDFS-7840
 Project: Hadoop HDFS
  Issue Type: Bug
 Environment: Hadoop
Reporter: ashu


We are trying to set the following properties in Hue LDAP section of our 
environment so that all 
usernames are forced to lowercase and so authentication ignores case. 
This will avoid new user’s home folders being in UPPERCASE, causing access 
errors with HDFS which parses to lowercase.

Configuration to Set/Append:

[[ldap]]
ignore_username_case=true
force_username_lowercase=true

Problem:

Cannot identify proper configuration file to edit (tried runtime Hue.ini and 
safety-valve files). 
Have edited the following files and restarted Hue service, and the runtime 
Hue.ini still 
does not show changes made. 
(8378 is the current process as of this email). 
Cloudera Managwer also does not expose these parameters, but does offer a field 
for safety-vale entries.

1.   
/var/run/cloudera-scm-agent/process/8378-hue-HUE_SERVER/hue_safety_valve.ini
a.   currently contains above LDAP section
2.   /var/run/cloudera-scm-agent/process/8378-hue-HUE_SERVER/hue.ini
a.   does not show safety valve changes; editing file directly does not 
work since changes are lost next service restart.
3.   
/opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/etc/hue/conf.empty/hue.ini
a.   File is empty


 Please provide correct files to edit and which services need restarted for 
change to take effect.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7841) access errors with HDFS which parses to lowercase.

2015-02-24 Thread ankush (JIRA)
ankush created HDFS-7841:


 Summary: access errors with HDFS which parses to lowercase.
 Key: HDFS-7841
 URL: https://issues.apache.org/jira/browse/HDFS-7841
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: ankush


We are trying to set the following properties in Hue LDAP section of our  
environment so that all usernames are forced to lowercase and so authentication 
ignores case. This will avoid new user’s home folders being in UPPERCASE, 
causing access errors with HDFS which parses to lowercase.

[[ldap]]
ignore_username_case=true
force_username_lowercase=true

Problem:

Cannot identify proper configuration file to edit (tried runtime Hue.ini and 
safety-valve files). Have edited the following files and restarted Hue service, 
and the runtime Hue.ini still does not show changes made. (8378 is the current 
process as of this email). Cloudera Managwer also does not expose these 
parameters, but does offer a field for safety-vale entries.

1.   
/var/run/cloudera-scm-agent/process/8378-hue-HUE_SERVER/hue_safety_valve.ini
a.   currently contains above LDAP section
2.   /var/run/cloudera-scm-agent/process/8378-hue-HUE_SERVER/hue.ini
a.   does not show safety valve changes; editing file directly does not 
work since changes are lost next service restart.
3.   
/opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/etc/hue/conf.empty/hue.ini
a.   File is empty

Support Needed:

 Please provide correct files to edit and which services need restarted for 
change to take effect.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7538) removedDst should be checked against null in the finally block of FSDirRenameOp#unprotectedRenameTo()

2015-02-24 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335972#comment-14335972
 ] 

Binglin Chang commented on HDFS-7538:
-

Hi [~tedyu], the patch is out of date, and I think the bug no longer exists, 
should this be resolved?

 removedDst should be checked against null in the finally block of 
 FSDirRenameOp#unprotectedRenameTo()
 -

 Key: HDFS-7538
 URL: https://issues.apache.org/jira/browse/HDFS-7538
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hdfs-7538-001.patch


 {code}
 if (removedDst != null) {
   undoRemoveDst = false;
 ...
   if (undoRemoveDst) {
 // Rename failed - restore dst
 if (dstParent.isDirectory() 
 dstParent.asDirectory().isWithSnapshot()) {
   dstParent.asDirectory().undoRename4DstParent(removedDst,
 {code}
 If the first if check doesn't pass, removedDst would be null and 
 undoRemoveDst may be true.
 This combination would lead to NullPointerException in the finally block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7841) fff

2015-02-24 Thread ankush (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ankush updated HDFS-7841:
-
Description: hv  (was: We are trying to set the following properties in Hue 
LDAP section of our  environment so that all usernames are forced to lowercase 
and so authentication ignores case. This will avoid new user’s home folders 
being in UPPERCASE, causing access errors with HDFS which parses to lowercase.

[[ldap]]
ignore_username_case=true
force_username_lowercase=true

Problem:

Cannot identify proper configuration file to edit (tried runtime Hue.ini and 
safety-valve files). Have edited the following files and restarted Hue service, 
and the runtime Hue.ini still does not show changes made. (8378 is the current 
process as of this email). Cloudera Managwer also does not expose these 
parameters, but does offer a field for safety-vale entries.

1.   
/var/run/cloudera-scm-agent/process/8378-hue-HUE_SERVER/hue_safety_valve.ini
a.   currently contains above LDAP section
2.   /var/run/cloudera-scm-agent/process/8378-hue-HUE_SERVER/hue.ini
a.   does not show safety valve changes; editing file directly does not 
work since changes are lost next service restart.
3.   
/opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/etc/hue/conf.empty/hue.ini
a.   File is empty

Support Needed:

 Please provide correct files to edit and which services need restarted for 
change to take effect.
)
Summary: fff  (was: access errors with HDFS which parses to lowercase.)

 fff
 ---

 Key: HDFS-7841
 URL: https://issues.apache.org/jira/browse/HDFS-7841
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: ankush

 hv



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7784) load fsimage in parallel

2015-02-24 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335929#comment-14335929
 ] 

Walter Su commented on HDFS-7784:
-

I use visualvm to profile the loading process and find out that the bottleneck 
is deserialization taking too much cpu time, not disk I/O. The 
test(test-20150213.pdf) uses three 7200rpm hard disks as raid0. I tried 
single-threaded starts with and without cleaning buffer cache, and the 
difference is very small.

 load fsimage in parallel
 

 Key: HDFS-7784
 URL: https://issues.apache.org/jira/browse/HDFS-7784
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Attachments: HDFS-7784.001.patch, test-20150213.pdf


 When single Namenode has huge amount of files, without using federation, the 
 startup/restart speed is slow. The fsimage loading step takes the most of the 
 time. fsimage loading can seperate to two parts, deserialization and object 
 construction(mostly map insertion). Deserialization takes the most of CPU 
 time. So we can do deserialization in parallel, and add to hashmap in serial. 
  It will significantly reduce the NN start time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7841) access errors with HDFS which parses to lowercase.

2015-02-24 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HDFS-7841.

Resolution: Invalid

Sorry, but none of this has anything to do with Apache Hadoop.  Please contact 
Cloudera for support.

 access errors with HDFS which parses to lowercase.
 --

 Key: HDFS-7841
 URL: https://issues.apache.org/jira/browse/HDFS-7841
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: ankush

 We are trying to set the following properties in Hue LDAP section of our  
 environment so that all usernames are forced to lowercase and so 
 authentication ignores case. This will avoid new user’s home folders being in 
 UPPERCASE, causing access errors with HDFS which parses to lowercase.
 [[ldap]]
 ignore_username_case=true
 force_username_lowercase=true
 Problem:
 Cannot identify proper configuration file to edit (tried runtime Hue.ini and 
 safety-valve files). Have edited the following files and restarted Hue 
 service, and the runtime Hue.ini still does not show changes made. (8378 is 
 the current process as of this email). Cloudera Managwer also does not expose 
 these parameters, but does offer a field for safety-vale entries.
 1.   
 /var/run/cloudera-scm-agent/process/8378-hue-HUE_SERVER/hue_safety_valve.ini
 a.   currently contains above LDAP section
 2.   /var/run/cloudera-scm-agent/process/8378-hue-HUE_SERVER/hue.ini
 a.   does not show safety valve changes; editing file directly does not 
 work since changes are lost next service restart.
 3.   
 /opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/etc/hue/conf.empty/hue.ini
 a.   File is empty
 Support Needed:
  Please provide correct files to edit and which services need restarted for 
 change to take effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7538) removedDst should be checked against null in the finally block of FSDirRenameOp#unprotectedRenameTo()

2015-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-7538:
-
Resolution: Not a Problem
Status: Resolved  (was: Patch Available)

 removedDst should be checked against null in the finally block of 
 FSDirRenameOp#unprotectedRenameTo()
 -

 Key: HDFS-7538
 URL: https://issues.apache.org/jira/browse/HDFS-7538
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hdfs-7538-001.patch


 {code}
 if (removedDst != null) {
   undoRemoveDst = false;
 ...
   if (undoRemoveDst) {
 // Rename failed - restore dst
 if (dstParent.isDirectory() 
 dstParent.asDirectory().isWithSnapshot()) {
   dstParent.asDirectory().undoRename4DstParent(removedDst,
 {code}
 If the first if check doesn't pass, removedDst would be null and 
 undoRemoveDst may be true.
 This combination would lead to NullPointerException in the finally block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.

2015-02-24 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-7722:

Attachment: HDFS-7722.001.patch

Hi, [~cmccabe]. Thanks for reviewing. 

I updated the patch based on your inputs. 

Now, {{checkDirs()}} shares the same logic with {{DataNode#refreshVolumes()}}, 
because we'd like to remove everythings about the volumes, i.e., 
{{blockInfos}}, {{FsVolumeImpls}} in {{FsDataset}} and storage dirs in 
{{DataStorage}}. The existing {{checkDirs()}} logic only removes {{blockInfo}} 
and {{FsVolumeImpl}} in {{FsDataset}}. Thus {{checkDirs()}} returns failed 
volumes way back to {{DataNode}}.

Because of the above reason, I chose to let {{checkDirs()}} return 
{{SetFile}} instead of {{SetFsVolumeImpl/FsVolumeRef}}, since these volumes 
will be consumed in {{DataNode}}. I think that {{FsVolumeRef}} should only be 
used when there is I/Os on the volume.

Would you mind take another look?

bq.  Please remember that this scans all files on a volume, which is an 
expensive operation.

{{FsVolumeList#checkDirs}} only checks access permissions on all sub 
directories and does not read files. I agree that it can still be problematic, 
I will file a follow JIRA to throttle it.


 DataNode#checkDiskError should also remove Storage when error is found.
 ---

 Key: HDFS-7722
 URL: https://issues.apache.org/jira/browse/HDFS-7722
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch


 When {{DataNode#checkDiskError}} found disk errors, it removes all block 
 metadatas from {{FsDatasetImpl}}. However, it does not removed the 
 corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. 
 The result is that, we could not directly run {{reconfig}} to hot swap the 
 failure disks without changing the configure file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.

2015-02-24 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-7722:

Status: Patch Available  (was: Open)

 DataNode#checkDiskError should also remove Storage when error is found.
 ---

 Key: HDFS-7722
 URL: https://issues.apache.org/jira/browse/HDFS-7722
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch


 When {{DataNode#checkDiskError}} found disk errors, it removes all block 
 metadatas from {{FsDatasetImpl}}. However, it does not removed the 
 corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. 
 The result is that, we could not directly run {{reconfig}} to hot swap the 
 failure disks without changing the configure file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case

2015-02-24 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-7763:
--
   Resolution: Fixed
Fix Version/s: 2.7.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks for the nice find and fix 
[~xieliang007]!

 fix zkfc hung issue due to not catching exception in a corner case
 --

 Key: HDFS-7763
 URL: https://issues.apache.org/jira/browse/HDFS-7763
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.6.0
Reporter: Liang Xie
Assignee: Liang Xie
 Fix For: 2.7.0

 Attachments: HDFS-7763-001.txt, HDFS-7763-002.txt, jstack.4936


 In our product cluster, we hit both the two zkfc process is hung after a zk 
 network outage.
 the zkfc log said:
 {code}
 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 3334ms for sessionid 
 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect
 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: 
 Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further 
 znode monitoring connection errors.
 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 
 0x4a61bacdd9dfb2 closed
 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: 
 Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. 
 Not retrying further znode monitoring connection errors.
 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 
 11300
 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2
 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread 
 shut down
 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
 Yielding from election
 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC 
 Server Responder
 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping 
 HealthMonitor thread
 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC 
 Server listener on 11300
 {code}
 and the thread dump also be uploaded as attachment.
 From the dump, we can see due to the unknown non-daemon 
 threads(pool-*-thread-*), the process did not exit, but the critical threads, 
 like health monitor and rpc threads had been stopped, so our 
 watchdog(supervisord) had not not observed the zkfc process is down or 
 abnormal.  so the following namenode failover could not be done as expected.
 there're two possible fixes here, 1) figure out the unset-thread-name, like 
 pool-7-thread-1, where them came from and close or set daemon property. i 
 tried to search but got nothing right now. 2) catch the exception from 
 ZKFailoverController.run() so we can continue to exec the System.exit, the 
 attached patch is 2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7836) BlockManager Scalability Improvements

2015-02-24 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7836:
---
Attachment: BlockManagerScalabilityImprovementsDesign.pdf

 BlockManager Scalability Improvements
 -

 Key: HDFS-7836
 URL: https://issues.apache.org/jira/browse/HDFS-7836
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: BlockManagerScalabilityImprovementsDesign.pdf


 Improvements to BlockManager scalability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2