Re: Kerberised JobHistory Server not starting: User jhs trying to create the /mr-history/done directory

2017-07-20 Thread Kevin Buckley
On 21 July 2017 at 04:04, Erik Krogen  wrote:
> Hi Kevin,
>
> Since you are using the "jhs" keytab with principal "jhs/_h...@realm.tld",
> the JHS is authenticating itself as the jhs user (which is the actual
> important part, rather than the user the process is running as). If you want
> it to be the "mapred" user, you should change the keytab/principal you use
> (mapred.jobhistory.{principal,keytab}).

I'll certainly give that a go Erik, however, the way I read the

>> The hadoop-2.8.0  docs SecureMode page also suggests that one would need to
>> play around with the
>>
>> hadoop.security.auth_to_local

bits suggested to me that if you set things up such that

===
$ hadoop org.apache.hadoop.security.HadoopKerberosName
jhs/co246a-9.ecs.vuw.ac...@ecs.vuw.ac.nz
17/07/20 17:42:50 INFO util.KerberosName: Non-simple name
mapred/co246a-9.ecs.vuw.ac...@ecs.vuw.ac.nz after auth_to_local rule
RULE:[2:$1/$2@$0](jhs/.*)s/jhs/mapred/
Name: jhs/co246a-9.ecs.vuw.ac...@ecs.vuw.ac.nz to
mapred/co246a-9.ecs.vuw.ac...@ecs.vuw.ac.nz


(or even used a rule that just mapped the principal to a simple "mapred"
because I tried that too !) told you it was remapping the user, then it would
remap for all instances of the user, within the Hadoop instance..

Let's see.
Cheers again for the feedback.

-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org



Re: Access to yarn application log files for different user or group

2017-07-20 Thread Manfred Halper(Arge)
I just realized that this problem is solved in version 2.9 and version
3.0 alpha 1 to 4. There you can set in yarn-site.xml the following
parameter yarn.nodemanager.default-container-executor.log-dirs.permissions.

I guess i will have to solve my problem with a bash script until the
version 2.9 is released hopefully November.



Am 19.07.2017 um 12:07 schrieb had...@x5h.eu:
> Hello,
>
> I currently try to access the application log files from yarn
> automatically. Yarn produces log files in the folder
> /userlogs/applicationid/containerid/*.log. The problem I have is, that
> the generated application and container directories are constructed with
> a specific umask (067) that I can't seem to change. If I set acl the log
> files get the correct permission but the directories stay the same. It
> seems that the process that creates the application log directories uses
> a mask::--x and therefore overrules my acl.
>
> I want to automatically comb through the log directories but with these
> default permissions set by hadoop I can't.
>
> What do I have to change so that the log directories log for
> applications are readable for a specific group or user on that system?
>
> Cheers,
>
> Manfred.
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: user-h...@hadoop.apache.org
>



Re: Kerberised JobHistory Server not starting: User jhs trying to create the /mr-history/done directory

2017-07-20 Thread Erik Krogen
Hi Kevin,

Since you are using the "jhs" keytab with principal "jhs/_h...@realm.tld",
the JHS is authenticating itself as the jhs user (which is the actual
important part, rather than the user the process is running as). If you
want it to be the "mapred" user, you should change the keytab/principal you
use (mapred.jobhistory.{principal,keytab}).

HTH,
Erik

On Wed, Jul 19, 2017 at 11:34 PM, Kevin Buckley <
kevin.buckley.ecs.vuw.ac...@gmail.com> wrote:

> My Hadoop 2.8.0's
>
> /mr-history/done
>
> directory is owned by the mapred user, who is in the hadoop group,
> and the directory has the pemissions
>
> /mr-history":mapred:hadoop:drwxrwx---
>
> If I run the Hadoop instance without any Kerberos config, and
> fire up the JobHistory server as the mapred user, everything
> works.
>
> If I flip over to a Kerberised environment, the NameNode and DataNodes,
> running as the 'hdfs' user, and the Resource and and Node Managers, running
> as the 'yarn' user, all start up OK and their respective web exposure can
> be
> used.
>
>
> When I try to start up the JobHistory server however
>
> /bin/su mapred -c
> '/local/Hadoop/hadoop-2.8.0/sbin/mr-jobhistory-daemon.sh --config
> /local/Hadoop/hadoop-2.8.0/etc/hadoop/ start historyserver
>
> I get a message in the logs telling me that, rather than the mapred
> user doing things,
> a user 'jhs' is trying to do stuff, vis
>
> 2017-07-20 18:15:09,667 INFO
> org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: registered UNIX
> signal handlers for [TERM, HUP, INT]
> 2017-07-20 18:15:10,062 INFO
> org.apache.hadoop.security.UserGroupInformation: Login successful for
> user jhs/co246a-9.ecs.vuw.ac...@ecs.vuw.ac.nz using keytab file
> /local/Hadoop/krb/jhs.service.keytab
> 2017-07-20 18:15:10,107 INFO
> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
> hadoop-metrics2.properties
> 2017-07-20 18:15:10,142 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric
> snapshot period at 10 second(s).
> 2017-07-20 18:15:10,142 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: JobHistoryServer
> metrics system started
> 2017-07-20 18:15:10,145 INFO
> org.apache.hadoop.mapreduce.v2.hs.JobHistory: JobHistory Init
> 2017-07-20 18:15:10,411 INFO
> org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default
> file system [hdfs://co246a-a.ecs.vuw.ac.nz:9000]
> 2017-07-20 18:15:10,518 INFO
> org.apache.hadoop.service.AbstractService: Service
> org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager failed in state
> INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> Error creating done directory:
> [hdfs://co246a-a.ecs.vuw.ac.nz:9000/mr-history/done]
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating
> done directory: [hdfs://co246a-a.ecs.vuw.ac.nz:9000/mr-history/done]
> at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.
> tryCreatingHistoryDirs(HistoryFileManager.java:639)
> at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.
> createHistoryDirs(HistoryFileManager.java:585)
> at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.
> serviceInit(HistoryFileManager.java:550)
> at org.apache.hadoop.service.AbstractService.init(
> AbstractService.java:163)
> at org.apache.hadoop.mapreduce.v2.hs.JobHistory.serviceInit(
> JobHistory.java:95)
> at org.apache.hadoop.service.AbstractService.init(
> AbstractService.java:163)
> at org.apache.hadoop.service.CompositeService.serviceInit(
> CompositeService.java:107)
> at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.
> serviceInit(JobHistoryServer.java:151)
> at org.apache.hadoop.service.AbstractService.init(
> AbstractService.java:163)
> at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.
> launchJobHistoryServer(JobHistoryServer.java:231)
> at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(
> JobHistoryServer.java:241)
> Caused by: org.apache.hadoop.security.AccessControlException:
> Permission denied: user=jhs, access=EXECUTE,
> inode="/mr-history":mapred:hadoop:drwxrwx---
>
>
> But where has the jhs user come from ?
>
> Doesn't appear to be set anywhere in any of the config files.
>
> According to the hadoop-2.8.0  docs SecureMode page,
>
>https://hadoop.apache.org/docs/r2.8.0/hadoop-project-
> dist/hadoop-common/SecureMode.html
>
> =
> MapReduce JobHistory Server
>
> The MapReduce JobHistory Server keytab file, on that host, should look
> like the following:
>
> $ klist -e -k -t /etc/security/keytab/jhs.service.keytab
> Keytab name: FILE:/etc/security/keytab/jhs.service.keytab
> KVNO Timestamp Principal
>4 07/18/11 21:08:09 jhs/full.qualified.domain.n...@realm.tld
> (AES-256 CTS mode with 96-bit SHA-1 HMAC)
>4 07/18/11 21:08:09 jhs/full.qualified.domain.n...@realm.tld
> (AES-128 CTS mode with 96-bit SHA-1 HMAC)
>4 07/18/11 21:08:09 jhs/full.qualified.domain.n...@realm.tld
> 

RE: Namenode not able to come out of SAFEMODE

2017-07-20 Thread omprakash
Hi,

 

Thanks for the quick reply. 

 

That was exactly the problem. The datanodes were throwing error of "maximum
IPC message length size not enough" but we were focusing on NameNodes only. 

 

I changed the "ipc.maximum.data.length" property in core-site.xml file(I
found this solution of above error after searcing over internet) and
restarted the namenode and datanode. After an hour all the blocks get loaded
successfully.

 

Thanks for the help again.

 

Regards

Om

 

From: surendra lilhore [mailto:surendra.lilh...@huawei.com] 
Sent: 20 July 2017 12:12
To: omprakash ; user@hadoop.apache.org
Subject: RE: Namenode not able to come out of SAFEMODE

 

Hi Omprakash, 

 

 

The reported blocks 0 needs additional 6132675 blocks to reach the threshold
0.9990 of total blocks 6138814. The number of live datanodes 0 has reached
the minimum number 0. 

 

---> By seeing this message looks like NameNode loaded the block
info into memory but the reported blocks from the Datanodes are "0". Maybe
datanodes are not running, if it is running then please check why it's not
registered with namenode. You can check the datanode log for more details.

 

 

Regards,

Surendra 

  _  

From: omprakash [ompraka...@cdac.in]
Sent: Thursday, July 20, 2017 12:12 AM
To:   user@hadoop.apache.org
Subject: Namenode not able to come out of SAFEMODE

Hi all,

 

I have a setup of 3 node Hadoop cluster(Hadoop-2.8.0). I have deployed 2
namenodes that are configured in HA mode using QJM. 2 datanodes are
configured on the same machine where namenode are installed. 3rd node is
used for quorum purpose only. 

 

Setup

Node1 -> nn1, dn1, jn1, zkfc1, zkServer1

Node2 -> nn2, dn2, jn2, zkfc2, zkServer2

Node3 -> jn3,  zkServer3

 

I stopped the cluster for some reason(power recycled the servers)  and since
them I am not able to start the cluster successfully. After examining the
logs I found that the namenodes are in safe mode and none of them are able
to load the block in memory. Below is the status of namenode from namenode
UI. 

 

Safe mode is ON. The reported blocks 0 needs additional 6132675 blocks to
reach the threshold 0.9990 of total blocks 6138814. The number of live
datanodes 0 has reached the minimum number 0. Safe mode will be turned off
automatically once the thresholds have been reached.

61,56,984 files and directories, 61,38,814 blocks = 1,22,95,798 total
filesystem object(s).

Heap Memory used 5.6 GB of 7.12 GB Heap Memory. Max Heap Memory is 13.33 GB.

Non Heap Memory used 45.19 MB of 49.75 MB Commited Non Heap Memory. Max Non
Heap Memory is 130 MB.

 

I have tried increasing the HADOOP_HEAPSIZE, increasing the heap size in
HADOOP_NAMENODE_OPTS but no success. 

Need help.

 

 

Regards

Omprakash Paliwal

HPC-Medical and Bioinformatics Applications Group

Centre for Development of Advanced Computing (C-DAC)

Pune University campus,

PUNE-411007

Maharashtra, India

email:ompraka...@cdac.in

Contact : +91-20-25704231

 



--- 
[ C-DAC is on Social-Media too. Kindly follow us at: 
Facebook:  
https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] 

This e-mail is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information. If you are not the 
intended recipient, please contact the sender by reply e-mail and destroy 
all copies and the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email 
is strictly prohibited and appropriate legal action will be taken. 

--- 


---
[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
---



RE: Namenode not able to come out of SAFEMODE

2017-07-20 Thread Brahma Reddy Battula

Dn1 and Dn2 is re-started successfully..? can you check the DN logs..?


--Brahma Reddy Battula

From: omprakash [mailto:ompraka...@cdac.in]
Sent: 20 July 2017 12:12
To: user@hadoop.apache.org
Subject: Namenode not able to come out of SAFEMODE

Hi all,

I have a setup of 3 node Hadoop cluster(Hadoop-2.8.0). I have deployed 2 
namenodes that are configured in HA mode using QJM. 2 datanodes are configured 
on the same machine where namenode are installed. 3rd node is used for quorum 
purpose only.

Setup
Node1 -> nn1, dn1, jn1, zkfc1, zkServer1
Node2 -> nn2, dn2, jn2, zkfc2, zkServer2
Node3 -> jn3,  zkServer3

I stopped the cluster for some reason(power recycled the servers)  and since 
them I am not able to start the cluster successfully. After examining the logs 
I found that the namenodes are in safe mode and none of them are able to load 
the block in memory. Below is the status of namenode from namenode UI.


Safe mode is ON. The reported blocks 0 needs additional 6132675 blocks to reach 
the threshold 0.9990 of total blocks 6138814. The number of live datanodes 0 
has reached the minimum number 0. Safe mode will be turned off automatically 
once the thresholds have been reached.

61,56,984 files and directories, 61,38,814 blocks = 1,22,95,798 total 
filesystem object(s).

Heap Memory used 5.6 GB of 7.12 GB Heap Memory. Max Heap Memory is 13.33 GB.

Non Heap Memory used 45.19 MB of 49.75 MB Commited Non Heap Memory. Max Non 
Heap Memory is 130 MB.

I have tried increasing the HADOOP_HEAPSIZE, increasing the heap size in 
HADOOP_NAMENODE_OPTS but no success.
Need help.


Regards
Omprakash Paliwal
HPC-Medical and Bioinformatics Applications Group
Centre for Development of Advanced Computing (C-DAC)
Pune University campus,
PUNE-411007
Maharashtra, India
email:ompraka...@cdac.in
Contact : +91-20-25704231


---
[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
---