Can't list files in a federation of HDFS

2015-03-03 Thread xeonmailinglist

Hi,

I have configured in 2 hosts (hadooop-coc-1, and hadoop-coc-2) a 
federation of HDFS. In the configuration, I have set a namespace in each 
host, and a single data node (see image). The service is running 
properly. You can check the output of the |jps| commands in [1].


The strange part is that, when I list the files, they do not appear in 
hadoop-coc-2. You can check the output in [2]. Why this happens?


hdfsfederation

[1]: jps output

|xubuntu@hadoop-coc-1:~/Programs/hadoop$ jps
21538 NameNode
21773 DataNode

xubuntu@hadoop-coc-2:~/Programs/hadoop$ jps
2342 NameNode
|

[2]: hdfs dfs -ls / output

|xubuntu@hadoop-coc-1:~/Programs/hadoop$ hdfs dfs -ls /
Java HotSpot(TM) Client VM warning: You have loaded library 
/home/xubuntu/Programs/hadoop-2.6.0/lib/native/libhadoop.so which might have 
disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c ', 
or link it with '-z noexecstack'.
15/03/03 05:09:04 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x   - xubuntu supergroup  0 2015-03-03 04:47 /input1

xubuntu@hadoop-coc-2:~/Programs/hadoop$ hdfs dfs -ls /
Java HotSpot(TM) Client VM warning: You have loaded library 
/home/xubuntu/Programs/hadoop-2.6.0/lib/native/libhadoop.so which might have 
disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c ', 
or link it with '-z noexecstack'.
15/03/03 05:09:07 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
|

​

--
--



Re: The Activities of Apache Hadoop Community

2015-03-03 Thread Akira AJISAKA

Hi all,

One year after the previous post, we collected and analyzed
JIRA tickets again to investigate the activities of Apache Hadoop
community in 2014.

http://ajisakaa.blogspot.com/2015/02/the-activities-of-apache-hadoop.html

As we expected in the previous post, the activities of
Apache Hadoop community was continued to expand also in 2014.
We hope it will be the same in 2015.

Thanks,
Akira

On 2/13/14 11:20, Akira AJISAKA wrote:

Hi all,

We collected and analyzed JIRA tickets to investigate
the activities of Apache Hadoop Community in 2013.

http://ajisakaa.blogspot.com/2014/02/the-activities-of-apache-hadoop.html

We counted the number of the organizations, the lines
of code, and the number of the issues. As a result, we
confirmed all of them are increasing and Hadoop community
is getting more active.
We appreciate continuous contributions of developers
and we hope the activities will expand also in 2014.

Thanks,
Akira





RE: how to check hdfs

2015-03-03 Thread Somnath Pandeya
Is your hdfs daemon running on cluster. ? ?

From: Vikas Parashar [mailto:para.vi...@gmail.com]
Sent: Tuesday, March 03, 2015 10:33 AM
To: user@hadoop.apache.org
Subject: Re: how to check hdfs

Hi,

Kindly install hadoop-hdfs rpm in your machine..

Rg:
Vicky

On Mon, Mar 2, 2015 at 11:19 PM, Shengdi Jin 
mailto:jinshen...@gmail.com>> wrote:
Hi all,
I just start to learn hadoop, I have a naive question
I used
hdfs dfs -ls /home/cluster
to check the content inside.
But I get error
ls: No FileSystem for scheme: hdfs
My configuration file core-site.xml is like


  fs.defaultFS
  hdfs://master:9000



hdfs-site.xml is like


   dfs.replication
   2


   dfs.name.dir
   file:/home/cluster/mydata/hdfs/namenode


   dfs.data.dir
   file:/home/cluster/mydata/hdfs/datanode


is there any thing wrong ?
Thanks a lot.


 CAUTION - Disclaimer *
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are 
not 
to copy, disclose, or distribute this e-mail or its contents to any other 
person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has 
taken 
every reasonable precaution to minimize this risk, but is not liable for any 
damage 
you may sustain as a result of any virus in this e-mail. You should carry out 
your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this 
e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS End of Disclaimer INFOSYS***

Re: how to check hdfs

2015-03-03 Thread 杨浩
I don't think it nessary to run the command with daemon in that client, and
hdfs is not a daemon for hadoop。

2015-03-03 20:57 GMT+08:00 Somnath Pandeya :

>  Is your hdfs daemon running on cluster. ? ?
>
>
>
> *From:* Vikas Parashar [mailto:para.vi...@gmail.com]
> *Sent:* Tuesday, March 03, 2015 10:33 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: how to check hdfs
>
>
>
> Hi,
>
>
>
> Kindly install hadoop-hdfs rpm in your machine..
>
>
>
> Rg:
>
> Vicky
>
>
>
> On Mon, Mar 2, 2015 at 11:19 PM, Shengdi Jin  wrote:
>
>  Hi all,
>
> I just start to learn hadoop, I have a naive question
>
> I used
>
> hdfs dfs -ls /home/cluster
>
> to check the content inside.
>
> But I get error
> ls: No FileSystem for scheme: hdfs
>
> My configuration file core-site.xml is like
> 
> 
>   fs.defaultFS
>   hdfs://master:9000
> 
> 
>
>
> hdfs-site.xml is like
> 
> 
>dfs.replication
>2
> 
> 
>dfs.name.dir
>file:/home/cluster/mydata/hdfs/namenode
> 
> 
>dfs.data.dir
>file:/home/cluster/mydata/hdfs/datanode
> 
> 
>
> is there any thing wrong ?
>
> Thanks a lot.
>
>
>
>  CAUTION - Disclaimer *
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
> for the use of the addressee(s). If you are not the intended recipient, please
> notify the sender by e-mail and delete the original message. Further, you are 
> not
> to copy, disclose, or distribute this e-mail or its contents to any other 
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has 
> taken
> every reasonable precaution to minimize this risk, but is not liable for any 
> damage
> you may sustain as a result of any virus in this e-mail. You should carry out 
> your
> own virus checks before opening the e-mail or attachment. Infosys reserves the
> right to monitor and review the content of all messages sent to or from this 
> e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS End of Disclaimer INFOSYS***
>
>


Re: How to find bottlenecks of the cluster ?

2015-03-03 Thread 杨浩
I think benchmark will do some help, since it can help to find out the
executing speed of I/O rated job and CPU rated job

2015-03-02 19:01 GMT+08:00 Adrien Mogenet 
:

> This is a non-sense ; you have to tell us under which conditions you want
> to find a bottleneck.
>
> Regardless the workload, we mostly use OpenTSDB to check cpu times (iowait
> / user / sys / idle), disk usage (await, ios in progress...) and memory
> (numa allocations, buffers, cache, dirty pages...)
>
> On 2 March 2015 at 08:20, Krish Donald  wrote:
>
>> Basically we have 4 points to consider, CPU , Memory, IO and Network
>>
>> So how to see which one is causing the bottleneck ?
>> What parameters we should consider etc ?
>>
>> On Sun, Mar 1, 2015 at 10:57 PM, Nishanth S 
>> wrote:
>>
>>> This is a  vast  topic.Can you tell what components are there in your
>>> data pipe line and how data flows in to system and the way its
>>> processed.There are several  inbuilt tests like testDFSIO and terasort that
>>> you can run.
>>>
>>> -Nishan
>>>
>>> On Sun, Mar 1, 2015 at 9:45 PM, Krish Donald 
>>> wrote:
>>>
 Hi,

 I wanted to understand, how should we find out the bottleneck of the
 cluster?

 Thanks
 Krish

>>>
>>>
>>
>
>
> --
>
> *Adrien Mogenet*
> Head of Backend/Infrastructure
> adrien.moge...@contentsquare.com
> (+33)6.59.16.64.22
> http://www.contentsquare.com
> 4, avenue Franklin D. Roosevelt - 75008 Paris
>


DataNode: long GC times (page faults?)

2015-03-03 Thread Dmitry Simonov
Hello!

We are using HA Hadoop (v2.5.1) Cluster with 3 DataNodes on Windows Server
2008R2 with 24 GB RAM.
Sometimes we observe long GC times, e.g.:

1029537.175: [GC (Allocation Failure) 1029537.176: [ParNew:
5459023K->309990K(5662336K), 32.6609790 secs]
10328788K->5583905K(11953792K), 32.6611705 secs] [Times: user=0.92
sys=0.06, real=32.65 secs]

Parameters:
set GC_OPTS=-XX:+UseNUMA -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled -XX:ConcGCThreads=4 -XX:ParallelGCThreads=15
-Xmx12G -Xms12G -Xmn6G

java version "1.8.0_25"
Java(TM) SE Runtime Environment (build 1.8.0_25-b18)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)

Could you please give some advice, how to troubleshoot this issue?

Best regards, Dmitrii Simonov.


Re: how to check hdfs

2015-03-03 Thread Shengdi Jin
I use command
./hdfs dfs -ls hdfs://master:9000/
It works. So i think hdfs://master:9000/ should be the hdfs.

I have another questions, if
./hdfs dfs -mkdir hdfs://master:9000/directory
where should the /directory be stored?
In DataNode or in NameNode? or in the local system of master?

On Tue, Mar 3, 2015 at 8:06 AM, 杨浩  wrote:

> I don't think it nessary to run the command with daemon in that client,
> and hdfs is not a daemon for hadoop。
>
> 2015-03-03 20:57 GMT+08:00 Somnath Pandeya :
>
>>  Is your hdfs daemon running on cluster. ? ?
>>
>>
>>
>> *From:* Vikas Parashar [mailto:para.vi...@gmail.com]
>> *Sent:* Tuesday, March 03, 2015 10:33 AM
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: how to check hdfs
>>
>>
>>
>> Hi,
>>
>>
>>
>> Kindly install hadoop-hdfs rpm in your machine..
>>
>>
>>
>> Rg:
>>
>> Vicky
>>
>>
>>
>> On Mon, Mar 2, 2015 at 11:19 PM, Shengdi Jin 
>> wrote:
>>
>>  Hi all,
>>
>> I just start to learn hadoop, I have a naive question
>>
>> I used
>>
>> hdfs dfs -ls /home/cluster
>>
>> to check the content inside.
>>
>> But I get error
>> ls: No FileSystem for scheme: hdfs
>>
>> My configuration file core-site.xml is like
>> 
>> 
>>   fs.defaultFS
>>   hdfs://master:9000
>> 
>> 
>>
>>
>> hdfs-site.xml is like
>> 
>> 
>>dfs.replication
>>2
>> 
>> 
>>dfs.name.dir
>>file:/home/cluster/mydata/hdfs/namenode
>> 
>> 
>>dfs.data.dir
>>file:/home/cluster/mydata/hdfs/datanode
>> 
>> 
>>
>> is there any thing wrong ?
>>
>> Thanks a lot.
>>
>>
>>
>>  CAUTION - Disclaimer *
>> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
>> for the use of the addressee(s). If you are not the intended recipient, 
>> please
>> notify the sender by e-mail and delete the original message. Further, you 
>> are not
>> to copy, disclose, or distribute this e-mail or its contents to any other 
>> person and
>> any such actions are unlawful. This e-mail may contain viruses. Infosys has 
>> taken
>> every reasonable precaution to minimize this risk, but is not liable for any 
>> damage
>> you may sustain as a result of any virus in this e-mail. You should carry 
>> out your
>> own virus checks before opening the e-mail or attachment. Infosys reserves 
>> the
>> right to monitor and review the content of all messages sent to or from this 
>> e-mail
>> address. Messages sent to or from this e-mail address may be stored on the
>> Infosys e-mail system.
>> ***INFOSYS End of Disclaimer INFOSYS***
>>
>>
>


Re: how to check hdfs

2015-03-03 Thread Vikas Parashar
Hello,

  hdfs dfs -ls /home/cluster
to check the content inside.
But I get error
ls: *No FileSystem for scheme: hdfs  --> *that means, you don't have
hdfs rpm installed at your client machine..


For answer of you question:-
./hdfs dfs -mkdir hdfs://master:9000/directory


That *directory *will be under / in your hdfs. All data would be stored in
data node; but namenode will have the meta data information. For more
details; you have to read hdfs
http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html.
















On Tue, Mar 3, 2015 at 10:46 PM, Shengdi Jin  wrote:

> I use command
> ./hdfs dfs -ls hdfs://master:9000/
> It works. So i think hdfs://master:9000/ should be the hdfs.
>
> I have another questions, if
> ./hdfs dfs -mkdir hdfs://master:9000/directory
> where should the /directory be stored?
> In DataNode or in NameNode? or in the local system of master?
>
> On Tue, Mar 3, 2015 at 8:06 AM, 杨浩  wrote:
>
>> I don't think it nessary to run the command with daemon in that client,
>> and hdfs is not a daemon for hadoop。
>>
>> 2015-03-03 20:57 GMT+08:00 Somnath Pandeya :
>>
>>>  Is your hdfs daemon running on cluster. ? ?
>>>
>>>
>>>
>>> *From:* Vikas Parashar [mailto:para.vi...@gmail.com]
>>> *Sent:* Tuesday, March 03, 2015 10:33 AM
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: how to check hdfs
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> Kindly install hadoop-hdfs rpm in your machine..
>>>
>>>
>>>
>>> Rg:
>>>
>>> Vicky
>>>
>>>
>>>
>>> On Mon, Mar 2, 2015 at 11:19 PM, Shengdi Jin 
>>> wrote:
>>>
>>>  Hi all,
>>>
>>> I just start to learn hadoop, I have a naive question
>>>
>>> I used
>>>
>>> hdfs dfs -ls /home/cluster
>>>
>>> to check the content inside.
>>>
>>> But I get error
>>> ls: No FileSystem for scheme: hdfs
>>>
>>> My configuration file core-site.xml is like
>>> 
>>> 
>>>   fs.defaultFS
>>>   hdfs://master:9000
>>> 
>>> 
>>>
>>>
>>> hdfs-site.xml is like
>>> 
>>> 
>>>dfs.replication
>>>2
>>> 
>>> 
>>>dfs.name.dir
>>>file:/home/cluster/mydata/hdfs/namenode
>>> 
>>> 
>>>dfs.data.dir
>>>file:/home/cluster/mydata/hdfs/datanode
>>> 
>>> 
>>>
>>> is there any thing wrong ?
>>>
>>> Thanks a lot.
>>>
>>>
>>>
>>>  CAUTION - Disclaimer *
>>> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
>>> for the use of the addressee(s). If you are not the intended recipient, 
>>> please
>>> notify the sender by e-mail and delete the original message. Further, you 
>>> are not
>>> to copy, disclose, or distribute this e-mail or its contents to any other 
>>> person and
>>> any such actions are unlawful. This e-mail may contain viruses. Infosys has 
>>> taken
>>> every reasonable precaution to minimize this risk, but is not liable for 
>>> any damage
>>> you may sustain as a result of any virus in this e-mail. You should carry 
>>> out your
>>> own virus checks before opening the e-mail or attachment. Infosys reserves 
>>> the
>>> right to monitor and review the content of all messages sent to or from 
>>> this e-mail
>>> address. Messages sent to or from this e-mail address may be stored on the
>>> Infosys e-mail system.
>>> ***INFOSYS End of Disclaimer INFOSYS***
>>>
>>>
>>
>


configure a backup namenode

2015-03-03 Thread Shengdi Jin
Hi all,

I have a small cluster with one namenode1 and one datanode.

I want to configure another namenode2 to replace the namenode1 by only
replicating the files in namenode directory of namenode1 to namenode2 and
changing IP of namenode2 to namenode1's.

I tried this, the replacing namenode2 can work with datanode, like
start/stop hdfs, create directory, delete directory, a new mapreduce job

But when I want to check the directories and files created by namenode1,
nothing found.

So i suspect that the blocking-mapping information is not included in
namenode directory of namenode1.

Am I right? Does anyone know how the namenode manages the mapping block
information. Please give me some ideas.

If I am wrong. Please correct me. Thanks a looot.

Shengdi


Re: how to check hdfs

2015-03-03 Thread Shengdi Jin
Thanks Vikas.

I run ./hdfs dfs -ls /home/cluster  on machine running namenode.
Do I need to configure a client machine?

In my opinion, I suspect that the local fs /home/cluster is not configured
as hdfs.
In core-site.xml,  I set the hdfs as hdfs://master:9000.
So I think that's why the command ./hdfs dfs-ls hdfs://master:9000/ can
work.

Please correct me, if i was wrong.

On Tue, Mar 3, 2015 at 1:59 PM, Vikas Parashar  wrote:

> Hello,
>
>   hdfs dfs -ls /home/cluster
> to check the content inside.
> But I get error
> ls: *No FileSystem for scheme: hdfs  --> *that means, you don't have
> hdfs rpm installed at your client machine..
>
>
> For answer of you question:-
> ./hdfs dfs -mkdir hdfs://master:9000/directory
>
>
> That *directory *will be under / in your hdfs. All data would be stored
> in data node; but namenode will have the meta data information. For more
> details; you have to read hdfs
> http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Mar 3, 2015 at 10:46 PM, Shengdi Jin  wrote:
>
>> I use command
>> ./hdfs dfs -ls hdfs://master:9000/
>> It works. So i think hdfs://master:9000/ should be the hdfs.
>>
>> I have another questions, if
>> ./hdfs dfs -mkdir hdfs://master:9000/directory
>> where should the /directory be stored?
>> In DataNode or in NameNode? or in the local system of master?
>>
>> On Tue, Mar 3, 2015 at 8:06 AM, 杨浩  wrote:
>>
>>> I don't think it nessary to run the command with daemon in that client,
>>> and hdfs is not a daemon for hadoop。
>>>
>>> 2015-03-03 20:57 GMT+08:00 Somnath Pandeya 
>>> :
>>>
  Is your hdfs daemon running on cluster. ? ?



 *From:* Vikas Parashar [mailto:para.vi...@gmail.com]
 *Sent:* Tuesday, March 03, 2015 10:33 AM
 *To:* user@hadoop.apache.org
 *Subject:* Re: how to check hdfs



 Hi,



 Kindly install hadoop-hdfs rpm in your machine..



 Rg:

 Vicky



 On Mon, Mar 2, 2015 at 11:19 PM, Shengdi Jin 
 wrote:

  Hi all,

 I just start to learn hadoop, I have a naive question

 I used

 hdfs dfs -ls /home/cluster

 to check the content inside.

 But I get error
 ls: No FileSystem for scheme: hdfs

 My configuration file core-site.xml is like
 
 
   fs.defaultFS
   hdfs://master:9000
 
 


 hdfs-site.xml is like
 
 
dfs.replication
2
 
 
dfs.name.dir
file:/home/cluster/mydata/hdfs/namenode
 
 
dfs.data.dir
file:/home/cluster/mydata/hdfs/datanode
 
 

 is there any thing wrong ?

 Thanks a lot.



  CAUTION - Disclaimer *
 This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended 
 solely
 for the use of the addressee(s). If you are not the intended recipient, 
 please
 notify the sender by e-mail and delete the original message. Further, you 
 are not
 to copy, disclose, or distribute this e-mail or its contents to any other 
 person and
 any such actions are unlawful. This e-mail may contain viruses. Infosys 
 has taken
 every reasonable precaution to minimize this risk, but is not liable for 
 any damage
 you may sustain as a result of any virus in this e-mail. You should carry 
 out your
 own virus checks before opening the e-mail or attachment. Infosys reserves 
 the
 right to monitor and review the content of all messages sent to or from 
 this e-mail
 address. Messages sent to or from this e-mail address may be stored on the
 Infosys e-mail system.
 ***INFOSYS End of Disclaimer INFOSYS***


>>>
>>
>


Re: AW: AW: Hadoop 2.6.0 - No DataNode to stop

2015-03-03 Thread Ulul

Hi

As a general rule, you should never run an applicative daemon as root 
since any vulnerabilty can allow a malicious intruder to get full 
control of the system.

The documentation does not advise to start hadoop as root :
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html#Hadoop_Startup
will show you hadoop being started as a regular user

I'm puzzled by the fact that root would be barred from accessing a file. 
The only case I can think of would be an nfs mount with root squashing.


And there should be no need to use the 777 bypass as long as you're 
using the same user to start and stop your daemons.


Ulul

Le 03/03/2015 00:14, Daniel Klinger a écrit :


Hi,

thanks for your help. The HADOOP_PID_DIR variable is pointing to 
/var/run/cluster/hadoop (which has hdfs:hadoop) as it’s owner. 3 PID 
are created there (datanode namenode and secure_dn). It looks like the 
PID was written but there was a readproblem.


I did chmod –R 777 on the folder and now the Datanodes are Stopped 
correctly. It only works when I’m running the start and stop command 
as user HDFS. If I try to start and stop as root (like its documented 
in the Documentation I still get the “no Datanode to stop” error.


Is it important to start the DN as root? The only thing I recognized 
is the secure_dn PID-File is not created when im starting the Datanode 
as HDFS-User. Is this a Problem?


Greets

DK

*Von:*Ulul [mailto:had...@ulul.org]
*Gesendet:* Montag, 2. März 2015 21:50
*An:* user@hadoop.apache.org
*Betreff:* Re: AW: Hadoop 2.6.0 - No DataNode to stop

Hi
The hadoop-daemon.sh script prints the no $command to stop if it 
doesn'f find the pid file.
You should echo the $pid variable and see if you hava a correct pid 
file there.

Ulul

Le 02/03/2015 13:53, Daniel Klinger a écrit :

Thanks for your help. But unfortunatly this didn’t do the job.
Here’s the Shellscript I’ve written to start my cluster (the
scripts on the other node only contains the command to start the
datanode respectively the command to start the Nodemanager on the
other node (with the right user (hdfs / yarn)):

#!/bin/bash

# Start

HDFS-

# Start Namenode

su - hdfs -c "$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config
$HADOOP_CONF_DIR --script hdfs start namenode"

wait

# Start all Datanodes

export HADOOP_SECURE_DN_USER=hdfs

su - hdfs -c "$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config
$HADOOP_CONF_DIR --script hdfs start datanode"

wait

ssh root@hadoop-data.klinger.local
 'bash startDatanode.sh'

wait

# Start Resourcemanager

su - yarn -c "$HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config
$HADOOP_CONF_DIR start resourcemanager"

wait

# Start Nodemanager on all Nodes

su - yarn -c "$HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config
$HADOOP_CONF_DIR start nodemanager"

wait

ssh root@hadoop-data.klinger.local
 'bash startNodemanager.sh'

wait

# Start Proxyserver

#su - yarn -c "$HADOOP_YARN_HOME/bin/yarn start proxyserver
--config $HADOOP_CONF_DIR"

#wait

# Start Historyserver

su - mapred -c "$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start
historyserver --config $HADOOP_CONF_DIR"

wait

This script generates the following output:

starting namenode, logging to
/var/log/cluster/hadoop/hadoop-hdfs-namenode-hadoop.klinger.local.out

starting datanode, logging to
/var/log/cluster/hadoop/hadoop-hdfs-datanode-hadoop.klinger.local.out

starting datanode, logging to
/var/log/cluster/hadoop/hadoop-hdfs-datanode-hadoop-data.klinger.local.out

starting resourcemanager, logging to
/var/log/cluster/yarn/yarn-yarn-resourcemanager-hadoop.klinger.local.out

starting nodemanager, logging to
/var/log/cluster/yarn/yarn-yarn-nodemanager-hadoop.klinger.local.out

starting nodemanager, logging to
/var/log/cluster/yarn/yarn-yarn-nodemanager-hadoop-data.klinger.local.out

starting historyserver, logging to
/var/log/cluster/mapred/mapred-mapred-historyserver-hadoop.klinger.local.out

Following my stopscript and it’s output:

#!/bin/bash

# Stop

HDFS

# Stop Namenode

su - hdfs -c "$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config
$HADOOP_CONF_DIR --script hdfs stop namenode"

# Stop all Datanodes

su - hdfs -c "$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config
$HADOOP_CONF_DIR --script hdfs stop datanode"

ssh root@hadoop-data.klinger.local
 'bash stopDatanode.sh'

# Stop Resourcemanager

su - yarn -c "$HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config
$HADOOP_C

Re: Permission Denied

2015-03-03 Thread David Patterson
Thanks to all that helped. I've now got my configuration running. All of
the configuration files that once referenced "localhost" now reference my
hostname (AccumuloTN). The big deal that I omitted seeing in *any* of the
various blogs and instructions was to create a linux directory with the
same name as my accumulo user (that is the linux-user-name) and change the
ownership of it to accumulo:superuser.

So, now not only is it working (from within another linux userid on that
machine), I've run code on my Windows machine that connects to my cloud
machine and is able to fetch data.

Dave Patterson

On Mon, Mar 2, 2015 at 11:47 AM, Sean Busbey  wrote:

> Splitting into three unix users is a good idea. Generally, none of the
> linux users should need access to any of the local resources owned by the
> others. (that is, the user running the accumulo processes shouldn't be able
> to interfere with the backing files used by the HDFS processes).
>
> By default, the linux user that drives a particular process will be
> resolved to a Hadoop user by the NameNode process. Presuming your Accumulo
> services are running under the linux user "accumulo", you should ensure
> that user exists on the linux node that runs the NameNode.
>
> The main issue with running init as the hadoop user is that by default
> it's likely going to write the accumulo directories as owned by the user
> that created them. Presuming you are using Accumulo because you have
> security requirements, the common practice is to make sure only the user
> that runs Accumulo processes can write to /accumulo and that only that user
> can read /accumulo/tables and /accumulo/wal. This ensures that other users
> with access to the HDFS cluster won't be able to bypass the cell-level
> access controls provided by Accumulo.
>
> While you are setting up HDFS directories, you should also create a home
> directory for the user that runs Accumulo processes. If your HDFS instance
> is set to use the trash feature (either in server configs or the client
> configs made available to Accumulo), then by default Accumulo will attempt
> to use it. Without a home directory, this will result in failures.
> Alternatively, you can ensure Accumulo doesn't rely on the trash feature by
> setting gc.trash.ignore in your accumulo-site.xml.
>
> One other note:
>
> > I edited the accumulo-site.xml so it now has
> >  
> >instance.volumes
> >hdfs://localhost:9000/accumulo
> >comma separated list of URIs for volumes. example:
> hdfs://localhost:9000/accumulo
> >  
>
> You will save yourself headache later if you stick with fully qualified
> domain names for all HDFS, ZooKeeper, and Accumulo connections.
>
> --
> Sean
>
> On Mon, Mar 2, 2015 at 8:13 AM, David Patterson  wrote:
>
>> David,
>>
>> Thanks for the information. I've issued those two commands in my hadoop
>> shell and still get the same error when I try to initialize accumulo in
>> *its* shell. :
>>
>> 2015-03-02 13:30:41,175 [init.Initialize] FATAL: Failed to initialize
>> filesystem
>>org.apache.hadoop.security.AccessControlException: Permission denied:
>> user=accumulo, access=WRITE, inode="/accumulo":
>>accumulo.supergroup:supergroup:drwxr-xr-x
>>
>> My comment that I had 3 users was meant in a linux sense, not in a hadoop
>> sense. So (to borrow terminoloy from RDF or XML) is there something I have
>> to do in my hadoop setup (running under linix:hadoop) or my accumulo setup
>> (running under linux:accumulo) so that the accumuulo I/O gets processed as
>> from someone in the hadoop:supergroup?
>>
>>
>> I tried running the accumulo init from the linux:hadoop user and it
>> worked. I'm not sure if any permissions/etc were hosed by doing it there.
>> I'll see.
>>
>> Thanks for you help.
>>
>> (By the way, is it wrong or a bad idea to split the work into three
>> linux:users, or should it all be done in one linux:user space?)
>>
>> Dave Patterson
>>
>> On Sun, Mar 1, 2015 at 8:35 PM, dlmarion  wrote:
>>
>>> hadoop fs -mkdir /accumulo
>>> hadoop fs -chown accumulo:supergroup /accumulo
>>>
>>>
>>>
>>>  Original message 
>>> From: David Patterson 
>>> Date:03/01/2015 7:04 PM (GMT-05:00)
>>> To: user@hadoop.apache.org
>>> Cc:
>>> Subject: Re: Permission Denied
>>>
>>> David,
>>>
>>> Thanks for the reply.
>>>
>>> Taking the questions in the opposite order, my accumulo-site.xml does
>>> not have volumes specified.
>>>
>>> I edited the accumulo-site.xml so it now has
>>>   
>>> instance.volumes
>>> hdfs://localhost:9000/accumulo
>>> comma separated list of URIs for volumes. example:
>>> hdfs://localhost:9000/accumulo
>>>   
>>>
>>> and got the same error.
>>>
>>> How can I precreate /accumulo ?
>>>
>>> Dave Patterson
>>>
>>> On Sun, Mar 1, 2015 at 3:50 PM, david marion 
>>> wrote:
>>>
  It looks like / is owned by hadoop.supergroup and the perms are 755.
 You could precreate /accumulo and chown it appropriately, or set the perms
 for / to 775. Init is trying to create /accum

Re: how to check hdfs

2015-03-03 Thread Vikas Parashar
Hi Jin,

Please check your hdfs-site.xml in which we specify; what will be our hdfs
path on our local machine.

Below are the parameter that will help you to understand;

dfs.namenode.name.dir
dfs.datanode.name.dir

Rg:
Vicky


On Wed, Mar 4, 2015 at 1:34 AM, Shengdi Jin  wrote:

> Thanks Vikas.
>
> I run ./hdfs dfs -ls /home/cluster  on machine running namenode.
> Do I need to configure a client machine?
>
> In my opinion, I suspect that the local fs /home/cluster is not configured
> as hdfs.
> In core-site.xml,  I set the hdfs as hdfs://master:9000.
> So I think that's why the command ./hdfs dfs-ls hdfs://master:9000/ can
> work.
>
> Please correct me, if i was wrong.
>
> On Tue, Mar 3, 2015 at 1:59 PM, Vikas Parashar 
> wrote:
>
>> Hello,
>>
>>   hdfs dfs -ls /home/cluster
>> to check the content inside.
>> But I get error
>> ls: *No FileSystem for scheme: hdfs  --> *that means, you don't have
>> hdfs rpm installed at your client machine..
>>
>>
>> For answer of you question:-
>> ./hdfs dfs -mkdir hdfs://master:9000/directory
>>
>>
>> That *directory *will be under / in your hdfs. All data would be stored
>> in data node; but namenode will have the meta data information. For more
>> details; you have to read hdfs
>> http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Mar 3, 2015 at 10:46 PM, Shengdi Jin 
>> wrote:
>>
>>> I use command
>>> ./hdfs dfs -ls hdfs://master:9000/
>>> It works. So i think hdfs://master:9000/ should be the hdfs.
>>>
>>> I have another questions, if
>>> ./hdfs dfs -mkdir hdfs://master:9000/directory
>>> where should the /directory be stored?
>>> In DataNode or in NameNode? or in the local system of master?
>>>
>>> On Tue, Mar 3, 2015 at 8:06 AM, 杨浩  wrote:
>>>
 I don't think it nessary to run the command with daemon in that client,
 and hdfs is not a daemon for hadoop。

 2015-03-03 20:57 GMT+08:00 Somnath Pandeya >>> >:

>  Is your hdfs daemon running on cluster. ? ?
>
>
>
> *From:* Vikas Parashar [mailto:para.vi...@gmail.com]
> *Sent:* Tuesday, March 03, 2015 10:33 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: how to check hdfs
>
>
>
> Hi,
>
>
>
> Kindly install hadoop-hdfs rpm in your machine..
>
>
>
> Rg:
>
> Vicky
>
>
>
> On Mon, Mar 2, 2015 at 11:19 PM, Shengdi Jin 
> wrote:
>
>  Hi all,
>
> I just start to learn hadoop, I have a naive question
>
> I used
>
> hdfs dfs -ls /home/cluster
>
> to check the content inside.
>
> But I get error
> ls: No FileSystem for scheme: hdfs
>
> My configuration file core-site.xml is like
> 
> 
>   fs.defaultFS
>   hdfs://master:9000
> 
> 
>
>
> hdfs-site.xml is like
> 
> 
>dfs.replication
>2
> 
> 
>dfs.name.dir
>file:/home/cluster/mydata/hdfs/namenode
> 
> 
>dfs.data.dir
>file:/home/cluster/mydata/hdfs/datanode
> 
> 
>
> is there any thing wrong ?
>
> Thanks a lot.
>
>
>
>  CAUTION - Disclaimer *
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended 
> solely
> for the use of the addressee(s). If you are not the intended recipient, 
> please
> notify the sender by e-mail and delete the original message. Further, you 
> are not
> to copy, disclose, or distribute this e-mail or its contents to any other 
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys 
> has taken
> every reasonable precaution to minimize this risk, but is not liable for 
> any damage
> you may sustain as a result of any virus in this e-mail. You should carry 
> out your
> own virus checks before opening the e-mail or attachment. Infosys 
> reserves the
> right to monitor and review the content of all messages sent to or from 
> this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS End of Disclaimer INFOSYS***
>
>

>>>
>>
>