Re: HDFS Append Problem

2015-03-05 Thread Suresh Srinivas
Please take this up CDH mailing list.



From: Molnár Bálint 
Sent: Thursday, March 05, 2015 4:53 AM
To: user@hadoop.apache.org
Subject: HDFS Append Problem

Hi Everyone!

I 'm experiencing an annoying problem.

My Scenario is:

I want to store lots of small files (1-2MB max) in map files. These files will 
come periodically during the days, so I cannot use the "factory" writer because 
it will create a lot of small MapFiles. (I want to store these files in the 
HDFS immediately.)

I' m trying to create a code to append Map files. I use the
org.apache.hadoop.fs.FileSystem append() method which calls the 
org.apache.hadoop.hdfs.DistributedFileSystem append() method to do the job.

My code works well, because the stock MapFile Reader can retrieve the files. My 
problem appears in the upload phase. When I try to upload a set (1GB) of small 
files, the free space of the HDFS decreases fast. The program only uploads 
400MB but according to the Cloudera Manager it is more than 5GB.
The interesting part is that, when I terminate the upload, and wait 1-2 
minutes, the HDFS goes back to normal size (500MB), and none of my files are 
lost. If I don't terminate the upload, the HDFS goes out of free space and the 
program gets errors.
I'm using cloudera quickvm 5.3 for testing, and the hdfs replication number is 
1.


Any ideas how to solve this issue?


Thanks


Re: Error while executing command on CDH5

2015-03-04 Thread Suresh Srinivas
Can you please use CDH mailing listd for this question?


From: SP 
Sent: Wednesday, March 04, 2015 11:00 AM
To: user@hadoop.apache.org
Subject: Error while executing command on CDH5




Hello All,

Why am I getting this error every time I execute a command. It was working fine 
with CDH4 version. When I upgraded to CDH5 version this message started showing 
up.

does any one have resolution for this error

sudo -u hdfs hadoop fs -ls /
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
Found 1 items
drwxrwxrwt   - hdfs hadoop  0 2015-03-04 10:30 /tmp


Thanks
SP



Re: DFS Used V/S Non DFS Used

2014-10-10 Thread Suresh Srinivas
Here is the information from -
https://issues.apache.org/jira/browse/HADOOP-4430?focusedCommentId=12640259&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12640259
Here are the definition of data reported on the Web UI:
Configured Capacity: Disk space corresponding to all the data directories -
Reserved space as defined by dfs.datanode.du.reserved
DFS Used: Space used by DFS
Non DFS Used: 0 if the temporary files do not exceed reserved space.
Otherwise this is the size by which temporary files exceed the reserved
space and encroach into the DFS configured space.
DFS Remaining: (Configured Capacity - DFS Used - Non DFS Used)
DFS Used %: (DFS Used / Configured Capacity) * 100
DFS Remaining % = (DFS Remaining / Configured Capacity) * 100

On Fri, Oct 10, 2014 at 2:21 PM, Manoj Samel 
wrote:

> Hi,
>
> Not clear how this computation is done
>
> For sake of discussion Say the machine with data node has two disks /disk1
> and /disk2. And each of these disk has a directory for data node and a
> directory for non-datanode usage.
>
> /disk1/datanode
> /disk1/non-datanode
> /disk2/datanode
> /disk2/non-datanode
>
> The dfs.datanode.data.dir says "/disk1/datanode,/disk2/datanode".
>
> With this, what does the DFS and NonDFS indicates? Does it indicates
> SUM(/disk*/datanode) & SUM(/disk*/non-datanode) etc. resp. ?
>
> Thanks,
>
>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Significance of PID files

2014-07-07 Thread Suresh Srinivas
When a daemon process is started, the process ID of the process is captured
in a pid file. It is used for following purposes:
- During a daemon startup, the existence of pid file is used to determine
that the process is already running.
- When a daemon is stooped, hadoop scripts sends kill TERM signal  to the
process ID captured in pid file for graceful shutdown. After a timeout, if
the process still exists, "kill -9" is sent for forced shutdown.

For more details, see the relevant code in
http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh




On Fri, Jul 4, 2014 at 10:00 AM, Vijaya Narayana Reddy Bhoomi Reddy <
vijay.bhoomire...@gmail.com> wrote:

> Hi,
>
> Can anyone please explain the significance of the pid files in Hadoop i.e.
> purpose and usage etc?
>
> Thanks & Regards
> Vijay
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: hadoop 2.2.0 HA: standby namenode generate a long list of loading edits

2014-06-11 Thread Suresh Srinivas
On Wed, Jun 11, 2014 at 8:27 PM, Henry Hung  wrote:

>  @Suresh,
>
>
>
> Q1: But is this kind of behavior can cause some problem in fail over
> event? I’m afraid that standby namenode will took a long time to be active.
>

Can you please explain how you arrived at this?

>
>
> Q2: Is there a way to purge the loading edit records? Should I do restart
> on standby namenode?
>
>
>

Other than showing a long list of loaded edits, there is nothing to be
concerned here. I agree that this is confusing and we could change this
where we print only last set of loaded edits instead of entire list.


>  Best regards,
>
> Henry
>
>
>
> *From:* Suresh Srinivas [mailto:sur...@hortonworks.com]
> *Sent:* Thursday, June 12, 2014 11:23 AM
> *To:* hdfs-u...@hadoop.apache.org
> *Subject:* Re: hadoop 2.2.0 HA: standby namenode generate a long list of
> loading edits
>
>
>
> Henry,
>
>
>
> I suspect this is what is happening. On active namenode, oncethe  existing
> set of editlogs during startup are loaded, it becomes active and from then
> it has no need to load any more edits. It only generates edits. On the
> other hand, standby namenode not only loads the edits during startup, it
> also continuously loads the edits being generated by the active. Hence the
> difference.
>
>
>
> Regards,
>
> Suresh
>
>
>
> On Wed, Jun 11, 2014 at 7:49 PM, Henry Hung  wrote:
>
>  Hi All,
>
>
>
> I’m using QJM with 2 namenodes, in the active namenode, the main page’s
> loading edits panel only show 10 records, but in standby namenode, the
> loading edits panel show a lot more records, never count it, but I think it
> has > 100 records.
>
> Is this a problem?
>
>
>
> Here I provide some of the data:
>
>
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1080&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1361&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1830&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1000140&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10001638&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10002099&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10002359&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1000332&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1000421&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10005210&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10005529&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1000577&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10005831&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&

Re: hadoop 2.2.0 HA: standby namenode generate a long list of loading edits

2014-06-11 Thread Suresh Srinivas
Henry,

I suspect this is what is happening. On active namenode, oncethe  existing
set of editlogs during startup are loaded, it becomes active and from then
it has no need to load any more edits. It only generates edits. On the
other hand, standby namenode not only loads the edits during startup, it
also continuously loads the edits being generated by the active. Hence the
difference.

Regards,
Suresh


On Wed, Jun 11, 2014 at 7:49 PM, Henry Hung  wrote:

>  Hi All,
>
>
>
> I’m using QJM with 2 namenodes, in the active namenode, the main page’s
> loading edits panel only show 10 records, but in standby namenode, the
> loading edits panel show a lot more records, never count it, but I think it
> has > 100 records.
>
> Is this a problem?
>
>
>
> Here I provide some of the data:
>
>
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1080&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1361&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1830&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1000140&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10001638&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10002099&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10002359&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1000332&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1000421&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10005210&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10005529&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=1000577&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10005831&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10005951&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10006089&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10006154&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10006291&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
> http://fchdgw1.ctfab.com:8480/getJournal?jid=hadoop_prod&segmentTxId=10006482&storageInfo=-47%3A1313059004%3A1395811267413%3ACID-9e1b67b3-8190-4652-b34a-210212a50a9e
> (0/0)
>
> 100.00%
>
> 0sec
>
>
>
>
>
> Best regards,
>
> Henry
>
> --
> The privileged confidential information contained in this email is
> intended for use only by the addressees as indicated by the original sender
> of this email. If you are not the addressee indicated in this email or are
> not responsible for delivery of the email to such a person, please kindly
> reply to the sender indicating this fact and delete all copies of it from
> your computer and network server immediately. Your cooperation is highly
> appreciated. It is advised that any unauthorized use of confidential

Re: how can i monitor Decommission progress?

2014-06-05 Thread Suresh Srinivas
The namenode webui provides that information. Click on the main webui the link 
associated with decommissioned nodes. 

Sent from phone

> On Jun 5, 2014, at 10:36 AM, Raj K Singh  wrote:
> 
> use
> 
> $hadoop dfsadmin -report
> 
> 
> Raj K Singh
> http://in.linkedin.com/in/rajkrrsingh
> http://www.rajkrrsingh.blogspot.com
> Mobile  Tel: +91 (0)9899821370
> 
> 
>> On Sat, May 31, 2014 at 11:26 AM, ch huang  wrote:
>> hi,maillist:
>>   i decommission three node out of my cluster,but question is 
>> how can i see the decommission progress?,i just can see admin state from web 
>> ui
> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: listing a 530k files directory

2014-05-30 Thread Suresh Srinivas
Listing such a directory should not be a big problem. Can you cut and paste the 
command output. 

Which release are you using?

Sent from phone

> On May 30, 2014, at 5:49 AM, Guido Serra  wrote:
> 
> already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... "GC 
> overhead limit exceed")
> 
> thanks anyhow
> 
>> On 05/30/2014 02:43 PM, bharath vissapragada wrote:
>> Hi Guido,
>> 
>> You can set client side heap in HADOOP_OPTS variable before running the ls 
>> command.
>> 
>> export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
>> 
>> - Bharath
>> 
>> 
>>> On Fri, May 30, 2014 at 5:22 PM, Guido Serra  wrote:
>>> Hi,
>>> do you have an idea on how to look at the content of a 530k-files HDFS 
>>> folder?
>>> (yes, I know it is a bad idea to have such setup, but that’s the status and 
>>> I’d like to debug it)
>>> and the only tool that doesn’t go out of memory is "hdfs dfs -count folder/“
>>> 
>>> -ls goes out of memory, -count with the folder/* goes out of memory …
>>> I’d like at least at the first 10 file names, see the size, maybe open one
>>> 
>>> thanks,
>>> G.
>> 
> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: any optimize suggestion for high concurrent write into hdfs?

2014-02-20 Thread Suresh Srinivas
Another alternative is to write block sized chunks into multiple hdfs files 
concurrently followed by concat to all those into a single file. 

Sent from phone

> On Feb 20, 2014, at 8:15 PM, Chen Wang  wrote:
> 
> Ch,
> you may consider using flume as it already has a flume sink that can sink to 
> hdfs. What I did is to set up a flume listening on an Avro sink, and then 
> sink to hdfs. Then in my application, i just send my data to avro socket.
> Chen
> 
> 
>> On Thu, Feb 20, 2014 at 5:07 PM, ch huang  wrote:
>> hi,maillist:
>>   is there any optimize for large of write into hdfs in same time ? 
>> thanks
> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: HDFS Federation address performance issue

2014-01-28 Thread Suresh Srinivas
Response inline...


On Tue, Jan 28, 2014 at 10:04 AM, Anfernee Xu  wrote:

> Hi,
>
> Based on
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/Federation.html#Key_Benefits,
> the overall performance can be improved by federation, but I'm not sure
> federation address my usercase, could someone elaborate it?
>
> My usercase is I have one single NM and several DN, and I have bunch of
> concurrent MR jobs which will create new files(plan files and
> sub-directory) under the same parent directory, the questions are:
>
> 1) Will these concurrent writes(new file, plan files and sub-directory
> under the same parent directory) run in sequential because WRITE-once
> control govened by single NM?
>

Namenode commits multiple requests in a batch. In Namenode it self, the
lock for write operations make them sequential. But this is a short
duration lock and hence will make from the multiple clients perspective,
the creation of files as simultaneous.

If you are talking about a single client, with a single thread, then it
would be sequential.

Hope that makes sense.

>
> I need this answer to estimate the necessity of moving to HDFS federation.
>
> Thanks
>
> --
> --Anfernee
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: HDFS federation configuration

2014-01-23 Thread Suresh Srinivas
Have you looked at -
http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-hdfs/Federation.html
?



On Thu, Jan 23, 2014 at 9:35 AM, AnilKumar B  wrote:

> Hi,
>
> We tried setting up HDFS name node federation set up with 2 name nodes. I
> am facing few issues.
>
> Can any one help me in understanding below points?
>
> 1) how can we configure different namespaces to different name node? Where
> exactly we need to configure this?
>
See the documentation. If it is not clear, please open a jira.


>
> 2) After formatting each NN with one cluster id, Do we need to set this
> cluster id in hdfs-site.xml?
>
There is no need to set the cluster id in hdfs-site.xml


>
> 3) I am getting exception like, data dir already locked by one of the NN,
> But when don't specify data.dir, then it's not showing exception. So what
> could be the issue?
>

Are you running the two namenode processes on the same machine?

>
> Thanks & Regards,
> B Anil Kumar.
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: What is the difference between Hdfs and DistributedFileSystem?

2014-01-13 Thread Suresh Srinivas
Hadoop has two sets of APIs:
- FileSystem.java - this is the older API and more widely used. This
defines both the application API and the API that the concrete file systems
must implement (such as DistributedFileSystem, LocalFileSystem,
ChecksumFileSystem etc.)
- FileContext.java and AbstractFileSystem.java - these are new APIs. The
goal was to divide the Hadoop API into two - application API in
FileContext.java and file system implementation interface
AbstractFileSystem.java. More details -
https://issues.apache.org/jira/browse/HADOOP-4952.

Hdfs.java is the newer implementation of AbstractFileSystem.java.
DistributedFileSystem.java implements the older API FileSystem.java.




On Mon, Jan 13, 2014 at 3:31 AM, 梁李印  wrote:

> What is the difference between Hdfs.java and DistributedFileSystem.java in
> Hadoop2?
>
>
>
> Best Regards,
>
> Liyin Liang
>
>
>
> Tel: 78233
>
> Email: liyin.lian...@alibaba-inc.com
>
>
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: compatibility between new client and old server

2013-12-18 Thread Suresh Srinivas
2.x is a new major release. 1.x and 2.x are not compatible.

In 1.x, the RPC wire protocol used java serialization. In 2.x, the RPC wire
protocol uses protobuf. A client must be compiled against 2.x and should
use appropriate jars from 2.x to work with 2.x.


On Wed, Dec 18, 2013 at 10:45 AM, Ken Been  wrote:

>  I am trying to make a 2.2.0 Java client work with a 1.1.2 server.  The
> error I am currently getting is below.  I’d like to know if my problem is
> because I have configured something wrong or because the versions are
> simply not compatible for what I want to do.  Thanks in advance for any
> help.
>
>
>
> Ken
>
>
>
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1351)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>
> at com.sun.proxy.$Proxy18.getFileInfo(Unknown Source)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:601)
>
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>
> at com.sun.proxy.$Proxy18.getFileInfo(Unknown Source)
>
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
>
> at
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
>
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
>
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
>
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>
>at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
>
> at my code...
>
> Caused by: java.io.EOFException
>
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
>
> at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:995)
>
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:891)
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: HDP 2.0 GA?

2013-11-05 Thread Suresh Srinivas
Please send the questions related to a vendor specific distro to vendor
mailing list. In this case - http://hortonworks.com/community/forums/.


On Tue, Nov 5, 2013 at 10:49 AM, Jim Falgout  wrote:

>  HDP 2.0.6 is the GA version that matches Apache Hadoop 2.2.
>
>
>  --
> *From:* John Lilley 
> *Sent:* Tuesday, November 05, 2013 12:34 PM
> *To:* user@hadoop.apache.org
> *Subject:* HDP 2.0 GA?
>
>
> I noticed that HDP 2.0 is available for download here:
>
> http://hortonworks.com/products/hdp-2/?b=1#install
>
> Is this the final “GA” version that tracks Apache Hadoop 2.2?
>
> Sorry I am just a little confused by the different numbering schemes.
>
> *Thanks*
>
> *John*
>
>
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: HDFS / Federated HDFS - Doubts

2013-10-16 Thread Suresh Srinivas
On Wed, Oct 16, 2013 at 9:22 AM, Steve Edison  wrote:

> I have couple of questions about HDFS federation:
>
> Can I state different block store directories for each namespace on a
> datanode ?
>

No. The main idea of federation was not to physically partition the storage
across namespace, but to use all the available storage across the
namespaces, to ensure better utilzation.


> Can I have some datanodes dedicated to a particular namespace only ?
>

As I said earlier, all the datanodes are shared across namespaces. If you
want to dedicate datanodes to a particular namespace, you might as well
create it as two separate clusters with different set of datanodes and a
separate namespace.


>
> This seems quite interesting. Way to go !
>
>
> On Tue, Oct 1, 2013 at 9:52 PM, Krishna Kumaar Natarajan  > wrote:
>
>> Hi All,
>>
>> While trying to understand federated HDFS in detail I had few doubts and
>> listing them down for your help.
>>
>>1. In case of *HDFS(without HDFS federation)*, the metadata or the
>>data about the blocks belonging to the files in HDFS is maintained in the
>>main memory of the name node or it is stored on permanent storage of the
>>namenode and is brought in the main memory on demand basis ? [Krishna]
>>Based on my understanding, I assume the entire metadata is in main memory
>>which is an issue by itself. Please correct me if my understanding is 
>> wrong.
>>2. In case of* federated HDFS*, the metadata or the data about the
>>blocks belonging to files in a particular namespace is maintained in the
>>main memory of the namenode or it is stored on the permanent storage of 
>> the
>>namenode and is brought in the main memory on demand basis ?
>>3. Are the metadata information stored in separate cluster
>>nodes(block management layer separation) as discussed in Appendix B of 
>> this
>>document ?
>>
>> https://issues.apache.org/jira/secure/attachment/12453067/high-level-design.pdf
>>4. I would like to know if the following proposals are already
>>implemented in federated HDFS. (
>>
>> http://www.slideshare.net/hortonworks/hdfs-futures-namenode-federation-for-improved-efficiency-and-scalability
>> slide-17)
>>- Separation of namespace and block management layers (same as qn.3)
>>   - Partial namespace in memory for further scalability
>>   - Move partial namespace from one namenode to another
>>
>> Thanks,
>> Krishna
>>
>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: HDFS federation Configuration

2013-09-23 Thread Suresh Srinivas
>
> I'm not able to follow the page completely.
> Can you pls help me to get some clear step by step or little bit more
> details in the configuration side?
>

Have you setup a non-federated cluster before. If you have, the page should
be easy to follow. If you have not setup a non-federated cluster before, I
suggest doing so, before looking at this document.

I think the document already has step by step instructions.

> I
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Name node High Availability in Cloudera 4.1.1

2013-09-19 Thread Suresh Srinivas
Please do not cross-post these emails to hdfs-user. The relevant email list
is only cdh-user.


On Thu, Sep 19, 2013 at 1:44 AM, Pavan Kumar Polineni <
smartsunny...@gmail.com> wrote:

> Hi all,
>
> *Name Node High Availability & Job tracker high availability* is there in
> Cloudera 4.1.1 ?
>
> If not, Then what are the properties need to change in Cloudera 4.1.1 to
> make the cluster as High availability.
>
> please help on this.. Thanks in Advance
>
> --
>  Pavan Kumar Polineni
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: HDFS federation Configuration

2013-09-19 Thread Suresh Srinivas
Have you looked at -
http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-project-dist/hadoop-hdfs/Federation.html

Let me know if the document is not clear or needs improvements.

Regards,
Suresh



On Thu, Sep 19, 2013 at 12:01 PM, Manickam P  wrote:

>  Guys,
>
> I need some tutorials to configure fedration. Can you pls suggest me some?
>
>
>
>
> Thanks,
> Manickam P
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Cloudera Vs Hortonworks Vs MapR

2013-09-13 Thread Suresh Srinivas
Shahab,

I agree with your arguments. Really well put. Only things I would add is -
we do not want sales/marketing folks getting involved in these kinds of
threads and pollute it with sales pitches, unsubstantiated claims, and make
it a forum for marketing pitch. This can also have community repercussions
as you have rightly pointed out.

Wearing my own hadoop PMC hat, we do put Apache release regularly. Bigtop
also provides excellent stack packaging as well. In this forum my wish is
to see discussions around that than vendor related. There are already many
outside forums for this.

Regards,
Suresh


On Fri, Sep 13, 2013 at 10:48 AM, Shahab Yunus wrote:

> I think, in my opinion, it is a wrong idea because:
>
> 1- Many of the participants here are employees for these very companies
> that are under discussion. This puts these respective employees in very
> difficult position. It is very hard to come with a correct response.
> Comments can be misconstrued easily.
> 2- Also, when we talk about vendor distributions of the software, it is
> not longer purely about open source. Now companies with the related
> corporate legal baggage also gets in the mix.
> 3- The discussion would be on not only positive things about each vendor
> but in fact negatives. The latter type of  discussion which can get
> unpleasant very easily.
> 4- Somebody mentioned that, this is a very lightly moderated platform and
> thus this discussion should be allowed. I think this is one of the reasons
> that it should not be because, people can say things casually, without much
> thought, or without taking care of the context or the possible
> interpretations and get in trouble.
> 5- The risk here is not only that serious repercussions can occur (which
> very well can) but the greater risk is that it can cause misunderstanding
> between individuals, industries and companies.
> 6-People here lot of time reply quickly just to resolve or help the
> 'technical' issue. Now they will have to take care how they frame the
> response. Re: 4
>
> I know some will feel that I have created a highly exaggerated scenario
> above, but what I am trying to say is that, it is a slippery slope. If we
> allow this then this can go anywhere.
>
> By the way, I do not work for any of these vendors.
>
> More importantly, I am not saying that this discussion should not be had,
> I am just saying that this is a wrong forum.
>
> Just my 2 cents (or,...this was rather a dollar.)
>
> Regards,
> Shahab
>
>
> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann wrote:
>
>> Errr, what's wrong with discussing these types of issues on list?
>>
>> Nothing public here, and as long as it's kept to facts, this should
>> not be a problem and Apache is a fine place to have such discussions.
>>
>> My 2c.
>>
>>
>>
>>
>>
>> -Original Message-
>> From: Xuri Nagarin 
>> Reply-To: "user@hadoop.apache.org" 
>> Date: Thursday, September 12, 2013 4:39 PM
>> To: "user@hadoop.apache.org" 
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I understand it can be contentious issue especially given that a lot of
>> >contributors to this list work for one or the other vendor or have some
>> >stake in any kind of evaluation. But, I see no reason why users should
>> >not be able to compare notes
>> > and share experiences. Over time, genuine pain points or issues or
>> >claims will bubble up and should only help the community. Sure, there
>> >will be a few flame wars but this already isn't a very tightly moderated
>> >list.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> > wrote:
>> >
>> >Raj,
>> >
>> >
>> >As others noted, this is not a great place for this discussion.  I'd
>> >suggest contacting the vendors you are interested in as I'm sure we'd all
>> >be happy to provide you more details.
>> >
>> >
>> >I don't know about the others, but for MapR, just send an email to
>> >sa...@mapr.com  and I'm sure someone will get
>> back
>> >to you with more information.
>> >
>> >
>> >Best Regards,
>> >Aaron Eng
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj 
>> wrote:
>> >
>> >
>> >Hi,
>> >
>> >We are trying to evaluate different implementations of Hadoop for our big
>> >data enterprise project.
>> >
>> >Can the forum members advise on what are the advantages and disadvantages
>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >
>> >Thanks in advance.
>> >
>> >Regards,
>> >Raj
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is

Re: Cloudera Vs Hortonworks Vs MapR

2013-09-12 Thread Suresh Srinivas
Raj,

You can also use Apache Hadoop releases. Bigtop does fine job as well
putting together consumable Hadoop stack.

As regards to vendor solutions, this is not the right forum. There are
other forums for this. Please refrain from this type of discussions on
Apache forum.

Regards,
Suresh


On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj  wrote:

> Hi,
>
> We are trying to evaluate different implementations of Hadoop for our big
> data enterprise project.
>
> Can the forum members advise on what are the advantages and disadvantages
> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
> Thanks in advance.
>
> Regards,
> Raj




-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Symbolic Link in Hadoop 1.0.4

2013-09-05 Thread Suresh Srinivas
FileContext APIs and symlink functionality is not available in 1.0. It is
only available in 0.23 and 2.x release.


On Thu, Sep 5, 2013 at 8:06 AM, Gobilliard, Olivier <
olivier.gobilli...@cartesian.com> wrote:

>  Hi,
>
>
>
> I am using Hadoop 1.0.4 and need to create a symbolic link in HDSF.
>
> This feature has been added in Hadoop 0.21.0 (
> https://issues.apache.org/jira/browse/HDFS-245) in the new FileContext
> API (
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileContext.html
> ).
>
> However, I cannot find the FileContext API in the 1.0.4 release (
> http://archive.apache.org/dist/hadoop/core/hadoop-1.0.4/). I cannot find
> it in any of the 1.X releases actually.
>
>
>
> Has this functionality been moved to another Class?
>
>
>
> Many thanks,
>
> Olivier
>
>
> __
> This email and any attachments are confidential. If you have received this
> email in error please notify the sender immediately
> by replying to this email and then delete from your computer without
> copying or distributing in any other way.
>
> Cartesian Limited - Registered in England and Wales with number 3230513
> Registered office: Descartes House, 8 Gate Street, London, WC2A 3HP
> www.cartesian.com
>



-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Documentation for Hadoop's RPC mechanism

2013-08-20 Thread Suresh Srinivas
Create a Jira and post it into hadoop documentation. I can help you with the 
review and commit. 

Sent from phone

On Aug 20, 2013, at 10:40 AM, Elazar Leibovich  wrote:

> Hi,
> 
> I've written some documentation for Hadoop's RPC mechanism internals:
> 
> http://hadoop.quora.com/Hadoop-RPC-mechanism
> 
> I'll be very happy if the community can review it. You should be able to edit 
> it directly, or just send your comments to the list.
> 
> Except, I'm looking for a good place to put it. Where does it fit? Would it 
> fit Hadoop's Wiki? Hadoop's Source?
> 
> Thanks

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Maven Cloudera Configuration problem

2013-08-13 Thread Suresh Srinivas
Folks, can you please take this thread to CDH related mailing list?


On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox  wrote:

> That link got my hopes up. But Cloudera Manager  (what I'm running; on
> CDH4) does not offer an "Export Client Config" option. What am I missing?
>
> On Aug 13, 2013, at 4:04 PM, Shahab Yunus  wrote:
>
> You should not use LocalJobRunner. Make sure that the mapred.job.tracker
> property does not point to 'local' an instead to your job-tracker host and
> port.
>
> *But before that* as Sandy said, your client machine (from where you will
> be kicking of your jobs and apps) should be using config files which will
> have your cluster's configuration. This is the alternative that you should
> follow if you don't want to bundle the configs for your cluster in the
> application itself (either in java code or separate copies of relevant
> properties set of config files.) This was something which I was suggesting
> early on to just to get you started using your cluster instead of local
> mode.
>
> By the way have you seen the following link? It gives you step by step
> information about how to generate config files from your cluster specific
> to your cluster and then how to place them and use the from any machine you
> want to designate as your client. Running your jobs form one of the
> datanodes without proper config would not work.
>
> https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration
>
> Regards,
> Shahab
>
>
> On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra  >wrote:
>
> Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
> job on one datanode..
>
> What changes should i make so that my application would take advantage
> of the cluster as a whole?
>
> On Tue, Aug 13, 2013 at 10:33 PM,   wrote:
>
> Nothing in your pom.xml should affect the configurations your job runs
>
> with.
>
>
> Are you running your job from a node on the cluster? When you say
>
> localhost configurations, do you mean it's using the LocalJobRunner?
>
>
> -sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra 
>
> wrote:
>
>
> When i actually run the job on the multi node cluster, logs shows it
> uses localhost configurations which i don't want..
>
> I just have a pom.xml which lists all the dependencies like standard
> hadoop, standard hbase, standard zookeeper etc., Should i remove these
> dependencies?
>
> I want the cluster settings to apply in my map-reduce application..
> So, this is where i'm stuck at..
>
> On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra 
>
> wrote:
>
> Hi Shabab and Sandy,
> The thing is we have a 6 node cloudera cluster running.. For
> development purposes, i was building a map-reduce application on a
> single node apache distribution hadoop with maven..
>
> To be frank, i don't know how to deploy this application on a multi
> node cloudera cluster. I am fairly well versed with Multi Node Apache
> Hadoop Distribution.. So, how can i go forward?
>
> Thanks for all the help :)
>
> On Tue, Aug 13, 2013 at 9:22 PM,   wrote:
>
> Hi Pavan,
>
> Configuration properties generally aren't included in the jar itself
>
> unless you explicitly set them in your java code. Rather they're picked up
> from the mapred-site.xml file located in the Hadoop configuration directory
> on the host you're running your job from.
>
>
> Is there an issue you're coming up against when trying to run your
>
> job on a cluster?
>
>
> -Sandy
>
> (iphnoe tpying)
>
> On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra 
>
> wrote:
>
>
> Hi,
> I'm currently using maven to build the jars necessary for my
> map-reduce program to run and it works for a single node cluster..
>
> For a multi node cluster, how do i specify my map-reduce program to
> ingest the cluster settings instead of localhost settings?
> I don't know how to specify this using maven to build my jar.
>
> I'm using the cdh distribution by the way..
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
>
>
> --
> Regards-
> Pavan
>
>
> Dr. Brad J. CoxCell: 703-594-1883 Blog: http://bradjcox.blogspot.com
> http://virtualschool.edu
>
>
>
>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re:

2013-07-12 Thread Suresh Srinivas
Please use CDH mailing list. This is apache hadoop mailing list. 

Sent from phone

On Jul 12, 2013, at 7:51 PM, Anit Alexander  wrote:

> Hello,
> 
> I am encountering a problem in cdh4 environment. 
> I can successfully run the map reduce job in the hadoop cluster. But when i 
> migrated the same map reduce to my cdh4 environment it creates an error 
> stating that it cannot read the next block(each block is 64 mb). Why is that 
> so?
> 
> Hadoop environment: hadoop 1.0.3
> java version 1.6
> 
> chd4 environment: CDH4.2.0
> java version 1.6
> 
> Regards,
> Anit Alexander


Re: Cloudera links and Document

2013-07-11 Thread Suresh Srinivas
Sathish, this mailing list for Apache Hadoop related questions. Please post
questions related to other distributions to appropriate vendor's mailing
list.



On Thu, Jul 11, 2013 at 6:28 AM, Sathish Kumar  wrote:

> Hi All,
>
> Can anyone help me the link or document that explain the below.
>
> How Cloudera Manager works and handle the clusters (Agent and Master
> Server)?
> How the Cloudera Manager Process Flow works?
> Where can I locate Cloudera configuration files and explanation in brief?
>
>
> Regards
> Sathish
>
>


-- 
http://hortonworks.com/download/


Re: data loss after cluster wide power loss

2013-07-03 Thread Suresh Srinivas
On Wed, Jul 3, 2013 at 8:12 AM, Colin McCabe  wrote:

> On Mon, Jul 1, 2013 at 8:48 PM, Suresh Srinivas 
> wrote:
> > Dave,
> >
> > Thanks for the detailed email. Sorry I did not read all the details you
> had
> > sent earlier completely (on my phone). As you said, this is not related
> to
> > data loss related to HBase log and hsync. I think you are right; the
> rename
> > operation itself might not have hit the disk. I think we should either
> > ensure metadata operation is synced on the datanode or handle it being
> > reported as blockBeingWritten. Let me spend sometime to debug this issue.
>
> In theory, ext3 is journaled, so all metadata operations should be
> durable in the case of a power outage.  It is only data operations
> that should be possible to lose.  It is the same for ext4.  (Assuming
> you are not using nonstandard mount options.)
>

ext3 journal may not hit the disk right. From what I read, if you do not
specifically
call sync, even the metadata operations do not hit disk.

See - https://www.kernel.org/doc/Documentation/filesystems/ext3.txt

commit=nrsec(*) Ext3 can be told to sync all its data and metadata
every 'nrsec' seconds. The default value is 5 seconds.
This means that if you lose your power, you will lose
as much as the latest 5 seconds of work (your
filesystem will not be damaged though, thanks to the
journaling).  This default value (or any low value)
will hurt performance, but it's good for data-safety.
Setting it to 0 will have the same effect as leaving
it at the default (5 seconds).
Setting it to very large values will improve

performance.


Re: HDFS file section rewrite

2013-07-02 Thread Suresh Srinivas
HDFS only supports regular writes and append. Random write is not
supported. I do not know of any feature/jira that is underway to support
this feature.


On Tue, Jul 2, 2013 at 9:01 AM, John Lilley wrote:

>  I’m sure this has been asked a zillion times, so please just point me to
> the JIRA comments: is there a feature underway to allow for re-writing of
> HDFS file sections?
>
> Thanks
>
> John
>
> ** **
>



-- 
http://hortonworks.com/download/


Re: data loss after cluster wide power loss

2013-07-01 Thread Suresh Srinivas
Dave,

Thanks for the detailed email. Sorry I did not read all the details you had
sent earlier completely (on my phone). As you said, this is not related to
data loss related to HBase log and hsync. I think you are right; the rename
operation itself might not have hit the disk. I think we should either
ensure metadata operation is synced on the datanode or handle it being
reported as blockBeingWritten. Let me spend sometime to debug this issue.

One surprising thing is, all the replicas were reported as
blockBeingWritten.

Regards,
Suresh


On Mon, Jul 1, 2013 at 6:03 PM, Dave Latham  wrote:

> (Removing hbase list and adding hdfs-dev list as this is pretty internal
> stuff).
>
> Reading through the code a bit:
>
> FSDataOutputStream.close calls
> DFSOutputStream.close calls
> DFSOutputStream.closeInternal
>  - sets currentPacket.lastPacketInBlock = true
>  - then calls
> DFSOutputStream.flushInternal
>  - enqueues current packet
>  - waits for ack
>
> BlockReceiver.run
>  - if (lastPacketInBlock && !receiver.finalized) calls
> FSDataset.finalizeBlock calls
> FSDataset.finalizeBlockInternal calls
> FSVolume.addBlock calls
> FSDir.addBlock calls
> FSDir.addBlock
>  - renames block from "blocksBeingWritten" tmp dir to "current" dest dir
>
> This looks to me as I would expect a synchronous chain from a DFS client
> to moving the file from blocksBeingWritten to the current dir so that once
> the file is closed that it the block files would be in the proper directory
> - even if the contents of the file are still in the OS buffer rather than
> synced to disk.  It's only after this moving of blocks that
> NameNode.complete file is called.  There are several conditions and loops
> in there that I'm not certain this chain is fully reliable in all cases
> without a greater understanding of the code.
>
> Could it be the case that the rename operation itself is not synced and
> that ext3 lost the fact that the block files were moved?
> Or is there a bug in the close file logic that for some reason the block
> files are not always moved into place when a file is closed?
>
> Thanks for your patience,
> Dave
>
>
> On Mon, Jul 1, 2013 at 3:35 PM, Dave Latham  wrote:
>
>> Thanks for the response, Suresh.
>>
>> I'm not sure that I understand the details properly.  From my reading of
>> HDFS-744 the hsync API would allow a client to make sure that at any point
>> in time it's writes so far hit the disk.  For example, for HBase it could
>> apply a fsync after adding some edits to its WAL to ensure those edits are
>> fully durable for a file which is still open.
>>
>> However, in this case the dfs file was closed and even renamed.  Is it
>> the case that even after a dfs file is closed and renamed that the data
>> blocks would still not be synced and would still be stored by the datanode
>> in "blocksBeingWritten" rather than in "current"?  If that is case, would
>> it be better for the NameNode not to reject replicas that are in
>> blocksBeingWritten, especially if it doesn't have any other replicas
>> available?
>>
>> Dave
>>
>>
>> On Mon, Jul 1, 2013 at 3:16 PM, Suresh Srinivas 
>> wrote:
>>
>>> Yes this is a known issue.
>>>
>>> The HDFS part of this was addressed in
>>> https://issues.apache.org/jira/browse/HDFS-744 for 2.0.2-alpha and is
>>> not
>>> available in 1.x  release. I think HBase does not use this API yet.
>>>
>>>
>>> On Mon, Jul 1, 2013 at 3:00 PM, Dave Latham  wrote:
>>>
>>> > We're running HBase over HDFS 1.0.2 on about 1000 nodes.  On Saturday
>>> the
>>> > data center we were in had a total power failure and the cluster went
>>> down
>>> > hard.  When we brought it back up, HDFS reported 4 files as CORRUPT.
>>>  We
>>> > recovered the data in question from our secondary datacenter, but I'm
>>> > trying to understand what happened and whether this is a bug in HDFS
>>> that
>>> > should be fixed.
>>> >
>>> > From what I can tell the file was created and closed by the dfs client
>>> > (hbase).  Then HBase renamed it into a new directory and deleted some
>>> other
>>> > files containing the same data.  Then the cluster lost power.  After
>>> the
>>> > cluster was restarted, the datanodes reported into the namenode but the
>>> > blocks for this file appeared as "blocks being written" - the namenode
>>> > rejected them and the datanodes deleted the blocks.  At th

Re: data loss after cluster wide power loss

2013-07-01 Thread Suresh Srinivas
Yes this is a known issue.

The HDFS part of this was addressed in
https://issues.apache.org/jira/browse/HDFS-744 for 2.0.2-alpha and is not
available in 1.x  release. I think HBase does not use this API yet.


On Mon, Jul 1, 2013 at 3:00 PM, Dave Latham  wrote:

> We're running HBase over HDFS 1.0.2 on about 1000 nodes.  On Saturday the
> data center we were in had a total power failure and the cluster went down
> hard.  When we brought it back up, HDFS reported 4 files as CORRUPT.  We
> recovered the data in question from our secondary datacenter, but I'm
> trying to understand what happened and whether this is a bug in HDFS that
> should be fixed.
>
> From what I can tell the file was created and closed by the dfs client
> (hbase).  Then HBase renamed it into a new directory and deleted some other
> files containing the same data.  Then the cluster lost power.  After the
> cluster was restarted, the datanodes reported into the namenode but the
> blocks for this file appeared as "blocks being written" - the namenode
> rejected them and the datanodes deleted the blocks.  At this point there
> were no replicas for the blocks and the files were marked CORRUPT.  The
> underlying file systems are ext3.  Some questions that I would love get
> answers for if anyone with deeper understanding of HDFS can chime in:
>
>  - Is this a known scenario where data loss is expected?  (I found
> HDFS-1539 but that seems different)
>  - When are blocks moved from blocksBeingWritten to current?  Does that
> happen before a file close operation is acknowledged to a hdfs client?
>  - Could it be that the DataNodes actually moved the blocks to current but
> after the restart ext3 rewound state somehow (forgive my ignorance of
> underlying file system behavior)?
>  - Is there any other explanation for how this can happen?
>
> Here is a sequence of selected relevant log lines from the RS (HBase
> Region Server) NN (NameNode) and DN (DataNode - 1 example of 3 in
> question).  It includes everything that mentions the block in question in
> the NameNode and one DataNode log.  Please let me know if this more
> information that would be helpful.
>
> RS 2013-06-29 11:16:06,812 DEBUG org.apache.hadoop.hbase.util.FSUtils:
> Creating
> file=hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
> with permission=rwxrwxrwx
> NN 2013-06-29 11:16:06,830 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.allocateBlock:
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c.
> blk_1395839728632046111_357084589
> DN 2013-06-29 11:16:06,832 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_1395839728632046111_357084589 src: /10.0.5.237:14327 dest: /
> 10.0.5.237:50010
> NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 10.0.6.1:50010 is added to
> blk_1395839728632046111_357084589 size 25418340
> NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 10.0.6.24:50010 is added to
> blk_1395839728632046111_357084589 size 25418340
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 10.0.5.237:50010 is added to
> blk_1395839728632046111_357084589 size 25418340
> DN 2013-06-29 11:16:11,385 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Received block
> blk_1395839728632046111_357084589 of size 25418340 from /10.0.5.237:14327
> DN 2013-06-29 11:16:11,385 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for
> block blk_1395839728632046111_357084589 terminating
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange:
> Removing lease on  file
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
> from client DFSClient_hb_rs_hs745,60020,1372470111932
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: DIR*
> NameSystem.completeFile: file
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
> is closed by DFSClient_hb_rs_hs745,60020,1372470111932
> RS 2013-06-29 11:16:11,393 INFO
> org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at
> hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
> to
> hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/n/6e0cc30af6e64e56ba5a539fdf159c4c
> RS 2013-06-29 11:16:11,505 INFO
> org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 7
> file(s) in n of
> users-6,\x12\xBDp\xA3,1359426311784.b5b0820cde759ae68e333b2f4015bb7e. into
> 6e0cc30af6e64e56ba5a539fdf159c4c, size=24.2m; total size for store is 24.2m
>
> ---  CRASH, RESTART -
>
> NN 2013-06-29 12:01:19,743 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: addStoredBlock request r

Re: HDFS upgrade under HA

2013-06-21 Thread Suresh Srinivas
I have hard time understanding your question.



On Fri, Jun 21, 2013 at 12:16 AM, Azuryy Yu  wrote:

> Hi,
>
> The layout version is -43 until 2.0.4-alpha, but HDFS-4908 changed layout
> version to -45.
>
> so if My test cluster is running hadoop-2.0.4-alpha(-43), which is
> upgraded from hadoop-1.0.4, then I want to upgrade using trunk(-45), how to
> do?
>

Layout version can increase (or decrease given it is negative number) by
more than 1, when you upgrade to a new release. So 2.0.4-alpha(-43) can
upgrade to trunk(-45) version.


>
> It cannot upgrade under HA, so I can use hadoop-1.0.4 core-site and
> hdfs-site, then "start-dfs.sh -upgrade", but standby NN cannot upgrade,
>

What is preventing you from upgrading? Please add details on what you are
doing and what issue you are seeing.

Regards,
Suresh


Re: Please explain FSNamesystemState TotalLoad

2013-06-07 Thread Suresh Srinivas
On Fri, Jun 7, 2013 at 9:10 AM, Nick Niemeyer wrote:

>  Regarding TotalLoad, what would be normal operating tolerances per node
> for this metric?  When should one become concerned?  Thanks again to
> everyone participating in this community.  :)
>
>
Why do you want to be concered :) I have not seen many issues related to
high TotalLoad.

This is mainly useful in terms of understanding how many concurrent
jobs/file accesses are happening and how busy datanodes are. When you are
debugging issues where cluster slow down due to overload, or correlating a
run of big jobs, this is useful. Knowing what it represent, you would find
many other uses as well.


>   From: Suresh Srinivas 
> Reply-To: "user@hadoop.apache.org" 
> Date: Thursday, June 6, 2013 4:14 PM
> To: "hdfs-u...@hadoop.apache.org" 
> Subject: Re: Please explain FSNamesystemState TotalLoad
>
>   It is the total number of transceivers (readers and writers) reported
> by all the datanodes. Datanode reports this count in periodic heartbeat to
> the namenode.
>
>
> On Thu, Jun 6, 2013 at 1:48 PM, Nick Niemeyer wrote:
>
>>   Can someone please explain what TotalLoad represents below?  Thanks
>> for your response in advance!
>>
>>  Version: hadoop-0.20-namenode-0.20.2+923.197-1
>>
>>  Example pulled from the output of via the name node:
>>   # curl -i http://localhost:50070/jmx
>>
>>  {
>> "name" : "hadoop:service=NameNode,name=FSNamesystemState",
>> "modelerType" : "org.apache.hadoop.hdfs.server.namenode.FSNamesystem",
>> "CapacityTotal" : #,
>> "CapacityUsed" : #,
>> "CapacityRemaining" : #,
>>* "TotalLoad" : #,*
>> "BlocksTotal" : #,
>> "FilesTotal" : #,
>> "PendingReplicationBlocks" : 0,
>> "UnderReplicatedBlocks" : 0,
>> "ScheduledReplicationBlocks" : 0,
>> "FSState" : "Operational"
>>   }
>>
>>
>>  Thanks,
>> Nick
>>
>
>
>
>  --
> http://hortonworks.com/download/
>



-- 
http://hortonworks.com/download/


Re: Please explain FSNamesystemState TotalLoad

2013-06-06 Thread Suresh Srinivas
It is the total number of transceivers (readers and writers) reported by
all the datanodes. Datanode reports this count in periodic heartbeat to the
namenode.


On Thu, Jun 6, 2013 at 1:48 PM, Nick Niemeyer wrote:

>   Can someone please explain what TotalLoad represents below?  Thanks for
> your response in advance!
>
>  Version: hadoop-0.20-namenode-0.20.2+923.197-1
>
>  Example pulled from the output of via the name node:
>   # curl -i http://localhost:50070/jmx
>
>  {
> "name" : "hadoop:service=NameNode,name=FSNamesystemState",
> "modelerType" : "org.apache.hadoop.hdfs.server.namenode.FSNamesystem",
> "CapacityTotal" : #,
> "CapacityUsed" : #,
> "CapacityRemaining" : #,
>* "TotalLoad" : #,*
> "BlocksTotal" : #,
> "FilesTotal" : #,
> "PendingReplicationBlocks" : 0,
> "UnderReplicatedBlocks" : 0,
> "ScheduledReplicationBlocks" : 0,
> "FSState" : "Operational"
>   }
>
>
>  Thanks,
> Nick
>



-- 
http://hortonworks.com/download/


Re: Management API

2013-06-06 Thread Suresh Srinivas
Namenode exposes all these information over JMX. The same is also exposed
over http, which you could use for your use case.

See http:///jmx to list all the properties exposed. Typically
http port is 50070. You will find the information you looking for at:
http:///jmx?Hadoop:service=NameNode,name=NameNodeInfo


On Thu, Jun 6, 2013 at 9:17 AM, Brian Mason  wrote:

> Mostly looking for a list of data nodes.  I am making  web page to display
> some info about data nodes and I need a list.  I can screen scrape the JSP
> GUI, I was just hoping for something more elegant.
>
>
>
> On Thu, Jun 6, 2013 at 8:40 AM, John Lilley wrote:
>
>>  What resources are you trying to access?  
>>
>> Do you want to monitor the system status?
>>
>> Do you want to read/write HDFS as a client?  
>>
>> Do you want to run your application on the Hadoop cluster?  
>>
>> John
>>
>> ** **
>>
>> ** **
>>
>> *From:* Brian Mason [mailto:br...@gabey.com]
>> *Sent:* Thursday, June 06, 2013 6:52 AM
>> *To:* user@hadoop.apache.org
>> *Subject:* Management API
>>
>> ** **
>>
>> I am looking for a way to access a list of Nodes, Compute, Data etc ..
>>  My application is not running on the name node.  It is remote.  The 2.0
>> Yarn API look like they may be useful, but I am not on 2.0 and cannot move
>> to 2,0 anytime soon.
>>
>> ** **
>>
>> DFSClient.java looks useful, but its not in the API docs so I am not sure
>> how to use it or even if I should.
>>
>> Any pointers would be helpful.  
>>
>> ** **
>>
>> Thanks,
>>
>
>


-- 
http://hortonworks.com/download/


Re: How to test the performance of NN?

2013-06-05 Thread Suresh Srinivas
What do you mean by it is not telling me any thing about performance? Also
I do not understand the part, "only about potential failures.". Can you add
more details.

nnbench is the best microbenchmark for nn performance test.


On Wed, Jun 5, 2013 at 3:17 PM, Mark Kerzner wrote:

> Hi,
>
> I am trying to create a more efficient namenode, and for that I need to
> the standard distribution, and then compare it to my version.
>
> Which benchmark should I run? I am doing nnbench, but it is not telling me
> anything about performance, only about potential failures.
>
> Thank you.
> Sincerely,
> Mark
>



-- 
http://hortonworks.com/download/


Re: cloudera4.2 source code ant

2013-05-17 Thread Suresh Srinivas
Folks, this is Apache Hadoop mailing list. For vendor distro related questions, 
please use the appropriate vendor mailing list. 

Sent from a mobile device

On May 17, 2013, at 2:06 AM, Kun Ling  wrote:

> Hi dylan,
> 
>  I have not build CDH source code using ant, However I have met a similar 
> dependencies resolve filed problem.
> 
> Acccording to my experience,   this is much like a package network 
> download issue. 
> 
> You may try to remove the .ivy2  and .m2   directories in your home 
> directory, and  run "ant clean; ant" to try again.
> 
> 
>Hope it is helpful to you.
> 
> 
> yours,
> Kun Ling 
> 
> 
> On Fri, May 17, 2013 at 4:42 PM, dylan  wrote:
>> hello, 
>> 
>>  there is a problem i can't resolved, i want to remote connect the 
>> hadoop ( cloudera cdh4.2.0 ) via eclipse plugin.There’s have no 
>> hadoop-eclipse-pluge.jar ,so i download the  hadoop of cdh4.2.0  tarbal and 
>> when i complie, the error is below:
>> 
>>  
>> 
>> ivy-resolve-common:
>> 
>> [ivy:resolve] :: resolving dependencies :: 
>> org.apache.hadoop#eclipse-plugin;working@master
>> 
>> [ivy:resolve]confs: [common]
>> 
>> [ivy:resolve]found commons-logging#commons-logging;1.1.1 in maven2
>> 
>> [ivy:resolve] :: resolution report :: resolve 5475ms :: artifacts dl 2ms
>> 
>>-
>> 
>>|  |modules||   artifacts   |
>> 
>>|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
>> 
>>-
>> 
>>|  common  |   2   |   0   |   0   |   0   ||   1   |   0   |
>> 
>>-
>> 
>> [ivy:resolve] 
>> 
>> [ivy:resolve] :: problems summary ::
>> 
>> [ivy:resolve]  WARNINGS
>> 
>> [ivy:resolve]   ::
>> 
>> [ivy:resolve]   ::  UNRESOLVED DEPENDENCIES ::
>> 
>> [ivy:resolve]   ::
>> 
>> [ivy:resolve]   :: log4j#log4j;1.2.16: several problems occurred 
>> while resolving dependency: log4j#log4j;1.2.16 {common=[master]}:
>> 
>> [ivy:resolve]reactor-repo: unable to get resource for 
>> log4j#log4j;1.2.16: res=${reactor.repo}/log4j/log4j/1.2.16/log4j-1.2.16.pom: 
>> java.net.MalformedURLException: no protocol: 
>> ${reactor.repo}/log4j/log4j/1.2.16/log4j-1.2.16.pom
>> 
>> [ivy:resolve]reactor-repo: unable to get resource for 
>> log4j#log4j;1.2.16: res=${reactor.repo}/log4j/log4j/1.2.16/log4j-1.2.16.jar: 
>> java.net.MalformedURLException: no protocol: 
>> ${reactor.repo}/log4j/log4j/1.2.16/log4j-1.2.16.jar
>> 
>> [ivy:resolve]   ::
>> 
>> [ivy:resolve] 
>> 
>> [ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
>> 
>>  
>> 
>> BUILD FAILED
>> 
>> /home/paramiao/hadoop-2.0.0-mr1-cdh4.2.0/src/contrib/build-contrib.xml:440: 
>> impossible to resolve dependencies:
>> 
>>resolve failed - see output for details
>> 
>>  
>> 
>> so could someone tell me where i am wrong and how could make it success? 
>> 
>>  
>> 
>> best regards!
>> 
> 
> 
> 
> -- 
> http://www.lingcc.com


Re: CDH4 installation along with MRv1 from tarball

2013-03-20 Thread Suresh Srinivas
Can you guys please take this thread to CDH mailing list?

Sent from phone

On Mar 20, 2013, at 2:48 PM, rohit sarewar  wrote:

> Hi Jens
> 
> These are not complete version of Hadoop.
> 1) hadoop-0.20-mapreduce-0.20.2+1341 (has only MRv1)
> 2) hadoop-2.0.0+922 (has HDFS+ Yarn)
> 
> I request you to read the comments in this link 
> https://issues.cloudera.org/browse/DISTRO-447
> 
> 
> 
> 
> 
> On Tue, Mar 19, 2013 at 1:17 PM, Jens Scheidtmann 
>  wrote:
>> Rohit,
>> 
>> What are you trying to achieve with two different complete versions of 
>> hadoop?
>> 
>> Thanks,
>> 
>> Jens
>> 
>> 
>> 
>> 2013/3/18 rohit sarewar 
>>> Need some guidance on CDH4 installation from tarballs
>>> 
>>> I have downloaded two files from " 
>>> https://ccp.cloudera.com/display/SUPPORT/CDH4+Downloadable+Tarballs "
>>> 
>>> 1) hadoop-0.20-mapreduce-0.20.2+1341 (has only MRv1)
>>> 2) hadoop-2.0.0+922 (has HDFS+ Yarn)
> 


Re: Regarding: Merging two hadoop clusters

2013-03-13 Thread Suresh Srinivas
> I have two different hadoop clusters in production. One cluster is used as
> backing for HBase and the other for other things. Both hadoop clusters are
> using the same version 1.0 and I want to merge them and make them one. I
> know, one possible solution is to copy the data across, but the data is
> really huge on these clusters and it will hard for me to compromise with
> huge downtime.
> Is there any optimal way to merge two hadoop clusters.
>

This is not a supported feature. Hence this activity would require
understanding low level Hadoop details, quite a bit of hacking and is not
straightforward. Copying data from the clusters is the simplest solution.


Re: Hadoop cluster hangs on big hive job

2013-03-11 Thread Suresh Srinivas
I have seen one such problem related to big hive jobs that open a lot of
files. See HDFS-4496 for more details. Snippet from the description:
The following issue was observed in a cluster that was running a Hive job
and was writing to 100,000 temporary files (each task is writing to 1000s
of files). When this job is killed, a large number of files are left open
for write. Eventually when the lease for open files expires, lease recovery
is started for all these files in a very short duration of time. This
causes a large number of commitBlockSynchronization where logSync is
performed with the FSNamesystem lock held. This overloads the namenode
resulting in slowdown.

Could this be the cause? Can you see namenode log to see if you have lease
recovery activity? If not, can you send some information about what is
happening in the namenode logs at the time of this slowdown?



On Mon, Mar 11, 2013 at 1:32 PM, Daning Wang  wrote:

> [hive@mr3-033 ~]$ hadoop version
> Hadoop 1.0.4
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1393290
> Compiled by hortonfo on Wed Oct  3 05:13:58 UTC 2012
>
>
> On Sun, Mar 10, 2013 at 8:16 AM, Suresh Srinivas 
> wrote:
>
>> What is the version of hadoop?
>>
>> Sent from phone
>>
>> On Mar 7, 2013, at 11:53 AM, Daning Wang  wrote:
>>
>> We have hive query processing zipped csv files. the query was scanning
>> for 10 days(partitioned by date). data for each day around 130G. The
>> problem is not consistent since if you run it again, it might go through.
>> but the problem has never happened on the smaller jobs(like processing only
>> one days data).
>>
>> We don't have space issue.
>>
>> I have attached log file when problem happening. it is stuck like
>> following(just search "19706 of 49964")
>>
>> 2013-03-05 15:13:51,587 INFO org.apache.hadoop.mapred.TaskTracker:
>> attempt_201302270947_0010_r_19_0 0.131468% reduce > copy (19706 of
>> 49964 at 0.00 MB/s) >
>> 2013-03-05 15:13:51,811 INFO org.apache.hadoop.mapred.TaskTracker:
>> attempt_201302270947_0010_r_39_0 0.131468% reduce > copy (19706 of
>> 49964 at 0.00 MB/s) >
>> 2013-03-05 15:13:52,551 INFO org.apache.hadoop.mapred.TaskTracker:
>> attempt_201302270947_0010_r_32_0 0.131468% reduce > copy (19706 of
>> 49964 at 0.00 MB/s) >
>> 2013-03-05 15:13:52,760 INFO org.apache.hadoop.mapred.TaskTracker:
>> attempt_201302270947_0010_r_00_0 0.131468% reduce > copy (19706 of
>> 49964 at 0.00 MB/s) >
>> 2013-03-05 15:13:52,946 INFO org.apache.hadoop.mapred.TaskTracker:
>> attempt_201302270947_0010_r_24_0 0.131468% reduce > copy (19706 of
>> 49964 at 0.00 MB/s) >
>> 2013-03-05 15:13:54,742 INFO org.apache.hadoop.mapred.TaskTracker:
>> attempt_201302270947_0010_r_08_0 0.131468% reduce > copy (19706 of
>> 49964 at 0.00 MB/s) >
>>
>> Thanks,
>>
>> Daning
>>
>>
>> On Thu, Mar 7, 2013 at 12:21 AM, Håvard Wahl Kongsgård <
>> haavard.kongsga...@gmail.com> wrote:
>>
>>> hadoop logs?
>>> On 6. mars 2013 21:04, "Daning Wang"  wrote:
>>>
>>>> We have 5 nodes cluster(Hadoop 1.0.4), It hung a couple of times while
>>>> running big jobs. Basically all the nodes are dead, from that
>>>> trasktracker's log looks it went into some kinds of loop forever.
>>>>
>>>> All the log entries like this when problem happened.
>>>>
>>>> Any idea how to debug the issue?
>>>>
>>>> Thanks in advance.
>>>>
>>>>
>>>> 2013-03-05 15:13:19,526 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> attempt_201302270947_0010_r_12_0 0.131468% reduce > copy (19706 of
>>>> 49964 at 0.00 MB/s) >
>>>> 2013-03-05 15:13:19,552 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> attempt_201302270947_0010_r_28_0 0.131468% reduce > copy (19706 of
>>>> 49964 at 0.00 MB/s) >
>>>> 2013-03-05 15:13:20,858 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> attempt_201302270947_0010_r_36_0 0.131468% reduce > copy (19706 of
>>>> 49964 at 0.00 MB/s) >
>>>> 2013-03-05 15:13:21,141 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> attempt_201302270947_0010_r_16_0 0.131468% reduce > copy (19706 of
>>>> 49964 at 0.00 MB/s) >
>>>> 2013-03-05 15:13:21,486 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> attempt_201302270947_0010_r_19_0 0.131468% reduce > copy (19706 of
>>>> 49964 at 0.00 MB/s) 

Re: Hadoop cluster hangs on big hive job

2013-03-10 Thread Suresh Srinivas
What is the version of hadoop?

Sent from phone

On Mar 7, 2013, at 11:53 AM, Daning Wang  wrote:

> We have hive query processing zipped csv files. the query was scanning for 10 
> days(partitioned by date). data for each day around 130G. The problem is not 
> consistent since if you run it again, it might go through. but the problem 
> has never happened on the smaller jobs(like processing only one days data).
> 
> We don't have space issue.
> 
> I have attached log file when problem happening. it is stuck like 
> following(just search "19706 of 49964")
> 
> 2013-03-05 15:13:51,587 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201302270947_0010_r_19_0 0.131468% reduce > copy (19706 of 49964 
> at 0.00 MB/s) >
> 2013-03-05 15:13:51,811 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201302270947_0010_r_39_0 0.131468% reduce > copy (19706 of 49964 
> at 0.00 MB/s) >
> 2013-03-05 15:13:52,551 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201302270947_0010_r_32_0 0.131468% reduce > copy (19706 of 49964 
> at 0.00 MB/s) >
> 2013-03-05 15:13:52,760 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201302270947_0010_r_00_0 0.131468% reduce > copy (19706 of 49964 
> at 0.00 MB/s) >
> 2013-03-05 15:13:52,946 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201302270947_0010_r_24_0 0.131468% reduce > copy (19706 of 49964 
> at 0.00 MB/s) >
> 2013-03-05 15:13:54,742 INFO org.apache.hadoop.mapred.TaskTracker: 
> attempt_201302270947_0010_r_08_0 0.131468% reduce > copy (19706 of 49964 
> at 0.00 MB/s) >
> 
> Thanks,
> 
> Daning
> 
> 
> On Thu, Mar 7, 2013 at 12:21 AM, Håvard Wahl Kongsgård 
>  wrote:
>> hadoop logs?
>> 
>> On 6. mars 2013 21:04, "Daning Wang"  wrote:
>>> We have 5 nodes cluster(Hadoop 1.0.4), It hung a couple of times while 
>>> running big jobs. Basically all the nodes are dead, from that 
>>> trasktracker's log looks it went into some kinds of loop forever.
>>> 
>>> All the log entries like this when problem happened.
>>> 
>>> Any idea how to debug the issue?
>>> 
>>> Thanks in advance.
>>> 
>>> 
>>> 2013-03-05 15:13:19,526 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_12_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:19,552 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_28_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:20,858 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_36_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:21,141 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_16_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:21,486 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_19_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:21,692 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_39_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:22,448 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_32_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:22,643 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_00_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:22,840 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_24_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:24,628 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_08_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:24,723 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_39_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:25,336 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_04_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:25,539 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_43_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:25,545 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_12_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:25,569 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_28_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:25,855 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_24_0 0.131468% reduce > copy (19706 of 
>>> 49964 at 0.00 MB/s) > 
>>> 2013-03-05 15:13:26,876 INFO org.apache.hadoop.mapred.TaskTracker: 
>>> attempt_201302270947_0010_r_

Re: [jira] [Commented] (HDFS-4533) start-dfs.sh ignored additional parameters besides -upgrade

2013-03-08 Thread Suresh Srinivas
Please followup on Jenkins failures. Looks like the patch is generated at
the wrong directory.


On Thu, Feb 28, 2013 at 1:34 AM, Azuryy Yu  wrote:

> Who can review this JIRA(https://issues.apache.org/jira/browse/HDFS-4533),
> which is very simple.
>
>
> -- Forwarded message --
> From: Hadoop QA (JIRA) 
> Date: Wed, Feb 27, 2013 at 4:53 PM
> Subject: [jira] [Commented] (HDFS-4533) start-dfs.sh ignored additional
> parameters besides -upgrade
> To: azury...@gmail.com
>
>
>
> [
> https://issues.apache.org/jira/browse/HDFS-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588130#comment-13588130]
>
> Hadoop QA commented on HDFS-4533:
> -
>
> {color:red}-1 overall{color}.  Here are the results of testing the latest
> attachment
>   http://issues.apache.org/jira/secure/attachment/12571164/HDFS-4533.patch
>   against trunk revision .
>
> {color:red}-1 patch{color}.  The patch command could not apply the
> patch.
>
> Console output:
> https://builds.apache.org/job/PreCommit-HDFS-Build/4008//console
>
> This message is automatically generated.
>
> > start-dfs.sh ignored additional parameters besides -upgrade
> > ---
> >
> > Key: HDFS-4533
> > URL: https://issues.apache.org/jira/browse/HDFS-4533
> > Project: Hadoop HDFS
> >  Issue Type: Bug
> >  Components: datanode, namenode
> >Affects Versions: 2.0.3-alpha
> >Reporter: Fengdong Yu
> >  Labels: patch
> > Fix For: 2.0.4-beta
> >
> > Attachments: HDFS-4533.patch
> >
> >
> > start-dfs.sh only takes -upgrade option and ignored others.
> > So If run the following command, it will ignore the clusterId option.
> > start-dfs.sh -upgrade -clusterId 1234
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>


-- 
http://hortonworks.com/download/


Re: How to setup Cloudera Hadoop to run everything on a localhost?

2013-03-05 Thread Suresh Srinivas
folks, another gentle reminder. Please use cloudera lists.


On Tue, Mar 5, 2013 at 2:56 PM, anton ashanin wrote:

> Do you run all Hadoop servers on a single host that gets IP by DHCP?
> What do you have in /etc/hosts?
>
> Thanks!
>
>
> On Wed, Mar 6, 2013 at 1:25 AM, yibing Shi <
> yibing@effectivemeasure.com> wrote:
>
>> Hi Anton,
>>
>> Cloudera manager needs fully qualified domain name. Run "hostname -f" to
>> check whether you have FQDN or not.
>>
>> I am not familiar with Ubuntu, but on my CentOS, I just put the FQDN into
>> /etc/sysconfig/network, which then looks like the following:
>> NETWORKING=yes
>> HOSTNAME=myhost.my.domain
>> GATEWAY=10.2.2.254
>>
>>
>> <http://demo.effectivemeasure.com/signatures/au/YibingShi.vcf>
>>
>>
>>
>> On Wed, Mar 6, 2013 at 8:14 AM, anton ashanin wrote:
>>
>>> I am at a loss. I have set an IP address that my node got by DHCP:
>>>  127.0.0.1   localhost
>>> 192.168.1.6node
>>>
>>> This has not helped. Cloudera Manager finds this host all right, but
>>> still can not get a "heartbeat" from it next.
>>> Maybe the problem is that at the moment of these experiments I have
>>> three laptops with addresses assigned by DHCP all running at once?
>>>
>>> To make Hadoop work I am ready now to switch Ubuntu for CentOS or should
>>> I try something else?
>>> Please let me know on what Linux version you have managed to run Hadoop
>>> on a local host only?
>>>
>>>
>>> On Tue, Mar 5, 2013 at 10:54 PM, Jean-Marc Spaggiari <
>>> jean-m...@spaggiari.org> wrote:
>>>
>>>> Hi Anton,
>>>>
>>>> Here is what my host is looking like:
>>>> 127.0.0.1   localhost
>>>> 192.168.1.2myserver
>>>>
>>>>
>>>> JM
>>>>
>>>> 2013/3/5 anton ashanin :
>>>> > Morgan,
>>>> > Just did exactly as you suggested, my /etc/hosts:
>>>> > 127.0.1.1 node.domain.local node
>>>> >
>>>> > Wiped out, annihilated my previous installation completely and
>>>> reinstalled
>>>> > everything from scratch.
>>>> > The same problem with CLOUDERA MANAGER (FREE EDITION):
>>>> > "Installation failed.  Failed to receive heartbeat from agent"
>>>> > 
>>>> >
>>>> > I will try now the the  bright idea from Jean, looks promising to me
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Mar 5, 2013 at 10:10 PM, Morgan Reece 
>>>> wrote:
>>>> >>
>>>> >> Don't use 'localhost' as your host name.  For example, if you wanted
>>>> to
>>>> >> use the name 'node'; add another line to your hosts file like:
>>>> >>
>>>> >> 127.0.1.1 node.domain.local node
>>>> >>
>>>> >> Then change all the host references in your configuration files to
>>>> 'node'
>>>> >> -- also, don't forget to change the master/slave files as well.
>>>> >>
>>>> >> Now, if you decide to use an external address it would need to be
>>>> static.
>>>> >> This is easy to do, just follow this guide
>>>> >> http://www.howtoforge.com/linux-basics-set-a-static-ip-on-ubuntu
>>>> >> and replace '127.0.1.1' with whatever external address you decide on.
>>>> >>
>>>> >>
>>>> >> On Tue, Mar 5, 2013 at 12:59 PM, Suresh Srinivas <
>>>> sur...@hortonworks.com>
>>>> >> wrote:
>>>> >>>
>>>> >>> Can you please take this Cloudera mailing list?
>>>> >>>
>>>> >>>
>>>> >>> On Tue, Mar 5, 2013 at 10:33 AM, anton ashanin <
>>>> anton.asha...@gmail.com>
>>>> >>> wrote:
>>>> >>>>
>>>> >>>> I am trying to run all Hadoop servers on a single Ubuntu
>>>> localhost. All
>>>> >>>> ports are open and my /etc/hosts file is
>>>> >>>>
>>>> >>>> 127.0.0.1   frigate frigate.domain.locallocalhost
>>>> >>>> # The following lines are desirable for IPv6 capable hosts
>>>> >>>> ::1 ip6-localhost ip6-loopback
>>>> >>>> fe00::0 ip6-localnet
>>>> >>>> ff00::0 ip6-mcastprefix
>>>> >>>> ff02::1 ip6-allnodes
>>>> >>>> ff02::2 ip6-allrouters
>>>> >>>>
>>>> >>>> When trying to install cluster Cloudera manager fails with the
>>>> following
>>>> >>>> messages:
>>>> >>>>
>>>> >>>> "Installation failed. Failed to receive heartbeat from agent".
>>>> >>>>
>>>> >>>> I run my Ubuntu-12.04 host from home connected by WiFi/dialup
>>>> modem to
>>>> >>>> my provider. What configuration is missing?
>>>> >>>>
>>>> >>>> Thanks!
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>> http://hortonworks.com/download/
>>>> >>
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>


-- 
http://hortonworks.com/download/


Re: How to setup Cloudera Hadoop to run everything on a localhost?

2013-03-05 Thread Suresh Srinivas
Can you please take this Cloudera mailing list?


On Tue, Mar 5, 2013 at 10:33 AM, anton ashanin wrote:

> I am trying to run all Hadoop servers on a single Ubuntu localhost. All
> ports are open and my /etc/hosts file is
>
> 127.0.0.1   frigate frigate.domain.locallocalhost
> # The following lines are desirable for IPv6 capable hosts
> ::1 ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
>
> When trying to install cluster Cloudera manager fails with the following
> messages:
>
> "Installation failed. Failed to receive heartbeat from agent".
>
> I run my Ubuntu-12.04 host from home connected by WiFi/dialup modem to my
> provider. What configuration is missing?
>
> Thanks!
>



-- 
http://hortonworks.com/download/


Re: QJM HA and ClusterID

2013-02-26 Thread Suresh Srinivas
looks start-dfs.sh has a bug. It only takes -upgrade option and ignores
clusterId.

Consider running the command (which is what start-dfs.sh calls):
bin/hdfs start namenode -upgrade -clusterId 

Please file a bug, if you can, for start-dfs.sh bug which ignores
additional parameters.


On Tue, Feb 26, 2013 at 4:50 PM, Azuryy Yu  wrote:

> Anybody here? Thanks!
>
>
> On Tue, Feb 26, 2013 at 9:57 AM, Azuryy Yu  wrote:
>
>> Hi all,
>> I've stay on this question several days. I want upgrade my cluster from
>> hadoop-1.0.3 to hadoop-2.0.3-alpha, I've configured QJM successfully.
>>
>> How to customize clusterID by myself. It generated a random clusterID now.
>>
>> It doesn't work when I run:
>>
>> start-dfs.sh -upgrade -clusterId 12345-test
>>
>> Thanks!
>>
>>
>


-- 
http://hortonworks.com/download/


No standup today - I, Nicholas and Brandon are out

2013-02-12 Thread Suresh Srinivas
-- 
http://hortonworks.com/download/


Re: "Hive Metastore DB Issue ( Cloudera CDH4.1.2 MRv1 with hive-0.9.0-cdh4.1.2)"

2013-02-07 Thread Suresh Srinivas
Please only use CDH mailing list and do not copy this to hdfs-user.


On Thu, Feb 7, 2013 at 7:20 AM, samir das mohapatra  wrote:

> Any Suggestion...
>
>
> On Thu, Feb 7, 2013 at 4:17 PM, samir das mohapatra <
> samir.help...@gmail.com> wrote:
>
>> Hi All,
>>   I could not see the hive meta  store DB under Mysql  database Under
>> mysql user hadoop.
>>
>> Example:
>>
>> $>  mysql –u root -p
>>  $> Add hadoop user (CREATE USER ‘hadoop'@'localhost' IDENTIFIED BY ‘
>> hadoop';)
>>  $>GRANT ALL ON *.* TO ‘hadoop'@‘% IDENTIFIED BY ‘hadoop’
>>  $> Example (GRANT ALL PRIVILEGES ON *.* TO 'hadoop'@'localhost'
>> IDENTIFIED BY 'hadoop' WITH GRANT OPTION;)
>>
>> Bellow  configuration i am follwing
>> 
>>
>> 
>> javax.jdo.option.ConnectionURL
>>
>> jdbc:mysql://localhost:3306/hadoop?createDatabaseIfNotExist=true
>> 
>> 
>> javax.jdo.option.ConnectionDriverName
>> com.mysql.jdbc.Driver
>> 
>> 
>>   javax.jdo.option.ConnectionUserName
>>   hadoop
>> 
>> 
>>javax.jdo.option.ConnectionPassword
>>hadoop
>>
>> 
>>
>>
>>  Note: Previously i was using cdh3 it was perfectly creating under mysql
>> metastore DB but when i changed cdh3 to cdh4.1.2 with hive as above subject
>> line , It is not creating.
>>
>>
>> Any suggestiong..
>>
>> Regrads,
>> samir.
>>
>
>


-- 
http://hortonworks.com/download/


Re: Advice on post mortem of data loss (v 1.0.3)

2013-02-05 Thread Suresh Srinivas
Sorry to hear you are having issues. Few questions and comments inline.


On Fri, Feb 1, 2013 at 8:40 AM, Peter Sheridan <
psheri...@millennialmedia.com> wrote:

>  Yesterday, I bounced my DFS cluster.  We realized that "ulimit –u" was,
> in extreme cases, preventing the name node from creating threads.  This had
> only started occurring within the last day or so.  When I brought the name
> node back up, it had essentially been rolled back by one week, and I lost
> all changes which had been made since then.
>
>  There are a few other factors to consider.
>
>1. I had 3 locations for dfs.name.dir — one local and two NFS.  (I
>originally thought this was 2 local and one NFS when I set it up.)  On
>1/24, the day which we effectively rolled back to, the second NFS mount
>started showing as FAILED on dfshealth.jsp.  We had seen this before
>without issue, so I didn't consider it critical.
>
> What do you mean by "rolled back to"?
I understand this so far has you have three dirs: l1, nfs1 and nfs2. (l for
local disk and nfs for NFS). nfs2 was shown as failed.

>
>1. When I brought the name node back up, because of discovering the
>above, I had changed dfs.name.dir to 2 local drives and one NFS, excluding
>the one which had failed.
>
> When you brought the namenode backup, with the changed configuration you
have l1, l2 and nfs1. Given you have not seen any failures, l1 and nfs1
have the latest edits so far. Correct? How did you add l2? Can you describe
this procedure in detail?


> Reviewing the name node log from the day with the NFS outage, I see:
>

When you say NFS outage here, this is the failure corresponding to nfs2
from above. Is that correct?


>
>  2013-01-24 16:33:11,794 ERROR
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Unable to sync edit
> log.
> java.io.IOException: Input/output error
> at sun.nio.ch.FileChannelImpl.force0(Native Method)
> at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:348)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog$EditLogFileOutputStream.flushAndSync(FSEditLog.java:215)
> at
> org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:89)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:1015)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1666)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:718)
> at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
> 2013-01-24 16:33:11,794 WARN org.apache.hadoop.hdfs.server.common.Storage:
> Removing storage dir /rdisks/xx
>
>
>  Unfortunately, since I wasn't expecting anything terrible to happen, I
> didn't look too closely at the file system while the name node was down.
>  When I brought it up, the time stamp on the previous checkpoint directory
> in the dfs.name.dir was right around the above error message.  The current
> directory basically had an fsimage and an empty edits log with the current
> time stamps.
>

Which storage directory are you talking about here?

>
>  So: what happened?  Should this failure have led to my data loss?  I
> would have thought the local directory would be fine in this scenario.  Did
> I have any other options for data recovery?
>

I am not sure how you concluded that you lost a week's data and the
namenode rolled back by one week? Please share the namenode logs
corresponding to the restart.

This is how it should have worked.
- When nfs2 was removed, on both l1 and nfs1 a timestamp is recorded,
corresponding to removal of a storage directory.
- If there is any checkpointing that happened, it would have also
incremented the timestamp.
- When the namenode starts up, it chooses l1 and nfs1 because the recorded
timestamp is the latest on these directories and loads fsimage and edits
from those directories. Namenode also performs checkpoint and writes new
consolidated image on l1, l2 and nfs1 and creates empty editlog on l1, l2
and nfs1.

If you provide more details on how l2 was added, we may be able to
understand what happened.

Regards,
Suresh


-- 
http://hortonworks.com/download/


Re: Application of Cloudera Hadoop for Dataset analysis

2013-02-05 Thread Suresh Srinivas
Please take this thread to CDH mailing list.


On Tue, Feb 5, 2013 at 2:43 AM, Sharath Chandra Guntuku <
sharathchandr...@gmail.com> wrote:

> Hi,
>
> I am Sharath Chandra, an undergraduate student at BITS-Pilani, India. I
> would like to get the following clarifications regarding cloudera hadoop
> distribution. I am using a CDH4 Demo VM for now.
>
> 1. After I upload the files into the file browser, if I have to link
> two-three datasets using a key in those files, what should I do? Do I have
> to run a query over them?
>
> 2. My objective is that I have some data collected over a few years and
> now, I would like to link all of them, as in a database using keys and then
> run queries over them to find out particular patterns. Later I would like
> to implement some Machine learning algorithms on them for predictive
> analysis. Will this be possible on the demo VM?
>
> I am totally new to this. Can I get some help on this? I would be very
> grateful for the same.
>
>
> --
> Thanks and Regards,
> *Sharath Chandra Guntuku*
> Undergraduate Student (Final Year)
> *Computer Science Department*
> *Email*: f2009...@hyderabad.bits-pilani.ac.in
>
> *BITS-Pilani*, Hyderabad Campus
> Jawahar Nagar, Shameerpet, RR Dist,
> Hyderabad - 500078, Andhra Pradesh
>



-- 
http://hortonworks.com/download/


Re: Using distcp with Hadoop HA

2013-01-29 Thread Suresh Srinivas
Currently, as you have pointed out, client side configuration based
failover is used in HA setup. The configuration must define namenode
addresses  for the nameservices of both the clusters. Are the datanodes
belonging to the two clusters running on the same set of nodes? Can you
share the configuration you are using, to diagnose the problem?

- I am trying to do a distcp from cluster A to cluster B. Since no
> operations are supported on the standby namenode, I need to specify either
> the active namenode while using distcp or use the failover proxy provider
> (dfs.client.failover.proxy.provider.clusterA) where I can specify the two
> namenodes for cluster B and the failover code inside HDFS will figure it
> out.
>


> - If I use the failover proxy provider, some of my datanodes on cluster A
> would connect to the namenode on cluster B and vice versa. I am assuming
> that is because I have configured both nameservices in my hdfs-site.xml for
> distcp to work.. I have configured dfs.nameservice.id to be the right one
> but the datanodes do not seem to respect that.
>
> What is the best way to use distcp with Hadoop HA configuration without
> having the datanodes to connect to the remote namenode? Thanks
>
> Regards,
> Dhaval
>



-- 
http://hortonworks.com/download/


Re: ClientProtocol Version mismatch. (client = 69, server = 1)

2013-01-29 Thread Suresh Srinivas
Please take this up in CDH mailing list. Most likely you are using client
that is not from 2.0 release of Hadoop.


On Tue, Jan 29, 2013 at 12:33 PM, Kim Chew  wrote:

> I am using CDH4 (2.0.0-mr1-cdh4.1.2) vm running on my mbp.
>
> I was trying to invoke a remote method in the ClientProtocol via RPC,
> however I am getting this exception.
>
> 2013-01-29 11:20:45,810 ERROR
> org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException as:training (auth:SIMPLE)
> cause:org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol
> org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch.
> (client = 69, server = 1)
> 2013-01-29 11:20:45,810 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 6 on 8020, call
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from
> 192.168.140.1:50597: error: org.apache.hadoop.ipc.RPC$VersionMismatch:
> Protocol org.apache.hadoop.hdfs.protocol.ClientProtocol version
> mismatch. (client = 69, server = 1)
> org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol
> org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch.
> (client = 69, server = 1)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.getProtocolImpl(ProtobufRpcEngine.java:400)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:435)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
>
> I could understand if the Server's ClientProtocol has version number
> "60" or something else, but how could it has a version number of "1"?
>
> Thanks.
>
> Kim
>



-- 
http://hortonworks.com/download/


Re: Cohesion of Hadoop team?

2013-01-18 Thread Suresh Srinivas
On Fri, Jan 18, 2013 at 6:48 AM, Glen Mazza  wrote:

>  Hi, looking at the derivation of the 0.23.x & 2.0.x branches on one hand,
> and the 1.x branches on the other, as described here:
>
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201301.mbox/%3CCD0CAB8B.1098F%25evans%40yahoo-inc.com%3E
>
> One gets the impression the Hadoop committers are split into two teams,
> with one team working on 0.23.x/2.0.2 and another team working on 1.x,
> running the risk of increasingly diverging products eventually competing
> with each other.  Is that the case?
>

I am not sure how you came to this conclusion. The way I see it is, all the
folks are working on trunk. Subset of this work from trunk is pushed to
older releases such as 1.x or 0.23.x. In Apache Hadoop, features always go
to trunk first before going to any older releases 1.x or 0.23.x. That means
trunk is a superset of all the features.

Is there expected to be a Hadoop 3.0 where the results of the two lines of
> development will merge or is it increasingly likely the subteams will
> continue their separate routes?
>

2.0.3-alpha, which is the latest release based off of trunk, that is in
final stage of completion should have all the features that all the other
releases have. Let me know if there are any exceptions to this that you
know of.


>
> Thanks,
> Glen
>
> --
> Glen Mazza
> Talend Community Coders - coders.talend.com
> blog: www.jroller.com/gmazza
>
>


-- 
http://hortonworks.com/download/


Re: how to start hadoop 1.0.4 backup node?

2012-12-28 Thread Suresh Srinivas
This is a documentation bug. Backup node is not available in 1.x release. It is 
available in 0.23 and 2.x releases. Please create a bug to point 1.x documents 
to the right set of docs. 

Sent from a mobile device

On Dec 28, 2012, at 7:13 PM, 周梦想  wrote:

> http://hadoop.apache.org/docs/r1.0.4/hdfs_user_guide.html#Backup+Node
> 
> the document write:
> The Backup node is configured in the same manner as the Checkpoint node. It 
> is started with bin/hdfs namenode -checkpoint
> 
> but hadoop 1.0.4 there is no hdfs file:
> [zhouhh@Hadoop48 hadoop-1.0.4]$ ls bin
> hadoophadoop-daemons.sh  start-all.sh   
> start-jobhistoryserver.sh  stop-balancer.sh  stop-mapred.sh
> hadoop-config.sh  rccstart-balancer.sh  start-mapred.sh   
>  stop-dfs.sh   task-controller
> hadoop-daemon.sh  slaves.sh  start-dfs.sh   stop-all.sh   
>  stop-jobhistoryserver.sh
> 
> 
> [zhouhh@Hadoop48 hadoop-1.0.4]$ find . -name hdfs
> ./webapps/hdfs
> ./src/webapps/hdfs
> ./src/test/org/apache/hadoop/hdfs
> ./src/test/system/aop/org/apache/hadoop/hdfs
> ./src/test/system/java/org/apache/hadoop/hdfs
> ./src/hdfs
> ./src/hdfs/org/apache/hadoop/hdfs
> 
> 
> thanks!
> Andy


Re: patch pre-commit help in branch-1

2012-12-13 Thread Suresh Srinivas
Currently pre-commit jobs are run only for the trunk patches. For branch-1,
along with the patch, the contributor posts test-patch results and unit
tests results.


On Thu, Dec 13, 2012 at 7:33 AM, pengwenwu2008 wrote:

> Hi all,
>
> I am relatively new to Hadoop and want to do pre-commit in branch-1 before
> check patch into community,
> however, there is no pre-commit job in community jenkins.
> Could anyone have any good suggestion or community jenkins can help?
>
> Thanks in advance!!
>
> Regards,
> Wenwu,Peng
>
>
>


-- 
http://hortonworks.com/download/


Re: What is the difference between the branch-1 and branch-1-win

2012-12-12 Thread Suresh Srinivas
branch-1 is a branch where the latest work related to 1.x work is in
progress. This is the branch from where future 1.x releases (such as 1.2.0)
will come out of.
branch-1-win was branched off of branch-1 in around release 1.0.3
timeframe. This eventually will be merged into a future 1.x release when it
is ready.


On Wed, Dec 12, 2012 at 10:22 PM, pengwenwu2008 wrote:

> Hi all,
>
> Could you help me What is the difference between the branch-1 and
> branch-1-win ?
>
> Regards,
> Wenwu,Peng
>
>
>


-- 
http://hortonworks.com/download/


Re: Is there an additional overhead when storing data in HDFS?

2012-11-20 Thread Suresh Srinivas
Namenode will have trivial amount of data stored in journal/fsimage.

On Tue, Nov 20, 2012 at 11:21 PM, WangRamon  wrote:

> Thanks, besides the checksum data is there anything else? Data in name
> node?
>
> --
> Date: Tue, 20 Nov 2012 23:14:06 -0800
> Subject: Re: Is there an additional overhead when storing data in HDFS?
> From: sur...@hortonworks.com
> To: user@hadoop.apache.org
>
>
> HDFS uses 4GB for the file + checksum data.
>
> Default is for every 512 bytes of data, 4 bytes of checksum are stored. In
> this case additional 32MB data.
>
> On Tue, Nov 20, 2012 at 11:00 PM, WangRamon wrote:
>
> Hi All
>
> I'm wondering if there is an additional overhead when storing some data
> into HDFS? For example, I have a 2GB file, the replicate factor of HDSF is
> 2, when the file is uploaded to HDFS, should HDFS use 4GB to store it or
> more then 4GB to store it? If it takes more than 4GB space, why?
>
> Thanks
> Ramon
>
>
>
>
> --
> http://hortonworks.com/download/
>
>


-- 
http://hortonworks.com/download/


Re: Is there an additional overhead when storing data in HDFS?

2012-11-20 Thread Suresh Srinivas
HDFS uses 4GB for the file + checksum data.

Default is for every 512 bytes of data, 4 bytes of checksum are stored. In
this case additional 32MB data.

On Tue, Nov 20, 2012 at 11:00 PM, WangRamon  wrote:

> Hi All
>
> I'm wondering if there is an additional overhead when storing some data
> into HDFS? For example, I have a 2GB file, the replicate factor of HDSF is
> 2, when the file is uploaded to HDFS, should HDFS use 4GB to store it or
> more then 4GB to store it? If it takes more than 4GB space, why?
>
> Thanks
> Ramon
>



-- 
http://hortonworks.com/download/


Re: High Availability - second namenode (master2) issue: Incompatible namespaceIDs

2012-11-16 Thread Suresh Srinivas
Vinay, if the Hadoop docs are not clear in this regard, can you please
create a jira to add these details?

On Fri, Nov 16, 2012 at 12:31 AM, Vinayakumar B wrote:

> Hi,
>
> ** **
>
> If you are moving from NonHA (single master) to HA, then follow the below
> steps.
>
> **1.   **Configure the another namenode’s configuration in the
> running namenode and all datanode’s configurations. And configure logical
> *fs.defaultFS*
>
> **2.   **Configure the shared storage related configuration.
>
> **3.   **Stop the running NameNode and all datanodes.
>
> **4.   **Execute ‘hdfs namenode –initializeSharedEdits’ from the
> existing namenode installation, to transfer the edits to shared storage.**
> **
>
> **5.   **Now format zkfc using ‘hdfs zkfc –formatZK’ and start zkfc
> using ‘hadoop-daemon.sh start zkfc’
>
> **6.   **Now restart the namenode from existing installation. If all
> configurations are fine, then NameNode should start successfully as
> STANDBY, then zkfc will make it to ACTIVE.
>
> ** **
>
> **7.   **Now install the NameNode in another machine (master2) with
> same configuration, except ‘dfs.ha.namenode.id’.
>
> **8.   **Now instead of format, you need to copy the name dir
> contents from another namenode (master1) to master2’s name dir. For this
> you are having 2 options.
>
> **a.   **Execute ‘hdfs namenode -bootStrapStandby’  from the master2
> installation.
>
> **b.  **Using ‘scp’ copy entire contents of name dir from master1 to
> master2’s name dir.
>
> **9.   **Now start the zkfc for second namenode ( No need to do zkfc
> format now). Also start the namenode (master2)
>
> ** **
>
> Regards,
>
> Vinay-
>
> *From:* Uma Maheswara Rao G [mailto:mahesw...@huawei.com]
> *Sent:* Friday, November 16, 2012 1:26 PM
> *To:* user@hadoop.apache.org
> *Subject:* RE: High Availability - second namenode (master2) issue:
> Incompatible namespaceIDs
>
> ** **
>
> If you format namenode, you need to cleanup storage directories of
> DataNode as well if that is having some data already. DN also will have
> namespace ID saved and compared with NN namespaceID. if you format NN, then
> namespaceID will be changed and DN may have still older namespaceID. So,
> just cleaning the data in DN would be fine.
>
>  
>
> Regards,
>
> Uma
> --
>
> *From:* hadoop hive [hadooph...@gmail.com]
> *Sent:* Friday, November 16, 2012 1:15 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: High Availability - second namenode (master2) issue:
> Incompatible namespaceIDs
>
> Seems like you havn't format your cluster (if its 1st time made). 
>
> On Fri, Nov 16, 2012 at 9:58 AM, a...@hsk.hk  wrote:
>
> Hi, 
>
> ** **
>
> Please help!
>
> ** **
>
> I have installed a Hadoop Cluster with a single master (master1) and have
> HBase running on the HDFS.  Now I am setting up the second master
>  (master2) in order to form HA.  When I used JPS to check the cluster, I
> found :
>
> ** **
>
> 2782 Jps
>
> 2126 NameNode
>
> 2720 SecondaryNameNode
>
> i.e. The datanode on this server could not be started
>
> ** **
>
> In the log file, found: 
>
> 2012-11-16 10:28:44,851 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
> Incompatible namespaceIDs in /app/hadoop/tmp/dfs/data: namenode namespaceID
> = 1356148070; datanode namespaceID = 1151604993
>
> ** **
>
> ** **
>
> ** **
>
> One of the possible solutions to fix this issue is to:  stop the cluster,
> reformat the NameNode, restart the cluster.
>
> QUESTION: As I already have HBASE running on the cluster, if I reformat
> the NameNode, do I need to reinstall the entire HBASE? I don't mind to have
> all data lost as I don't have many data in HBASE and HDFS, however I don't
> want to re-install HBASE again.
>
> ** **
>
> ** **
>
> On the other hand, I have tried another solution: stop the DataNode, edit
> the namespaceID in current/VERSION (i.e. set namespaceID=1151604993),
> restart the datanode, it doesn't work:
>
> Warning: $HADOOP_HOME is deprecated.
>
> starting master2, logging to
> /usr/local/hadoop-1.0.4/libexec/../logs/hadoop-hduser-master2-master2.out*
> ***
>
> Exception in thread "main" java.lang.NoClassDefFoundError: master2
>
> Caused by: java.lang.ClassNotFoundException: master2
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>
> Could not find the main class: master2.  Program will exit.
>
> QUESTION: Any other solutions?
>
> ** **
>
> ** **
>
> ** **
>
> Thanks
>
> ** **
>
> ** 

Re: could only be replicated to 0 nodes, instead of 1

2012-09-04 Thread Suresh Srinivas
Keith,

Assuming that you were seeing the problem when you captured the namenode
webUI info, it is not related to what I suspect. This might be a good
question for CDH forums given this is not an Apache release.

Regards,
Suresh

On Tue, Sep 4, 2012 at 10:20 AM, Keith Wiley  wrote:

> On Sep 4, 2012, at 10:05 , Suresh Srinivas wrote:
>
> > When these errors are thrown, please send the namenode web UI
> information. It has storage related information in the cluster summary.
> That will help debug.
>
> Sure thing.  Thanks.  Here's what I currently see.  It looks like the
> problem isn't the datanode, but rather the namenode.  Would you agree with
> that assessment?
>
> NameNode 'localhost:9000'
>
> Started: Tue Sep 04 10:06:52 PDT 2012
> Version: 0.20.2-cdh3u3, 03b655719d13929bd68bb2c2f9cee615b389cea9
> Compiled:Thu Jan 26 11:55:16 PST 2012 by root from Unknown
> Upgrades:There are no upgrades in progress.
>
> Browse the filesystem
> Namenode Logs
> Cluster Summary
>
> Safe mode is ON. Resources are low on NN. Safe mode must be turned off
> manually.
> 1639 files and directories, 585 blocks = 2224 total. Heap Size is 39.55 MB
> / 888.94 MB (4%)
> Configured Capacity  :   49.21 GB
> DFS Used :   9.9 MB
> Non DFS Used :   2.68 GB
> DFS Remaining:   46.53 GB
> DFS Used%:   0.02 %
> DFS Remaining%   :   94.54 %
> Live Nodes   :   1
> Dead Nodes   :   0
> Decommissioning Nodes:   0
> Number of Under-Replicated Blocks:   5
>
> NameNode Storage:
>
> Storage Directory   TypeState
> /var/lib/hadoop-0.20/cache/hadoop/dfs/name  IMAGE_AND_EDITS Active
>
> Cloudera's Distribution including Apache Hadoop, 2012.
>
>
> 
> Keith Wiley kwi...@keithwiley.com keithwiley.com
> music.keithwiley.com
>
> "And what if we picked the wrong religion?  Every week, we're just making
> God
> madder and madder!"
>--  Homer Simpson
>
> 
>
>


-- 
http://hortonworks.com/download/


Re: could only be replicated to 0 nodes, instead of 1

2012-09-04 Thread Suresh Srinivas
- A datanode is typically kept free with up to 5 free blocks (HDFS block
size) of space.
- Disk space is used by mapreduce jobs to store temporary shuffle spills
also. This is what "dfs.datanode.du.reserved" is used to configure. The
configuration is available in hdfs-site.xml. If you have not configured it
then reserved space is 0. Not only mapreduce, other files also might take
up the disk space.

When these errors are thrown, please send the namenode web UI information.
It has storage related information in the cluster summary. That will help
debug.


On Tue, Sep 4, 2012 at 9:41 AM, Keith Wiley  wrote:

> I've been running up against the good old fashioned "replicated to 0
> nodes" gremlin quite a bit recently.  My system (a set of processes
> interacting with hadoop, and of course hadoop itself) runs for a while (a
> day or so) and then I get plagued with these errors.  This is a very simple
> system, a single node running pseudo-distributed.  Obviously, the
> replication factor is implicitly 1 and the datanode is the same machine as
> the namenode.  None of the typical culprits seem to explain the situation
> and I'm not sure what to do.  I'm also not sure how I'm getting around it
> so far.  I fiddle desperately for a few hours and things start running
> again, but that's not really a solution...I've tried stopping and
> restarting hdfs, but that doesn't seem to improve things.
>
> So, to go through the common suspects one by one, as quoted on
> http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo:
>
> • No DataNode instances being up and running. Action: look at the servers,
> see if the processes are running.
>
> I can interact with hdfs through the command line (doing directory
> listings for example).  Furthermore, I can see that the relevant java
> processes are all running (NameNode, SecondaryNameNode, DataNode,
> JobTracker, TaskTracker).
>
> • The DataNode instances cannot talk to the server, through networking or
> Hadoop configuration problems. Action: look at the logs of one of the
> DataNodes.
>
> Obviously irrelevant in a single-node scenario.  Anyway, like I said, I
> can perform basic hdfs listings, I just can't upload new data.
>
> • Your DataNode instances have no hard disk space in their configured data
> directories. Action: look at the dfs.data.dir list in the node
> configurations, verify that at least one of the directories exists, and is
> writeable by the user running the Hadoop processes. Then look at the logs.
>
> There's plenty of space, at least 50GB.
>
> • Your DataNode instances have run out of space. Look at the disk capacity
> via the Namenode web pages. Delete old files. Compress under-used files.
> Buy more disks for existing servers (if there is room), upgrade the
> existing servers to bigger drives, or add some more servers.
>
> Nope, 50GBs free, I'm only uploading a few KB at a time, maybe a few MB.
>
> • The reserved space for a DN (as set in dfs.datanode.du.reserved is
> greater than the remaining free space, so the DN thinks it has no free space
>
> I grepped all the files in the conf directory and couldn't find this
> parameter so I don't really know anything about it.  At any rate, it seems
> rather esoteric, I doubt it is related to my problem.  Any thoughts on this?
>
> • You may also get this message due to permissions, eg if JT can not
> create jobtracker.info on startup.
>
> Meh, like I said, the system basicaslly works...and then stops working.
>  The only explanation that would really make sense in that context is
> running out of space...which isn't happening. If this were a permission
> error, or a configuration error, or anything weird like that, then the
> whole system would never get up and running in the first place.
>
> Why would a properly running hadoop system start exhibiting this error
> without running out of disk space?  THAT's the real question on the table
> here.
>
> Any ideas?
>
>
> 
> Keith Wiley kwi...@keithwiley.com keithwiley.com
> music.keithwiley.com
>
> "Yet mark his perfect self-contentment, and hence learn his lesson, that
> to be
> self-contented is to be vile and ignorant, and that to aspire is better
> than to
> be blindly and impotently happy."
>--  Edwin A. Abbott, Flatland
>
> 
>
>


-- 
http://hortonworks.com/download/


Re: CDH4 eclipse(juno) yarn plugin

2012-08-13 Thread Suresh Srinivas
Can you please post these messages to CDH mailing lists.

Regards,
Suresh

On Mon, Aug 13, 2012 at 5:25 AM, anand sharma wrote:

>
> Hi, i just want to now the best way to configure eclipse(juno) plugin for
> CDH4 yarn, right now it throws me error which is being discussed here...
>
>
> http://stackoverflow.com/questions/11166125/build-a-hadoop-ecplise-library-from-cdh4-jar-files
>
> and if anyone can provide all the jars build that will be great.
>
>
>
>


-- 
http://hortonworks.com/download/