org.apache.hadoop.ipc.StandbyException occurs at the thirty of per hour in standby NN

2014-01-24 Thread Francis . Hu
hello, All Installed 2 NN and 3 DN in my hadoop-2.2.0 cluster,and implemented HDFS HA with QJM. Currently, looking at the log of standby NN ,it throws below exception at a regular interval, one hour: 2014-01-24 03:30:01,245 ERROR org.apache.hadoop.security.UserGroupInformation:

Re: org.apache.hadoop.ipc.StandbyException occurs at the thirty of per hour in standby NN

2014-01-24 Thread Harsh J
Hi Francis, This is a non-worry, but you're basically hitting https://issues.apache.org/jira/browse/HDFS-3447. A temporary workaround could be to disable the UGI at the logging configuration level. On Fri, Jan 24, 2014 at 2:46 PM, Francis.Hu francis...@reachjunction.com wrote: hello, All

Re: hdfs fsck -locations

2014-01-24 Thread Mark Kerzner
Here is an example hdfs fsck /user/mark/data/word_count.csv Connecting to namenode via http://mark-7:50070 FSCK started by mark (auth:SIMPLE) from /192.168.1.232 for path /user/mark/data/word_count.csv at Fri Jan 24 07:45:24 CST 2014 .Status: HEALTHY Total size: 7217 B Total dirs: 0 Total

RE: HDFS buffer sizes

2014-01-24 Thread John Lilley
Ah, I see... it is a constant CommonConfigurationKeysPublic.java: public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096; Are there benefits to increasing this for large reads or writes? john From: Arpit Agarwal [mailto:aagar...@hortonworks.com] Sent: Thursday, January 23, 2014 3:31 PM To:

Fw: Hadoop 2 Namenode HA not working properly

2014-01-24 Thread Bruno Andrade
Begin forwarded message: Date: Tue, 21 Jan 2014 09:35:23 + From: Bruno Andrade b...@eurotux.com To: user@hadoop.apache.org Subject: Re: Hadoop 2 Namenode HA not working properly Hey, this is my hdfs-site.xml - http://pastebin.com/qpELkwH8 this is my core-site.xml: configuration

No space left on device during merge.

2014-01-24 Thread Tim Potter
Hi, I'm getting the below error while trying to sort a lot of data with Hadoop. I strongly suspect the node the merge is on is running out of local disk space. Assuming this is the case, is there any way to get around this limitation considering I can't increase the local disk space

Fwd: HDFS data transfer is faster than SCP based transfer?

2014-01-24 Thread rab ra
Hi Can anyone please answer my query? -Rab -- Forwarded message -- From: rab ra rab...@gmail.com Date: 24 Jan 2014 10:55 Subject: HDFS data transfer is faster than SCP based transfer? To: user@hadoop.apache.org Hello I have a use case that requires transfer of input files from

Re: HDFS buffer sizes

2014-01-24 Thread Arpit Agarwal
I don't think that value is used either except in the legacy block reader which is turned off by default. On Fri, Jan 24, 2014 at 6:34 AM, John Lilley john.lil...@redpoint.netwrote: Ah, I see… it is a constant CommonConfigurationKeysPublic.java: public static final int

What is the fix for this error ?

2014-01-24 Thread Kokkula, Sada
-bash-4.1$ /usr/jdk64/jdk1.6.0_31/bin/javac -Xlint -classpath /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-core-2.2.0.2.0.6.0-76.jar:/usr/lib/hadoop/hadoop-common-2.2.0.2.0.6.0-76.jar:./hadoop-annotations-2.0.0-cdh4.0.1.jar WordCount.java WordCount.java:62: warning: [deprecation]

Re: hdfs fsck -locations

2014-01-24 Thread Harsh J
Sorry, but what was the question? I also do not see a locations option flag. On Jan 24, 2014 7:17 PM, Mark Kerzner mark.kerz...@shmsoft.com wrote: Here is an example hdfs fsck /user/mark/data/word_count.csv Connecting to namenode via http://mark-7:50070 FSCK started by mark (auth:SIMPLE)

Re: hdfs fsck -locations

2014-01-24 Thread Mark Kerzner
Sorry, did not copy the full command hdfs fsck /user/mark/data/word_count.csv -locations Connecting to namenode via http://mark-7:50070 FSCK started by mark (auth:SIMPLE) from /192.168.1.232 for path /user/mark/data/word_count.csv at Fri Jan 24 11:15:17 CST 2014 .Status: HEALTHY Total size: 7217

RE: hdfs fsck -locations

2014-01-24 Thread Nascimento, Rodrigo
I'm not seeing locations flag yet. Rod Nascimento Systems Engineer @ Brazil People don't buy WHAT you do. They buy WHY you do it. From: Mark Kerzner [mailto:mark.kerz...@shmsoft.com] Sent: Friday, January 24, 2014 3:16 PM To: Hadoop User Subject: Re: hdfs fsck -locations Sorry, did not copy

RE: hdfs fsck -locations

2014-01-24 Thread Nascimento, Rodrigo
Hi Mark, It is a sample from my sandbox. Your question is about the part that is in RED at the output below, right? [root@sandbox ~]# hdfs fsck /user/ambari-qa/passwd -locations Connecting to namenode via http://sandbox.hortonworks.com:50070 FSCK started by root (auth:SIMPLE) from

Memory problems with BytesWritable and huge binary files

2014-01-24 Thread Adam Retter
Hi there, We have several diverse large datasets to process (one set may be as much as 27 TB), however all of the files in these datasets are binary files. We need to be able to pass each binary file to several tools running in the Map Reduce framework. We already have a working pipeline of

Re: Ambari upgrade 1.4.1 to 1.4.2

2014-01-24 Thread Vinod Kumar Vavilapalli
+user@ambari -user@hadoop Please post ambari related questions to the ambari user mailing list. Thanks +Vinod Hortonworks Inc. http://hortonworks.com/ On Fri, Jan 24, 2014 at 9:15 AM, Kokkula, Sada sadanandam.kokk...@bnymellon.com wrote: Ambari-Server upgrade from 1.4.1 to 1.4.2 wipes out

Re: HDFS data transfer is faster than SCP based transfer?

2014-01-24 Thread Vinod Kumar Vavilapalli
Is it a single file? Lots of files? How big are the files? Is the copy on a single node or are you running some kind of a MapReduce program? +Vinod Hortonworks Inc. http://hortonworks.com/ On Fri, Jan 24, 2014 at 7:21 AM, rab ra rab...@gmail.com wrote: Hi Can anyone please answer my query?

Re: HDFS federation configuration

2014-01-24 Thread AnilKumar B
Thanks Suresh. I followed the link it's clear now. But client side configuration is not covered on the doc. Thanks Regards, B Anil Kumar. On Thu, Jan 23, 2014 at 11:44 PM, Suresh Srinivas sur...@hortonworks.comwrote: Have you looked at -

Re: Memory problems with BytesWritable and huge binary files

2014-01-24 Thread Vinod Kumar Vavilapalli
Is your data in any given file a bunch of key-value pairs? If that isn't the case, I'm wondering how writing a single large key-value into a sequence file helps. It won't. May be you can give an example of your input data? If indeed they are a bunch of smaller sized key-value pairs, you can write

Re: No space left on device during merge.

2014-01-24 Thread Vinod Kumar Vavilapalli
That's a lot of data to process for a single reducer. You should try increasing the number of reducers to achieve more parallelism and also try modifying your logic to avoid significant skew in the reducers. Unfortunately this means rethinking about your app, but that's the only way about it. It

Re: hdfs fsck -locations

2014-01-24 Thread Mark Kerzner
hdfs fsck /user/mark/data/word_count.csv *-locations* On Fri, Jan 24, 2014 at 11:34 AM, Nascimento, Rodrigo rodrigo.nascime...@netapp.com wrote: I’m not seeing locations flag yet. *Rod Nascimento* *Systems Engineer @ Brazil* *People **don’t** buy **WHAT** you do. They buy **WHY**

Re: Memory problems with BytesWritable and huge binary files

2014-01-24 Thread Adam Retter
Is your data in any given file a bunch of key-value pairs? No. The content of each file itself is the value we are interested in, and I guess that it's filename is the key. If that isn't the case, I'm wondering how writing a single large key-value into a sequence file helps. It won't. May be

Re: hdfs fsck -locations

2014-01-24 Thread Mark Kerzner
Can you send me your output? hadoop version Hadoop 2.0.0-cdh4.5.0 Subversion git://ubuntu64-12-04-mk1/var/lib/jenkins/workspace/generic-package-ubuntu64-12-04/CDH4.5.0-Packaging-Hadoop-2013-11-20_14-31-53/hadoop-2.0.0+1518-1.cdh4.5.0.p0.24~precise/src/hadoop-common-project/hadoop-common -r

Re: hdfs fsck -locations

2014-01-24 Thread Mark Kerzner
Yes, Rodrigo, that's what I was looking for. So in my install I somehow don't have it at all. Was asked by my students, so I got the answer. Mark On Fri, Jan 24, 2014 at 4:00 PM, Nascimento, Rodrigo rodrigo.nascime...@netapp.com wrote: Mark, there we go ;-) Rodrigo Nascimento Systems

Re: Memory problems with BytesWritable and huge binary files

2014-01-24 Thread Vinod Kumar Vavilapalli
Okay. Assuming you don't need a whole file (video) in memory for your processing, you can simply write a Inputformat/RecordReader implementation that streams through any given file and processes it. +Vinod On Jan 24, 2014, at 12:44 PM, Adam Retter adam.ret...@googlemail.com wrote: Is your

Re: Memory problems with BytesWritable and huge binary files

2014-01-24 Thread Adam Retter
So I am not sure I follow you, as we already have a custom InputFormat and RecordReader and that does not seem to help. The reason it does not seem to help is that it needs to return the data as a Writable so that the Writable can then be used in the following map operation. The map operation

HIVE versus SQL DB

2014-01-24 Thread Felipe Gutierrez
Hi, I am in a project that has three databases with flat files. Our plan is to normalize these DB in one. We will need to follow the Data warehouse concept (ETL - Extraction, Transform, Load). We are thinking to use Hadoop at the Transform step, because we need to relate datas from the three

Re: hdfs fsck -locations

2014-01-24 Thread Harsh J
The right syntax is to use -files -blocks -locations, so it drills down all the way. You are not missing a feature - this has existed for as long as I've known HDFS. In Rodrigo's output, he's seeing a BlockPool ID, which is not equivalent to a location, but just carries an IP in it for

Spoofing Ganglia Metrics

2014-01-24 Thread Calvin Jia
Is there a way to configure hdfs/hbase/mapreduce to spoof the ganglia metrics being sent? This is because the machines are behind a NAT and the monitoring box is outside, so all the metrics are recognized as coming from the same machine. Thanks!

Re: Fw: Hadoop 2 Namenode HA not working properly

2014-01-24 Thread Juan Carlos
Hi Bruno, ha.zookeeper.quorum is a property in core-site and you have it in hdfs-site, maybe it's your problem. 2014/1/24 Bruno Andrade b...@eurotux.com Begin forwarded message: Date: Tue, 21 Jan 2014 09:35:23 + From: Bruno Andrade b...@eurotux.com To: user@hadoop.apache.org

Re: Datanode Shutting down automatically

2014-01-24 Thread Harsh J
You reformatted your NameNode at some point, but likely failed to also clear out the DN data directories, which would not auto-wipe themselves. Clear the contents of /app/hadoop/tmp/dfs/data at the DN and it should start up the next time you invoke it. P.s. Please do not email

Re: Datanode Shutting down automatically

2014-01-24 Thread Shekhar Sharma
Incompatible name space Id error.its because dat u might have formatted the namenode but data nodes folder is still have the same I'd. What is the value of the following property dfs. Data.dir dfs. name.dir hadoop.tmp.dir The value of these properties is directory on local file system Solution