set the namenode public IP address in amazon EC2?

2013-03-29 Thread Pedro Sá da Costa
Hi, I'm trying to configure the Namenode with a public IP in amazon EC2. The service always get the host private IP, and not the public one. How can I set the namenode public IP address? Here are my configuration files: $cat hdfs-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl

Re: DFSOutputStream.sync() method latency time

2013-03-29 Thread lei liu
The sync method include below code: // Flush only if we haven't already flushed till this offset. if (lastFlushOffset != bytesCurBlock) { assert bytesCurBlock lastFlushOffset; // record the valid offset of this flush lastFlushOffset = bytesCurBlock;

error copying file from local to hadoop fs

2013-03-29 Thread Ravi Chandran
hi, I am trying to copy a local text file into hadoop fs using -copyFromLocal option, but i am getting error: 13/03/29 03:02:54 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8020. Already tried 0 time(s); maxRetries=45 13/03/29 03:03:15 INFO ipc.Client: Retrying connect to server:

Re: error copying file from local to hadoop fs

2013-03-29 Thread Jagat Singh
Your cluster is not running properly. Can you do jps and see if all services are running. JT NN DN etc On Fri, Mar 29, 2013 at 6:07 PM, Ravi Chandran ravichandran...@gmail.comwrote: hi, I am trying to copy a local text file into hadoop fs using -copyFromLocal option, but i am getting

Re: error copying file from local to hadoop fs

2013-03-29 Thread Ravi Chandran
Thanks for replying, I did jps, it doesnt show any of the deamon services. also, i just got the error message showing: Cannot create file/user/training/inputs/basic.txt._COPYING_. Name node is in safe mode. looks like JT and DN are not responding to NN. but this is a standalone setup, i dont

Re: error copying file from local to hadoop fs

2013-03-29 Thread Anand Aravindan
Hello, In the Safe Mode, -copyFromLocal would not work. Please read: http://hadoop.apache.org/docs/stable/hdfs_user_guide.html#Safemode Please wait a bit for the HDFS system to exit SafeMode. If it takes a significantly long time, and the HDFS is still in the SafeMode, something could be wrong

[no subject]

2013-03-29 Thread Mohit Vadhera
Hi, I am getting below error while mounting fuse_dfs i am getting shared library error while running the command. mount -a. Can anybody tell me to fix this plz # cat /etc/fstab | grep hadoop hadoop-fuse-dfs#dfs://localhost:8020 /mnt/san1/hadoop_mount fuse allow_other,usetrash,rw 2 0 # mount

Re: error copying file from local to hadoop fs

2013-03-29 Thread Ravi Chandran
But in standalone mode, should the safemode be faster? I mean because everything is running locally. still the deamons are not visible in jps. how can i restart it individually? also, the service status returned this info: Hadoop namenode is running [ OK ]

Re: error copying file from local to hadoop fs

2013-03-29 Thread Anand Aravindan
It would appear that your HDFS set up is not functioning properly. Please try to shut down HDFS (stop-all.sh).. waiting for a bit (5 mins) and restarting HDFS (start-all.sh). If that does not work, you might have to reformat the NameNode (after shutting down HDFS again). Similar solution

Fwd:

2013-03-29 Thread Mohit Vadhera
Now I have linked the shared library. Now I get below error while running mount -a # mount -a INFO /data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.1.3/src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c:164 Adding FUSE arg

RE: Which hadoop installation should I use on ubuntu server?

2013-03-29 Thread David Parks
Hmm, seems intriguing. I'm still not totally clear on bigtop here. It seems like they're creating and maintain basically an installer for Hadoop? I tried following their docs for Ubuntu, but just get a 404 error on the first step, so it makes me wonder how reliable that project is.

Re: Which hadoop installation should I use on ubuntu server?

2013-03-29 Thread Håvard Wahl Kongsgård
I recommend cloudera's CDH4 on ubuntu 12.04 LTS On Thu, Mar 28, 2013 at 7:07 AM, David Parks davidpark...@yahoo.com wrote: I’m moving off AWS MapReduce to our own cluster, I’m installing Hadoop on Ubuntu Server 12.10. ** ** I see a .deb installer and installed that, but it seems like

Re: Which hadoop installation should I use on ubuntu server?

2013-03-29 Thread Bruno Mahé
On 03/29/2013 01:09 AM, David Parks wrote: Hmm, seems intriguing. I’m still not totally clear on bigtop here. It seems like they’re creating and maintain basically an installer for Hadoop? I tried following their docs for Ubuntu, but just get a 404 error on the first step, so it makes me wonder

Million docs and word count scenario

2013-03-29 Thread pathurun
If there r 1 million docs in an enterprse and we need to perform word count computation on all the docs what is the first step to be done. Is it to extract all the text of all the docs into a single file and then put into hdfs or put each one separately in hdfs. Thanks Sent from BlackBerry®

Re: Million docs and word count scenario

2013-03-29 Thread Ted Dunning
Putting each document into a separate file is not likely to be a great thing to do. On the other hand, putting them all into one file may not be what you want either. It is probably best to find a middle ground and create files each with many documents and each a few gigabytes in size. On Fri,

Re: Hadoop distcp from CDH4 to Amazon S3 - Improve Throughput

2013-03-29 Thread Himanish Kushary
Thanks Dave. I had already tried using the s3distcp jar. But got stuck on the below error,which made me think that this is something specific to Amazon hadoop distribution. Exception in thread Thread-28 java.lang.NoClassDefFoundError:

Re: Hadoop distcp from CDH4 to Amazon S3 - Improve Throughput

2013-03-29 Thread Himanish Kushary
Yes you are right CDH4 is the 2.x line, but I even checked in the javadocs for 1.0.4 branch (could not find 1.0.3 API's so used http://hadoop.apache.org/docs/r1.0.4/api/index.html) but did not find the ProgressableResettableBufferedFileInputStream class.Not sure how it is present in the

How to use HPROF for rhadoop jobs ?

2013-03-29 Thread rohit sarewar
Hi All I can use HPROF in java map reduce jobs. Configuration conf = getConf(); conf.setBoolean(mapred.task.profile, true); conf.set(mapred.task.profile.params, -agentlib:hprof=cpu=samples, + heap=sites,depth=6,force=n,thread=y,verbose=n,file=%s); conf.set(mapred.task.profile.maps, 0-2);

FileSystem Error

2013-03-29 Thread Cyril Bogus
Hi, I am running a small java program that basically write a small input data to the Hadoop FileSystem, run a Mahout Canopy and Kmeans Clustering and then output the content of the data. In my hadoop.properties I have included the core-site.xml definition for the Java program to connect to my

Re: Understanding Sys.output from mapper partitioner

2013-03-29 Thread Jens Scheidtmann
Hallo Sai, the interesting bits are, how your job is configured. Depending on how you defined the input to the MR-job, e.g. as a text file you might get this result. Unfortunately, you didn't give this source code... Best regards, Jens

Re: Bloom Filter analogy in SQL

2013-03-29 Thread Sai Sai
Can some one give a simple analogy of Bloom Filter in SQL. I am trying to understand and always get confused. Thanks

Re: Bloom Filter analogy in SQL

2013-03-29 Thread Ted Yu
From http://msdn.microsoft.com/en-us/library/cc278097(v=sql.100).aspx : The new technology employed is based on bitmap filters, also known as *Bloom filters *(see *Bloom filter, *Wikipedia 2007, http://en.wikipedia.org/wiki/Bloom_filter) ... HBase uses bloom filters extensively. I can give

Re: list of linux commands for hadoop

2013-03-29 Thread Sai Sai
Just wondering if there are a list of linux commands or any article which r needed for learning hadoop. Thanks

Re: Understanding Sys.output from mapper partitioner

2013-03-29 Thread Sai Sai
Hi Jens Here is the code for the driver if this is what you r referring to is missing, plesae let me know if you need any additional info: Your help is appreciated. public class SecondarySortDriver { /** * @param args * @throws Exception  */ public static void main(String[] args) throws

Re: error copying file from local to hadoop fs

2013-03-29 Thread Jens Scheidtmann
Dear Ravi, 2013/3/29 Ravi Chandran ravichandran...@gmail.com But in standalone mode, should the safemode be faster? I mean because everything is running locally. still the deamons are not visible in jps. how can i restart it individually? What do you mean with standalone mode? standalone

Running test on hadoop cluster

2013-03-29 Thread Shah, Rahul1
Hi all, I have my Hadoop cluster setup. I am using the Intel distribution of Hadoop. I was planning to run some test like terasort on the cluster just to check whether all the nodes in the cluster are working properly. As I am new to this Hadoop I am not sure where to start with. Any kind of

Re: Running test on hadoop cluster

2013-03-29 Thread Mohammad Tariq
Hello Rahul, You might find this links useful : http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/ And the official page :

Re: FileSystem Error

2013-03-29 Thread Azuryy Yu
using haddop jar, instead of java -jar. hadoop script can set a proper classpath for you. On Mar 29, 2013 11:55 PM, Cyril Bogus cyrilbo...@gmail.com wrote: Hi, I am running a small java program that basically write a small input data to the Hadoop FileSystem, run a Mahout Canopy and Kmeans