Re: Running JobTracker on a separate machine

2011-03-21 Thread Harsh J
start-all.sh will attempt to launch a JobTracker on the same machine it was executed on (which would fail since mapred.job.tracker would point to an unavailable binding address). The right way is to launch these individually (or perhaps writing a custom/modified script to do what you need). On Tue

Problem trying to append file

2011-03-21 Thread Aaron Baff
I'm trying out the HDFS Append with r0.21.0 in a quick little simple application before using in my production code. It creates a new file, writes out 50K records, closes the file, then opens the file in append, writes another 100K records, then closes the file. Everything is fine up until it go

lots of LZO errors in mapper and name of file split

2011-03-21 Thread Shi Yu
Hi, My hadoop distribution is 0.20.2. I had many errors when compressing output with LZO (see stack trace at the end). I disabled the compression of mapper output. The naitvecode seems having been loaded correctly, but during the mapper stage, lots of error popped. The program didn't break and

Re: TextInputFormat and Gzip encoding - wordcount displaying binary data

2011-03-21 Thread Saptarshi Guha
True, my naming is Hmm, now i know. thanks On Mon, Mar 21, 2011 at 4:01 PM, Niels Basjes wrote: > Hi, > > 2011/3/21 Saptarshi Guha : >> It's frustrating to be dealing with these simple problems (and I know >> the fault is mine, i'm missing something). >> I'm running word count (from 0.20-2) on a

Re: TextInputFormat and Gzip encoding - wordcount displaying binary data

2011-03-21 Thread Niels Basjes
Hi, 2011/3/21 Saptarshi Guha : > It's frustrating to be dealing with these simple problems (and I know > the fault is mine, i'm missing something). > I'm running word count (from 0.20-2) on a gzip file (very small), the > output has binary characters. > When I run the same on the ungzipped file, t

TextInputFormat and Gzip encoding - wordcount displaying binary data

2011-03-21 Thread Saptarshi Guha
Hello, It's frustrating to be dealing with these simple problems (and I know the fault is mine, i'm missing something). I'm running word count (from 0.20-2) on a gzip file (very small), the output has binary characters. When I run the same on the ungzipped file, the output is correct ascii. I'm u

Re: Chukwa?

2011-03-21 Thread Eric Yang
Chukwa is good for general purpose log aggregation, and it has specific knowledge of analyze hadoop logs for monitoring/reporting hadoop performance. Log aggregation component is a completed system in Chukwa 0.4 for streaming logs into hadoop. Trunk also provides a completed reference implemen

Running JobTracker on a separate machine

2011-03-21 Thread modemide
Hello all, I was wondering if someone could help me with this issue. I was trying to configure the JobTracker to run on a separate machine as the Namenode (per the documentation's recommendation). When I try to configure it and run ./start-all.sh , the NameNode tries to start the JobTracker on its

Re: Chukwa?

2011-03-21 Thread Mark
Is Chukwa primarily used for analytics or log aggregation. I thought it was the latter but it seems more and more its like the former. On 3/21/11 8:27 AM, Eric Yang wrote: Chukwa is waiting on a official release of Hadoop and HBase which works together. In Chukwa trunk, Chukwa is using HBase a

Re: File formats in Hadoop

2011-03-21 Thread Doug Cutting
On 03/19/2011 09:01 AM, Weishung Chung wrote: > I am browsing through the hadoop.io package and was wondering what other > file formats are available in hadoop other than SequenceFile and TFile? > Is all data written through hadoop including those from hbase saved in the > above formats? It seems l

Re: Chukwa?

2011-03-21 Thread Eric Yang
Chukwa is waiting on a official release of Hadoop and HBase which works together. In Chukwa trunk, Chukwa is using HBase as data storage, and using pig+hbase for data analytics. Unfortunately, Hadoop security release branch and Hadoop trunk are both broken for HBase. Hence, Chukwa is in hibernat

Re: porting hadoop in Android

2011-03-21 Thread Harsh J
On Mon, Mar 21, 2011 at 5:49 PM, Irsan Rajamin wrote: > hi guys,,, I want to running hodoop in Android (dalvik virtual machine) but > I have some problem with library or something else. there anybody knows to > solve this problem? > Far as I can tell, Hadoop was never implemented with the Dalvik

Re: can't find lzo headers when ant compile hadoop package

2011-03-21 Thread Shi Yu
Problem solved, two paths should be set: export C_INCLUDE_PATH=/path_of_lzo_output/include export LIBRARY_PATH=/path_of_lzo_output/lib and enable shared when configuring the lzo compile: ./configure -enable-shared -prefix=/path_of_lzo_output/ Shi On 3/19/2011 1:16 PM, Shi Yu wrote: Trying to

Re: Sync-marker in uncompressed sequenceFile

2011-03-21 Thread Harsh J
Hello, On Mon, Mar 21, 2011 at 8:10 PM, Weishung Chung wrote: > Hello my fellow Hadoop users/developers, > > I'm reading the SequenceFile source code, and there is a checkAndWriteSync() > method that writes a sync marker every so many bytes. I was wondering what's > the use of the sync marker. I

Re: File formats in Hadoop

2011-03-21 Thread Weishung Chung
I found this interesting article about sequence file, share it here http://www.cloudera.com/blog/2011/01/hadoop-io-sequence-map-set-array-bloommap-files/ On Sun, Mar 20, 2011 at 6:04 AM, Niels Basjes wrote: > And then there is the matter of how you put the data in the file. I've > heard that so

Sync-marker in uncompressed sequenceFile

2011-03-21 Thread Weishung Chung
Hello my fellow Hadoop users/developers, I'm reading the SequenceFile source code, and there is a checkAndWriteSync() method that writes a sync marker every so many bytes. I was wondering what's the use of the sync marker. I know one can use it to designate the end of a header, but it's also used

porting hadoop in Android

2011-03-21 Thread Irsan Rajamin
hi guys,,, I want to running hodoop in Android (dalvik virtual machine) but I have some problem with library or something else. there anybody knows to solve this problem?

Re: Installing Hadoop on Debian Squeeze

2011-03-21 Thread Steve Loughran
On 21/03/11 09:00, Dieter Plaetinck wrote: On Thu, 17 Mar 2011 19:33:02 +0100 Thomas Koch wrote: Currently my advise is to use the Debian packages from cloudera. That's the problem, it appears there are none. Like I said in my earlier mail, Debian is not in Cloudera's list of supported distr

Re: decommissioning node woes

2011-03-21 Thread Steve Loughran
On 19/03/11 16:00, Ted Dunning wrote: Unfortunately this doesn't help much because it is hard to get the ports to balance the load. On Fri, Mar 18, 2011 at 8:30 PM, Michael Segelwrote: With a 1GBe port, you could go 100Mbs for the bandwidth limit. If you bond your ports, you could go higher.

Java programmatic authentication of Hadoop Kerberos

2011-03-21 Thread Sari1983
Hi, Kerberos has been configured for our Hadoop file system. I wish to do the authentication through a Java program. I'm able to perform the authentication using a normal java application. But, if I've any HDFS operations in the Java program, it's succeeded in reading the Keytab file, but showing

Java programmatic authentication of Hadoop Kerberos

2011-03-21 Thread Sari1983
Hi, Kerberos has been configured for our Hadoop file system. I wish to do the authentication through a Java program. I'm able to perform the authentication using a normal java application. But, if I've any HDFS operations in the Java program, it's succeeded in reading the Keytab file, but showing

Re: Installing Hadoop on Debian Squeeze

2011-03-21 Thread Dieter Plaetinck
On Thu, 17 Mar 2011 19:33:02 +0100 Thomas Koch wrote: > Currently my advise is to use the Debian packages from cloudera. That's the problem, it appears there are none. Like I said in my earlier mail, Debian is not in Cloudera's list of supported distros, and they do not have a repository for Deb