Re: Cannot run program autoreconf

2012-09-25 Thread Harsh J
Hi, It means you do not have the autoconf packages installed on your OS, or don't have it on PATH. On Tue, Sep 25, 2012 at 7:54 PM, Pedro Sá da Costa psdc1...@gmail.com wrote: I'm trying to compile the mapreduce, but I get the error: create-native-configure: BUILD FAILED

RE: libhdfs install dep

2012-09-25 Thread Leo Leung
Rodrigo, Assuming you are asking for hadoop 1.x You are missing the hadoop-*libhdfs* rpm. Build it or get it from the vendor you got your hadoop from. -Original Message- From: Pastrana, Rodrigo (RIS-BCT) [mailto:rodrigo.pastr...@lexisnexis.com] Sent: Monday, September 24, 2012

Re: Passing Command-line Parameters to the Job Submit Command

2012-09-25 Thread Hemanth Yamijala
By java environment variables, do you mean the ones passed as -Dkey=value ? That's one way of passing them. I suppose another way is to have a client side site configuration (like mapred-site.xml) that is in the classpath of the client app. Thanks Hemanth On Tue, Sep 25, 2012 at 12:20 AM, Varad

Re: Passing Command-line Parameters to the Job Submit Command

2012-09-25 Thread Bertrand Dechoux
Building on Hemanth answer : at the end your variables should be in the job.xml (the second file needed with the jar to run a job). Building this job.xml can be done in various way but it does inherit from your local configuration and you can change it using the java API but at the end it is only

Re: Passing Command-line Parameters to the Job Submit Command

2012-09-25 Thread Mohit Anchlia
You could always write your own properties file and read it as resource. On Tue, Sep 25, 2012 at 12:10 AM, Hemanth Yamijala yhema...@gmail.comwrote: By java environment variables, do you mean the ones passed as -Dkey=value ? That's one way of passing them. I suppose another way is to have a

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

2012-09-25 Thread Chen He
Hi Sudha Good question. First of all, you need to specify clearly about your Hadoop environment, (pseudo distributed or real cluster) Secondly, you need to clearly understand how hadoop load job's jar file to all worker nodes, it only copy the jar file to worker nodes. It does not contain the

Re: libhdfs install dep

2012-09-25 Thread Brian Bockelman
Hi Rodrigo, The hadoop RPMs are a bit deficient compared to those you would find from your Linux distribution. For example, look at the Apache RPM you used: [bbockelm@rcf-bockelman ~]$ rpm -qp http://mirrors.sonic.net/apache/hadoop/common/hadoop-1.0.3/hadoop-1.0.3-1.x86_64.rpm --provides

Re: Python + hdfs written thrift sequence files: lots of moving parts!

2012-09-25 Thread Harsh J
Hi Jay, This may be off-topic to you, but I feel its related: Use Avro DataFiles. There's Python support already available, as well as several other languages. On Tue, Sep 25, 2012 at 10:57 PM, Jay Vyas jayunit...@gmail.com wrote: Hi guys! Im trying to read some hadoop outputted thrift files

Re: Python + hdfs written thrift sequence files: lots of moving parts!

2012-09-25 Thread Jay Vyas
Thanks harsh: In any case, I'm really curious about how it is that sequence file headers are formatted, as the documentation in the SequenceFile javadocs seems to be very generic. To make my questions more concrete: 1) I notice that the FileSplit class has a getStart() function. It is

Re: libhdfs install dep

2012-09-25 Thread Harsh J
I'd recommend using the packages for Apache Hadoop from Apache Bigtop (https://cwiki.apache.org/confluence/display/BIGTOP). The ones upstream (here) aren't maintained as much these days. On Tue, Sep 25, 2012 at 6:27 PM, Pastrana, Rodrigo (RIS-BCT) rodrigo.pastr...@lexisnexis.com wrote: Leo, yes

FW: [Hadoop-SecureMode-Webconsole]Mismatch in documentation and Code

2012-09-25 Thread Brahma Reddy Battula
From: Brahma Reddy Battula Sent: Monday, September 24, 2012 11:12 PM To: u...@hadoop.apache.org Subject: [Hadoop-SecureMode-Webconsole]Mismatch in documentation and Code Dear All, I started cluster in secure-mode by going through following document.

Re: Join-package combiner number of input and output records the same

2012-09-25 Thread Sigurd Spieckermann
I think I have tracked down the problem to the point that each split only contains one big key-value pair and a combiner is connected to a map task. Please correct me if I'm wrong, but I assume each map task takes one split and the combiner operates only on the key-value pairs within one split.

Re: Join-package combiner number of input and output records the same

2012-09-25 Thread Björn-Elmar Macek
Hi, i had this problem once too. Did you properly overwrite the reduce method with the @override annotation? Does your reduce method use OutputCollector or Context for gathering outputs? If you are using current version, it has to be Context. The thing is: if you do NOT override the standart

Re: Join-package combiner number of input and output records the same

2012-09-25 Thread Björn-Elmar Macek
Ups, sorry. You are using standart implementations? I dont know whats happening then. Sorry. But the fact, that your inputsize equals your outputsize in a join process reminded me too much of my own problems. Sorry for confusion, i may have caused. Best, Am 25.09.2012 um 15:32 schrieb

Cannot run program autoreconf

2012-09-25 Thread Pedro Sá da Costa
I'm trying to compile the mapreduce, but I get the error: create-native-configure: BUILD FAILED /home/xeon/Projects/hadoop-1.0.3/build.xml:618: Execute failed: java.io.IOException: Cannot run program autoreconf (in directory /home/xeon/Projects/hadoop-1.0.3/src/native): java.io.IOException:

Re: Join-package combiner number of input and output records the same

2012-09-25 Thread Sigurd Spieckermann
I'm not doing a conventional join, but in my case one split/file consists of only one key-value pair. I'm not using default mapper/reducer implementations. I'm guessing the problem is that a combiner is only applied to the output of a map task which is an instance of the mapper class, but one

Re: Help on a Simple program

2012-09-25 Thread Bejoy Ks
Hi If you don't want either key or value in the output, just make the corresponding data types as NullWritable. Since you just need to filter out a few records/itemd from your logs, reduce phase is not mandatory just a mappper would suffice your needs. From your mapper just output the records

Re: Detect when file is not being written by another process

2012-09-25 Thread Bejoy Ks
Hi Peter AFAIK oozie has a mechanism to achieve this. You can trigger your jobs as soon as the files are written to a certain hdfs directory. On Tue, Sep 25, 2012 at 10:23 PM, Peter Sheridan psheri...@millennialmedia.com wrote: These are log files being deposited by other processes, which

Re: Detect when file is not being written by another process

2012-09-25 Thread Andy Isaacson
On Tue, Sep 25, 2012 at 9:28 AM, Peter Sheridan psheri...@millennialmedia.com wrote: We're using Hadoop 1.0.3. We need to pick up a set of large (4+GB) files when they've finished being written to HDFS by a different process. The common way to solve this problem is to modify the writing