date:20110901

Re: tutorial on Hadoop/Hbase utility classes

2011-09-01 Thread Arun C Murthy

Thanks for putting this up, it's very useful. I'd encourage you to contribute this a documentation patch so that you help everyone who comes to hadoop.apache.org, plus you can be a part of the project and a contributor. I can help with the mechanics - here is a link to help you get started:

Re: Binary content

2011-09-01 Thread Dieter Plaetinck

On Wed, 31 Aug 2011 08:44:42 -0700 Mohit Anchlia mohitanch...@gmail.com wrote: Does map-reduce work well with binary contents in the file? This binary content is basically some CAD files and map reduce program need to read these files using some proprietry tool extract values and do some

Timer jobs

2011-09-01 Thread Per Steffensen

Hi I use hadoop for a MapReduce job in my system. I would like to have the job run very 5th minute. Are there any distributed timer job stuff in hadoop? Of course I could setup a timer in an external timer framework (CRON or something like that) that invokes the MapReduce job. But CRON is

Re: Timer jobs

2011-09-01 Thread Ronen Itkin

Hi Try to use Oozie for job coordination and work flows. On Thu, Sep 1, 2011 at 12:30 PM, Per Steffensen st...@designware.dk wrote: Hi I use hadoop for a MapReduce job in my system. I would like to have the job run very 5th minute. Are there any distributed timer job stuff in hadoop? Of

Re: Hadoop with Netapp

2011-09-01 Thread Steve Loughran

On 25/08/11 08:20, Sagar Shukla wrote: Hi Hakan, Please find my comments inline in blue : -Original Message- From: Hakan (c)lter [mailto:hakanil...@gmail.com] Sent: Thursday, August 25, 2011 12:28 PM To: common-user@hadoop.apache.org Subject: Hadoop with Netapp Hi

Re: Turn off all Hadoop logs?

2011-09-01 Thread Steve Loughran

On 29/08/11 20:31, Frank Astier wrote: Is it possible to turn off all the Hadoop logs simultaneously? In my unit tests, I don’t want to see the myriad “INFO” logs spewed out by various Hadoop components. I’m using: ((Log4JLogger) DataNode.LOG).getLogger().setLevel(Level.OFF);

Re: Timer jobs

2011-09-01 Thread Per Steffensen

Hi Thanks a lot for pointing me to Oozie. I have looked a little bit into Oozie and it seems like the component triggering jobs is called Coordinator Application. But I really see nowhere that this Coordinator Application doesnt just run on a single machine, and that it will therefore not

Re: Timer jobs

2011-09-01 Thread Ronen Itkin

If I get you right you are asking about Installing Oozie as Distributed and/or HA cluster?! In that case I am not familiar with an out of the box solution by Oozie. But, I think you can made up a solution of your own, for example: Installing Oozie on two servers on the same partition which will be

I got the problem from Map output lost

2011-09-01 Thread Tu Tu

From this week,My Hadoop caught his problem with information as following: Lost task tracker: tracker_rsync.host01:localhost/127.0.0.1:40759 Map output lost, rescheduling: getMapOutput(attempt_201108021855_6734_m_97_1,2002) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could

Problem with Python + Hadoop: how to link .so outside Python?

2011-09-01 Thread Xiong Deng

Hi, I have successfully installed scipy on my Python 2.7 on my local Linux, and I want to pack my Python2.7 (with scipy) onto Hadoop and run my Python MapReduce scripts, like this: 20 ${HADOOP_HOME}/bin/hadoop streaming \$ 21 -input ${input} \$ 22 -output ${output} \$ 23

Re: Timer jobs

2011-09-01 Thread Alejandro Abdelnur

[moving common-user@ to BCC] Oozie is not HA yet. But it would be relatively easy to make it. It was designed with that in mind, we even did a prototype. Oozie consists of 2 services, a SQL database to store the Oozie jobs state and a servlet container where Oozie app proper runs. The solution

Re: Timer jobs

2011-09-01 Thread Per Steffensen

Thanks for your response. See comments below. Regards, Per Steffensen Alejandro Abdelnur skrev: [moving common-user@ to BCC] Oozie is not HA yet. But it would be relatively easy to make it. It was designed with that in mind, we even did a prototype. Ok, so if it isnt HA out-of-the-box I

Re: Creating a hive table for a custom log

2011-09-01 Thread Brock Noland

Hi, On Thu, Sep 1, 2011 at 9:08 AM, Raimon Bosch raimon.bo...@gmail.com wrote: Hi, I'm trying to create a table similar to apache_log but I'm trying to avoid to write my own map-reduce task because I don't want to have my HDFS files twice. So if you're working with log lines like this:

Re: Timer jobs

2011-09-01 Thread Tharindu Mathew

On Thu, Sep 1, 2011 at 7:58 PM, Per Steffensen st...@designware.dk wrote: Thanks for your response. See comments below. Regards, Per Steffensen Alejandro Abdelnur skrev: [moving common-user@ to BCC] Oozie is not HA yet. But it would be relatively easy to make it. It was designed with

Re: Timer jobs

2011-09-01 Thread Per Steffensen

Well I am not sure I get you right, but anyway, basically I want a timer framework that triggers my jobs. And the triggering of the jobs need to work even though one or two particular machines goes down. So the timer triggering mechanism has to live in the cluster, so to speak. What I dont

Re: Binary content

2011-09-01 Thread Mohit Anchlia

On Thu, Sep 1, 2011 at 1:25 AM, Dieter Plaetinck dieter.plaeti...@intec.ugent.be wrote: On Wed, 31 Aug 2011 08:44:42 -0700 Mohit Anchlia mohitanch...@gmail.com wrote: Does map-reduce work well with binary contents in the file? This binary content is basically some CAD files and map reduce

Re: Timer jobs

2011-09-01 Thread Tharindu Mathew

In Hadoop, if the client that triggers the job fails, is there a way to recover and another client to submit the job? On Thu, Sep 1, 2011 at 8:44 PM, Per Steffensen st...@designware.dk wrote: Well I am not sure I get you right, but anyway, basically I want a timer framework that triggers my

Re: Binary content

2011-09-01 Thread Owen O'Malley

On Thu, Sep 1, 2011 at 8:37 AM, Mohit Anchlia mohitanch...@gmail.comwrote: Thanks! Is there a specific tutorial I can focus on to see how it could be done? Take the word count example and change its output format to be SequenceFileOutputFormat.

Re: Timer jobs

2011-09-01 Thread Vitalii Tymchyshyn

01.09.11 18:14, Per Steffensen написав(ла): Well I am not sure I get you right, but anyway, basically I want a timer framework that triggers my jobs. And the triggering of the jobs need to work even though one or two particular machines goes down. So the timer triggering mechanism has to live

cross product of 2 data sets

2011-09-01 Thread Marc Sturlese

Hey there, I would like to do the cross product of two data sets, any of them feeds in memory. I've seen pig has the cross operation. Can someone please explain me how it implements it? -- View this message in context:

Re: cross product of 2 data sets

2011-09-01 Thread Alan Gates

http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html search on cross matches Alan. On Sep 1, 2011, at 11:44 AM, Marc Sturlese wrote: Hey there, I would like to do the cross product of two data sets, any of them feeds in memory. I've seen pig has the cross operation. Can

Re: Timer jobs

2011-09-01 Thread Per Steffensen

Vitalii Tymchyshyn skrev: 01.09.11 18:14, Per Steffensen написав(ла): Well I am not sure I get you right, but anyway, basically I want a timer framework that triggers my jobs. And the triggering of the jobs need to work even though one or two particular machines goes down. So the timer

MultipleOutputs - Create multiple files during output

2011-09-01 Thread modemide

Hi all, I was wondering if anyone was familiar with this class. I want to create multiple output files during my reduce. My input files will consist of name1action1date1 name1action2date2 name1action3date3 name2action1date1 name2action2date2 name2action3date3 My goal is to create files with

Namenode not starting

2011-09-01 Thread abhishek sharma

Hi all, I am trying to install Hadoop (release 0.20.203) on a machine with CentOS. When I try to start HDFS, I get the following error. machine-name: Unrecognized option: -jvm machine-name: Could not create the Java virtual machine. Any idea what might be the problem? Thanks, Abhishek

Re: Namenode not starting

2011-09-01 Thread abhishek sharma

Hi Hailong, I have installed JDK and set JAVA_HOME correctly (as far as I know). Output of java -version is: java version 1.6.0_04 Java(TM) SE Runtime Environment (build 1.6.0_04-b12) Java HotSpot(TM) Server VM (build 10.0-b19, mixed mode) I also have another version installed 1.6.0_27 but get

Re: Namenode not starting

2011-09-01 Thread abhishek sharma

Actually, I found the reason. I am running HDFS as root and there is a bug that has recently been fixed. https://issues.apache.org/jira/browse/HDFS-1943 Thanks, Abhishek On Thu, Sep 1, 2011 at 6:25 PM, Ravi Prakash ravihad...@gmail.com wrote: Hi Abhishek, Try reading through the shell

Re: TestDFSIO failure

2011-09-01 Thread Ken Krugler

Hi Matt, On Jun 20, 2011, at 1:46pm, GOEKE, MATTHEW (AG/1000) wrote: Has anyone else run into issues using output compression (in our case lzo) on TestDFSIO and it failing to be able to read the metrics file? I just assumed that it would use the correct decompression codec after it finishes

Re: MultipleOutputs - Create multiple files during output

2011-09-01 Thread Stan Rosenberg

Hi Tim, You could create a custom HashPartitioner so that all key,value pairs denoting the actions of the same user end up in the same reducer; then you need only one output file per reducer. Btw, how large are the output files? make sure you don't end up creating a lot of small files, i.e.,

Re: tutorial on Hadoop/Hbase utility classes

Re: Binary content

Timer jobs

Re: Timer jobs

Re: Hadoop with Netapp

Re: Turn off all Hadoop logs?

Re: Timer jobs

Re: Timer jobs

I got the problem from Map output lost

Problem with Python + Hadoop: how to link .so outside Python?

Re: Timer jobs

Re: Timer jobs

Re: Creating a hive table for a custom log

Re: Timer jobs

Re: Timer jobs

Re: Binary content

Re: Timer jobs

Re: Binary content

Re: Timer jobs

cross product of 2 data sets

Re: cross product of 2 data sets

Re: Timer jobs

MultipleOutputs - Create multiple files during output

Namenode not starting

Re: Namenode not starting

Re: Namenode not starting

Re: TestDFSIO failure

Re: MultipleOutputs - Create multiple files during output

28 matches

Site Navigation

Mail list logo

Footer information