RE: Good resource to learn .20 API?

2012-02-08 Thread MONTMORY Alain
Hello, I Have the same question at the begining of year 2010 when we start using Hadoop...and we start using NewApi as the old ones are marked depreaceted... And we spent lot of time using NewApi because not all the feature are not ported in the NewApi (MultipleInputFormat for example..). Foll

RE: How to count records in FileInputFormat (MapFile, SequenceFile ?)

2011-09-09 Thread MONTMORY Alain
Anyone which have an advice ? or i am not at the right place ? or mayde my question is stupid... Thank You Alain [@@THALES GROUP RESTRICTED@@] De : MONTMORY Alain [mailto:alain.montm...@thalesgroup.com] Envoyé : jeudi 8 septembre 2011 10:58 À : mapreduce-user@hadoop.apache.org Objet : How to

How to count records in FileInputFormat (MapFile, SequenceFile ?)

2011-09-08 Thread MONTMORY Alain
Hi everyBody, In my application the treatment of the whole dataset (that we called CycleWorkflow) may have a duration of several weeks and we want (mandatory) to split the CycleWorkflow into multiple DayWorkflow. The actual system use a traditionnal RDBMS approach and use SQL OFFSET LIMIT to sp

RE: From a newbie: Questions and will MapReduce fit our needs

2011-08-26 Thread MONTMORY Alain
Hi, I am going to try to response to your response in the text. I am not an hadoop expert but we are facing the same kind of problem (dealing with file which are external to HDFS) in our project and we use hadoop. [@@THALES GROUP RESTRICTED@@] -Message d'origine- De : Per Steffensen [

RE: MR 0.20.2 job chaining

2011-07-26 Thread MONTMORY Alain
Hello, You can also use Cascading API (http://www.cascading.org/) which greatly simplify the Job chainning. In Thales we try both MR native and Cacading approach and we obtain very good results (productivity and performance) using cascading... regards [@@THALES GROUP RESTRICTED@@] -Messa

RE: easiest way to install hadoop

2011-02-23 Thread MONTMORY Alain
Hi, For my point of view it is not a trivial question... The latest "stable release" is 0.20.2 (embedded in cloudera CH3) (and not 0.21)... When you start with hadoop recently (end 2010 for me) you are facing "old API" depreceated, so you start with using new API... But in 0.20.2 not all the n

RE: Could we use different output Format for the Mapper and Combiner?

2011-02-16 Thread MONTMORY Alain
Hi, I think you could use different type for mapper and combiner, they are not linked together but suppose : maper < KeyTypeA, ValuetypeB> reducer < KeyTypeC, ValuetypeD> in your mapper you have to emit : public void map(KeyTypeA, ValuetypeB) { context.write(KeyTypeC

exception related to logging (0.21.0)

2011-01-18 Thread MONTMORY Alain
Hi everybody, When running (0.21.0) map/reduce jobs i have got this exception : java.lang.NullPointerException at org.apache.hadoop.mapred.TaskLogAppender.flush(TaskLogAppender.java:69) at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:222) at org.apache.hadoop.mapred.Child$4.run(Chil

RE: how to write custom object using M/R

2011-01-14 Thread MONTMORY Alain
Hi, I think you have to put : job.setOutputFormatClass(SequenceFileOutputFormat.class); to make it works.. hopes this help Alain [@@THALES GROUP RESTRICTED@@] De : Joan [mailto:joan.monp...@gmail.com] Envoyé : vendredi 14 janvier 2011 13:58 À : mapreduce-user Objet : how to write cu