Re: Could we use different output Format for the Mapper and Combiner?

2011-02-16 Thread Stanley Xu
Hi Alain, I thought Hash is correct for I found that from the old mails. http://search-hadoop.com/m/eSd3VxxvkC1/combiner+input+output+format&subj=What+s+a+valid+combiner+ And per my test, the combiner probably could not has the different output type with the mapper, at least until v 0.20.3. Tha

Re: use DistributedCache to add many files to class path

2011-02-16 Thread Alejandro Abdelnur
Lei Liu, You have a cut&paste error the second addition should use 'tairJarPath' but it is using the 'jeJarPath' Hope this helps. Alejandro On Thu, Feb 17, 2011 at 11:50 AM, lei liu wrote: > I use DistributedCache to add two files to class path, exampe below code > : >String jeJarPa

use DistributedCache to add many files to class path

2011-02-16 Thread lei liu
I use DistributedCache to add two files to class path, exampe below code : String jeJarPath = "/group/aladdin/lib/je-4.1.7.jar"; DistributedCache.addFileToClassPath(new Path(jeJarPath), conf); String tairJarPath = "/group/aladdin/lib/tair-aladdin-2.3.1.jar" Distrib

How to configure multiple hadoop instances?

2011-02-16 Thread Juwei Shi
Hi, Does there any one have the experience of configuring mulitple hadoop instances on the same cluster? I changed the port numbers, temp directory, local storage directory but the running instances still conflict. The second instance can not be up correctly. Thanks , -- - Juwei

Re: hadoop fs -put vs writing text files to hadoop as sequence files

2011-02-16 Thread Chase Bradford
We use sequence files for storing text data, and you definitely notice the cost of compressing client side while streaming to hdfs. if I remember correctly, it took about 10x. That drove us to using writer treads that fed off a single input stream a few thousand lines at a time, and wrote to a

seattle hadoop announce: meeting February 17th 2011 @ 7:15 pm, HBase, Google NGram, Python

2011-02-16 Thread sean jensen-grey
Hello Fellow Mappers and Reducers, We are meeting at 7:15 pm on February 17th at the University Heights Community Center 5031 University Way NE Seattle WA 98105 Room -- BASEMENT! The meetings are informal and highly conversational. If you have questions about Hadoop and map reduce this is a gre

how/where to set metadata for a sequence file ?

2011-02-16 Thread Mapred Learn
Hi, I have text file data that I want to upload to hdfs as sequence file. So, where can I define the metadata for this file so that users accessing it as sequence file can undersrand and read it ? -thks JJ

hadoop fs -put vs writing text files to hadoop as sequence files

2011-02-16 Thread Mapred Learn
Hi, I have to upload some terabytes of data that is text files. What would be good option to do so: i) using hadoop fs -put to copy text files directly on hdfs. ii) copying text files as sequence files on hdfs ? What would be extra time in this case as opposed to (i). Thanks, Jimmy

Re: Is it possible to determine the source of a value in the Mapper?

2011-02-16 Thread Harold Lim
Hi Ben, You can do something like this: ((FileSplit) context.getInputSplit()).getPath() -Harold --- On Wed, 2/16/11, Benjamin Hiller wrote: From: Benjamin Hiller Subject: Is it possible to determine the source of a value in the Mapper? To: mapreduce-user@hadoop.apache.org Date: Wednesday,

Re: Is it possible to determine the source of a value in the Mapper?

2011-02-16 Thread Benjamin Hiller
Thank you, that works fine. =) - Original Message - From: Alex Kozlov To: mapreduce-user@hadoop.apache.org Cc: Benjamin Hiller Sent: Wednesday, February 16, 2011 10:37 PM Subject: Re: Is it possible to determine the source of a value in the Mapper? There is a way to get

Re: Is it possible to determine the source of a value in the Mapper?

2011-02-16 Thread Alex Kozlov
There is a way to get the file name in the new mapreduce API: fileName = ((FileSplit) context.getInputSplit()).getPath().toString(); You usually do it in the setup() method. On Wed, Feb 16, 2011 at 1:32 PM, Benjamin Hiller < benjamin.hil...@urz.uni-heidelberg.de> wrote: > Hi, > > is it possibl

How read compressed files?

2011-02-16 Thread Pedro Costa
Hi, 1 - I'm trying to read parts of a compressed file to generate message digests, but I can't fetch the right parts. I searched for an example that read compressed files, but I can't find one. As I've 3 partition in my example, below are the indexes of the file: raw bytes: 54632 / offset: 0 / par

Is it possible to determine the source of a value in the Mapper?

2011-02-16 Thread Benjamin Hiller
Hi, is it possible to determine the source (the filename for example) of a key-value pair in the mapper? What I need to do is to differentiate between two different sources, although the records of each source are of the same kind (so I can't differentiate between the sources by looking at the

Re: Why jobtracker.jsp can safely call a non-thread-safe method of JT?

2011-02-16 Thread Mahadev Konar
These are all getter apis which if you look through the implementations call some synchronized api's. Its just that they dont need to synchronized across all the completed/retired/running jobs for the web ui. thanks mahadev On Wed, Feb 16, 2011 at 12:47 AM, Min Zhou wrote: > Anyone can help me w

RE: Could we use different output Format for the Mapper and Combiner?

2011-02-16 Thread MONTMORY Alain
Hi, I think you could use different type for mapper and combiner, they are not linked together but suppose : maper < KeyTypeA, ValuetypeB> reducer < KeyTypeC, ValuetypeD> in your mapper you have to emit : public void map(KeyTypeA, ValuetypeB) { context.write(KeyTypeC

RE: unsubscribe

2011-02-16 Thread Fitzpatrick, Kenneth
Thanks! From: simon [mailto:randyh...@gmail.com] Sent: Wednesday, February 16, 2011 7:16 AM To: mapreduce-user@hadoop.apache.org Subject: Re: unsubscribe Hi, Kenneth~ this will work to unsubscribe the mailing list please send something to the following address mapreduce-user-unsubscr...

Re: unsubscribe

2011-02-16 Thread simon
Hi, Kenneth~ this will work to unsubscribe the mailing list please send something to the following address mapreduce-user-unsubscr...@hadoop.apache.org and you will get it Best Regards, Simon Hsu 2011/2/16 Fitzpatrick, Kenneth > unsubscribe >

unsubscribe

2011-02-16 Thread Fitzpatrick, Kenneth
unsubscribe

Re: Could we use different output Format for the Mapper and Combiner?

2011-02-16 Thread Harsh J
The combiner must "have the same input and output key types and the same input and output value types" (as per the docs for setting one.) The combined outputs are treated as typical map outputs after processing, so that the reducer still applies on it properly. For this to work, your combiner can'

Could we use different output Format for the Mapper and Combiner?

2011-02-16 Thread Stanley Xu
Dear all, I am writing a map-reduce job today. Which I hope I could use different format for the Mapper and Combiner. I am using the Text as the format of the Mapper and MapWritable as the format of the format. But it looks the hadoop didn't support that yet? I have some code like the following:

Re: ProgramDriver JobControl in 20.2 API

2011-02-16 Thread Harsh J
Hey, On Wed, Feb 16, 2011 at 2:02 PM, Joachim Van den Bogaert wrote: > Hi, > > > > I have a couple of questions: > > > > 1.   What is the best way to create a composed MapReduce job in the 20.2 > API? Can you use JobControl, which is still located in the mapred namespace, > or is it better to

Re: Why jobtracker.jsp can safely call a non-thread-safe method of JT?

2011-02-16 Thread Min Zhou
Anyone can help me with this? Thanks, Min On Tue, Feb 15, 2011 at 11:16 AM, Min Zhou wrote: > Hi all, > > Hadoop JobTracker's http info server provides running/failed/completed > job informations on the web through jobtracker.jsp.  Lines below show > the logic how the web retries those informati

ProgramDriver JobControl in 20.2 API

2011-02-16 Thread Joachim Van den Bogaert
Hi, I have a couple of questions: 1. What is the best way to create a composed MapReduce job in the 20.2 API? Can you use JobControl, which is still located in the mapred namespace, or is it better to avoid mixing API's? 2. Has anyone ever worked with composed MapReduce jobs on