Re: Multiple Input Paths

2009-11-02 Thread L
Mark, Is the structure of both files the same? It makes even more sense to combine the files, if you can, as I have seen a considerable speed up when I've done that (at least when I've had small files to deal with). Lajos Mark Vigeant wrote: Hey, quick question: I'm writing a program that

RE: Multiple Input Paths

2009-11-02 Thread Mark Vigeant
nt: Monday, November 02, 2009 10:27 AM To: common-user@hadoop.apache.org Subject: Re: Multiple Input Paths Mark, Is the structure of both files the same? It makes even more sense to combine the files, if you can, as I have seen a considerable speed up when I've done that (at least when I'v

Re: Multiple Input Paths

2009-11-02 Thread Amogh Vasekar
Mark, Set-up for a mapred job consumes a considerable amount of time and resources and so, if possible a single job is preferred. You can add multiple paths to your job, and if you need different processing logic depending upon the input being consumed, you can use parameter map.input.file in yo

RE: Multiple Input Paths

2009-11-02 Thread Mark Vigeant
Ok, thank you very much Amogh, I will redesign my program. -Original Message- From: Amogh Vasekar [mailto:am...@yahoo-inc.com] Sent: Monday, November 02, 2009 11:45 AM To: common-user@hadoop.apache.org Subject: Re: Multiple Input Paths Mark, Set-up for a mapred job consumes a

RE: Multiple Input Paths

2009-11-02 Thread Vipul Sharma
Mark, were you able to concatenate both the xml files together. What did you do to keep the resulting xml well forned? Regards, Vipul Sharma, Cell: 281-217-0761

RE: Multiple Input Paths

2009-11-03 Thread Mark Vigeant
nse at all? Sorry if it doesn't, feel free to ask more questions Mark -Original Message- From: Vipul Sharma [mailto:sharmavi...@gmail.com] Sent: Monday, November 02, 2009 7:48 PM To: common-user@hadoop.apache.org Subject: RE: Multiple Input Paths Mark, were you able to concatenate bot

RE: Multiple Input Paths

2009-11-03 Thread vipul sharma
Mark, thanks for the pointer. So as far as I understand you are not using hadoop's default split but using your own split of one record as specified by the everything between the starting tag and the end tag in your xml? So in a way you have one map per record? In my case this will not be efficien

Re: Multiple Input Paths

2009-11-03 Thread Amogh Vasekar
Message- From: Vipul Sharma [mailto:sharmavi...@gmail.com] Sent: Monday, November 02, 2009 7:48 PM To: common-user@hadoop.apache.org Subject: RE: Multiple Input Paths Mark, were you able to concatenate both the xml files together. What did you do to keep the resulting xml well forned? Regards, Vi

RE: Multiple Input Paths

2009-11-04 Thread Mark Vigeant
ommon-user@hadoop.apache.org Subject: Re: Multiple Input Paths Hi Mark, A future release of Hadoop will have a MultipleInputs class, akin to MultipleOutputs. This would allow you to have a different inputformat, mapper depending on the path you are getting the split from. It uses special Delegat

Re: Multiple Input Paths

2009-11-08 Thread Tom White
t I'm skeptical that I can even make that work... Is there a > way you know of that I could submit 2 mapper classes to the job? > > -Original Message- > From: Amogh Vasekar [mailto:am...@yahoo-inc.com] > Sent: Wednesday, November 04, 2009 1:50 AM > To: common-user@hado