Re: Pipelining data from map to reduce

2010-03-04 Thread Ashutosh Chauhan
Bharath, This idea is kicking around in academia.. not made into apache yet.. https://issues.apache.org/jira/browse/MAPREDUCE-1211 You can get a working prototype from: http://code.google.com/p/hop/ Ashutosh On Thu, Mar 4, 2010 at 09:06, E. Sammer wrote: > On 3/4/10 12:00 PM, bharath v wrote:

Re: Unexpected empty result problem (zero-sized part-### files)?

2010-02-20 Thread Ashutosh Chauhan
A log file with a name like pig_1234567890.log must be sitting in the directory from where you launched your pig script. Can you send its content ? Ashutosh On Sat, Feb 20, 2010 at 16:41, jiang licht wrote: > I have a pig script as follows (see far below). It loads 2 data sets, > perform some f

Re: what does it mean when a job fails at 100%?

2009-11-13 Thread Ashutosh Chauhan
Hi Mike, This % reported represents % of records read by framework not % of records processed. So, for sake of example lets say you only have one record in the data, framework will report 100% as soon as it is read even though you might be doing lot of processing on that record and that processing

Re: Hadoop dfs can't allocate memory with enough hard disk space when data gets huge

2009-10-19 Thread Ashutosh Chauhan
You might be hitting into the problem of "small-files". This has been discussed multiple times on the list. Greping through archives will help. Also http://www.cloudera.com/blog/2009/02/02/the-small-files-problem/ Ashutosh On Sun, Oct 18, 2009 at 22:57, Kunsheng Chen wrote: > I and running a ha

Re: Error in FileSystem.get()

2009-10-15 Thread Ashutosh Chauhan
Each node reads its own conf files (mapred-site.xml, hdfs-site.xml etc.) Make sure your configs are consistent on all nodes across entire cluster and are pointing to correct fs. Hope it helps, Ashutosh On Thu, Oct 15, 2009 at 16:36, Bhupesh Bansal wrote: > Hey Folks, > > I am seeing a very weir

Re: Pig and Hive on the same data?

2009-09-30 Thread Ashutosh Chauhan
Hi Chris, Pig doesn't mandate a Ctrl-A or any other character to be used as field delimiter. You can tell Pig which delimiter to use. For example, you can specify Ctrl-A as field delimiter as following: A = load 'mydata' using PigStorage('\u0001'); If you don't specify any delimiter, e.g. A = l

Re: Pregel

2009-09-03 Thread Ashutosh Chauhan
Hamburg is here: http://wiki.apache.org/hadoop/Hamburg Ashutosh On Thu, Sep 3, 2009 at 16:42, Mark Kerzner wrote: > Ok, then, I can join hamburg. Where is it? > > On Thu, Sep 3, 2009 at 3:12 PM, Amandeep Khurana wrote: > > > There is another project- Hamburg - on similar lines. Check that out

what to make of ava.io.IOException: Premeture EOF from inputStream ?

2009-07-27 Thread Ashutosh Chauhan
Hi, In my map-reduce job, I see following stacktrace in syslog logs of my map tasks. This repeats at nearly 10 minute intervals for about 4-5 times and eventually map tasks gets completed successfully. I am not sure what to make of this stacktrace. Are there repeated trials and then it eventually