Bharath,
This idea is kicking around in academia.. not made into apache yet..
https://issues.apache.org/jira/browse/MAPREDUCE-1211
You can get a working prototype from:
http://code.google.com/p/hop/
Ashutosh
On Thu, Mar 4, 2010 at 09:06, E. Sammer wrote:
> On 3/4/10 12:00 PM, bharath v wrote:
A log file with a name like pig_1234567890.log must be sitting in the
directory from where you launched your pig script. Can you send its content
?
Ashutosh
On Sat, Feb 20, 2010 at 16:41, jiang licht wrote:
> I have a pig script as follows (see far below). It loads 2 data sets,
> perform some f
Hi Mike,
This % reported represents % of records read by framework not % of records
processed. So, for sake of example lets say you only have one record in the
data, framework will report 100% as soon as it is read even though you might
be doing lot of processing on that record and that processing
You might be hitting into the problem of "small-files". This has been
discussed multiple times on the list. Greping through archives will help.
Also http://www.cloudera.com/blog/2009/02/02/the-small-files-problem/
Ashutosh
On Sun, Oct 18, 2009 at 22:57, Kunsheng Chen wrote:
> I and running a ha
Each node reads its own conf files (mapred-site.xml, hdfs-site.xml etc.)
Make sure your configs are consistent on all nodes across entire cluster and
are pointing to correct fs.
Hope it helps,
Ashutosh
On Thu, Oct 15, 2009 at 16:36, Bhupesh Bansal wrote:
> Hey Folks,
>
> I am seeing a very weir
Hi Chris,
Pig doesn't mandate a Ctrl-A or any other character to be used as field
delimiter. You can tell Pig which delimiter to use. For example, you can
specify Ctrl-A as field delimiter as following:
A = load 'mydata' using PigStorage('\u0001');
If you don't specify any delimiter, e.g. A = l
Hamburg is here: http://wiki.apache.org/hadoop/Hamburg
Ashutosh
On Thu, Sep 3, 2009 at 16:42, Mark Kerzner wrote:
> Ok, then, I can join hamburg. Where is it?
>
> On Thu, Sep 3, 2009 at 3:12 PM, Amandeep Khurana wrote:
>
> > There is another project- Hamburg - on similar lines. Check that out
Hi,
In my map-reduce job, I see following stacktrace in syslog logs of my
map tasks. This repeats at nearly 10 minute intervals for about 4-5
times and eventually map tasks gets completed successfully.
I am not sure what to make of this stacktrace. Are there repeated
trials and then it eventually