I just installed hadoop and pig yesterday on an ubuntu Jaunty box.
Hadoop 0.18.3-6cloudera0.3.0
Apache Pig version 0.6.0 (r910629)
I have hadoop services running and can copy files to the hdfs of
hadoop and ran the test for computing PI.
The problem I'm having is getting pig to recognize my
Actually I think the evaluation is correct
Overriding the method to pass FALSE as parameter, enhances the experiences
Secondly as in
org.apache.pig.Main
grunt.exec(); is called
which in turn calls
parser.parseStopOnError();
which calls
parseStopOnError(false);
Regards
Rohan
Dmitriy Ryaboy wrote
This stuff is a bit convoluted, isn't it?
I think you may be right (I never use registerScript). Try an experiment?
On Thu, Mar 4, 2010 at 11:20 AM, Rohan Rai wrote:
> In addition
>
> Even
> org.apache.pig.tools.pigscript.parser.PigScriptParser (.jj)
> seems to tell that its not running in batch
Amazons extension allows one to write to/read from both s3 or hdfs,
whereas the last time I checked the non amazon version only allows one
to do either or but not both. The MultiStorage in the regular piggy
bank is not written to support the multiple file systems - which would
be my guess
Hi,
I've forgot to respond but what I was thinking is that if there is a need to
have a function that splits a string and returns a tuple, and another that
returns a bag, so if tokenize returns a bag then yes I agree with Bill that
split should return a tuple.
Am I making sense? :)
- O
In addition
Even
org.apache.pig.tools.pigscript.parser.PigScriptParser (.jj)
seems to tell that its not running in batch mode .
Is the interpretation incorrect
Regards
Rohan
Rohan Rai wrote:
Thats what makes it confusing
If you see the the parameter getting passed is true which is sameBatch
o
Hi,
Does anyone have experience running MultiStorage-like UDF on Elastic
MapReduce? Basically we are trying to store output into multiple
directories based on certain field values. We have some success
writing UDF that extends MultiStorage in piggybank to write to HDFS,
but we couldn't get the sam
I kid, I kid...
On Thu, Mar 4, 2010 at 10:34 AM, Alan Gates wrote:
>
> On Mar 4, 2010, at 10:19 AM, Dmitriy Ryaboy wrote:
>
> Thanks to Gerrit and Bill who responded.
>> Unfortunately they said the exact opposite thing so we are still at an
>> impasse :-). Anyone else care to venture an opini
On Mar 4, 2010, at 10:19 AM, Dmitriy Ryaboy wrote:
Thanks to Gerrit and Bill who responded.
Unfortunately they said the exact opposite thing so we are still at an
impasse :-). Anyone else care to venture an opinion?
Cause if Alan and I have a commiter fight, he'll win and y'all will
have to
Thanks to Gerrit and Bill who responded.
Unfortunately they said the exact opposite thing so we are still at an
impasse :-). Anyone else care to venture an opinion?
Cause if Alan and I have a commiter fight, he'll win and y'all will have to
live with unordered split results :)
-D
On Mon, Mar 1,
Thats what makes it confusing
If you see the the parameter getting passed is true which is sameBatch
on which it should ideally not call setBatchOn
if (!mInteractive && !sameBatch) {
setBatchOn();
}
Dmitriy Ryaboy wrote:
Looks like it's on automatically.
Code below is from
Looks like it's on automatically.
Code below is from trunk, but I don't think this changed recently. I got rid
of exception handling for conciseness.
In PigServer:
public void registerScript(String fileName) throws IOException {
GruntParser grunt = new GruntParser(new FileReader(
Thanks Dmitriy
Just a question more
registerScript allows to register a pig script in the embedded mode
So the confusion was does it internally tries to optimize it.
or setBatchOn has to be explicitly called
Regards
Rohan
Dmitriy Ryaboy wrote:
1) Automatically, if you call it right. Look for
1) Automatically, if you call it right. Look for the setBatchOn and
executeBatch methods (I may be slightly off on the method names, going off
memory)
2) The optimizer moves stuff around and may be executing things in a
slightly different order then what you tell it. This can mean pushing up
proj
In addendum
How can I play with Logical Plan?
Rohan Rai wrote:
On using embedded Pig Server and registering a pig script for execution
1) Does Multi Query Optimization happens automatically, or has to
explicitly told so.
2) Logical Plan. What one can infer out of it.
3) Does the Block Size (d
On using embedded Pig Server and registering a pig script for execution
1) Does Multi Query Optimization happens automatically, or has to
explicitly told so.
2) Logical Plan. What one can infer out of it.
3) Does the Block Size (defined in hadoop) has an effect on performance
or the number of m
16 matches
Mail list logo