Re: ERROR 6017: Execution failed, while processing

Alan Gates Mon, 15 Mar 2010 16:05:00 -0700

In your example below how would the results of these load functions beaccessed in your main script?

I certainly see the value of #include plus functions (or #define ifyou prefer). Without functions though you'll have namespace clashes(any relation names used in the imported files will be visible toother imported files and to the main script) and the user will have toknow the name of input and output relations for the imported files sohe can use it subsequently in his script. For example if you had apig script that implemented a certain type of join:


RETURN = join INPUT1 by $0, INPUT2 by $0

Now the user has to know that INPUT1 and INPUT2 must be the names ofhis input relations and that the output relation will be namedRETURN. This is also limited because we can't define which key(s) todo the join on. To make this useful we're going to want a macro orfunction ability so we can pass in names of inputs and otherparameters (like which keys to join on), control the names of results,and have variable scoping.


That said, I'm all for it.  I think it would make Pig must more usable.

Alan.



On Mar 15, 2010, at 2:58 PM, Dmitriy Ryaboy wrote:

Alan, this would be quite useful, as essentially this would allowdevelopersto create functions by writing them into separate pig scripts andcombining
them as necessary.
For example we have code that auto-generates load statements withfairly
complex schemas based on protocol buffers (see
http://www.slideshare.net/hadoopusergroup/twitter-protobufs-and-hadoop-hug-021709).
It would be very handy to be able to say something like

#include common_jars.pig
#include load_tweets.pig
#include load_users.pig

#include filter_nonenglish_tweets.pig
#include geomap_users.pig

.. etc ..

-D
On Mon, Mar 15, 2010 at 2:23 PM, Alan Gates <ga...@yahoo-inc.com>wrote:
On Mar 12, 2010, at 10:36 AM, hc busy wrote:
Is there any work towards something like C languages '#include' inPig? Mylarge pig script is actually developed separately in severalsmaller pig
files. Individually the pig files do not run because they depend on
previous
scripts, but logically they are separate because each step doessomething
different.
Currently the only thing existing along these lines is the execcommand
in grunt. I don't think we're opposed to a #include functionality,we justhaven't done it. However, given that Pig doesn't have functioncalls, andpresumably each Pig Latin script is self contained, it isn't clearto me how
useful it will be.

Alan.

Re: ERROR 6017: Execution failed, while processing

Reply via email to