Hi,
one grammar is responsible for lexing (tokenization), the second one for
parsing and the last one for generating the plan from a tree grammar (the
AST output of the parsing).
These are the main ones, but there are also some "minor" ones like AST
validator or AST printer.
If you had to add a ke
I've taken some time to understand how a Logical Plan progresses to a Physical
and MR Plan (thanks for the boost, Alan!)
My next question is centered around Logical Plan generation. If one were to
add a new keyword (sticking with the theme in my last message, say,
SUPERSPECIALJOIN), that keywo
Awesome -- I really appreciate that insight. Is that recorded anywhere? If
not, then perhaps I'll spend some time writing about how these things are
implemented in the wiki for when others come along with similar questions.
Thanks, Alan!
This e-mail is intended solely for the above-mentione
Many operators, such as join and group by, are not implemented by a single
physical operation. Also, they are spread through the code as they have
logical components and physical components. The logical components of join are
in org.apache.pig.newplan.logical.relational.LOJoin.java. That gets
Thanks Russell -- That's really useful.
Just for kicks and giggles: Where would I look in the code base to see how the
JOIN keyword is implemented? I've found the built in functions, but not the
keywords (JOIN, GROUP, etc). Perhaps that would give me some hints. Perhaps
it'll show me that a
You can write an EvalFunc UDF that depends on a sort, and there are
several in piggybank that do so. COR (the correlate UDF) is such an
example. You call these UDFs on a relation after ordering them.
For example:
answers = foreach (group data by key)
{
sorted = order data by value;
generate m
Hi,
I'm fairly new to writing UDFs and Pig in general. I want to be able to write
a UDF that can take advantage of MapReduce's sorting of data. Specifically,
I'm trying to conceive how I'd write a UDF to do a specialized join or a pivot.
In both cases, sorting would be useful. EvalFunc seems