[ 
https://issues.apache.org/jira/browse/PIG-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169357#comment-13169357
 ] 

[email protected] commented on PIG-2422:
----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3196/
-----------------------------------------------------------

Review request for pig.


Summary
-------

In many cases, we have seen users debugging a lot of time for simple schema 
definition mistakes in jython udfs.
I believe adding some log messages could help users in this case.

For Ex, the below script results in exception (field row doesnt exist) in line 
P since the schema definition is not annotated properly (@ is missing). Also 
There are no error messages from the python.

register 'schemabug.py' using jython as schemabug;
A = load 'schemabugA.txt' using PigStorage() as ( x : chararray,y : chararray, 
z : long );
M = group A by (x, y);
N = foreach M generate schemabug.numberrows(A) as udfout; 
O = foreach N generate FLATTEN(udfout);
P = foreach O generate row.x;
dump P;

schemabug.py
------------
outputSchema("numberrows:bag{rownum:tuple(row:tuple(x:chararray,y:chararray,z:long),number:long)}")
def numberrows(inBag):
outBag = []
number = 0
for row in inBag:
number = number + 1
tup = (row, number)
outBag.append(tup)
return outBag


This addresses bug PIG-2422.
    https://issues.apache.org/jira/browse/PIG-2422


Diffs
-----


Diff: https://reviews.apache.org/r/3196/diff


Testing
-------


Thanks,

Vivek


                
> Add log messages for Jython schema definitions
> ----------------------------------------------
>
>                 Key: PIG-2422
>                 URL: https://issues.apache.org/jira/browse/PIG-2422
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.9.1, 0.10
>            Reporter: Vivek Padmanabhan
>            Assignee: Vivek Padmanabhan
>            Priority: Minor
>             Fix For: 0.9.1
>
>         Attachments: PIG-2422_1.patch
>
>
> In many cases, we have seen users debugging a lot of time for simple schema 
> definition mistakes in jython udfs.
> I believe adding some log messages could help users in this case.
> For Ex, the below script results in exception (field row doesnt exist) in 
> line P since the schema definition is not annotated properly (@ is missing). 
> Also There are no error messages from the python.
> register 'schemabug.py' using jython as schemabug;
> A = load 'schemabugA.txt' using PigStorage() as (  x : chararray,y : 
> chararray, z : long );
> M = group A by (x, y);
> N = foreach M generate schemabug.numberrows(A) as udfout; 
> O = foreach N generate FLATTEN(udfout);
> P = foreach O generate row.x;
> dump P;
> schemabug.py
> ------------
> outputSchema("numberrows:bag{rownum:tuple(row:tuple(x:chararray,y:chararray,z:long),number:long)}")
> def numberrows(inBag):
>   outBag = []
>   number = 0
>   for row in inBag:
>     number = number + 1
>     tup = (row, number)
>     outBag.append(tup)
>   return outBag

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to