[jira] [Updated] (PIG-1824) Support import modules in Jython UDF

2011-05-17 Thread Woody Anderson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Woody Anderson updated PIG-1824:


Attachment: 1824_final.patch

ok. my bad!

testcase=full.package.path doesn't even run the test, so tho i claimed that the 
tests were passing, it was in fact simply that junit could run.


Here's a new patch:
there was an extra line that i mistakenly didn't delete when creating the 
re-trunked code.

this patch will pass the tests

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824_final.patch, 1824a.patch, 1824b.patch, 
> 1824c.patch, 1824d.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2077) Project UDF output inside a non-foreach statement fail on 0.8

2011-05-17 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2077:


Attachment: PIG-2077-1.patch

PIG-2077-1.patch is only for Pig 0.8. However, test case should commit to 0.8, 
0.9 and trunk.

> Project UDF output inside a non-foreach statement fail on 0.8
> -
>
> Key: PIG-2077
> URL: https://issues.apache.org/jira/browse/PIG-2077
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.1
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.8.1
>
> Attachments: PIG-2077-1.patch
>
>
> The following script fail on 0.8:
> {code}
> A = load '1.txt' as (tracking_id, day:chararray);
> B = load '2.txt' as (tracking_id, timestamp:chararray);
> C = JOIN A by (tracking_id, day) LEFT OUTER, B by (tracking_id,  
> STRSPLIT(timestamp, ' ').$0);
> explain C;
> {code}
> Error stack:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.get(ArrayList.java:324)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.findReferent(ProjectExpression.java:207)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSchema(ProjectExpression.java:121)
> at 
> org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:193)
> at 
> org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:53)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:75)
> at 
> org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:83)
> at 
> org.apache.pig.newplan.logical.relational.LOJoin.accept(LOJoin.java:149)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:262)
> This is not a problem on 0.9, trunk, since LogicalExpPlanMigrationVistor is 
> dropped in 0.9.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2077) Project UDF output inside a non-foreach statement fail on 0.8

2011-05-17 Thread Daniel Dai (JIRA)
Project UDF output inside a non-foreach statement fail on 0.8
-

 Key: PIG-2077
 URL: https://issues.apache.org/jira/browse/PIG-2077
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.1
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.8.1


The following script fail on 0.8:
{code}
A = load '1.txt' as (tracking_id, day:chararray);
B = load '2.txt' as (tracking_id, timestamp:chararray);
C = JOIN A by (tracking_id, day) LEFT OUTER, B by (tracking_id,  
STRSPLIT(timestamp, ' ').$0);
explain C;
{code}

Error stack:
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.get(ArrayList.java:324)
at 
org.apache.pig.newplan.logical.expression.ProjectExpression.findReferent(ProjectExpression.java:207)
at 
org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSchema(ProjectExpression.java:121)
at 
org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:193)
at 
org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:53)
at 
org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:75)
at 
org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:83)
at 
org.apache.pig.newplan.logical.relational.LOJoin.accept(LOJoin.java:149)
at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:262)

This is not a problem on 0.9, trunk, since LogicalExpPlanMigrationVistor is 
dropped in 0.9.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2029) Inconsistency in Pig Stats reports

2011-05-17 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-2029:
--

Attachment: PIG-2029.patch

> Inconsistency in Pig Stats reports 
> ---
>
> Key: PIG-2029
> URL: https://issues.apache.org/jira/browse/PIG-2029
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.1, 0.9.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.10
>
> Attachments: PIG-2029.patch
>
>
> I have a Pig script which reports varying Stats for the same M/R job (same 
> inputs). Sometimes the PigStats reports all the stats (such as 
> Maps,Reduces,MaxMapTime,MinMapTime,AvgMapTime,MaxReduceTime, MinReduceTime 
> and AvgReduceTime) for the M/R job as 0. Sometimes it reports it correctly.
> Enclosed are the stderr logs for 2 runs, you can notice that for Run 1 
> job_201103091134_556600 from Run 1; has 0 against all the columns whereas in 
> Run 2, Hadoop job job_201104272229_75693 has some valid values. 
> The actual Job Tracker link shows that they are non empty. This points to a 
> bug in the interaction of the PigStats module with the Jobtracker.
> Run 1:
> {quote}
> Job Stats (time in seconds):
> JobId MapsReduces MaxMapTime  MinMapTIme  AvgMapTime  
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201103091134_556458   160 100 552 191 368 1257
> 371 392 
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>DISTINCT,MULTI_QUERY
> job_201103091134_556600   0   0   0   0   0   0   
> 0   0   UNION5  MULTI_QUERY,MAP_ONLY/user/viraj/dir,,
> job_201103091134_556601   7   100 17  8   14  200 
> 15  27  CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER   
> job_201103091134_556602   0   0   0   0   0   0   
> 0   0   CNJOIN3,GNJOIN3,sampleNJOIN3GROUP_BY,COMBINER   
> job_201103091134_556603   0   0   0   0   0   0   
> 0   0   CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER   
> job_201103091134_556604   2   100 13  7   10  34  
> 13  31  CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER   
> job_201103091134_556644   0   0   0   0   0   0   
> 0   0   ONJOIN15SAMPLER 
> job_201103091134_556645   0   0   0   0   0   0   
> 0   0   ONJOIN25SAMPLER 
> job_201103091134_556646   0   0   0   0   0   0   
> 0   0   ONJOIN3 SAMPLER 
> job_201103091134_556654   0   0   0   0   0   0   
> 0   0   ONJOIN19SAMPLER 
> job_201103091134_556662   0   0   0   0   0   0   
> 0   0   ONJOIN19ORDER_BY,COMBINER
> ..
> {quote}
> Run 2:
> {quote}
> Job Stats (time in seconds):
> JobId MapsReduces MaxMapTime  MinMapTIme  AvgMapTime  
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201104272229_75503159 100 484 192 353 396 
> 308 321 
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>DISTINCT,MULTI_QUERY
> job_201104272229_7569318  0   31  14  24  0   
> 0   UNION5 MULTI_QUERY,MAP_ONLY /user/viraj/dir,
> job_201104272229_756947   100 34  13  22  46  
> 20  25  CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER   
> job_201104272229_75695125 100 19  11  15  32  
> 18  26  CNJOIN3,GNJOIN3,sampleNJOIN3GROUP_BY,COMBINER   
> job_201104272229_756981   100 12  12  12  13  
> 9   11  CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER   
> job_201104272229_757022   100 21  5   13  35  
> 22  26  CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER   
> job_201104272229_757241   1   4   4   4   11  
> 11  11  ONJOIN15SAMPLER 
> job_201104272229_757250   0   0   0   0   0   
> 0   ONJOIN25SAMPLER 
> job_201104272229_757266   1   8   6   8   24  
> 24  24  ONJOIN3 SAMPLER 
> job_201104272229_757290   0   0   0   0   0   
> 0   ONJOIN19SAMPLER 
> job_2011

[jira] [Commented] (PIG-2029) Inconsistency in Pig Stats reports

2011-05-17 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035010#comment-13035010
 ] 

Richard Ding commented on PIG-2029:
---

Currently Pig prints out zero (0) if max/min/avg map/reduce time isn't 
available by querying hadoop using hadoop client API. This is misleading. I 
propose that we change those values to 'n/a' as following:

{code}
Job Stats (time in seconds):
JobId   MapsReduces MaxMapTime  MinMapTIme  AvgMapTime  
MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
job_201104272229_434232 2   10  354 220 287 168 149 
163 
IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
   DISTINCT,MULTI_QUERY
job_201104272229_434319 2   0   9   3   6   0   0   
0   UNION5  MULTI_QUERY,MAP_ONLY/user/rding/verifypigstats2-UNION5,
job_201104272229_434320 2   10  n/a n/a n/a n/a n/a 
n/a CNJOIN3,GNJOIN3,sampleNJOIN3GROUP_BY,COMBINER   
job_201104272229_434321 1   10  5   5   5   23  9   
17  CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER   
job_201104272229_434322 2   10  n/a n/a n/a n/a n/a 
n/a CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER   
job_201104272229_434323 2   10  n/a n/a n/a n/a n/a 
n/a CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER   
job_201104272229_434331 2   1   n/a n/a n/a n/a n/a 
n/a ONJOIN15SAMPLER 
job_201104272229_434332 2   1   n/a n/a n/a n/a n/a 
n/a ONJOIN3 SAMPLER 
job_201104272229_434333 1   1   2   2   2   13  13  
13  ONJOIN25SAMPLER 
job_201104272229_434334 1   1   1   1   1   12  12  
12  ONJOIN19SAMPLER 
job_201104272229_434342 1   10  2   2   2   16  8   
11  ONJOIN25ORDER_BY,COMBINER   
{code}

> Inconsistency in Pig Stats reports 
> ---
>
> Key: PIG-2029
> URL: https://issues.apache.org/jira/browse/PIG-2029
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.1, 0.9.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.10
>
>
> I have a Pig script which reports varying Stats for the same M/R job (same 
> inputs). Sometimes the PigStats reports all the stats (such as 
> Maps,Reduces,MaxMapTime,MinMapTime,AvgMapTime,MaxReduceTime, MinReduceTime 
> and AvgReduceTime) for the M/R job as 0. Sometimes it reports it correctly.
> Enclosed are the stderr logs for 2 runs, you can notice that for Run 1 
> job_201103091134_556600 from Run 1; has 0 against all the columns whereas in 
> Run 2, Hadoop job job_201104272229_75693 has some valid values. 
> The actual Job Tracker link shows that they are non empty. This points to a 
> bug in the interaction of the PigStats module with the Jobtracker.
> Run 1:
> {quote}
> Job Stats (time in seconds):
> JobId MapsReduces MaxMapTime  MinMapTIme  AvgMapTime  
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201103091134_556458   160 100 552 191 368 1257
> 371 392 
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>DISTINCT,MULTI_QUERY
> job_201103091134_556600   0   0   0   0   0   0   
> 0   0   UNION5  MULTI_QUERY,MAP_ONLY/user/viraj/dir,,
> job_201103091134_556601   7   100 17  8   14  200 
> 15  27  CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER   
> job_201103091134_556602   0   0   0   0   0   0   
> 0   0   CNJOIN3,GNJOIN3,sampleNJOIN3GROUP_BY,COMBINER   
> job_201103091134_556603   0   0   0   0   0   0   
> 0   0   CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER   
> job_201103091134_556604   2   100 13  7   10  34  
> 13  31  CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER   
> job_201103091134_556644   0   0   0   0   0   0   
> 0   0   ONJOIN15SAMPLER 
> job_201103091134_556645   0   0   0   0   0   0   
> 0   0   ONJOIN25SAMPLER 
> job_201103091134_556646   0   0   0   0   0   0   
> 0   0   ONJOIN3 SAMPLER 
> job_201103091134_556654   0   0   0   0  

[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-17 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034932#comment-13034932
 ] 

Woody Anderson commented on PIG-1824:
-

hmm.. i ran each of those tests via:

ant -noclasspath test -Dtestcase=org.apache.pig.test.TestScriptUDF
etc. and they all passed.

is your environment clean?
% printenv | grep YTHON
(should be empty)

is there anything else i should be doing to try to mirror your test framework 
(while not having to run all tests for the 18 hours that that requires)?

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch, 1824c.patch, 
> 1824d.patch, 1824x.patch, TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1890) Fix piggybank unit test TestAvroStorage

2011-05-17 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034925#comment-13034925
 ] 

Daniel Dai commented on PIG-1890:
-

Seems it should call POProject.getNext(DataBag) instead. Project one item 
assumes this item already has the correct type and need not convert. The issue 
should be caused by plan generation, which results a wrong result type for 
POProject.

> Fix piggybank unit test TestAvroStorage
> ---
>
> Key: PIG-1890
> URL: https://issues.apache.org/jira/browse/PIG-1890
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.9.0
>Reporter: Daniel Dai
>Assignee: Jakob Homan
> Fix For: 0.9.0
>
> Attachments: PIG-1890-1.patch
>
>
> TestAvroStorage fail on trunk. There are two reasons:
> 1. After PIG-1680, we call LoadFunc.setLocation one more time.
> 2. The schema for AvroStorage seems to be wrong. For example, in first test 
> case testArrayDefault, the schema for "in" is set to "PIG_WRAPPER: (FIELD: 
> {PIG_WRAPPER: (ARRAY_ELEM: float)})". It seems PIG_WRAPPER is redundant. This 
> issue is hidden until PIG-1188 checked in.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1824) Support import modules in Jython UDF

2011-05-17 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1824:


Attachment: TEST-org.apache.pig.test.TestScriptUDF.txt
TEST-org.apache.pig.test.TestScriptLanguage.txt
TEST-org.apache.pig.test.TestGrunt.txt

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch, 1824c.patch, 
> 1824d.patch, 1824x.patch, TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1824) Support import modules in Jython UDF

2011-05-17 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1824:


Status: Open  (was: Patch Available)

I ran the unit tests and saw issues with most of the python oriented tests.  
I'll attach the logs from the failing tests.

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch, 1824c.patch, 
> 1824d.patch, 1824x.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira