[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2013-05-06 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649924#comment-13649924
 ] 

Rohini Palaniswamy commented on PIG-1824:
-

to use a jython install, the Lib dir must be in the jython search path
 * via env variable JYTHON_HOME=jy_home or JYTHON_PATH=jy_home/Lib:... or
 * jython-standalone.jar should be in the classpath

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10.0
>
> Attachments: 1824a.patch, 1824b.patch, 1824c.patch, 1824d.patch, 
> 1824_final.patch, 1824.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2013-05-06 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649920#comment-13649920
 ] 

Rohini Palaniswamy commented on PIG-1824:
-

You need to have jython/Lib directory in the classpath. We bundle it with our 
deployment. Else need to have jython-standalone.jar instead of jython.jar as in 
Pig 0.11. 

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10.0
>
> Attachments: 1824a.patch, 1824b.patch, 1824c.patch, 1824d.patch, 
> 1824_final.patch, 1824.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2013-05-03 Thread Martin Gerlach (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13648793#comment-13648793
 ] 

Martin Gerlach commented on PIG-1824:
-

Doesn't work for me, either (with codecs module). Pig version is 0.10.0-cdh4.1.2


> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10.0
>
> Attachments: 1824a.patch, 1824b.patch, 1824c.patch, 1824d.patch, 
> 1824_final.patch, 1824.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2013-01-29 Thread Russell Jurney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565541#comment-13565541
 ] 

Russell Jurney commented on PIG-1824:
-

This does not actually work for me, in either Pig 0.10 or Pig 0.10.1. I can't 
include the 're' module via 'import re', or I get this error:

Caused by: Traceback (most recent call last):
  File "udfs.py", line 20, in 
import re
ImportError: No module named re

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10.0
>
> Attachments: 1824a.patch, 1824b.patch, 1824c.patch, 1824d.patch, 
> 1824_final.patch, 1824.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-20 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037043#comment-13037043
 ] 

Olga Natkovich commented on PIG-1824:
-

Lets get it committed! Thanks, Woody for contributing!

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824_final.patch, 1824a.patch, 1824b.patch, 
> 1824c.patch, 1824d.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-20 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037039#comment-13037039
 ] 

Richard Ding commented on PIG-1824:
---

Patch passed e2e python tests.

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824_final.patch, 1824a.patch, 1824b.patch, 
> 1824c.patch, 1824d.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-19 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036514#comment-13036514
 ] 

Olga Natkovich commented on PIG-1824:
-

I believe that Richard is running some additional tests. Once he is done, he is 
planning to commit the patch

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824_final.patch, 1824a.patch, 1824b.patch, 
> 1824c.patch, 1824d.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-19 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036496#comment-13036496
 ] 

Woody Anderson commented on PIG-1824:
-

cool. can we get this into trunk so i don't have to keep fixing the patches?


> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824_final.patch, 1824a.patch, 1824b.patch, 
> 1824c.patch, 1824d.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-18 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035542#comment-13035542
 ] 

Richard Ding commented on PIG-1824:
---

The new patch fixed the unit test errors reported earlier. I have one 
(different) failed test in TestGrunt, not sure if it's related to the patch. 

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824_final.patch, 1824a.patch, 1824b.patch, 
> 1824c.patch, 1824d.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-17 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034932#comment-13034932
 ] 

Woody Anderson commented on PIG-1824:
-

hmm.. i ran each of those tests via:

ant -noclasspath test -Dtestcase=org.apache.pig.test.TestScriptUDF
etc. and they all passed.

is your environment clean?
% printenv | grep YTHON
(should be empty)

is there anything else i should be doing to try to mirror your test framework 
(while not having to run all tests for the 18 hours that that requires)?

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch, 1824c.patch, 
> 1824d.patch, 1824x.patch, TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-12 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032594#comment-13032594
 ] 

Alan Gates commented on PIG-1824:
-

Woody, 

This patch now conflicts with the changes that were checked in as part of 
PIG-2056.  I don't understand how to resolve the conflicts.  You could upload a 
new patch or just tell me how to do the resolution so I can continue testing.

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch, 1824c.patch, 
> 1824d.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-10 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031274#comment-13031274
 ] 

Alan Gates commented on PIG-1824:
-

I'll start running the tests and such.  I also want to add some end to end 
tests.

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch, 1824c.patch, 
> 1824d.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-08 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030597#comment-13030597
 ] 

Julien Le Dem commented on PIG-1824:


+1 for inclusion for me.
Thanks for including the comments Woody.

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch, 1824c.patch, 
> 1824d.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-06 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030116#comment-13030116
 ] 

Woody Anderson commented on PIG-1824:
-

i'm not sure what's really left to keep this out of the next release, given 
we've been going back an forth over issues that don't even affect functionality.
but, there are other jython related bugs in the pipe for 0.10 anyway, so 
perhaps having them all in the same release is a good idea for a feature 
grouping perspective.

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch, 1824c.patch, 
> 1824d.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-03 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028366#comment-13028366
 ] 

Woody Anderson commented on PIG-1824:
-

understood.
adding that null check/throw etc. is just a change that is unrelated to this 
bug. I can bundle it up as all the related lines of code are being changed by 
this bug anyway, but that's why i didn't do it originally.

I'll add a throw similar to current impl of getScriptAsStream

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.9.0
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch, 1824c.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-03 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028362#comment-13028362
 ] 

Julien Le Dem commented on PIG-1824:


Hi Woody,
I had misread the code about automatic deletion. You're right it deletes only 
if it was created by Pig.

I understand the superfluous null check and the warning being somewhat 
incorrect. 
To me there should be either no null check in that case or throw some exception 
if null. This is about debug-ability of the code. If someone changes the 
behavior of getScriptAsStream() there should be an exception in your code at 
that point. Not somewhere else. It also helps with understanding the code so 
that the reader does not wonder why it does nothing when the stream is null 
(because it's never null. But then why do we check ? etc)

otherwise it looks good. Thanks!

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.9.0
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch, 1824c.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-04-25 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025007#comment-13025007
 ] 

Woody Anderson commented on PIG-1824:
-

agree:

inre: PYTHON_CACHEDIR: the code behaves as you wish, in that it only deletes 
the dir if it (pig) created it.
sorry for not being being clear in comments about that, but if you read the 
code you'll see it.

if we can't write, i (pig) was creating an alternate directory. It may be 
possible to pre-populate this, and i understand (and had) the desire to have an 
error instead of a new directory, but I was initially experiencing this error:
{code}
*sys-package-mgr*: can't create package cache dir, 
'/grid/0/Releases/pig-0.8.0..1103222002-20110401-000/share/pig-0.8.0..1103222002/lib/cachedir/packages'
{code}

which is why i added the 'is writable' check, but after reviewing (per your 
comment), it seems that cachedir is not set on the grid (at least at the point 
when the static block runs). If left as null, it seems to default to some grid 
location that is not writable (and thus doesn't work), but if i set it to a 
writable tmp first, it works.
so.. i can safely agree that an error if the dir isn't writable is both 
desirable and works.

as for the getScriptAsStream():
i followed the existing code convention on that one, though i didn't like it 
either.
again, if you read down a bit you'll see that the impl of getScriptAsStream() 
is:
{code}
..
if (is == null) {
throw new IllegalStateException(
"Could not initialize interpreter (from file system or 
classpath) with " + scriptPath);
}  
return is;
{code}

so, the null check is superfluous but does quiet the "not null check" warnings.
i didn't add an additional throw statement in this case b/c essentially, my 
code wouldn't add any _new_ errors that the existing code didn't already 
exhibit if somehow the impl of getScriptAsStream changed and could return null.

anyway, ill upload a new patch to address the writable issue, if you think it's 
a big deal we can add an 'else throw' statement around getScriptAsStream

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.9.0
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-04-25 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024960#comment-13024960
 ] 

Julien Le Dem commented on PIG-1824:


Hi Woody,
This is a great feature. 
I agree with the static block comments, but I don't see how you could do it 
differently without a major refactoring of the existing code.
Here are comments/questions about some details of the implementation.

in JythonScriptEngine.Interpreter static block:
* If _PYTHON_CACHEDIR_ is provided, we will delete it on exit. Shouldn't we 
delete it only if it has been created by Pig? it is dangerous to delete 
something that we have not created. The user could shoot himself in the foot by 
providing something he cares about as the _PYTHON_CACHEDIR_.
* Also, if we can't write to the provided _PYTHON_CACHEDIR_ we create another 
one. Can the user pre-populate the cache dir? If yes we should throw an 
exception here.

in JythonScriptEngine.Interpreter.init():
* Something should fail if _is_ is null.
{code}
InputStream is = getScriptAsStream(path);
 if (is != null) {
{code}

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.9.0
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-04-08 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017789#comment-13017789
 ] 

Woody Anderson commented on PIG-1824:
-

ok. i understand your thoughts on static, and mostly i have them too, but the 
PythonInterpreter is a static member of the Interperter class, and the code i 
wrote must run BEFORE that interpreter is constructed.

Interpeter is a private inner class, so it cannot be caused to load before 
normal use patterns. So, moving the static block into the static block for 
Interpreter addresses your concerns.

import will not cause the static block to be executed btw, it's the first 
executed reference to the class. However, i take the point that some code could 
have been:
{code}
Class = JythonScriptEngine.class;
{code}
or something like that to cause the class to be loaded. Still, as i said: 
Interpreter static block addresses this, and the ctor is out b/c of the static 
nature of Interpreter.interpreter.

on second point:
i dont' see the point of a includeResources() method, if it can be done, it can 
be done in init(), if not it won't be done. Why add a new method?


> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.9.0
>
> Attachments: 1824.patch, 1824a.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-04-08 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017762#comment-13017762
 ] 

Alan Gates commented on PIG-1824:
-

On the issue of the static block, I dislike static initialization blocks 
because you're never sure when they are going to be called.  Someone adding 
"import o.a.p.s.j.JythonScriptingEngine" somewhere in the code will result in 
changing when this is executed, including possibly when it does not need to be 
executed.  Just moving it into the Interpreter class as a static block won't 
change that I don't think.  It can't be in Interpreter's constructor?

On the second point, what I meant was, should there be a separate method 
ScriptEngine.includeResources()?  This would make clear to developers of future 
scripting engines that this is something they need to do.  The contract would 
then be that before Pig called ScriptingEngine.registerFunction it would call 
includeResources().  I agree with you that, when possible, all scripting engine 
implementations should include their resources.  I was not suggesting a 
supportsFeature() method.  For situations where it cannot be supported 
includeResources would be a NOP.



> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.9.0
>
> Attachments: 1824.patch, 1824a.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-04-08 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017549#comment-13017549
 ] 

Woody Anderson commented on PIG-1824:
-

1. i could re-work the initialization into the static block of the inner class 
"Interpreter", it simply needs to be done before the interpreter is allocated. 
I'm not sure what you mean by not wanting a cache dir when using python udfs or 
control flow? can you clarify?
2. separate the logic out of init into what? I think it should, in general, be 
the contract of any script environment to handle resource inclusion (if 
possible). Are you imagining some scenario where init(file,..) would not 
actually parse/internalize the code inside init()? I don't much care where the 
code is parsed and added to a ScriptEngine, but when it is, it should handle 
all other evaluated resources that are necessary to succeed. In the current 
API, a user provided script file is given to init(), so that's where it must do 
this. There is really no other place to evaluate resource inclusions, and i 
think i might not be understanding your suggestion. As for other ScriptEngines 
that may not be able to support this concept, are you suggesting a 
"supportsFeature()" method that we use to test various SE's to determine if 
they can support this (or other) features? I'm not sure what we'd do with this 
knowledge.

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.9.0
>
> Attachments: 1824.patch, 1824a.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-04-06 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016475#comment-13016475
 ] 

Alan Gates commented on PIG-1824:
-

A couple of questions:

# Based on my analysis the static init block that this patch adds to 
JythonScriptingEngine will only get invoked once we know we have Jython in the 
mix.  Is that correct?  We don't want to be invoking this when Python UDFs or a 
Python control flow.
# Right now the code to do this is part of the init of the 
JythonScriptingEngine.  Should we make this a separate method in ScriptEngine 
so that other languages can also add this kind of functionality?  I would not 
make it abstract, since some languages may not be able to do this.  But it 
seems like it makes for a cleaner interface.


> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.8.0, 0.9.0, 0.10
>
> Attachments: 1824.patch, 1824a.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-03-31 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014086#comment-13014086
 ] 

Woody Anderson commented on PIG-1824:
-

The following may not be immediately self evident to all developers:

import statements that execute from within runtime function calls will not work 
(unless the dependency has already been satisfied statically), eg:
{code}
def resplit(content, regex, index):
 import re
 return re.compile(regex).split(content)[index]
{code}

will not work b/c the import is not attempted until after the job has been 
defined, built, and deployed.
This import practice is frowned upon and is used very rarely. If you happen to 
be doing it (i'll assume you have a good reason), then you probably know how to 
fix it. If you're using someone else's code that is written like this, you can 
satisfy the dependency by explicitly importing the module up front, this will 
cause it to be added to the jar, and subsequent uses will succeed.


> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.8.0, 0.9.0, 0.10
>
> Attachments: 1824.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-03-29 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012608#comment-13012608
 ] 

Woody Anderson commented on PIG-1824:
-

this code originally written cannot work:
{code}
import re
@outputSchema("y:bag{t:tuple(word:chararray)}")
def strsplittobag(content,regex):
return re.compile(regex).split(content
{code}

the reason is that split returns a list of strings, not a list of tuples, and 
jythonfunction casting will fail. i've created a ticket for these kinds of 
'obvious' type coercions: https://issues.apache.org/jira/browse/PIG-1942

and, as such, i am going to change the code for this ticket to something that 
will work when 'import re' works.

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("y:bag{t:tuple(word:chararray)}")
> def strsplittobag(content,regex):
> return re.compile(regex).split(content)
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (PIG-1824) Support import modules in Jython UDF

2011-03-01 Thread David Ciemiewicz (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000982#comment-13000982
 ] 

David Ciemiewicz commented on PIG-1824:
---

I don't think it is appropriate to just leave this up to the end user to figure 
this stuff out.

Especially when the errors won't be discovered until the user attempts to run 
the code on the grid
then must decipher the errors
then must track down the individual dependency files
then must try to figure out how to ship the necessary files
then must try to track down why it still doesn't work because the import files 
contained dependencies on imported files
then must track down the subsequent dependencies
then ...

If jython itself does not provide hooks to enumerate all dependencies after 
parsing, would it be possible to build a tool which recurses the imports and 
then provides information to the end user on how to package all the 
dependencies for ship (or better just does it).

Couldn't this be a requirement for all language bindings to provide a method or 
script for enumerating all dependent files, even if the interpreter 
implementation in Java doesn't provide this functionality natively?

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Reporter: Richard Ding
>Assignee: Richard Ding
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("y:bag{t:tuple(word:chararray)}")
> def strsplittobag(content,regex):
> return re.compile(regex).split(content)
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (PIG-1824) Support import modules in Jython UDF

2011-01-26 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987143#action_12987143
 ] 

Alan Gates commented on PIG-1824:
-

+1 to Ashutosh's comment.  Also, this won't port well as we add UDFs in new 
languages.

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Reporter: Richard Ding
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("y:bag{t:tuple(word:chararray)}")
> def strsplittobag(content,regex):
> return re.compile(regex).split(content)
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1824) Support import modules in Jython UDF

2011-01-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987105#action_12987105
 ] 

Ashutosh Chauhan commented on PIG-1824:
---

Unless, there is a java api provided by jython interpreter which lists all the 
dependencies of a jython script, trying to figure out all the module 
dependencies yourself will be close to writing a linker, isn't it? I think it 
will be easier to let user specify and ship his modules in the meanwhile.  

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Reporter: Richard Ding
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("y:bag{t:tuple(word:chararray)}")
> def strsplittobag(content,regex):
> return re.compile(regex).split(content)
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.