[ 
https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766746#action_12766746
 ] 

Alan Gates commented on PIG-928:
--------------------------------

I ran some quick and sloppy performance tests on this.  I ran it using both BSF 
and direct bindings to groovy.  I also ran it using the builtin TOKENIZE 
function in Pig.  I had it read 5000 lines of text.  The groovy (or TOKENIZE) 
functions handle splitting the line, then we do a standard group/count to count 
the words.  I got the following results:

Groovy using BSF:  55.070 seconds
Groovy direct bindings:  58.560 seconds
TOKENIZE:  2.554 seconds

So a 30x slow down using this.  That's pretty painful.  I know string 
translation between languages can be bad.  I don't know how much of this is 
inter-language bindings and how much is groovy.  When i get  chance I'll try 
this in Python and see if I get similar numbers.

> UDFs in scripting languages
> ---------------------------
>
>                 Key: PIG-928
>                 URL: https://issues.apache.org/jira/browse/PIG-928
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Alan Gates
>         Attachments: package.zip
>
>
> It should be possible to write UDFs in scripting languages such as python, 
> ruby, etc.  This frees users from needing to compile Java, generate a jar, 
> etc.  It also opens Pig to programmers who prefer scripting languages over 
> Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to