[ https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766746#action_12766746 ]
Alan Gates commented on PIG-928: -------------------------------- I ran some quick and sloppy performance tests on this. I ran it using both BSF and direct bindings to groovy. I also ran it using the builtin TOKENIZE function in Pig. I had it read 5000 lines of text. The groovy (or TOKENIZE) functions handle splitting the line, then we do a standard group/count to count the words. I got the following results: Groovy using BSF: 55.070 seconds Groovy direct bindings: 58.560 seconds TOKENIZE: 2.554 seconds So a 30x slow down using this. That's pretty painful. I know string translation between languages can be bad. I don't know how much of this is inter-language bindings and how much is groovy. When i get chance I'll try this in Python and see if I get similar numbers. > UDFs in scripting languages > --------------------------- > > Key: PIG-928 > URL: https://issues.apache.org/jira/browse/PIG-928 > Project: Pig > Issue Type: New Feature > Reporter: Alan Gates > Attachments: package.zip > > > It should be possible to write UDFs in scripting languages such as python, > ruby, etc. This frees users from needing to compile Java, generate a jar, > etc. It also opens Pig to programmers who prefer scripting languages over > Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.