Scott Carey brought Java 7 up in PIG-2643, and I think it's something we need to think about. When do we want to start taking advantage of new features that may not exist on Java 6? Do we ever?
2012/4/12 Scott Carey (Commented) (JIRA) <j...@apache.org> > > [ > https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252706#comment-13252706] > > Scott Carey commented on PIG-2643: > ---------------------------------- > > Another thought for this sort of thing: > > This might be achievable without bytecode generation and good performance > with Java 7 MethodHandles [1][2]. Of course, that would require Java 7, > but Java 6 support ends later year [3], about the time Pig 0.11 would be > out anyway. > > > [1] > http://docs.oracle.com/javase/7/docs/api/java/lang/invoke/MethodHandle.html > [2] > http://stackoverflow.com/questions/8823793/methodhandle-what-is-it-all-about > [3] https://blogs.oracle.com/henrik/entry/updated_java_6_eol_date > > > Use bytecode generation to make a performance replacement for > InvokeForLong, InvokeForString, etc > > > ------------------------------------------------------------------------------------------------- > > > > Key: PIG-2643 > > URL: https://issues.apache.org/jira/browse/PIG-2643 > > Project: Pig > > Issue Type: Improvement > > Reporter: Jonathan Coveney > > Assignee: Jonathan Coveney > > Priority: Minor > > Labels: codegen > > Fix For: 0.11, 0.10.1 > > > > Attachments: PIG-2643-0.patch > > > > > > This is basically to cut my teeth for much more ambitious code > generation down the line, but I think it could be performance and useful. > > the new syntax is: > > {code}a = load 'thing' as (x:chararray); > > define concat InvokerGenerator('java.lang.String','concat','String'); > > define valueOf InvokerGenerator('java.lang.Integer','valueOf','String'); > > define valueOfRadix > InvokerGenerator('java.lang.Integer','valueOf','String,int'); > > b = foreach a generate x, valueOf(x) as vOf; > > c = foreach b generate x, vOf, valueOfRadix(x, 16) as vOfR; > > d = foreach c generate x, vOf, vOfR, concat(concat(x, (chararray)vOf), > (chararray)vOfR); > > dump d; > > {code} > > There are some differences between this version and Dmitriy's > implementation: > > - it is no longer necessary to declare whether the method is static or > not. This is gleaned via reflection. > > - as per the above, it is no longer necessary to make the first argument > be the type of the object to invoke the method on. If it is not a static > method, then the type will implicitly be the type you need. So in the case > of concat, it would need to be passed a tuple of two inputs: one for the > method to be called against (as it is not static), and then the 'string' > that was specified. In the case of valueOf, because it IS static, then the > 'String' is the only value. > > - The arguments are type sensitive. Integer means the Object Integer, > whereas int (or long, or float, or boolean, etc) refer to the primitive. > This is necessary to properly reflect the arguments. Values passed in WILL, > however, be properly unboxed as necessary. > > - The return type will be reflected. > > This uses the ASM API to generate the bytecode, and then a custom > classloader to load it in. I will add caching of the generated code based > on the input strings, etc, but I wanted to get eyes and opinions on this. I > also need to benchmark, but it should be native speed (excluding a little > startup time to make the bytecode, but ASM is really fast). > > Another nice benefit is that this bypasses the need for the JDK, though > it adds a dependency on ASM (which is a super tiny dependency). > > Patch incoming. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > >