[ https://issues.apache.org/jira/browse/PIG-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183690#comment-13183690 ]
Scott Carey commented on PIG-2359: ---------------------------------- bq. The JVM will detect your checks and not do its own bounds checks if yours are sufficient More info: The JVM tries to eliminate array bounds checks. It can do this in a few ways. There is the loop predication work that is in JRE 6_u22 or so and later that will move bound checks from inside a loop to the outside of the loop when it can. Older JVMs can do similar, but in fewer situations. This is both the intrinsic Java check and any you write yourself. In fact, it tries to hoist all sorts of code outside the loop if it can, not just array bounds checks. If it can prove that the value passed in is within range it may eliminate the bounds check. This can be due to the index variable having a known range (0 to arr.length, for example) or a few other conditions. For a public virtual method like Tuple.get() it will almost never be able to inline the call at the call site, and so it may not ever be able to prove that it can remove the bounds checks. In this sort of situation, there are two fast ways: don't check yourself and let the exception bubble up, or check yourself and handle the out of bound condition yourself. In general, catching an index out of bounds exception is slower than checking yourself since the JVM can prove that its own checks are useless with yours guarding them and exceptoin handling is much slower than a code branch. In the condition that the method may be inlined aggressively (small private or effectively final methods especially) leaving manual checks out can be very fast since the JVM may be able to prove that none are necessary at all at a given call site. Variants can be performance tested and refined over time. It doesn't have to be perfect now. > Support more efficient Tuples when schemas are known > ---------------------------------------------------- > > Key: PIG-2359 > URL: https://issues.apache.org/jira/browse/PIG-2359 > Project: Pig > Issue Type: New Feature > Reporter: Dmitriy V. Ryaboy > Assignee: Dmitriy V. Ryaboy > Attachments: PIG-2359.1.patch, PIG-2359.2.patch, PIG-2359.3.patch, > PIG-2359.4.patch > > > Pig Tuples have significant overhead due to the fact that all the fields are > Objects. > When a Tuple only contains primitive fields (ints, longs, etc), it's possible > to avoid this overhead, which would result in significant memory savings. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira