josiahyan commented on pull request #8214: URL: https://github.com/apache/arrow/pull/8214#issuecomment-695114615
@liyafan82 Thanks for taking a look! > I am a little surprised that JIT failed to transform the integer divistion to a right shift, as it can be easily deduced that the type width for an IntVector is always 4. I was pretty surprised too - it turned out that a huge chunk of cycles that we spent serializing data was devoted to this (BaseFixedWidthVector capacity bounds checking) + ArrowBuf bounds checking. This held true across the two/three JDK versions I used, across servers. It was not immediately evident where all the cycles are going on a Java-level profiler, but if you pull out the assembly, I think its quite clear (see the linked JIRA). > I am thinking about another way: can we simply save the value/validity buffer capacities? so we can further improve the performance by avoiding the if branch in getValueBufferValueCapacity and the if branch in capAtMaxInt I did this for a prototype internally - I think this is faster, but if done before even calling setSafe, implicitly assumes that no-one else is writing to the ArrowBuf. If I remember correctly, if we read the capacities directly and do these checks in getValueBufferValueCapacity, deeper in the setSafe call, performance was marginally comparable (at least within <20%?). However, this needs to be checked again. I wasn't too concerned about this, as we run with arrow.enable_unsafe_memory_access=false` set. We might have been able to make it faster on this front of checking the capacity, but its waaaay into diminishing returns (for us at least). We also lost a lot of performance over keeping `arrow.enable_unsafe_memory_access=false`, and I was wondering if there was a better way of eliminating all bounds checks, both on the BaseFixedWidthVector, and in ArrowBuf.chk. When we run our production code with `arrow.enable_unsafe_memory_access=false`, the breakdowns of the setSafe CPU usage (before this patch) according to an async sampling profiler are as following: 53% for handleSafe (which this patch is expected to remove the majority of) ~30% for the ArrowBuf.chk checks in the `set(index, value)` code. I was thinking that if we had a specialized vector append API, that guaranteed ownership of the buffer, we could throw out all of these bounds checks safely (in handleSafe, and ArrowBuf.chk), as well as have a restricted API to optimize building buffers out of. Something like the original [IntWriter](https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/complex/writer/BaseWriter.ScalarWriter.html) interface, but with stronger ownership semantics around the ArrowBuf? Is that something you think is possible/desirable? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org