On Fri, Aug 23, 2019, 11:49 AM Micah Kornfield <emkornfi...@gmail.com> wrote:
> I don't think we should couple this discussion with the implementation of >> large list, etc since I think those two concepts are independent. > > I'm still trying to balance in my mind which is a worse experience for > consumers of the libraries for these types. Claiming that Java supports > these types but throwing an exception when the Vectors exceed 32-bits or > just say they aren't supported until we have 64-bit support in Java. > The vector indexes being limited to 32 bits doesn't limit the addressing to 32 bit chunks of memory. For example, you're prime example before was image data. Having 2 billion images of 1mb images would still be supported without changing the index addressing. > >> I've asked some others on my team their opinions on the risk here. I >> think we should probably review some our more complex vector interactions >> and see how the jvm's assembly changes with this kind of change. Using >> microbenchmarking is good but I think we also need to see whether we're >> constantly inserting additional instructions or if in most cases, this >> actually doesn't impact instruction count. > > > Is this something that your team will take on? > Yeah, we need to look at this I think. Do you need a rebased version of the PR or is the existing one sufficient? > Rebase would help if possible. > Thanks, > Micah > > On Thu, Aug 22, 2019 at 8:56 PM Jacques Nadeau <jacq...@apache.org> wrote: > >> I don't think we should couple this discussion with the implementation of >> large list, etc since I think those two concepts are independent. >> >> I've asked some others on my team their opinions on the risk here. I >> think we should probably review some our more complex vector interactions >> and see how the jvm's assembly changes with this kind of change. Using >> microbenchmarking is good but I think we also need to see whether we're >> constantly inserting additional instructions or if in most cases, this >> actually doesn't impact instruction count. >> >> >> >> On Wed, Aug 21, 2019 at 12:18 PM Micah Kornfield <emkornfi...@gmail.com> >> wrote: >> >>> >>>> With regards to the reference implementation point. It is a good point. >>>> I'm on vacation this week. Unless you're pushing hard on this, can we pick >>>> this up and discuss more next week? >>> >>> >>> Hi Jacques, I hope you had a good rest. Any more thoughts on the >>> reference implementation aspect of this? >>> >>> >>>> To copy the sentiments from the 0.15.0 release thread, I think it >>>> would be best to decouple this discussion from the release timeline >>>> given how many people we have relying on regular releases coming out. >>>> We can keep continue making major 0.x releases until we're ready to >>>> release 1.0.0. >>> >>> >>> I'm OK with it as long as other stakeholders are. Timed releases are the >>> way to go. As stated on the release thread [1] we need a better mechanism >>> to avoid this type of issue arising again. The release thread also had >>> some more discussion on compatibility. >>> >>> Thanks, >>> Micah >>> >>> [1] >>> https://lists.apache.org/thread.html/d70feeceaf2570906ade117030b29887af7c77ca5c4a976e6d555920@%3Cdev.arrow.apache.org%3E >>> >>> >>> On Wed, Aug 14, 2019 at 3:23 PM Wes McKinney <wesmck...@gmail.com> >>> wrote: >>> >>>> On Sun, Aug 11, 2019 at 9:40 PM Micah Kornfield <emkornfi...@gmail.com> >>>> wrote: >>>> > >>>> > Hi Wes and Jacques, >>>> > See responses below. >>>> > >>>> > With regards to the reference implementation point. It is a good >>>> point. I'm >>>> > > on vacation this week. Unless you're pushing hard on this, can we >>>> pick this >>>> > > up and discuss more next week? >>>> > >>>> > >>>> > Sure thing, enjoy your vacation. I think the only practical >>>> implications >>>> > are it delays choices around implementing LargeList, LargeBinary, >>>> > LargeString in Java, which in turn might push out the 0.15.0 release. >>>> > >>>> >>>> To copy the sentiments from the 0.15.0 release thread, I think it >>>> would be best to decouple this discussion from the release timeline >>>> given how many people we have relying on regular releases coming out. >>>> We can keep continue making major 0.x releases until we're ready to >>>> release 1.0.0. >>>> >>>> > My stance on this is that I don't know how important it is for Java to >>>> > > support vectors over INT32_MAX elements. The use cases enabled by >>>> > > having very large arrays seem to be concentrated in the native code >>>> > > world (e.g. C/C++/Rust) -- that could just be >>>> implementation-centrism >>>> > > on my part, though. >>>> > >>>> > >>>> > A data point against this view is Spark has done work to eliminate 2GB >>>> > memory limits on its block sizes [1]. I don't claim to understand the >>>> > implications of this. Bryan might you have any thoughts here? I'm OK >>>> with >>>> > INT32_MAX, as well, I think we should think about what this means for >>>> > adding Large types to Java and implications for reference >>>> implementations. >>>> > >>>> > Thanks, >>>> > Micah >>>> > >>>> > [1] https://issues.apache.org/jira/browse/SPARK-6235 >>>> > >>>> > On Sun, Aug 11, 2019 at 6:31 PM Jacques Nadeau <jacq...@apache.org> >>>> wrote: >>>> > >>>> > > Hey Micah, >>>> > > >>>> > > Appreciate the offer on the compiling. The reality is I'm more >>>> concerned >>>> > > about the unknowns than the compiling issue itself. Any time you've >>>> been >>>> > > tuning for a while, changing something like this could be totally >>>> fine or >>>> > > cause a couple of major issues. For example, we've done a very >>>> large amount >>>> > > of work reducing heap memory footprint of the vectors. Are target >>>> is to >>>> > > actually get it down to 24 bytes per ArrowBuf and 24 bytes heap per >>>> vector >>>> > > (not including arrow bufs). >>>> > > >>>> > > With regards to the reference implementation point. It is a good >>>> point. >>>> > > I'm on vacation this week. Unless you're pushing hard on this, can >>>> we pick >>>> > > this up and discuss more next week? >>>> > > >>>> > > thanks, >>>> > > Jacques >>>> > > >>>> > > On Sat, Aug 10, 2019 at 7:39 PM Micah Kornfield < >>>> emkornfi...@gmail.com> >>>> > > wrote: >>>> > > >>>> > >> Hi Jacques, >>>> > >> I definitely understand these concerns and this change is risky >>>> because it >>>> > >> is so large. Perhaps, creating a new hierarchy, might be the >>>> cleanest way >>>> > >> of dealing with this. This could have other benefits like >>>> cleaning up >>>> > >> some >>>> > >> cruft around dictionary encode and "orphaned" method. Per past >>>> e-mail >>>> > >> threads I agree it is beneficial to have 2 separate reference >>>> > >> implementations that can communicate fully, and my intent here was >>>> to >>>> > >> close >>>> > >> that gap. >>>> > >> >>>> > >> Trying to >>>> > >> > determine the ramifications of these changes would be >>>> challenging and >>>> > >> time >>>> > >> > consuming against all the different ways we interact with the >>>> Arrow Java >>>> > >> > library. >>>> > >> >>>> > >> >>>> > >> Understood. I took a quick look at Dremio-OSS it seems like it >>>> has a >>>> > >> simple java build system? If it is helpful, I can try to get a >>>> fork >>>> > >> running that at least compiles against this PR. My plan would be >>>> to cast >>>> > >> any place that was changed to return a long back to an int, so in >>>> essence >>>> > >> the Dremio algorithms would reman 32-bit implementations. >>>> > >> >>>> > >> I don't have the infrastructure to test this change properly from >>>> a >>>> > >> distributed systems perspective, so it would still take some time >>>> from >>>> > >> Dremio to validate for regressions. >>>> > >> >>>> > >> I'm not saying I'm against this but want to make sure we've >>>> > >> > explored all less disruptive options before considering changing >>>> > >> something >>>> > >> > this fundamental (especially when I generally hold the view that >>>> large >>>> > >> cell >>>> > >> > counts against massive contiguous memory is an anti pattern to >>>> scalable >>>> > >> > analytical processing--purely subjective of course). >>>> > >> >>>> > >> >>>> > >> I'm open to other ideas here, as well. I don't think it is out of >>>> the >>>> > >> question to leave the Java implementation as 32-bit, but if we do, >>>> then I >>>> > >> think we should consider a different strategy for reference >>>> > >> implementations. >>>> > >> >>>> > >> Thanks, >>>> > >> Micah >>>> > >> >>>> > >> On Sat, Aug 10, 2019 at 5:09 PM Jacques Nadeau <jacq...@apache.org >>>> > >>>> > >> wrote: >>>> > >> >>>> > >> > Hey Micah, I didn't have a particular path in mind. Was thinking >>>> more >>>> > >> along >>>> > >> > the lines of extra methods as opposed to separate classes. >>>> > >> > >>>> > >> > Arrow hasn't historically been a place where we're writing >>>> algorithms in >>>> > >> > Java so the fact that they aren't there doesn't mean they don't >>>> exist. >>>> > >> We >>>> > >> > have a large amount of code that depends on the current behavior >>>> that is >>>> > >> > deployed in hundreds of customer clusters (you can peruse our >>>> dremio >>>> > >> repo >>>> > >> > to see how extensively we leverage Arrow if interested). Trying >>>> to >>>> > >> > determine the ramifications of these changes would be >>>> challenging and >>>> > >> time >>>> > >> > consuming against all the different ways we interact with the >>>> Arrow Java >>>> > >> > library. I'm not saying I'm against this but want to make sure >>>> we've >>>> > >> > explored all less disruptive options before considering changing >>>> > >> something >>>> > >> > this fundamental (especially when I generally hold the view that >>>> large >>>> > >> cell >>>> > >> > counts against massive contiguous memory is an anti pattern to >>>> scalable >>>> > >> > analytical processing--purely subjective of course). >>>> > >> > >>>> > >> > On Sat, Aug 10, 2019, 4:17 PM Micah Kornfield < >>>> emkornfi...@gmail.com> >>>> > >> > wrote: >>>> > >> > >>>> > >> > > Hi Jacques, >>>> > >> > > What avenue were you thinking for supporting both paths? I >>>> didn't >>>> > >> want >>>> > >> > > to pursue a different class hierarchy, because I felt like >>>> that would >>>> > >> > > effectively fork the code base, but that is potentially an >>>> option that >>>> > >> > > would allow us to have a complete reference implementation in >>>> Java >>>> > >> that >>>> > >> > can >>>> > >> > > fully interact with C++, without major changes to this code. >>>> > >> > > >>>> > >> > > For supporting both APIs on the same classes/interfaces, I >>>> think they >>>> > >> > > roughly fall into three categories, changes to input >>>> parameters, >>>> > >> changes >>>> > >> > to >>>> > >> > > output parameters and algorithm changes. >>>> > >> > > >>>> > >> > > For inputs, changing from int to long is essentially a no-op >>>> from the >>>> > >> > > compiler perspective. From the limited micro-benchmarking >>>> this also >>>> > >> > > doesn't seem to have a performance impact. So we could keep >>>> two >>>> > >> versions >>>> > >> > > of the methods that only differ on inputs, but it is not clear >>>> what >>>> > >> the >>>> > >> > > value of that would be. >>>> > >> > > >>>> > >> > > For outputs, we can't support methods "long getLength()" and >>>> "int >>>> > >> > > getLength()" in the same class, so we would be forced into >>>> something >>>> > >> like >>>> > >> > > "long getLength(boolean dummy)" which I think is a less >>>> desirable. >>>> > >> > > >>>> > >> > > For algorithm changes, there did not appear to be too many >>>> places >>>> > >> where >>>> > >> > we >>>> > >> > > actually loop over all elements (it is quite possible I missed >>>> > >> something >>>> > >> > > here), the ones that I did find I was able to mitigate >>>> performance >>>> > >> > > penalties as noted above. Some of the current implementation >>>> will >>>> > >> get a >>>> > >> > > lot slower for "large arrays", but we can likely fix those >>>> later or in >>>> > >> > this >>>> > >> > > PR with a nested while loop instead of 2 for loops. >>>> > >> > > >>>> > >> > > Thanks, >>>> > >> > > Micah >>>> > >> > > >>>> > >> > > >>>> > >> > > On Saturday, August 10, 2019, Jacques Nadeau < >>>> jacq...@apache.org> >>>> > >> wrote: >>>> > >> > > >>>> > >> > >> This is a pretty massive change to the apis. I wonder how >>>> nasty it >>>> > >> would >>>> > >> > >> be to just support both paths. Have you evaluated how complex >>>> that >>>> > >> > would be? >>>> > >> > >> >>>> > >> > >> On Wed, Aug 7, 2019 at 11:08 PM Micah Kornfield < >>>> > >> emkornfi...@gmail.com> >>>> > >> > >> wrote: >>>> > >> > >> >>>> > >> > >>> After more investigation, it looks like Float8Benchmarks at >>>> least >>>> > >> on my >>>> > >> > >>> machine are within the range of noise. >>>> > >> > >>> >>>> > >> > >>> For BitVectorHelper I pushed a new commit [1], seems to >>>> bring the >>>> > >> > >>> BitVectorHelper benchmarks back inline (and even with some >>>> > >> improvement >>>> > >> > >>> for >>>> > >> > >>> getNullCountBenchmark). >>>> > >> > >>> >>>> > >> > >>> Benchmark Mode Cnt >>>> Score >>>> > >> > >>> Error >>>> > >> > >>> Units >>>> > >> > >>> BitVectorHelperBenchmarks.allBitsNullBenchmark avgt 5 >>>> 3.821 ± >>>> > >> > >>> 0.031 >>>> > >> > >>> ns/op >>>> > >> > >>> BitVectorHelperBenchmarks.getNullCountBenchmark avgt 5 >>>> 14.884 ± >>>> > >> > >>> 0.141 >>>> > >> > >>> ns/op >>>> > >> > >>> >>>> > >> > >>> I applied the same pattern to other loops that I could find, >>>> and for >>>> > >> > any >>>> > >> > >>> "for (long" loop on the critical path, I broke it up into >>>> two loops. >>>> > >> > the >>>> > >> > >>> first loop does iteration by integer, the second finishes >>>> off for >>>> > >> any >>>> > >> > >>> long >>>> > >> > >>> values. As a side note it seems like optimization for loops >>>> using >>>> > >> long >>>> > >> > >>> counters at least have a semi-recent open bug for the JVM [2] >>>> > >> > >>> >>>> > >> > >>> Thanks, >>>> > >> > >>> Micah >>>> > >> > >>> >>>> > >> > >>> [1] >>>> > >> > >>> >>>> > >> > >>> >>>> > >> > >>>> > >> >>>> https://github.com/apache/arrow/pull/5020/commits/2ea2c1ae83e3baa7b9a99a6d06276d968df41797 >>>> > >> > >>> [2] https://bugs.openjdk.java.net/browse/JDK-8223051 >>>> > >> > >>> >>>> > >> > >>> On Wed, Aug 7, 2019 at 8:11 PM Micah Kornfield < >>>> > >> emkornfi...@gmail.com> >>>> > >> > >>> wrote: >>>> > >> > >>> >>>> > >> > >>> > Indeed, the BoundChecking and CheckNullForGet variables >>>> can make a >>>> > >> > big >>>> > >> > >>> > difference. I didn't initially run the benchmarks with >>>> these >>>> > >> turned >>>> > >> > on >>>> > >> > >>> > (you can see the result from above with >>>> Float8Benchmarks). Here >>>> > >> are >>>> > >> > >>> new >>>> > >> > >>> > numbers including with the flags enabled. It looks like >>>> using >>>> > >> longs >>>> > >> > >>> might >>>> > >> > >>> > be a little bit slower, I'll see what I can do to mitigate >>>> this. >>>> > >> > >>> > >>>> > >> > >>> > Ravindra also volunteered to try to benchmark the changes >>>> with >>>> > >> > Dremio's >>>> > >> > >>> > code on today's sync call. >>>> > >> > >>> > >>>> > >> > >>> > New >>>> > >> > >>> > >>>> > >> > >>> > Benchmark Mode >>>> Cnt Score >>>> > >> > >>> Error >>>> > >> > >>> > Units >>>> > >> > >>> > >>>> > >> > >>> > BitVectorHelperBenchmarks.allBitsNullBenchmark avgt 5 >>>> > >> 4.176 ± >>>> > >> > >>> 1.292 >>>> > >> > >>> > ns/op >>>> > >> > >>> > >>>> > >> > >>> > BitVectorHelperBenchmarks.getNullCountBenchmark avgt 5 >>>> > >> 26.102 ± >>>> > >> > >>> 0.700 >>>> > >> > >>> > ns/op >>>> > >> > >>> > >>>> > >> > >>> > Float8Benchmarks.copyFromBenchmark avgt 5 7.398 ± >>>> 0.084 >>>> > >> us/op >>>> > >> > >>> > >>>> > >> > >>> > Float8Benchmarks.readWriteBenchmark avgt 5 2.711 ± >>>> 0.057 >>>> > >> us/op >>>> > >> > >>> > >>>> > >> > >>> > >>>> > >> > >>> > >>>> > >> > >>> > old >>>> > >> > >>> > >>>> > >> > >>> > BitVectorHelperBenchmarks.allBitsNullBenchmark avgt 5 >>>> > >> 3.828 ± >>>> > >> > >>> 0.030 >>>> > >> > >>> > ns/op >>>> > >> > >>> > >>>> > >> > >>> > BitVectorHelperBenchmarks.getNullCountBenchmark avgt 5 >>>> > >> 20.611 ± >>>> > >> > >>> 0.188 >>>> > >> > >>> > ns/op >>>> > >> > >>> > >>>> > >> > >>> > Float8Benchmarks.copyFromBenchmark avgt 5 6.597 ± >>>> 0.462 >>>> > >> us/op >>>> > >> > >>> > >>>> > >> > >>> > Float8Benchmarks.readWriteBenchmark avgt 5 2.615 ± >>>> 0.027 >>>> > >> us/op >>>> > >> > >>> > >>>> > >> > >>> > On Wed, Aug 7, 2019 at 7:13 PM Fan Liya < >>>> liya.fa...@gmail.com> >>>> > >> > wrote: >>>> > >> > >>> > >>>> > >> > >>> >> Hi Gonzalo, >>>> > >> > >>> >> >>>> > >> > >>> >> Thanks for sharing the performance results. >>>> > >> > >>> >> I am wondering if you have turned off the flag >>>> > >> > >>> >> BoundsChecking#BOUNDS_CHECKING_ENABLED. >>>> > >> > >>> >> If not, the lower throughput should be expected. >>>> > >> > >>> >> >>>> > >> > >>> >> Best, >>>> > >> > >>> >> Liya Fan >>>> > >> > >>> >> >>>> > >> > >>> >> On Wed, Aug 7, 2019 at 10:23 PM Micah Kornfield < >>>> > >> > >>> emkornfi...@gmail.com> >>>> > >> > >>> >> wrote: >>>> > >> > >>> >> >>>> > >> > >>> >>> Hi Gonzalo, >>>> > >> > >>> >>> Thank you for the feedback. I wasn't aware of the JIT >>>> > >> > >>> implications. At >>>> > >> > >>> >>> least on the benchmark run they don't seem to have an >>>> impact. >>>> > >> > >>> >>> >>>> > >> > >>> >>> If there are other benchmarks that people have that can >>>> > >> validate if >>>> > >> > >>> this >>>> > >> > >>> >>> change will be problematic I would appreciate trying to >>>> run them >>>> > >> > >>> with the >>>> > >> > >>> >>> PR. I will try to run the ones for zeroing/popcnt >>>> tonight to >>>> > >> see >>>> > >> > if >>>> > >> > >>> >>> there >>>> > >> > >>> >>> is a change in those. >>>> > >> > >>> >>> >>>> > >> > >>> >>> -Micah >>>> > >> > >>> >>> >>>> > >> > >>> >>> >>>> > >> > >>> >>> >>>> > >> > >>> >>> On Wednesday, August 7, 2019, Gonzalo Ortiz Jaureguizar < >>>> > >> > >>> >>> golthir...@gmail.com> wrote: >>>> > >> > >>> >>> >>>> > >> > >>> >>> > I would recommend to take care with this kind of >>>> changes. >>>> > >> > >>> >>> > >>>> > >> > >>> >>> > I didn't try Arrow in more than one year, but by then >>>> the >>>> > >> > >>> performance >>>> > >> > >>> >>> was >>>> > >> > >>> >>> > quite bad in comparison with plain byte buffer access >>>> > >> > >>> >>> > (see >>>> http://git.net/apache-arrow-development/msg02353.html *) >>>> > >> > and >>>> > >> > >>> >>> > there are several optimizations that the JVM >>>> (specifically, >>>> > >> C2) >>>> > >> > >>> does >>>> > >> > >>> >>> not >>>> > >> > >>> >>> > apply when dealing with int instead of longs. One of >>>> the >>>> > >> > >>> >>> > most commons is the loop unrolling and vectorization. >>>> > >> > >>> >>> > >>>> > >> > >>> >>> > * It doesn't seem the best way to reference an old >>>> email on >>>> > >> the >>>> > >> > >>> list, >>>> > >> > >>> >>> but >>>> > >> > >>> >>> > it is the only result shown by Google >>>> > >> > >>> >>> > >>>> > >> > >>> >>> > El mié., 7 ago. 2019 a las 11:42, Fan Liya (< >>>> > >> > liya.fa...@gmail.com >>>> > >> > >>> >) >>>> > >> > >>> >>> > escribió: >>>> > >> > >>> >>> > >>>> > >> > >>> >>> >> Hi Micah, >>>> > >> > >>> >>> >> >>>> > >> > >>> >>> >> Thanks for your effort. The performance result looks >>>> good. >>>> > >> > >>> >>> >> >>>> > >> > >>> >>> >> As you indicated, ArrowBuf will take additional 12 >>>> bytes (4 >>>> > >> > bytes >>>> > >> > >>> for >>>> > >> > >>> >>> each >>>> > >> > >>> >>> >> of length, write index, and read index). >>>> > >> > >>> >>> >> Similar overheads also exist for vectors like >>>> > >> > >>> BaseFixedWidthVector, >>>> > >> > >>> >>> >> BaseVariableWidthVector, etc. >>>> > >> > >>> >>> >> >>>> > >> > >>> >>> >> IMO, such overheads are small enough to justify the >>>> change. >>>> > >> > >>> >>> >> Let's check if there are other overheads. >>>> > >> > >>> >>> >> >>>> > >> > >>> >>> >> Best, >>>> > >> > >>> >>> >> Liya Fan >>>> > >> > >>> >>> >> >>>> > >> > >>> >>> >> On Wed, Aug 7, 2019 at 3:30 PM Micah Kornfield < >>>> > >> > >>> emkornfi...@gmail.com >>>> > >> > >>> >>> > >>>> > >> > >>> >>> >> wrote: >>>> > >> > >>> >>> >> >>>> > >> > >>> >>> >> > Hi Liya Fan, >>>> > >> > >>> >>> >> > Based on the Float8Benchmark there does not seem to >>>> be any >>>> > >> > >>> >>> meaningful >>>> > >> > >>> >>> >> > performance difference on my machine. At least for >>>> me, the >>>> > >> > >>> >>> benchmarks >>>> > >> > >>> >>> >> are >>>> > >> > >>> >>> >> > not stable enough to say one is faster than the >>>> other (I've >>>> > >> > >>> pasted >>>> > >> > >>> >>> >> results >>>> > >> > >>> >>> >> > below). That being said my machine isn't >>>> necessarily the >>>> > >> most >>>> > >> > >>> >>> reliable >>>> > >> > >>> >>> >> for >>>> > >> > >>> >>> >> > benchmarking. >>>> > >> > >>> >>> >> > >>>> > >> > >>> >>> >> > On an intuitive level, this makes sense to me, for >>>> the >>>> > >> most >>>> > >> > >>> part it >>>> > >> > >>> >>> >> seems >>>> > >> > >>> >>> >> > like the change just moves casting from "int" to >>>> "long" >>>> > >> > further >>>> > >> > >>> up >>>> > >> > >>> >>> the >>>> > >> > >>> >>> >> > stack for "PlatformDepdendent" operations. If >>>> there are >>>> > >> > other >>>> > >> > >>> >>> >> benchmarks >>>> > >> > >>> >>> >> > that you think are worth running let me know. >>>> > >> > >>> >>> >> > >>>> > >> > >>> >>> >> > One downside performance wise I think for his >>>> change is it >>>> > >> > >>> >>> increases the >>>> > >> > >>> >>> >> > size of ArrowBuf objects, which I suppose could >>>> influence >>>> > >> > cache >>>> > >> > >>> >>> misses >>>> > >> > >>> >>> >> at >>>> > >> > >>> >>> >> > some level or increase the size of call-stacks, but >>>> this >>>> > >> > doesn't >>>> > >> > >>> >>> seem to >>>> > >> > >>> >>> >> > show up in the benchmark.. >>>> > >> > >>> >>> >> > >>>> > >> > >>> >>> >> > Thanks, >>>> > >> > >>> >>> >> > Micah >>>> > >> > >>> >>> >> > >>>> > >> > >>> >>> >> > Sample benchmark numbers: >>>> > >> > >>> >>> >> > >>>> > >> > >>> >>> >> > [New Code] >>>> > >> > >>> >>> >> > Benchmark Mode Cnt >>>> Score >>>> > >> Error >>>> > >> > >>> >>> Units >>>> > >> > >>> >>> >> > Float8Benchmarks.copyFromBenchmark avgt 5 >>>> 15.441 ± >>>> > >> 0.469 >>>> > >> > >>> >>> us/op >>>> > >> > >>> >>> >> > Float8Benchmarks.readWriteBenchmark avgt 5 >>>> 14.057 ± >>>> > >> 0.115 >>>> > >> > >>> >>> us/op >>>> > >> > >>> >>> >> > >>>> > >> > >>> >>> >> > >>>> > >> > >>> >>> >> > [Old code] >>>> > >> > >>> >>> >> > Benchmark Mode Cnt >>>> Score >>>> > >> Error >>>> > >> > >>> >>> Units >>>> > >> > >>> >>> >> > Float8Benchmarks.copyFromBenchmark avgt 5 >>>> 16.248 ± >>>> > >> 1.409 >>>> > >> > >>> >>> us/op >>>> > >> > >>> >>> >> > Float8Benchmarks.readWriteBenchmark avgt 5 >>>> 14.150 ± >>>> > >> 0.084 >>>> > >> > >>> >>> us/op >>>> > >> > >>> >>> >> > >>>> > >> > >>> >>> >> > On Tue, Aug 6, 2019 at 1:18 AM Fan Liya < >>>> > >> liya.fa...@gmail.com >>>> > >> > > >>>> > >> > >>> >>> wrote: >>>> > >> > >>> >>> >> > >>>> > >> > >>> >>> >> >> Hi Micah, >>>> > >> > >>> >>> >> >> >>>> > >> > >>> >>> >> >> Thanks a lot for doing this. >>>> > >> > >>> >>> >> >> >>>> > >> > >>> >>> >> >> I am a little concerned about if there is any >>>> negative >>>> > >> > >>> performance >>>> > >> > >>> >>> >> impact >>>> > >> > >>> >>> >> >> on the current 32-bit-length based applications. >>>> > >> > >>> >>> >> >> Can we do some performance comparison on our >>>> existing >>>> > >> > >>> benchmarks? >>>> > >> > >>> >>> >> >> >>>> > >> > >>> >>> >> >> Best, >>>> > >> > >>> >>> >> >> Liya Fan >>>> > >> > >>> >>> >> >> >>>> > >> > >>> >>> >> >> >>>> > >> > >>> >>> >> >> On Tue, Aug 6, 2019 at 3:35 PM Micah Kornfield < >>>> > >> > >>> >>> emkornfi...@gmail.com> >>>> > >> > >>> >>> >> >> wrote: >>>> > >> > >>> >>> >> >> >>>> > >> > >>> >>> >> >>> There have been some previous discussions on the >>>> mailing >>>> > >> > about >>>> > >> > >>> >>> >> supporting >>>> > >> > >>> >>> >> >>> 64-bit lengths for Java ValueVectors (this is >>>> what the >>>> > >> IPC >>>> > >> > >>> >>> >> specification >>>> > >> > >>> >>> >> >>> and C++ support). I created a PR [1] that >>>> changes all >>>> > >> APIs >>>> > >> > >>> that I >>>> > >> > >>> >>> >> could >>>> > >> > >>> >>> >> >>> find that take an index to take an "long" instead >>>> of an >>>> > >> > "int" >>>> > >> > >>> (and >>>> > >> > >>> >>> >> >>> similarly change "size/rowcount" APIs). >>>> > >> > >>> >>> >> >>> >>>> > >> > >>> >>> >> >>> It is a big change, so I think it is worth >>>> discussing if >>>> > >> it >>>> > >> > is >>>> > >> > >>> >>> >> something >>>> > >> > >>> >>> >> >>> we >>>> > >> > >>> >>> >> >>> still want to move forward with. It would be >>>> nice to >>>> > >> come >>>> > >> > to >>>> > >> > >>> a >>>> > >> > >>> >>> >> >>> conclusion >>>> > >> > >>> >>> >> >>> quickly, ideally in the next few days, to avoid a >>>> lot of >>>> > >> > merge >>>> > >> > >>> >>> >> conflicts. >>>> > >> > >>> >>> >> >>> >>>> > >> > >>> >>> >> >>> The reason I did this work now is the C++ >>>> implementation >>>> > >> has >>>> > >> > >>> added >>>> > >> > >>> >>> >> >>> support >>>> > >> > >>> >>> >> >>> for LargeList, LargeBinary and LargeString arrays >>>> and >>>> > >> based >>>> > >> > on >>>> > >> > >>> >>> prior >>>> > >> > >>> >>> >> >>> discussions we need to have similar support in >>>> Java >>>> > >> before >>>> > >> > our >>>> > >> > >>> >>> next >>>> > >> > >>> >>> >> >>> release. Support 64-bit indexes means we can have >>>> full >>>> > >> > >>> >>> compatibility >>>> > >> > >>> >>> >> and >>>> > >> > >>> >>> >> >>> make the most use of the types in Java. >>>> > >> > >>> >>> >> >>> >>>> > >> > >>> >>> >> >>> Look forward to hearing feedback. >>>> > >> > >>> >>> >> >>> >>>> > >> > >>> >>> >> >>> Thanks, >>>> > >> > >>> >>> >> >>> Micah >>>> > >> > >>> >>> >> >>> >>>> > >> > >>> >>> >> >>> [1] https://github.com/apache/arrow/pull/5020 >>>> > >> > >>> >>> >> >>> >>>> > >> > >>> >>> >> >> >>>> > >> > >>> >>> >> >>>> > >> > >>> >>> > >>>> > >> > >>> >>> >>>> > >> > >>> >> >>>> > >> > >>> >>>> > >> > >> >>>> > >> > >>>> > >> >>>> > > >>>> >>>