I agree with Jacques here. Perhaps what is needed is an unsafe non-nullable array accessor layer, then there is no need for flags etc. We've already been writing a lot of such code in C++ (splitting between no-nulls and some-nulls paths, see also the BitBlockCounter stuff we've been doing, is such a thing available in Java yet?) and it hasn't seemed especially burdensome to me.
On Thu, Sep 10, 2020 at 9:37 AM Jacques Nadeau <jacq...@apache.org> wrote: > > This change is undesirable as it optimizes one path and makes several > others behave in unintended ways. What happens if a vector with nulls > shows up? What happens if a user sets a position to a null value in user > code when this flag set? > > If the answer to the above questions is the use is an advanced user, then > why can't they just call: > PlatformDependent.getInt(vector.memoryAddress() + position * 4). > > Why would we introduce something directly in the vector class for this > specialized use case? If the user is advanced, that short memory access > invocation seems fine to use. The whole idea with Arrow is that if you have > a specialized algorithm, you can hand write memory reads and writes because > you have predictable memory layout. > > On Wed, Sep 9, 2020 at 1:11 AM Fan Liya <liya.fa...@gmail.com> wrote: > > > Hi all, > > > > Thanks a lot for your previous feedback. > > > > Now we have made some investigation and prepared an initial PR supporting > > the non-nullable IntVector [1], as this represents a common scenario. > > Some initial observations and conclusions can be made. > > > > The basic idea of the PR is to provide a global static final flag > > (NON_NULLABLE_VECTORS_ENABLED) to enable/disable the new feature. > > 1. When the flag is enabled, manipulations to the validity buffer would be > > by-passed if the vector is non-nullable. > > 2. When the flag is disabled, the behavior is identical to the original > > code. > > > > *First*, we want to show that the change is small. To support the > > non-nullable IntVector, we need to change classes IntVector, > > BaseFixedWidthVector (the superclass of IntVector), and BaseValueVector > > (the superclass of BaseFixedWidthVector). The amount of changes (lines of > > code) to each class is given below > > > > Class > > > > Additions (#lines) > > > > Deletions (#lines) > > > > Total # lines > > > > IntVector > > > > 12 > > > > 5 > > > > 370 > > > > BaseFixedWidthVector > > > > 30 > > > > 11 > > > > 925 > > > > BaseValueVector > > > > 7 > > > > 0 > > > > 240 > > > > It can be seen that the change is small, relative to the class size. In > > addition, to support additional vector types, we only need to change the > > sub-classes, and no more need to change the super classes. > > > > *Second*, we want to show that the performance improvement is notable. To > > see this, we give the performance data of the IntBenchmark (with some > > benchmarks added in the PR). To make the performance as good as possible, > > we enable the ARROW_ENABLE_UNSAFE_MEMORY_ACCESS flag and disable > > the ARROW_ENABLE_NULL_CHECK_FOR_GET flag. Below, we give the data with the > > non-nullable vector flag turned off and on, respectively. > > > > > > > > > > > > > > > > > > *(Vector non-nullable flag off)Benchmark Mode Cnt > > Score Error UnitsIntBenchmarks.getInt avgt 5 > > 384.948 ± 3.336 ns/opIntBenchmarks.isNull avgt 5 > > 1301.005 ± 54.239 ns/opIntBenchmarks.setIntDirectly avgt 5 > > 15387.486 ± 555.749 ns/opIntBenchmarks.setWithValueHolder avgt 5 > > 15251.351 ± 134.286 ns/opIntBenchmarks.setWithWriter avgt 5 > > 28595.586 ± 932.528 ns/op* > > > > *(Vector non-nullable flag on)* > > > > > > > > > > *IntBenchmarks.getInt avgt 5 384.630 ± 1.547 > > ns/opIntBenchmarks.isNull avgt 5 3.004 ± 0.110 > > ns/opIntBenchmarks.setIntDirectly avgt 5 13511.605 ± 135.455 > > ns/opIntBenchmarks.setWithValueHolder avgt 5 13035.883 ± 196.081 > > ns/opIntBenchmarks.setWithWriter avgt 5 24734.825 ± 1603.708 > > ns/op* > > > > For the *getInt *operation, there is little performance difference. This > > is because we disable the ARROW_ENABLE_NULL_CHECK_FOR_GET flag, so > > manipulations to the validity buffer are by-passed, even if the > > non-nullable vector flag is off. When the ARROW_ENABLE_NULL_CHECK_FOR_GET > > is enabled, there is a 72.4% performance improvement gained by turning on > > the non-nullable vector flag. > > > > For the isNull operation, we see 3 orders of magnitude performance > > improvements by enabling the non-nullable vector flag. > > > > For other operations, we see 12.2%, 13.5% and 14.5% performance > > improvements by turning on the non-nullable vector flag. > > > > So it can be seen that notable performance improvements can be gained for > > non-nullable vectors. > > > > *Third*, we want to claim that for nullable vectors and scenarios when we > > turn off the non-nullable vector flag, the new changes do not introduce > > performance regression. Such concern is plausible, as our changes add some > > if-else branches to the code, which may degrade performance. > > > > We give the benchmark results of the original code, as below. > > > > *(original ocde)* > > > > > > > > > > > > *Benchmark Mode Cnt Score Error > > UnitsIntBenchmarks.getInt avgt 5 383.511 ± 0.156 > > ns/opIntBenchmarks.isNull avgt 5 1274.271 ± 19.092 > > ns/opIntBenchmarks.setIntDirectly avgt 5 15162.219 ± 194.956 > > ns/opIntBenchmarks.setWithValueHolder avgt 5 15247.640 ± 153.103 > > ns/opIntBenchmarks.setWithWriter avgt 5 28587.780 ± 160.458 > > ns/op* > > > > By comparing this set of results with the above results with non-nullable > > vector flag disabled, little performance difference can be observed, > > indicating no performance regression. > > > > By examining the generated assembly, it can be seen that JIT is smart > > enough to remove the if-else branch completely (the below screen-shot gives > > an example where the if branch is optimized away in the assembly) > > > > So if some users do not like this feature, because their vectors are > > always nullable, they can simply disable the flag, and no performance > > difference can be observed. > > > > Would you please give some feedback? > > If it looks good to you, maybe we can go ahead to support other types of > > vectors. > > Thank you in advance. > > > > Liya Fan > > > > [image: image.png] > > > > [1] https://github.com/apache/arrow/pull/8147 > > > > On Fri, Mar 13, 2020 at 9:47 PM Fan Liya <liya.fa...@gmail.com> wrote: > > > >> Hi Jacques, > >> > >> Thanks a lot for your valuable comments. > >> > >> I agree with you that collapsing nullable and non-nullable > >> implementations is a good idea, and it does not contradict with the idea of > >> introducing a fast code path, if it does not introduce much cost or code > >> complexity. > >> > >> The idea of word level checking is interesting. > >> > >> As you suggested, we will do more investigations and reconsider the > >> problem from a broader perspective. > >> > >> Best, > >> Liya Fan > >> > >> > >> > >> On Thu, Mar 12, 2020 at 9:27 AM Jacques Nadeau <jacq...@apache.org> > >> wrote: > >> > >>> Generally Ive found that this isnt an important optimization in the use > >>> cases we see. Memory overhead, especially with our Java shared allocation > >>> scheme is nominal. Optimizing null checks at the word level usually is > >>> much > >>> more impactful since non null and null runs are much more common on a > >>> shorter window common than they seeing those declared. > >>> > >>> In other words, I'd suggest you look at your problem with a broader > >>> perspective and see whether you're actually focused on optimizing the > >>> most > >>> important dimension. > >>> > >>> As an aside, the original Arrow Java code actually treated these concepts > >>> more distinctly and we consciously made a decision to collapse them to > >>> simplify real world use. > >>> > >>> I do think it makes to add a dirty read interface if you want. This would > >>> allow consumers of the interface to behave efficiently if they wanted to. > >>> > >>> One last note, efficient evaluation and processing should generally > >>> always > >>> work at the validity word level. Adding an extra if check at the word > >>> versus extra complexity of having an early out per batch seems like a > >>> pretty small life in the grand scheme of processing. > >>> > >>> On Wed, Mar 11, 2020, 9:15 AM Brian Hulette <hulet...@gmail.com> wrote: > >>> > >>> > > And there is a "nullable" metadata-only flag at the > >>> > > Field level. Could the same kinds of optimizations be implemented in > >>> > > Java without introducing a "nullable" concept? > >>> > > >>> > Note Liya Fan did suggest pulling the nullable flag from the Field > >>> when the > >>> > vector is created in item (1) of the proposed changes. > >>> > > >>> > Brian > >>> > > >>> > On Wed, Mar 11, 2020 at 5:54 AM Fan Liya <liya.fa...@gmail.com> wrote: > >>> > > >>> > > Hi Micah, > >>> > > > >>> > > Thanks a lot for your valuable comments. Please see my comments > >>> inline. > >>> > > > >>> > > > I'm a little concerned that this will change assumptions for at > >>> least > >>> > > some > >>> > > > of the clients using the library (some might always rely on the > >>> > validity > >>> > > > buffer being present). > >>> > > > >>> > > I can understand your concern and I am also concerned. > >>> > > IMO, the client should not depend on this assumption, as the > >>> > specification > >>> > > says "Arrays having a 0 null count may choose to not allocate the > >>> > validity > >>> > > bitmap." [1] > >>> > > That being said, I think it would be safe to provide a global flag to > >>> > > switch on/off the feature (as you suggested). > >>> > > > >>> > > > I think this is a good feature to have for the reasons you > >>> mentioned. > >>> > It > >>> > > > seems like there would need to be some sort of configuration bit > >>> to set > >>> > > for > >>> > > > this behavior. > >>> > > > >>> > > Good suggestion. We should be able to switch on and off the feature > >>> with > >>> > a > >>> > > single global flag. > >>> > > > >>> > > > But, I'd be worried about code complexity this would > >>> > > > introduce. > >>> > > > >>> > > I agree with you that code complexity is an important factor to > >>> consider. > >>> > > IMO, our proposal should not involve too much code change, or > >>> increase > >>> > code > >>> > > complexity too much. > >>> > > To prove this, maybe we need to show some small experimental code > >>> change. > >>> > > > >>> > > Best, > >>> > > Liya Fan > >>> > > > >>> > > [1] https://arrow.apache.org/docs/format/Columnar.html#logical-types > >>> > > > >>> > > On Wed, Mar 11, 2020 at 1:53 PM Micah Kornfield < > >>> emkornfi...@gmail.com> > >>> > > wrote: > >>> > > > >>> > > > Hi Liya Fan, > >>> > > > I'm a little concerned that this will change assumptions for at > >>> least > >>> > > some > >>> > > > of the clients using the library (some might always rely on the > >>> > validity > >>> > > > buffer being present). > >>> > > > > >>> > > > I think this is a good feature to have for the reasons you > >>> mentioned. > >>> > It > >>> > > > seems like there would need to be some sort of configuration bit > >>> to set > >>> > > for > >>> > > > this behavior. But, I'd be worried about code complexity this would > >>> > > > introduce. > >>> > > > > >>> > > > Thanks, > >>> > > > Micah > >>> > > > > >>> > > > On Tue, Mar 10, 2020 at 6:42 AM Fan Liya <liya.fa...@gmail.com> > >>> wrote: > >>> > > > > >>> > > > > Hi Wes, > >>> > > > > > >>> > > > > Thanks a lot for your quick reply. > >>> > > > > I think what you mentioned is almost exactly what we want to do > >>> in > >>> > > > Java.The > >>> > > > > concept is not important. > >>> > > > > > >>> > > > > Maybe there are only some minor differences: > >>> > > > > 1. In C++, the null_count is mutable, while for Java, once a > >>> vector > >>> > is > >>> > > > > constructed as non-nullable, its null count can only be 0. > >>> > > > > 2. In C++, a non-nullable array's validity buffer is null, while > >>> in > >>> > > Java, > >>> > > > > the buffer is an empty buffer, and cannot be changed. > >>> > > > > > >>> > > > > Best, > >>> > > > > Liya Fan > >>> > > > > > >>> > > > > On Tue, Mar 10, 2020 at 9:26 PM Wes McKinney < > >>> wesmck...@gmail.com> > >>> > > > wrote: > >>> > > > > > >>> > > > > > hi Liya, > >>> > > > > > > >>> > > > > > In C++ we elect certain faster code paths when the null count > >>> is 0 > >>> > or > >>> > > > > > computed to be zero. When the null count is 0, we do not > >>> allocate a > >>> > > > > > validity bitmap. And there is a "nullable" metadata-only flag > >>> at > >>> > the > >>> > > > > > Field level. Could the same kinds of optimizations be > >>> implemented > >>> > in > >>> > > > > > Java without introducing a "nullable" concept? > >>> > > > > > > >>> > > > > > - Wes > >>> > > > > > > >>> > > > > > On Tue, Mar 10, 2020 at 8:13 AM Fan Liya <liya.fa...@gmail.com > >>> > > >>> > > wrote: > >>> > > > > > > > >>> > > > > > > Dear all, > >>> > > > > > > > >>> > > > > > > A non-nullable vector is one that is guaranteed to contain no > >>> > > nulls. > >>> > > > We > >>> > > > > > > want to support non-nullable vectors in Java. > >>> > > > > > > > >>> > > > > > > *Motivations:* > >>> > > > > > > 1. It is widely used in practice. For example, in a database > >>> > > engine, > >>> > > > a > >>> > > > > > > column can be declared as not null, so it cannot contain null > >>> > > values. > >>> > > > > > > 2.Non-nullable vectors has significant performance advantages > >>> > > > compared > >>> > > > > > with > >>> > > > > > > their nullable conterparts, such as: > >>> > > > > > > 1) the memory space of the validity buffer can be saved. > >>> > > > > > > 2) manipulation of the validity buffer can be bypassed > >>> > > > > > > 3) some if-else branches can be replaced by sequential > >>> > > instructions > >>> > > > > (by > >>> > > > > > > the JIT compiler), leading to high throughput for the CPU > >>> > pipeline. > >>> > > > > > > > >>> > > > > > > *Potential Cost:* > >>> > > > > > > For nullable vectors, there can be extra checks against the > >>> > > > > nullablility > >>> > > > > > > flag. So we must change the code in a way that minimizes the > >>> > cost. > >>> > > > > > > > >>> > > > > > > *Proposed Changes:* > >>> > > > > > > 1. There is no need to create new vector classes. We add a > >>> final > >>> > > > > boolean > >>> > > > > > to > >>> > > > > > > the vector base classes as the nullability flag. The value > >>> of the > >>> > > > flag > >>> > > > > > can > >>> > > > > > > be obtained from the field when creating the vector. > >>> > > > > > > 2. Add a method "boolean isNullable()" to the root interface > >>> > > > > ValueVector. > >>> > > > > > > 3. If a vector is non-nullable, its validity buffer should > >>> be an > >>> > > > empty > >>> > > > > > > buffer (not null, so much of the existing logic can be left > >>> > > > unchanged). > >>> > > > > > > 4. For operations involving validity buffers (e.g. isNull, > >>> get, > >>> > > set), > >>> > > > > we > >>> > > > > > > use the nullability flag to bypass manipulations to the > >>> validity > >>> > > > > buffer. > >>> > > > > > > > >>> > > > > > > Therefore, it should be possible to support the feature with > >>> > small > >>> > > > code > >>> > > > > > > changes. > >>> > > > > > > > >>> > > > > > > BTW, please note that similar behaviors have already been > >>> > supported > >>> > > > in > >>> > > > > > C++. > >>> > > > > > > > >>> > > > > > > Would you please give your valueable feedback? > >>> > > > > > > > >>> > > > > > > Best, > >>> > > > > > > Liya Fan > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > > >>> > >>