bq. with a base implementation that does not support nulls +1
On Mon, Apr 1, 2013 at 1:32 PM, Nick Dimiduk <[email protected]> wrote: > Thanks for the thoughtful response (and code!). > > I'm thinking I will press forward with a base implementation that does not > support nulls. The idea is to provide an extensible set of interfaces, so I > think this will not box us into a corner later. That is, a mirroring > package could be implemented that supports null values and accepts > the relevant trade-offs. > > Thanks, > Nick > > On Mon, Apr 1, 2013 at 12:26 PM, Matt Corgan <[email protected]> wrote: > > > I spent some time this weekend extracting bits of our serialization code > to > > a public github repo at http://github.com/hotpads/data-tools. > > Contributions are welcome - i'm sure we all have this stuff laying > around. > > > > You can see I've bumped into the NULL problem in a few places: > > * > > > > > https://github.com/hotpads/data-tools/blob/master/src/main/java/com/hotpads/data/primitive/lists/LongArrayList.java > > * > > > > > https://github.com/hotpads/data-tools/blob/master/src/main/java/com/hotpads/data/types/floats/DoubleByteTool.java > > > > Looking back, I think my latest opinion on the topic is to reject > > nullability as the rule since it can cause unexpected behavior and > > confusion. It's cleaner to provide a wrapper class (so both > LongArrayList > > plus NullableLongArrayList) that explicitly defines the behavior, and > costs > > a little more in performance. If the user can't find a pre-made wrapper > > class, it's not very difficult for each user to provide their own > > interpretation of null and check for it themselves. > > > > If you reject nullability, the question becomes what to do in situations > > where you're implementing existing interfaces that accept nullable > params. > > The LongArrayList above implements List<Long> which requires an > add(Long) > > method. In the above implementation I chose to swap nulls with > > Long.MIN_VALUE, however I'm now thinking it best to force the user to > make > > that swap and then throw IllegalArgumentException if they pass null. > > > > > > On Mon, Apr 1, 2013 at 11:41 AM, Doug Meil < > [email protected] > > >wrote: > > > > > > > > HmmmŠ good question. > > > > > > I think that fixed width support is important for a great many rowkey > > > constructs cases, so I'd rather see something like losing MIN_VALUE and > > > keeping fixed width. > > > > > > > > > > > > > > > On 4/1/13 2:00 PM, "Nick Dimiduk" <[email protected]> wrote: > > > > > > >Heya, > > > > > > > >Thinking about data types and serialization. I think null support is > an > > > >important characteristic for the serialized representations, > especially > > > >when considering the compound type. However, doing so in directly > > > >incompatible with fixed-width representations for numerics. For > > instance, > > > >if we want to have a fixed-width signed long stored on 8-bytes, where > do > > > >you put null? float and double types can cheat a little by folding > > > >negative > > > >and positive NaN's into a single representation (this isn't strictly > > > >correct!), leaving a place to represent null. In the long example > case, > > > >the > > > >obvious choice is to reduce MAX_VALUE or increase MIN_VALUE by one. > This > > > >will allocate an additional encoding which can be used for null. My > > > >experience working with scientific data, however, makes me wince at > the > > > >idea. > > > > > > > >The variable-width encodings have it a little easier. There's already > > > >enough going on that it's simpler to make room. > > > > > > > >Remember, the final goal is to support order-preserving serialization. > > > >This > > > >imposes some limitations on our encoding strategies. For instance, > it's > > > >not > > > >enough to simply encode null, it really needs to be encoded as 0x00 so > > as > > > >to sort lexicographically earlier than any other value. > > > > > > > >What do you think? Any ideas, experiences, etc? > > > > > > > >Thanks, > > > >Nick > > > > > > > > > > > > > > >
