Silly question... Null support. In a system where a column may or may not exist, how do you support null?
;-) In terms of a key, it's a primary key and can't be null. So what am I missing? Sent from a remote device. Please excuse any typos... Mike Segel On Apr 1, 2013, at 10:26 PM, Nick Dimiduk <ndimi...@gmail.com> wrote: > Furthermore, is is more important to support null values than squeeze all > representations into minimum size (4-bytes for int32, &c.)? > On Apr 1, 2013 4:41 PM, "Nick Dimiduk" <ndimi...@gmail.com> wrote: > >> On Mon, Apr 1, 2013 at 4:31 PM, James Taylor <jtay...@salesforce.com>wrote: >> >>> From the SQL perspective, handling null is important. >> >> >> From your perspective, it is critical to support NULLs, even at the >> expense of fixed-width encodings at all or supporting representation of a >> full range of values. That is, you'd rather be able to represent NULL than >> -2^31? >> >> On 04/01/2013 01:32 PM, Nick Dimiduk wrote: >>> >>>> Thanks for the thoughtful response (and code!). >>>> >>>> I'm thinking I will press forward with a base implementation that does >>>> not >>>> support nulls. The idea is to provide an extensible set of interfaces, >>>> so I >>>> think this will not box us into a corner later. That is, a mirroring >>>> package could be implemented that supports null values and accepts >>>> the relevant trade-offs. >>>> >>>> Thanks, >>>> Nick >>>> >>>> On Mon, Apr 1, 2013 at 12:26 PM, Matt Corgan <mcor...@hotpads.com> >>>> wrote: >>>> >>>> I spent some time this weekend extracting bits of our serialization >>>>> code to >>>>> a public github repo at >>>>> http://github.com/hotpads/**data-tools<http://github.com/hotpads/data-tools> >>>>> . >>>>> Contributions are welcome - i'm sure we all have this stuff laying >>>>> around. >>>>> >>>>> You can see I've bumped into the NULL problem in a few places: >>>>> * >>>>> >>>>> https://github.com/hotpads/**data-tools/blob/master/src/** >>>>> main/java/com/hotpads/data/**primitive/lists/LongArrayList.**java<https://github.com/hotpads/data-tools/blob/master/src/main/java/com/hotpads/data/primitive/lists/LongArrayList.java> >>>>> * >>>>> >>>>> https://github.com/hotpads/**data-tools/blob/master/src/** >>>>> main/java/com/hotpads/data/**types/floats/DoubleByteTool.**java<https://github.com/hotpads/data-tools/blob/master/src/main/java/com/hotpads/data/types/floats/DoubleByteTool.java> >>>>> >>>>> Looking back, I think my latest opinion on the topic is to reject >>>>> nullability as the rule since it can cause unexpected behavior and >>>>> confusion. It's cleaner to provide a wrapper class (so both >>>>> LongArrayList >>>>> plus NullableLongArrayList) that explicitly defines the behavior, and >>>>> costs >>>>> a little more in performance. If the user can't find a pre-made wrapper >>>>> class, it's not very difficult for each user to provide their own >>>>> interpretation of null and check for it themselves. >>>>> >>>>> If you reject nullability, the question becomes what to do in situations >>>>> where you're implementing existing interfaces that accept nullable >>>>> params. >>>>> The LongArrayList above implements List<Long> which requires an >>>>> add(Long) >>>>> method. In the above implementation I chose to swap nulls with >>>>> Long.MIN_VALUE, however I'm now thinking it best to force the user to >>>>> make >>>>> that swap and then throw IllegalArgumentException if they pass null. >>>>> >>>>> >>>>> On Mon, Apr 1, 2013 at 11:41 AM, Doug Meil < >>>>> doug.m...@explorysmedical.com >>>>> >>>>>> wrote: >>>>>> HmmmŠ good question. >>>>>> >>>>>> I think that fixed width support is important for a great many rowkey >>>>>> constructs cases, so I'd rather see something like losing MIN_VALUE and >>>>>> keeping fixed width. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 4/1/13 2:00 PM, "Nick Dimiduk" <ndimi...@gmail.com> wrote: >>>>>> >>>>>> Heya, >>>>>>> >>>>>>> Thinking about data types and serialization. I think null support is >>>>>>> an >>>>>>> important characteristic for the serialized representations, >>>>>>> especially >>>>>>> when considering the compound type. However, doing so in directly >>>>>>> incompatible with fixed-width representations for numerics. For >>>>>> instance, >>>>> >>>>>> if we want to have a fixed-width signed long stored on 8-bytes, where >>>>>>> do >>>>>>> you put null? float and double types can cheat a little by folding >>>>>>> negative >>>>>>> and positive NaN's into a single representation (this isn't strictly >>>>>>> correct!), leaving a place to represent null. In the long example >>>>>>> case, >>>>>>> the >>>>>>> obvious choice is to reduce MAX_VALUE or increase MIN_VALUE by one. >>>>>>> This >>>>>>> will allocate an additional encoding which can be used for null. My >>>>>>> experience working with scientific data, however, makes me wince at >>>>>>> the >>>>>>> idea. >>>>>>> >>>>>>> The variable-width encodings have it a little easier. There's already >>>>>>> enough going on that it's simpler to make room. >>>>>>> >>>>>>> Remember, the final goal is to support order-preserving serialization. >>>>>>> This >>>>>>> imposes some limitations on our encoding strategies. For instance, >>>>>>> it's >>>>>>> not >>>>>>> enough to simply encode null, it really needs to be encoded as 0x00 so >>>>>> as >>>>> >>>>>> to sort lexicographically earlier than any other value. >>>>>>> >>>>>>> What do you think? Any ideas, experiences, etc? >>>>>>> >>>>>>> Thanks, >>>>>>> Nick >>