Re: [DISCUSS] Remove required type

Parth Chandra Thu, 24 Mar 2016 10:38:44 -0700

For the record I disagree with the assertion that the fear is around
'theoretical' performance regressions. I mentioned in a previous post that
I have customer data with required fields. As real as it gets.





On Thu, Mar 24, 2016 at 9:01 AM, Jacques Nadeau <[email protected]> wrote:

> I've created DRILL-4534 to track this issue.
>
> The reality is that the fear around theoretical performance regressions is
> holding this back. By making this both the removal of the required type and
> the switch to columnar null evaluation, those fears should be allayed. I've
> created both as subtasks under DRILL-4534. This is a breaking change (it
> changes the UDF interface) so I've associated this with the v2 release.
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Thu, Mar 24, 2016 at 8:48 AM, Jacques Nadeau <[email protected]>
> wrote:
>
> > My numbers show a declarative approach is unnecessary in execution.
> >
> > >> Having the right tools would help...
> >
> > Declarative is great in planning and should continue to exist. The right
> > tools will continue to exist.
> >
> > It seems like a number of people here are worried performance of future
> > features. I'm also focused on performance. Cleaning up mistakes is the
> way
> > we're going to get to the next level of performance.
> >
> > It is clear from my numbers that a columnar observation approach would be
> > a huge win across virtually all current workloads.
> >
> > I think there is a second dynamic here: this type of change, much like
> the
> > a few others proposed right now are not trivial changes: there are huge
> > benefits to what we're proposing but it is possible that some workloads
> > won't be as good. That seems like a step-function change (and some would
> > call it a breaking change). I'm going to start a new thread on creation
> of
> > a v2 branch.
> >
> >
> >
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> > On Thu, Mar 24, 2016 at 8:38 AM, Aman Sinha <[email protected]>
> wrote:
> >
> >> With regard to the following:
> >>
> >>
> >>
> >> *>> The only time we use the "required" path is if the underlying data
> >>
> >> guarantees that all the data will be non-null. I believe that path is >>
> >> rarely used, poorly tested and provides only a small gain in performance
> >> >>
> >> when used.*
> >>
> >> The main reason this code path is less used is because currently there
> is
> >> no declarative way of specifying the required type.  Going forward, at
> >> least 2 features (probably several more) that would require a
> declarative
> >> approach:
> >>
> >>    1. INSERT INTO:   I recall discussions from last year where we wanted
> >> to
> >>    keep the merged schema in some metadata file.  This would allow an
> >> insert
> >>    row to be quickly rejected if its schema did not match the merged
> >> schema.
> >>    2. Sort physical property of a column in files in order to do
> >> merge-join
> >>    or streaming aggregate without re-sorting the data.  This physical
> >> property
> >>    would also be declared in the metadata file.
> >>
> >> Once these functionality are added (I am not sure of the timeline but
> >> likely in a few months) we could leverage the same declarative way for
> NOT
> >> NULL attributes for the underlying data.
> >>
> >> For data warehouse offloads (a major use-case of Drill), we need to make
> >> the ForeignKey-PrimaryKey joins (assume both are guaranteed to be
> non-null
> >> for this scenario) as fast as possible to compete with the RDBMSs.
> >>  Having
> >> the right tools would help...
> >>
> >>
> >> On Wed, Mar 23, 2016 at 2:55 PM, Jacques Nadeau <[email protected]>
> >> wrote:
> >>
> >> > There seems to be a lot of confusion on this thread.
> >> >
> >> > We have large amount of code that separates physical representations
> of
> >> > data that can be possibly null versus data that can't be null. We
> have a
> >> > rigid concept in MajorType of whether data is nullable or required. If
> >> we
> >> > change from one to the other, that is a schema change inside of Drill
> >> (and
> >> > treated much the same as changing from Integer to Map). As we compile
> >> > expression trees, we have to constantly manage whether or not items
> are
> >> > null or not null. We also don't cast between the two. So UDF, Vector
> >> > classes, code generation, schema management, schema change are all
> much
> >> > more complicated because of this fact. I proposed this complexity
> >> initially
> >> > but looking at the continued cost and nominal benefit, think it was a
> >> > mistake.
> >> >
> >> > The only time we use the "required" path is if the underlying data
> >> > guarantees that all the data will be non-null. I believe that path is
> >> > rarely used, poorly tested and provides only a small gain in
> performance
> >> > when used. In essence, it creates a permutation nightmare (just like
> us
> >> > having too many minor types) with marginal benefit.
> >> >
> >> > The proposal here is to correct that mistake.
> >> >
> >> > **Separately**, Drill should take better advantage of observed
> not-null
> >> > data.
> >> >
> >> > >> You may not generate not-null data, but a lot of data is not-null.
> >> >
> >> > Yes! You are 100% correct. Drill often chews through large amounts of
> >> data
> >> > that is annotated as nullable but has no nulls. For example, we run
> >> > benchmarks on TPCH data. The TPCH dataset doesn't have nulls. However,
> >> we
> >> > store the data as nullable (to be consistent with how virtually all
> >> systems
> >> > generate the data). As such, *Drill uses the nullable path* for the
> >> > entirety of execution. This is a great opportunity for performance
> >> > improvements. However, it is orthogonal to whether we remove the code
> >> path
> >> > above ** since it doesn't use it**. Ultimately we should allow the
> >> > execution engine to decide the operation path **rather than having a
> >> schema
> >> > level concept** that creates more code combinations and schema change.
> >> >
> >> > My additional perspective is that having the mistake cruft above means
> >> that
> >> > doing the right thing of using observed nulls instead of annotated
> >> nulls is
> >> > substantially harder to implement and reduces the likelihood that it
> >> will
> >> > be implemented.
> >> >
> >> > With regards to columnar benefits for calculations (which I again
> argue
> >> is
> >> > actually orthogonal to the initial proposal), I put together an ideal
> >> > condition test. In reality, we have more indirection and I'd actually
> >> > expect a larger benefit moving to columnar null evaluation than is
> this
> >> > test. (For example: (1) everybody still runs with bounds checking
> which
> >> > introduces an additional check for each null bit and (2) we always
> read
> >> > memory values in addition to null bits before inspecting the null
> >> bits). As
> >> > you can see below, having a columnar approach means that performance
> >> varies
> >> > little depending on nullability. Optimizing for the columnar no-nulls
> >> case
> >> > provides 5-6% additional performance which seems like a late
> >> optimization
> >> > compared to where we should be focused: moving to columnar execution.
> >> >
> >> > Benchmark                                          Mode  Cnt     Score
> >> > Error  Units
> >> > ColumnarComparisons.a_plus_b_columnar              avgt  200
> 2059.743 ±
> >> > 9.625  ns/op
> >> > ColumnarComparisons.a_plus_b_non_null              avgt  200
> 1934.380 ±
> >> >  10.279  ns/op
> >> > ColumnarComparisons.a_plus_b_current_drill         avgt  200
> 6737.569 ±
> >> > 396.452  ns/op
> >> > ColumnarComparisons.a_plus_b_plus_c_columnar       avgt  200
> 2565.702 ±
> >> >  12.139  ns/op
> >> > ColumnarComparisons.a_plus_b_plus_c_non_null       avgt  200
> 2437.322 ±
> >> >  12.875  ns/op
> >> > ColumnarComparisons.a_plus_b_plus_c_current_drill  avgt  200
> 9010.913 ±
> >> > 475.392  ns/op
> >> >
> >> > This comes out as:
> >> >
> >> > columnar a+b 0.5ns/record
> >> > current a+b 1.7ns/record
> >> > no-null a+b 0.5ns/record
> >> > columnar a+b+c 0.6ns/record
> >> > current a+b+c 2.25ns/record
> >> > no-null a+b+c 0.6ns/record
> >> >
> >> > relative differences:
> >> > columnar versus current (a+b) : 3.2x
> >> > columnar versus current (a+b+c) : 3.5x
> >> > columnar no-nulls eval null: 1.06x
> >> > columnar no-nulls eval null: 1.05x
> >> >
> >> > Code here: https://gist.github.com/jacques-n/70fa5afdeadba28ea398
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Jacques Nadeau
> >> > CTO and Co-Founder, Dremio
> >> >
> >> > On Wed, Mar 23, 2016 at 11:58 AM, Parth Chandra <[email protected]>
> >> wrote:
> >> >
> >> > > Hmm. I may not have expressed my thoughts clearly.
> >> > > What I was suggesting was that 'non-null' data exists in all data
> >> sets.
> >> > (I
> >> > > have at least two data sets from users with Drill in production
> >> (sorry,
> >> > > cannot share the data), that have required fields in parquet files).
> >> The
> >> > > fields may not be marked as such in the metadata, or the data source
> >> may
> >> > > not have any such metadata, but if we can identify the type as
> >> non-null,
> >> > > then we can (and should) take advantage of it.
> >> > > If we are already taking advantage of it, then we should not make
> any
> >> > > changes without understanding the tradeoffs.
> >> > > So in the spirit of understanding that, I'd like to ask two
> questions
> >> -
> >> > > 1) Where specifically are you suggesting code complexity will
> >> decrease.
> >> > You
> >> > > mentioned UDFs. Where else do you see the code is more complex?
> >> > > 2) Do we have any experiments to show whether columnar processing
> >> > benefits
> >> > > from eliminating required fields?
> >> > >
> >> > >
> >> > >
> >> > > On Wed, Mar 23, 2016 at 8:36 AM, Jacques Nadeau <[email protected]
> >
> >> > > wrote:
> >> > >
> >> > > > I agree that we should focus on real benefits versus theories.
> >> > > >
> >> > > > Reduction in code complexity is a real benefit. Performance
> benefits
> >> > from
> >> > > > having required types is theoretical. Dot drill files don't exist
> so
> >> > they
> >> > > > should have little bearing on this conversation.
> >> > > >
> >> > > > We rarely generate required data. Most tools never generate it.
> The
> >> > > reason
> >> > > > the question is about actual deployments is that would be a real
> >> factor
> >> > > to
> >> > > > counterbalance the drive for code simplification rather than
> >> something
> >> > > > theoretical. A theoretical future performance regression shouldn't
> >> stop
> >> > > > code improvement. If it did, we wouldn't make any progress.
> >> > > >
> >> > > > What about your own internal benchmark tests. If removing required
> >> > types
> >> > > > doesn't impact them, doesn't that mean this hasn't been a point of
> >> > focus?
> >> > > > On Mar 22, 2016 8:36 PM, "Parth Chandra" <[email protected]>
> wrote:
> >> > > >
> >> > > > > I don't know if the main question is whether people have parquet
> >> (or
> >> > > > other
> >> > > > > ) files which have required fields or not. With something like a
> >> dot
> >> > > > drill
> >> > > > > file, a user can supply schema or format for data that does not
> >> carry
> >> > > > > schema, and we can certainly use the same to indicate knowledge
> of
> >> > > > > nullability. The question is whether we can take advantage of
> >> knowing
> >> > > > > whether data is null or not to get better performance.
> >> > > > >
> >> > > > > Any argument that applies to taking advantage of non-nullability
> >> at
> >> > the
> >> > > > > batch level applies to taking advantage of non-nullability at
> the
> >> > > schema
> >> > > > > level.
> >> > > > >
> >> > > > > I'm not entirely convinced that the reduction of code complexity
> >> is
> >> > > > > ultimately leading to performance gain. Sure, it improves
> >> > > > maintainability,
> >> > > > > but what specific improvements are you thinking of that will
> >> increase
> >> > > > > performance?
> >> > > > >
> >> > > > > If you recommend some areas of improvement that become possible
> >> as a
> >> > > > result
> >> > > > > of this change, then I would suggest we run some experiments
> >> before
> >> > we
> >> > > > make
> >> > > > > any change.
> >> > > > >
> >> > > > > It is a capital mistake to theorize before one has data, etc...
> >> > > > >
> >> > > > > A 15% performance drop is not something to be ignored, I would
> >> think.
> >> > > > >
> >> > > > > Parth
> >> > > > >
> >> > > > > On Tue, Mar 22, 2016 at 5:40 PM, Jacques Nadeau <
> >> [email protected]>
> >> > > > > wrote:
> >> > > > > >
> >> > > > > > Re Performance:
> >> > > > > >
> >> > > > > > I think the main question is what portion of people's data is
> >> > > actually
> >> > > > > > marked as non-nullable in Parquet files? (We already treat
> json,
> >> > > avro,
> >> > > > > > kudu, and hbase (except row key) as nullable. We do treat csv
> as
> >> > > > > > non-nullable (array) but I think these workloads start with
> >> > > conversion
> >> > > > to
> >> > > > > > Parquet.)  Early on, we typically benchmarked Drill using
> >> required
> >> > > > fields
> >> > > > > > in Parquet. At the time, we actually hacked the Pig code to
> get
> >> > > > something
> >> > > > > > to even generate this format. (I believe, to this day, Pig
> only
> >> > > > generates
> >> > > > > > nullable fields in Parquet.) After some time, we recognized
> that
> >> > > > > basically
> >> > > > > > every tool was producing Parquet files that were nullable and
> >> > > > ultimately
> >> > > > > > moved the benchmark infrastructure to using nullable types to
> >> do a
> >> > > > better
> >> > > > > > job of representing real-world workloads.
> >> > > > > >
> >> > > > > > Based on my (fuzzy) recollection, working with nullable types
> >> had a
> >> > > > > 10-15%
> >> > > > > > performance impact versus working on required types so I think
> >> > there
> >> > > > is a
> >> > > > > > performance impact but I think the population of users who
> have
> >> > > > > > non-nullable Parquet files are small. If I recall, I believe
> >> Impala
> >> > > > also
> >> > > > > > creates nullable Parquet files. Not sure what Spark does. I
> >> believe
> >> > > > Hive
> >> > > > > > has also made this change recently or is doing it (deprecating
> >> > > > non-nulls
> >> > > > > in
> >> > > > > > their internals).
> >> > > > > >
> >> > > > > > If we move forward with this, I would expect there initially
> >> would
> >> > > be a
> >> > > > > > decrease in performance when data is held as non-nullable
> given
> >> we
> >> > > > > > previously observed this. However, I believe the reduction in
> >> code
> >> > > > > > complexity would leads us to improve other things more
> quickly.
> >> > Which
> >> > > > > leads
> >> > > > > > me to...
> >> > > > > >
> >> > > > > > Re: Why
> >> > > > > >
> >> > > > > > Drill suffers from code complexity. This hurts forward
> progress.
> >> > One
> >> > > > > > example is the fact that we have to generate all nullable
> >> > > permutations
> >> > > > of
> >> > > > > > functions. (For example, if we have three arguments, we have
> to
> >> > > > generate
> >> > > > > 8
> >> > > > > > separate functions to work with the combination of argument
> >> > > > > nullabilities).
> >> > > > > > This leads to vastly more reliance on compile-time templating
> >> which
> >> > > is
> >> > > > a
> >> > > > > > maintenance headache. Additionally, it makes the runtime code
> >> > > > generation
> >> > > > > > more complicated and error prone. Testing is also more
> expensive
> >> > > > because
> >> > > > > we
> >> > > > > > now have twice as many paths to both validate and maintain.
> >> > > > > Realistically,
> >> > > > > > we should try to move to more columnar algorithms, which would
> >> > > provide
> >> > > > a
> >> > > > > > bigger lift than this declared schema nullability
> optimization.
> >> > This
> >> > > is
> >> > > > > > because many workloads have rare nulls so we can actually
> >> optimize
> >> > > > better
> >> > > > > > at the batch level. Creating three code paths (nullable
> observed
> >> > > > > non-null,
> >> > > > > > nullable observed null and non-null) make this substantially
> >> more
> >> > > > > > complicated. We want to invest in this area but the code
> >> complexity
> >> > > of
> >> > > > > > nullable versus required makes this tasks less likely to
> >> > > happen/harder.
> >> > > > > So,
> >> > > > > > in essence, I'm arguing that it is a small short-term loss
> that
> >> > leads
> >> > > > to
> >> > > > > > better code quality and ultimately faster performance.
> >> > > > > >
> >> > > > > > Do others have real-world observations on the frequency of
> >> required
> >> > > > > fields
> >> > > > > > in Parquet files?
> >> > > > > >
> >> > > > > > thanks,
> >> > > > > > Jacques
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Jacques Nadeau
> >> > > > > > CTO and Co-Founder, Dremio
> >> > > > > >
> >> > > > > > On Tue, Mar 22, 2016 at 3:08 PM, Parth Chandra <
> >> [email protected]>
> >> > > > > wrote:
> >> > > > > >
> >> > > > > > > I'm not entirely convinced that this would have no
> performance
> >> > > > impact.
> >> > > > > Do
> >> > > > > > > we have any experiments?
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Tue, Mar 22, 2016 at 1:36 PM, Jacques Nadeau <
> >> > > [email protected]>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > My suggestion is we use explicit observation at the batch
> >> > level.
> >> > > If
> >> > > > > there
> >> > > > > > > > are no nulls we can optimize this batch. This would
> >> ultimately
> >> > > > > improve
> >> > > > > > > over
> >> > > > > > > > our current situation where most parquet and all json data
> >> is
> >> > > > > nullable so
> >> > > > > > > > we don't optimize. I'd estimate that the vast majority of
> >> > Drills
> >> > > > > > > workloads
> >> > > > > > > > are marked nullable whether they are or not. So what we're
> >> > really
> >> > > > > > > > suggesting is deleting a bunch of code which is rarely in
> >> the
> >> > > > > execution
> >> > > > > > > > path.
> >> > > > > > > > On Mar 22, 2016 1:22 PM, "Aman Sinha" <
> [email protected]
> >> >
> >> > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > I was thinking about it more after sending the previous
> >> > > concerns.
> >> > > > > > > Agree,
> >> > > > > > > > > this is an execution side change...but some details need
> >> to
> >> > be
> >> > > > > worked
> >> > > > > > > > out.
> >> > > > > > > > > If the planner indicates to the executor that a column
> is
> >> > > > > non-nullable
> >> > > > > > > > (e.g
> >> > > > > > > > > a primary key),  the run-time generated code is more
> >> > efficient
> >> > > > > since it
> >> > > > > > > > > does not have to check the null bit.  Are you thinking
> we
> >> > would
> >> > > > use
> >> > > > > the
> >> > > > > > > > > existing nullable vector and add some additional
> metadata
> >> > (at a
> >> > > > > record
> >> > > > > > > > > batch level rather than record level) to indicate
> >> > > > non-nullability ?
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > On Tue, Mar 22, 2016 at 12:27 PM, Jacques Nadeau <
> >> > > > > [email protected]
> >> > > > > >
> >> > > > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Hey Aman, I believe both Steven and I were only
> >> suggesting
> >> > > > > removal
> >> > > > > > > only
> >> > > > > > > > > > from execution, not planning. It seems like your
> >> concerns
> >> > are
> >> > > > all
> >> > > > > > > > related
> >> > > > > > > > > > to planning. Iit seems like the real tradeoffs in
> >> execution
> >> > > are
> >> > > > > > > > nominal.
> >> > > > > > > > > > On Mar 22, 2016 9:03 AM, "Aman Sinha" <
> >> > [email protected]>
> >> > > > > wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > > While it is true that there is code complexity due
> to
> >> the
> >> > > > > required
> >> > > > > > > > > type,
> >> > > > > > > > > > > what would we be trading off ?  some important
> >> > > > considerations:
> >> > > > > > > > > > >   - We don't currently have null count statistics
> >> which
> >> > > would
> >> > > > > need
> >> > > > > > > to
> >> > > > > > > > > be
> >> > > > > > > > > > > implemented for various data sources
> >> > > > > > > > > > >   - Primary keys in the RDBMS sources (or rowkeys in
> >> > hbase)
> >> > > > are
> >> > > > > > > > always
> >> > > > > > > > > > > non-null, and although today we may not be doing
> >> > > > optimizations
> >> > > > > to
> >> > > > > > > > > > leverage
> >> > > > > > > > > > > that,  one could easily add a rule that converts
> >> WHERE
> >> > > > > primary_key
> >> > > > > > > > IS
> >> > > > > > > > > > NULL
> >> > > > > > > > > > > to a FALSE filter.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Tue, Mar 22, 2016 at 7:31 AM, Dave Oshinsky <
> >> > > > > > > > > [email protected]>
> >> > > > > > > > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > Hi Jacques,
> >> > > > > > > > > > > > Marginally related to this, I made a small change
> in
> >> > > PR-372
> >> > > > > > > > > > (DRILL-4184)
> >> > > > > > > > > > > > to support variable widths for decimal quantities
> in
> >> > > > Parquet.
> >> > > > >  I
> >> > > > > > > > > found
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > (decimal) vectoring code to be very difficult to
> >> > > understand
> >> > > > > > > > (probably
> >> > > > > > > > > > > > because it's overly complex, but also because I'm
> >> new
> >> > to
> >> > > > > Drill
> >> > > > > > > code
> >> > > > > > > > > in
> >> > > > > > > > > > > > general), so I made a small, surgical change in my
> >> pull
> >> > > > > request
> >> > > > > > > to
> >> > > > > > > > > > > support
> >> > > > > > > > > > > > keeping track of variable widths (lengths) and
> null
> >> > > > booleans
> >> > > > > > > within
> >> > > > > > > > > the
> >> > > > > > > > > > > > existing fixed width decimal vectoring scheme.
> Can
> >> my
> >> > > > > changes be
> >> > > > > > > > > > > > reviewed/accepted, and then we discuss how to fix
> >> > > properly
> >> > > > > > > > long-term?
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Thanks,
> >> > > > > > > > > > > > Dave Oshinsky
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > -----Original Message-----
> >> > > > > > > > > > > > From: Jacques Nadeau [mailto:[email protected]]
> >> > > > > > > > > > > > Sent: Monday, March 21, 2016 11:43 PM
> >> > > > > > > > > > > > To: dev
> >> > > > > > > > > > > > Subject: Re: [DISCUSS] Remove required type
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Definitely in support of this. The required type
> is
> >> a
> >> > > huge
> >> > > > > > > > > maintenance
> >> > > > > > > > > > > and
> >> > > > > > > > > > > > code complexity nightmare that provides little to
> no
> >> > > > benefit.
> >> > > > > As
> >> > > > > > > > you
> >> > > > > > > > > > > point
> >> > > > > > > > > > > > out, we can do better performance optimizations
> >> though
> >> > > null
> >> > > > > count
> >> > > > > > > > > > > > observation since most sources are nullable
> anyway.
> >> > > > > > > > > > > > On Mar 21, 2016 7:41 PM, "Steven Phillips" <
> >> > > > > [email protected]>
> >> > > > > > > > > wrote:
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > I have been thinking about this for a while now,
> >> and
> >> > I
> >> > > > feel
> >> > > > > it
> >> > > > > > > > > would
> >> > > > > > > > > > > > > be a good idea to remove the Required vector
> types
> >> > from
> >> > > > > Drill,
> >> > > > > > > > and
> >> > > > > > > > > > > > > only use the Nullable version of vectors. I
> think
> >> > this
> >> > > > will
> >> > > > > > > > greatly
> >> > > > > > > > > > > > simplify the code.
> >> > > > > > > > > > > > > It will also simplify the creation of UDFs. As
> is,
> >> > if a
> >> > > > > > > function
> >> > > > > > > > > has
> >> > > > > > > > > > > > > custom null handling (i.e. INTERNAL), the
> function
> >> > has
> >> > > to
> >> > > > > be
> >> > > > > > > > > > > > > separately implemented for each permutation of
> >> > > > nullability
> >> > > > > of
> >> > > > > > > the
> >> > > > > > > > > > > > > inputs. But if drill data types are always
> >> nullable,
> >> > > this
> >> > > > > > > > wouldn't
> >> > > > > > > > > > be a
> >> > > > > > > > > > > > problem.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > I don't think there would be much impact on
> >> > > performance.
> >> > > > In
> >> > > > > > > > > practice,
> >> > > > > > > > > > > > > I think the required type is used very rarely.
> And
> >> > > there
> >> > > > > are
> >> > > > > > > > other
> >> > > > > > > > > > > > > ways we can optimize for when a column is known
> to
> >> > have
> >> > > > no
> >> > > > > > > nulls.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Thoughts?
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > ***************************Legal
> >> > > > > > > > > Disclaimer***************************
> >> > > > > > > > > > > > "This communication may contain confidential and
> >> > > privileged
> >> > > > > > > > material
> >> > > > > > > > > > for
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > sole use of the intended recipient. Any
> unauthorized
> >> > > > review,
> >> > > > > use
> >> > > > > > > or
> >> > > > > > > > > > > > distribution
> >> > > > > > > > > > > > by others is strictly prohibited. If you have
> >> received
> >> > > the
> >> > > > > > > message
> >> > > > > > > > by
> >> > > > > > > > > > > > mistake,
> >> > > > > > > > > > > > please advise the sender by reply email and delete
> >> the
> >> > > > > message.
> >> > > > > > > > Thank
> >> > > > > > > > > > > you."
> >> > > > > > > > > > > >
> >> > > > > > > > >
> >> > > > >
> >> > **********************************************************************
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: [DISCUSS] Remove required type

Reply via email to