[ 
https://issues.apache.org/jira/browse/SPARK-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983318#comment-13983318
 ] 

Michael Armbrust commented on SPARK-1649:
-----------------------------------------

Why do you think it would be better to have the nullability bit in the data 
type?  Both attribute references and struct fields already have a nullable bit, 
so we can always describe whether or not a given attribute can be null or not.

Right now we use primitive datatypes mostly as enums, so adding this bit to 
them would mean that everywhere we pattern match on datatype we would need to 
include a wildcard for nullability.  This would also require a pretty big 
change to all expressions since right now we determine nullability propagation 
independent of datatype.

> DataType should contain nullable bit
> ------------------------------------
>
>                 Key: SPARK-1649
>                 URL: https://issues.apache.org/jira/browse/SPARK-1649
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.1.0
>            Reporter: Andre Schumacher
>            Priority: Critical
>
> For the underlying storage layer it would simplify things such as schema 
> conversions, predicate filter determination and such to record in the data 
> type itself whether a column can be nullable. So the DataType type could look 
> like like this:
> abstract class DataType(nullable: Boolean = true)
> Concrete subclasses could then override the nullable val. Mostly this could 
> be left as the default but when types can be contained in nested types one 
> could optimize for, e.g., arrays with elements that are nullable and those 
> that are not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to