[ https://issues.apache.org/jira/browse/SPARK-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097724#comment-15097724 ]
Earthson Lu commented on SPARK-12746: ------------------------------------- ok, i see:) If there's no nullability in ML, how could we implement a Transformer to fill missing values(always represented as NULL). I think we need support nullability for Preprocessing, so we can get clean data for further operation. I can't imagine the situation that we can do nothing when the data contains NULL. - - - I think the type checking API is independent with nullability in ML. It is a common case that one transformer accept both BooleanType or IntType. Maybe, it is a good idea that test condition and assertions are implemented separately. > ArrayType(_, true) should also accept ArrayType(_, false) > --------------------------------------------------------- > > Key: SPARK-12746 > URL: https://issues.apache.org/jira/browse/SPARK-12746 > Project: Spark > Issue Type: Bug > Components: ML, SQL > Affects Versions: 1.6.0 > Reporter: Earthson Lu > > I see CountVectorizer has schema check for ArrayType which has > ArrayType(StringType, true). > ArrayType(String, false) is just a special case of ArrayType(String, true), > but it will not pass this type check. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org