[ 
https://issues.apache.org/jira/browse/SPARK-21199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Franklyn Dsouza updated SPARK-21199:
------------------------------------
    Description: 
There are cases where nulls end up in vector columns in dataframes. Currently 
there is no way to fill in these nulls because its not possible to create a 
literal vector column expression using lit().

Also the entire pyspark ml api will fail when they encounter nulls so this 
makes it hard to work with the data.

I think that either vector support should be added to the imputer or vectors 
should be supported in column expressions so they can be used in a coalesce.

[~mlnick]

  was:
There are cases where nulls end up in vector columns in dataframes. Currently 
there is no way to fill in these nulls because its not possible to create a 
literal vector column expression using lit().

Also the entire pyspark ml api will fail when they encounter nulls so this 
makes it hard to work with the data.

I think that either vector support should be added to the imputer or vectors 
should be supported in column expressions so they can be used in a coalesce.

@mlnick


> Its not possible to impute Vector types
> ---------------------------------------
>
>                 Key: SPARK-21199
>                 URL: https://issues.apache.org/jira/browse/SPARK-21199
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.0.0, 2.1.1
>            Reporter: Franklyn Dsouza
>
> There are cases where nulls end up in vector columns in dataframes. Currently 
> there is no way to fill in these nulls because its not possible to create a 
> literal vector column expression using lit().
> Also the entire pyspark ml api will fail when they encounter nulls so this 
> makes it hard to work with the data.
> I think that either vector support should be added to the imputer or vectors 
> should be supported in column expressions so they can be used in a coalesce.
> [~mlnick]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to