[jira] [Comment Edited] (SPARK-22250) Be less restrictive on type checking

Fernando Pereira (JIRA) Fri, 13 Oct 2017 12:29:38 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-22250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16204059#comment-16204059
 ]


Fernando Pereira edited comment on SPARK-22250 at 10/13/17 7:28 PM:
--------------------------------------------------------------------

I have to admit I was not aware of that option.
Nevertheless, even though that would shut up all complaints, I find it a bit 
extreme, and I believe a better handling of these cases would be valuable.
A float can be initialized by an int in almost every language, and java is no 
exception.
And since Pandas is supported, taking into consideration Numpy types would also 
be really nice, especially for large arrays.
I could work on that feature if the community considers it worth.


was (Author: ferdonline):
I have to admit I was not aware of that option.
Nevertheless, even though that would shut up all complains, I find it a bit 
extreme, and I believe a better handling of these cases would be valuable.
A float can be initialized by an int in almost any language, and java is no 
exception.
And since Pandas is supported, taking into consideration Numpy types would also 
be really nice, especially for large arrays.
I could work on that feature if the community considers it worth.

> Be less restrictive on type checking
> ------------------------------------
>
>                 Key: SPARK-22250
>                 URL: https://issues.apache.org/jira/browse/SPARK-22250
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.0.0
>            Reporter: Fernando Pereira
>            Priority: Minor
>
> I find types.py._verify_type() often too restrictive. E.g. 
> {code}
> TypeError: FloatType can not accept object 0 in type <type 'int'>
> {code}
> I believe it would be globally acceptable to fill a float field with an int, 
> especially since in some formats (json) you don't have a way of inferring the 
> type correctly.
> Another situation relates to other equivalent numerical types, like 
> array.array or numpy. A numpy scalar int is not accepted as an int, and these 
> arrays have always to be converted down to plain lists, which can be 
> prohibitively large and computationally expensive.
> Any thoughts?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-22250) Be less restrictive on type checking

Reply via email to