[ https://issues.apache.org/jira/browse/SPARK-22250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16204059#comment-16204059 ]
Fernando Pereira edited comment on SPARK-22250 at 10/13/17 7:28 PM: -------------------------------------------------------------------- I have to admit I was not aware of that option. Nevertheless, even though that would shut up all complaints, I find it a bit extreme, and I believe a better handling of these cases would be valuable. A float can be initialized by an int in almost every language, and java is no exception. And since Pandas is supported, taking into consideration Numpy types would also be really nice, especially for large arrays. I could work on that feature if the community considers it worth. was (Author: ferdonline): I have to admit I was not aware of that option. Nevertheless, even though that would shut up all complains, I find it a bit extreme, and I believe a better handling of these cases would be valuable. A float can be initialized by an int in almost any language, and java is no exception. And since Pandas is supported, taking into consideration Numpy types would also be really nice, especially for large arrays. I could work on that feature if the community considers it worth. > Be less restrictive on type checking > ------------------------------------ > > Key: SPARK-22250 > URL: https://issues.apache.org/jira/browse/SPARK-22250 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.0.0 > Reporter: Fernando Pereira > Priority: Minor > > I find types.py._verify_type() often too restrictive. E.g. > {code} > TypeError: FloatType can not accept object 0 in type <type 'int'> > {code} > I believe it would be globally acceptable to fill a float field with an int, > especially since in some formats (json) you don't have a way of inferring the > type correctly. > Another situation relates to other equivalent numerical types, like > array.array or numpy. A numpy scalar int is not accepted as an int, and these > arrays have always to be converted down to plain lists, which can be > prohibitively large and computationally expensive. > Any thoughts? -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org