[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314627#comment-15314627 ]
Nick Pentreath commented on SPARK-15746: ---------------------------------------- I'd say hold off on working on it until we decide which approach to take, but once that is done sure. > SchemaUtils.checkColumnType with VectorUDT prints instance details in error > message > ----------------------------------------------------------------------------------- > > Key: SPARK-15746 > URL: https://issues.apache.org/jira/browse/SPARK-15746 > Project: Spark > Issue Type: Improvement > Components: ML > Reporter: Nick Pentreath > Priority: Minor > > Currently, many feature transformers in {{ml}} use > {{SchemaUtils.checkColumnType(schema, ..., new VectorUDT)}} to check the > column type is a ({{ml.linalg}}) vector. > The resulting error message contains "instance" info for the {{VectorUDT}}, > i.e. something like this: > {code} > java.lang.IllegalArgumentException: requirement failed: Column features must > be of type org.apache.spark.ml.linalg.VectorUDT@3bfc3ba7 but was actually > StringType. > {code} > A solution would either be to amend {{SchemaUtils.checkColumnType}} to print > the error message using {{getClass.getName}}, or to create a {{private[spark] > case object VectorUDT extends VectorUDT}} for convenience, since it is used > so often (and incidentally this would make it easier to put {{VectorUDT}} > into lists of data types e.g. schema validation, UDAFs etc). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org