[ https://issues.apache.org/jira/browse/PHOENIX-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205567#comment-14205567 ]
Samarth Jain commented on PHOENIX-1430: --------------------------------------- [~robertroland] - can you try with phoenix 4.2.1? We have updated the array data type names in PDataType. For ex - sql type name VARCHAR_ARRAY is now VARCHAR ARRAY > Spark queries against tables with VARCHAR ARRAY columns fail > ------------------------------------------------------------ > > Key: PHOENIX-1430 > URL: https://issues.apache.org/jira/browse/PHOENIX-1430 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.1 > Reporter: Robert Roland > > Running Phoenix 4.1 against HDP 2.2 Preview, I'm unable to execute queries in > Spark against tables that contain VARCHAR ARRAY columns. Given the error, I > think it's likely to affect any array column. > Given the following table schema: > {noformat} > CREATE TABLE ARRAY_TEST_TABLE ( > ID BIGINT NOT NULL, > STRING_ARRAY VARCHAR[] > CONSTRAINT pk PRIMARY KEY (ID)); > {noformat} > I am unable to execute a query via Spark, using the PhoenixInputFormat: > {noformat} > val phoenixConf = new PhoenixPigConfiguration(new Configuration()) > phoenixConf.setSelectStatement("SELECT ID, STRING_ARRAY FROM > ARRAY_TEST_TABLE") > phoenixConf.setSelectColumns("ID,STRING_ARRAY") > phoenixConf.setSchemaType(SchemaType.QUERY) > phoenixConf.configure("sandbox.hortonworks.com:2181:/hbase-unsecure", > "ARRAY_TEST_TABLE", 100) > val phoenixRDD = sc.newAPIHadoopRDD(phoenixConf.getConfiguration, > classOf[PhoenixInputFormat], > classOf[NullWritable], > classOf[PhoenixRecord]) > val count = phoenixRDD.count() > {noformat} > I get the following error: > {noformat} > java.lang.RuntimeException: java.sql.SQLException: > org.apache.phoenix.schema.IllegalDataException: Unsupported sql type: VARCHAR > ARRAY > at > org.apache.phoenix.pig.hadoop.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:162) > at > org.apache.phoenix.pig.hadoop.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:88) > at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:94) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1135) > at org.apache.spark.rdd.RDD.count(RDD.scala:904) > at > com.simplymeasured.spark.PhoenixRDDTest$$anonfun$4.apply$mcV$sp(PhoenixRDDTest.scala:147) > ... > Cause: java.sql.SQLException: > org.apache.phoenix.schema.IllegalDataException: Unsupported sql type: VARCHAR > ARRAY > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:947) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:1171) > at > org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:315) > at > org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:284) > at > org.apache.phoenix.compile.FromCompiler$BaseColumnResolver.createTableRef(FromCompiler.java:289) > at > org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.<init>(FromCompiler.java:210) > at > org.apache.phoenix.compile.FromCompiler.getResolverForQuery(FromCompiler.java:158) > at > org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:300) > at > org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:290) > at > org.apache.phoenix.jdbc.PhoenixStatement.compileQuery(PhoenixStatement.java:926) > ... > Cause: org.apache.phoenix.schema.IllegalDataException: Unsupported sql > type: VARCHAR ARRAY > at org.apache.phoenix.schema.PDataType.fromSqlTypeName(PDataType.java:6977) > at > org.apache.phoenix.schema.PColumnImpl.createFromProto(PColumnImpl.java:195) > at org.apache.phoenix.schema.PTableImpl.createFromProto(PTableImpl.java:848) > at > org.apache.phoenix.coprocessor.MetaDataProtocol$MetaDataMutationResult.constructFromProto(MetaDataProtocol.java:158) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:939) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:1171) > at > org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:315) > at > org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:284) > at > org.apache.phoenix.compile.FromCompiler$BaseColumnResolver.createTableRef(FromCompiler.java:289) > at > org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.<init>(FromCompiler.java:210) > {noformat} > Using sqlline to investigate the column's type, it looks like it's considered > "VARCHAR ARRAY" instead of "VARCHAR_ARRAY": (truncated for brevity) > {noformat} > 0: jdbc:phoenix:localhost:2181:/hbase-unsecur> !columns ARRAY_TEST_TABLE > +------------+-------------+------------+-------------+------------+------------+ > | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | > TYPE_NAME | > +------------+-------------+------------+-------------+------------+------------+ > | null | null | ARRAY_TEST_TABLE | ID | -5 | > BIGINT | > | null | null | ARRAY_TEST_TABLE | STRING_ARRAY | 2003 | > VARCHAR ARRAY | > +------------+-------------+------------+-------------+------------+------------+ > {noformat} > The PDataType class defines VARCHAR_ARRAY as such: > {noformat} > VARCHAR_ARRAY("VARCHAR_ARRAY", PDataType.ARRAY_TYPE_BASE + > PDataType.VARCHAR.getSqlType(), PhoenixArray.class, null) { ... } > {noformat} > The first parameter there being the sqlTypeName, which is "VARCHAR_ARRAY" but > it appears to try and look it up as "VARCHAR ARRAY" (space instead of > underscore) > I'm not sure if the fix here is to change those values, or if it's deep > inside MetaDataEndpointImpl where the ProtoBuf returned to the client is > implemented when a getTable call occurs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)