[
https://issues.apache.org/jira/browse/PHOENIX-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205639#comment-14205639
]
Robert Roland commented on PHOENIX-1430:
----------------------------------------
That seems to have fixed it. I was using the version of Phoenix bundled into
HDP 2.2 Preview while in development.
Thanks for the amazingly fast response!
> Spark queries against tables with VARCHAR ARRAY columns fail
> ------------------------------------------------------------
>
> Key: PHOENIX-1430
> URL: https://issues.apache.org/jira/browse/PHOENIX-1430
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.1
> Reporter: Robert Roland
>
> Running Phoenix 4.1 against HDP 2.2 Preview, I'm unable to execute queries in
> Spark against tables that contain VARCHAR ARRAY columns. Given the error, I
> think it's likely to affect any array column.
> Given the following table schema:
> {noformat}
> CREATE TABLE ARRAY_TEST_TABLE (
> ID BIGINT NOT NULL,
> STRING_ARRAY VARCHAR[]
> CONSTRAINT pk PRIMARY KEY (ID));
> {noformat}
> I am unable to execute a query via Spark, using the PhoenixInputFormat:
> {noformat}
> val phoenixConf = new PhoenixPigConfiguration(new Configuration())
> phoenixConf.setSelectStatement("SELECT ID, STRING_ARRAY FROM
> ARRAY_TEST_TABLE")
> phoenixConf.setSelectColumns("ID,STRING_ARRAY")
> phoenixConf.setSchemaType(SchemaType.QUERY)
> phoenixConf.configure("sandbox.hortonworks.com:2181:/hbase-unsecure",
> "ARRAY_TEST_TABLE", 100)
> val phoenixRDD = sc.newAPIHadoopRDD(phoenixConf.getConfiguration,
> classOf[PhoenixInputFormat],
> classOf[NullWritable],
> classOf[PhoenixRecord])
> val count = phoenixRDD.count()
> {noformat}
> I get the following error:
> {noformat}
> java.lang.RuntimeException: java.sql.SQLException:
> org.apache.phoenix.schema.IllegalDataException: Unsupported sql type: VARCHAR
> ARRAY
> at
> org.apache.phoenix.pig.hadoop.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:162)
> at
> org.apache.phoenix.pig.hadoop.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:88)
> at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:94)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1135)
> at org.apache.spark.rdd.RDD.count(RDD.scala:904)
> at
> com.simplymeasured.spark.PhoenixRDDTest$$anonfun$4.apply$mcV$sp(PhoenixRDDTest.scala:147)
> ...
> Cause: java.sql.SQLException:
> org.apache.phoenix.schema.IllegalDataException: Unsupported sql type: VARCHAR
> ARRAY
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:947)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:1171)
> at
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:315)
> at
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:284)
> at
> org.apache.phoenix.compile.FromCompiler$BaseColumnResolver.createTableRef(FromCompiler.java:289)
> at
> org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.<init>(FromCompiler.java:210)
> at
> org.apache.phoenix.compile.FromCompiler.getResolverForQuery(FromCompiler.java:158)
> at
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:300)
> at
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:290)
> at
> org.apache.phoenix.jdbc.PhoenixStatement.compileQuery(PhoenixStatement.java:926)
> ...
> Cause: org.apache.phoenix.schema.IllegalDataException: Unsupported sql
> type: VARCHAR ARRAY
> at org.apache.phoenix.schema.PDataType.fromSqlTypeName(PDataType.java:6977)
> at
> org.apache.phoenix.schema.PColumnImpl.createFromProto(PColumnImpl.java:195)
> at org.apache.phoenix.schema.PTableImpl.createFromProto(PTableImpl.java:848)
> at
> org.apache.phoenix.coprocessor.MetaDataProtocol$MetaDataMutationResult.constructFromProto(MetaDataProtocol.java:158)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:939)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:1171)
> at
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:315)
> at
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:284)
> at
> org.apache.phoenix.compile.FromCompiler$BaseColumnResolver.createTableRef(FromCompiler.java:289)
> at
> org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.<init>(FromCompiler.java:210)
> {noformat}
> Using sqlline to investigate the column's type, it looks like it's considered
> "VARCHAR ARRAY" instead of "VARCHAR_ARRAY": (truncated for brevity)
> {noformat}
> 0: jdbc:phoenix:localhost:2181:/hbase-unsecur> !columns ARRAY_TEST_TABLE
> +------------+-------------+------------+-------------+------------+------------+
> | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE |
> TYPE_NAME |
> +------------+-------------+------------+-------------+------------+------------+
> | null | null | ARRAY_TEST_TABLE | ID | -5 |
> BIGINT |
> | null | null | ARRAY_TEST_TABLE | STRING_ARRAY | 2003 |
> VARCHAR ARRAY |
> +------------+-------------+------------+-------------+------------+------------+
> {noformat}
> The PDataType class defines VARCHAR_ARRAY as such:
> {noformat}
> VARCHAR_ARRAY("VARCHAR_ARRAY", PDataType.ARRAY_TYPE_BASE +
> PDataType.VARCHAR.getSqlType(), PhoenixArray.class, null) { ... }
> {noformat}
> The first parameter there being the sqlTypeName, which is "VARCHAR_ARRAY" but
> it appears to try and look it up as "VARCHAR ARRAY" (space instead of
> underscore)
> I'm not sure if the fix here is to change those values, or if it's deep
> inside MetaDataEndpointImpl where the ProtoBuf returned to the client is
> implemented when a getTable call occurs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)