[ https://issues.apache.org/jira/browse/SPARK-42976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim updated SPARK-42976: --------------------------------- Issue Type: Question (was: Improvement) > spark sql Disable vectorized faild > ----------------------------------- > > Key: SPARK-42976 > URL: https://issues.apache.org/jira/browse/SPARK-42976 > Project: Spark > Issue Type: Question > Components: Spark Shell, SQL > Affects Versions: 3.3.2 > Environment: spark :3.3_2.12 > hive : 3.1.1 > iceberg: iceberg-spark-runtime-3.3_2.12-1.2.0 > > > > Reporter: liu > Priority: Blocker > Labels: features > > spark-sql start config: > > {code:java} > ./spark-sql --packages > org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.2.0\ > --conf > spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions > \ > --conf > spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \ > --conf spark.sql.catalog.spark_catalog.type=hive \ > --conf spark.sql.iceberg.handle-timestamp-without-timezone=true \ > --conf spark.sql.parquet.binaryAsString=true \ > --conf spark.sql.parquet.enableVectorizedReader=false \ > --conf spark.sql.parquet.enableNestedColumnVectorizedReader=true \ > --conf spark.sql.parquet.recordLevelFilter=true {code} > Now that I have configured spark. sql. queue. > enableVectorizedReader=false,but i query a iceberg parquet table,the > following error occurred: > > {code:java} > at scala.collection.AbstractIterable.foreach(Iterable.scala:56) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:498) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:286) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.lang.UnsupportedOperationException: Cannot support vectorized > reads for column [hzxm] optional binary hzxm = 8 with encoding > DELTA_BYTE_ARRAY. Disable vectorized reads to read this table/file > at > org.apache.iceberg.arrow.vectorized.parquet.VectorizedPageIterator.initDataReader(VectorizedPageIterator.java:100) > at > org.apache.iceberg.parquet.BasePageIterator.initFromPage(BasePageIterator.java:140) > at > org.apache.iceberg.parquet.BasePageIterator$1.visit(BasePageIterator.java:105) > at > org.apache.iceberg.parquet.BasePageIterator$1.visit(BasePageIterator.java:96) > at > org.apache.iceberg.shaded.org.apache.parquet.column.page.DataPageV2.accept(DataPageV2.java:192) > at > org.apache.iceberg.parquet.BasePageIterator.setPage(BasePageIterator.java:95) > at > org.apache.iceberg.parquet.BaseColumnIterator.advance(BaseColumnIterator.java:61) > at > org.apache.iceberg.parquet.BaseColumnIterator.setPageSource(BaseColumnIterator.java:50) > at > org.apache.iceberg.arrow.vectorized.parquet.VectorizedColumnIterator.setRowGroupInfo(Vec > {code} > *{color:#FF0000}Caused by: java.lang.UnsupportedOperationException: Cannot > support vectorized reads for column [hzxm] optional binary hzxm = 8 with > encoding DELTA_BYTE_ARRAY. Disable vectorized reads to read this > table/file{color}* > > Now it seems that this parameter has not worked. How can I turn off this > function so that I can successfully query the table > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org