fuzing opened a new issue, #12804:
URL: https://github.com/apache/iceberg/issues/12804
### Query engine
spark 3.5
### Question
I'm having difficulty querying iceberg tables using thriftserver with odbc.
There's no issue calling procedures such as system.rewrite_data_files and
system.expire_snapshots - they work fine.
I'm using a spark-iceberg docker container and the entrypoint.sh looks
similar to:
```
start-master.sh -p 7077
start-worker.sh spark://spark-iceberg:7077
start-history-server.sh
start-thriftserver.sh --driver-java-options
"-Dderby.system.home=/tmp/derby" \
--packages
"org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.8.1,org.apache.iceberg:iceberg-aws-bundle:1.8.1"
\
--conf
"spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions"
\
--conf "spark.sql.catalogImplementation=in-memory" \
--conf "spark.sql.defaultCatalog=somecatalog" \
--conf
"spark.sql.catalog.somecatalog=org.apache.iceberg.spark.SparkCatalog" \
--conf "spark.sql.catalog.somecatalog.warehouse=s3://bucket-name" \
--conf "spark.sql.catalog.somecatalog.s3.endpoint=http://minio:9005" \
--conf "spark.sql.catalog.somecatalog.uri=http://iceberg-rest:8181" \
--conf "spark.sql.catalog.somecatalog.type=rest" \
--conf
"spark.sql.catalog.somecatalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO"
tail -f /dev/null
```
Querying through odbc with the following works fine (i.e. snapshots are
expired):
```
CALL system.expire_snapshots(
table => 'somecatalog.analytics.events',
older_than => TIMESTAMP '2025-04-01 00:00:00.000');
```
However the following query does not:
```
SELECT * FROM somecatalog.analytics.events LIMIT 5;
```
It comes back with the following error:
```
Table or view not found: somecatalog.analytics.events
```
If I run the same query with spark-sql, it does work:
```
spark-sql \
--packages
"org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.8.1,org.apache.iceberg:iceberg-aws-bundle:1.8.1"
\
--conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
--conf spark.sql.defaultCatalog=somecatalog \
--conf spark.sql.catalogImplementation=in-memory \
--conf spark.sql.catalog.somecatalog.uri=http://iceberg-rest:8181/ \
--conf spark.sql.catalog.somecatalog.type=rest \
--conf spark.sql.catalog.fusing.warehouse=s3://bucket-name \
--conf spark.sql.catalog.somecatalog=org.apache.iceberg.spark.SparkCatalog
\
--conf
spark.sql.catalog.somecatalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \
--conf spark.sql.catalog.somecatalog.s3.endpoint=http://minio:9005
```
Then:
```
SELECT * FROM somecatalog.analytics.events LIMIT 5;
```
This works fine. I'm also using starrocks for read-only access to the
catalog and table, and this also works fine for querying.
Any assistance on how to get the thriftserver working correctly for SQL
queries would be much appreciated.
Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]