andygrove commented on PR #1920:
URL:
https://github.com/apache/datafusion-comet/pull/1920#issuecomment-3004952533
I added the following method to `FileReader` locally:
```scala
/** Sets the projected columns to be read later via {@link
#readNextRowGroup()} */
public void setRequestedSchemaFromSpecs(List<ParquetColumnSpec>
projection) {
paths.clear();
for (ParquetColumnSpec columnSpec : projection) {
ColumnDescriptor col = Utils.buildColumnDescriptor(columnSpec);
paths.put(ColumnPath.get(col.getPath()), col);
}
}
```
I can now compile Iceberg, but I get an exception at runtime, and I do not
yet understand why:
```
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.4.3
/_/
Using Scala version 2.12.17 (OpenJDK 64-Bit Server VM, Java 11.0.27)
Type in expressions to have them evaluated.
Type :help for more information.
scala> spark.sql(s"CREATE TABLE IF NOT EXISTS t1 (c0 INT, c1 STRING) USING
iceberg")
25/06/25 08:12:29 INFO core/src/lib.rs: Comet native library version 0.9.0
initialized
25/06/25 08:12:29 WARN CometExecRule: Comet cannot execute some parts of
this plan natively (set spark.comet.explainFallback.enabled=false to disable
this logging):
CreateTable [COMET: CreateTable is not supported]
res0: org.apache.spark.sql.DataFrame = []
scala> spark.sql(s"INSERT INTO t1 VALUES ${(0 until 10000).map(i => (i,
i)).mkString(",")}")
25/06/25 08:12:32 WARN CometExecRule: Comet cannot execute some parts of
this plan natively (set spark.comet.explainFallback.enabled=false to disable
this logging):
AppendData [COMET: AppendData is not supported]
+- LocalTableScan [COMET: LocalTableScan is not supported]
res1: org.apache.spark.sql.DataFrame = []
scala> spark.sql(s"SELECT * from t1").show()
25/06/25 08:12:35 WARN CometExecRule: Comet cannot execute some parts of
this plan natively (set spark.comet.explainFallback.enabled=false to disable
this logging):
CollectLimit [COMET: CollectLimit is not supported]
+- Project
+- BatchScan spark_catalog.default.t1 [COMET: Unsupported scan:
org.apache.iceberg.spark.source.SparkBatchQueryScan. Comet Scan only supports
Parquet and Iceberg Parquet file formats, BatchScan spark_catalog.default.t1 is
not supported]
25/06/25 08:12:35 WARN CheckAllocator: More than one
DefaultAllocationManager on classpath. Choosing first found
25/06/25 08:12:35 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 32)
java.lang.NoSuchMethodError: 'void
org.apache.comet.parquet.FileReader.setRequestedSchemaFromSpecs(java.util.List)'
at
org.apache.iceberg.parquet.CometVectorizedParquetReader$FileIterator.newCometReader(CometVectorizedParquetReader.java:222)
```
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]