+1, Thanks Jack and team for getting the discussion started with this
proposal!
Much of this is well aligned with what we noticed when implementing RBAC
for Polaris Catalog, namely that even if a more complicated User/Role
structure exists outside of the catalog, that it's necessary to be able to
Hi Fokko,
Thanks so much for sharing. I am using version 3.2.1. Is this not supported in
3.2.1?
I do get the error with the `col` syntax:
df2.writeTo(spark_table_path).using("iceberg").overwrite(col("tid") >= 2)
The stack trace would look like this:
---
Hey Ha,
What version of Spark are you using? Can you share the whole stack trace? I
tried to reproduce it locally and it worked fine:
pyspark --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\
--conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSession
It's been a while since the discussion started. There were no objections
to the proposal, so I went ahead and started to implement "M1" of the
document [1].
[1] PR for M1: https://github.com/apache/iceberg/pull/10603
Robert
On 19.06.24 15:11, Robert Stupp wrote:
Alright, I've just creat
Hi Ajantha,
Thanks for replying! The example, however, is in Java. I figure that that
syntax probably only works for Java and Scala. I have tried similarly for
PySpark but still got `Column is not iterable` with:
df.writeTo(spark_table_path).using("iceberg").overwrite(col("time") >
target_times
Hi,
Please refer this doc:
https://iceberg.apache.org/docs/nightly/spark-writes/#overwriting-data
We do have some test cases for the same:
https://github.com/apache/iceberg/blob/91fbcaa62c25308aa815557dd2c0041f75530705/spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/sql/PartitionedWritesT