RE: Re: Support Securable Objects in Iceberg REST Catalog

2024-06-28 Thread Dennis Huo
+1, Thanks Jack and team for getting the discussion started with this proposal! Much of this is well aligned with what we noticed when implementing RBAC for Polaris Catalog, namely that even if a more complicated User/Role structure exists outside of the catalog, that it's necessary to be able to

RE: Iceberg - PySpark overwrite with a condition

2024-06-28 Thread Ha Cao
Hi Fokko, Thanks so much for sharing. I am using version 3.2.1. Is this not supported in 3.2.1? I do get the error with the `col` syntax: df2.writeTo(spark_table_path).using("iceberg").overwrite(col("tid") >= 2) The stack trace would look like this: ---

Re: Iceberg - PySpark overwrite with a condition

2024-06-28 Thread Fokko Driesprong
Hey Ha, What version of Spark are you using? Can you share the whole stack trace? I tried to reproduce it locally and it worked fine: pyspark --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\ --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSession

Re: [DISCUSSION] Addressing security questions in the Iceberg REST specification

2024-06-28 Thread Robert Stupp
It's been a while since the discussion started. There were no objections to the proposal, so I went ahead and started to implement "M1" of the document [1]. [1] PR for M1: https://github.com/apache/iceberg/pull/10603 Robert On 19.06.24 15:11, Robert Stupp wrote: Alright, I've just creat

RE: Iceberg - PySpark overwrite with a condition

2024-06-28 Thread Ha Cao
Hi Ajantha, Thanks for replying! The example, however, is in Java. I figure that that syntax probably only works for Java and Scala. I have tried similarly for PySpark but still got `Column is not iterable` with: df.writeTo(spark_table_path).using("iceberg").overwrite(col("time") > target_times

Re: Iceberg - PySpark overwrite with a condition

2024-06-28 Thread Ajantha Bhat
Hi, Please refer this doc: https://iceberg.apache.org/docs/nightly/spark-writes/#overwriting-data We do have some test cases for the same: https://github.com/apache/iceberg/blob/91fbcaa62c25308aa815557dd2c0041f75530705/spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/sql/PartitionedWritesT