Sounds good.
Another benefit of the struct model is that it's more extensible in the
future when we need to disambiguate the same table that appears multiple
times in the MV query tree.
This could happen with time travel queries or branching. We may end up
adding additional properties like a
Hi Benny, I have responded to the comment.
I would suggest that we use this thread to evaluate properties model vs top
level metadata model (to avoid discussion drift).
If we have feedback on the actual properties used in the properties model
as defined in the PR, we can have the discussion
Hi Walaa
I left comments in your spec PR:
https://github.com/apache/iceberg/pull/10280#pullrequestreview-2061922169
My last question about use cases was really about incremental refresh with
aggregates. But I think this might be too complicated to try to
model/discuss now and so I agree with
+1 for a JSON/BSON type. We also had the same discussion internally and
a JSON type would really play well with for example the SUPER type in
Redshift: https://docs.aws.amazon.com/redshift/latest/dg/r_SUPER_type.html,
and can also provide better integration with the Trino JSON type.
Looking
Iceberg has a Java library, which is the most complete implementation of
the spec (compared to other languages like Python, Rust) at the moment. You
can certainly use the Java library directly to write and commit data to
Iceberg. But you will likely need to implement quite a bit of code for
things
Completely naive question since I'm not familiar at all with the
technologies. I wanted to demonstrate using Iceberg files as a way to
ingest lots of data and persist it to S3. It seems like it can do this,
but I have a feeling I need tools like Spark to do it. is that true? or
can I hook it up