Re: [DISCUSS] iceberg-rust: pyiceberg_core 0.1.0 Release

2024-08-28 Thread Fokko Driesprong
Thanks for driving this Sung, this is very exciting! 1. The transforms are a good first thing to address. 2. I agree with Xuanwo, that for flexibility we can decouple them. 3. Automation is probably easier than doing it manually (otherwise we would have to document the steps). Kind regards, Fokko

Re: [VOTE] Merge REST Spec change to add RemovePartitionSpecsUpdate update type

2024-08-26 Thread Fokko Driesprong
+1 Op ma 26 aug 2024 om 22:00 schreef Yufei Gu : > +1 > Yufei > > > On Mon, Aug 26, 2024 at 11:06 AM Ryan Blue > wrote: > >> +1 >> >> On Mon, Aug 26, 2024 at 11:04 AM Amogh Jahagirdar <2am...@gmail.com> >> wrote: >> >>> I've opened a PR [1] to add a RemovePartitionSpecsUpdate update type so >>>

Re: [VOTE] Release Apache Iceberg 1.6.1 RC2

2024-08-24 Thread Fokko Driesprong
+1 (binding) - Verified signatures, checksums and ran the tests locally Kind regards, Fokko Op vr 23 aug 2024 om 20:51 schreef Piotr Findeisen < piotr.findei...@gmail.com>: > +1 (non-binding) > > Trino integration > > https://github.com/trinodb/trino/actions/runs/10529992246/job/29179087096?pr=

Re: [DISCUSS] Variant Spec Location

2024-08-22 Thread Fokko Driesprong
ndation. > >>>>>>>>>> > >>>>>>>>>> Yufei > >>>>>>>>>> > >>>>>>>>>> On Wed, Aug 14, 2024 at 7:51 PM Gang Wu > >>> wrote: > >>>>>>&g

Re: [VOTE] Release Apache Iceberg 1.6.1 RC1

2024-08-21 Thread Fokko Driesprong
Hey Eduard, I think it relates to this PR. It contains a CVE and would be good to be backported. We wanted to include it in 1.6.1 if we needed another RC, but that didn't happen, so I think we didn't cherry-pick it to 1.6.x branch. Kind regards, Fokk

Re: Type promotion in v3

2024-08-20 Thread Fokko Driesprong
ot something that needs to be done now but laying the ground-work > is useful). Similar to the point above we should be opinionated about this. For example, historically we've been parsing dates strictly, as an example, see DateTimeUtil <https://github.com/apache/iceberg/blob/main/api/s

Re: [DISCUSS] Adding RemovePartitionSpecsUpdate update type to REST

2024-08-20 Thread Fokko Driesprong
+1 Thanks for working on this Op di 20 aug 2024 om 04:16 schreef xianjin : > +1 from my side as well. > > Sent from my iPhone > > On Aug 20, 2024, at 9:09 AM, Yufei Gu wrote: > >  > > +1, the new spec looks good to me. It seems like the client-side handling > the heavy lifting of figuring out w

Re: [VOTE] Spec changes in preparation for v3

2024-08-19 Thread Fokko Driesprong
+1 Op ma 19 aug 2024 om 22:01 schreef Russell Spitzer < russell.spit...@gmail.com>: > +1 - Feels duplicative to vote here and approve on the PR > > On Mon, Aug 19, 2024 at 2:41 PM Ryan Blue wrote: > >> Hi everyone, >> >> I'd like to vote on PR #10948 >>

Re: [DISCUSS] Iceberg 1.6.1 release

2024-08-19 Thread Fokko Driesprong
erged on Jul 26th and it would be >> great to make it available to downstream projects. >> >> I volunteer to help with Iceberg 1.6.1 release, to share the operational >> cost. >> >> >> Best >> Piotr >> >> >> >> On Thu, 8 Aug 202

Re: Type promotion in v3

2024-08-19 Thread Fokko Driesprong
Thanks Ryan for bringing this up, that's an interesting problem, let me think about this. we can persist schema_id in the DataFile This was also my first thought. The two drawbacks are: - Distribute all the schemas to the executors, and we have to do the lookup and comparison there. -

Re: Table schema and partition spec update

2024-08-19 Thread Fokko Driesprong
Hey Peter, Thanks for raising this since I recently ran into the same issue. The APIs that we have today nicely hide the field IDs from the user, which is great. I do think all the methods are in there to evolve the schema to the desired one, however, we don't have a way to control the field-IDs.

Re: [VOTE] Release Apache Iceberg Rust 0.3.0 RC1

2024-08-19 Thread Fokko Driesprong
+1 (binding) Thanks Xuanwo for running this release, and sorry for the late vote, I was doing additional tests against Tabular and had to flex my tiny Rust muscle a bit. - Validated the signatures and checksums - Checked out the licenses

Re: [VOTE] Release Apache PyIceberg 0.7.1rc2

2024-08-14 Thread Fokko Driesprong
+1 (binding) Thanks Sung for running this 🙌 - Validated signatures/checksums/license - Ran some basic tests (3.10) Kind regards, Fokko Op wo 14 aug 2024 om 19:57 schreef André Luis Anastácio : > >- validated signatures and checksums > > >- checked license > > >- ran tests and test-

Re: [DISCUSS] Cleanup svn dev/iceberg

2024-08-14 Thread Fokko Driesprong
Thanks! Kind regards, Fokko Op wo 14 aug 2024 om 17:57 schreef Xuanwo : > Got it. I will clean them up. > > On Wed, Aug 14, 2024, at 23:54, Fokko Driesprong wrote: > > Hey Xuanwo, > > Feel free to clean those up as they should have been cleaned up a long > time ago.

Re: [DISCUSS] Cleanup svn dev/iceberg

2024-08-14 Thread Fokko Driesprong
Hey Xuanwo, Feel free to clean those up as they should have been cleaned up a long time ago. I'm also happy to do it myself, let me know! Kind regards, Fokko Op wo 14 aug 2024 om 17:49 schreef Xuanwo : > Hi, > > The dev branch of SVN is used to host artifacts awaiting a vote. It > increases the

Re: [VOTE] Merge REST spec clarification on how servers should handle unknown updates/requirements

2024-08-14 Thread Fokko Driesprong
+1 Thanks for clarifying this Kind regards, Fokko Op wo 14 aug 2024 om 04:34 schreef xianjin : > +1 > > On Aug 14, 2024, at 2:24 AM, Ryan Blue > wrote: > >  > +1 > > On Tue, Aug 13, 2024 at 8:59 AM Yufei Gu wrote: > >> +1 >> Yufei >> >> >> On Tue, Aug 13, 2024 at 8:57 AM Eduard Tudenhöfner <

Re: [DISCUSS] Variant Spec Location

2024-08-14 Thread Fokko Driesprong
+1 to what's already being said here. It is good to copy the spec to Iceberg and add context that's specific to Iceberg, but at the same time, we should maintain compatibility. Kind regards, Fokko Op wo 14 aug 2024 om 15:30 schreef Manu Zhang : > +1 to copy the spec into our repository. I think

Re: Welcome Péter, Amogh and Eduard to the Apache Iceberg PMC

2024-08-14 Thread Fokko Driesprong
Congratulations and welcome! Kind regards, Fokko Op wo 14 aug 2024 om 06:23 schreef Xuanwo : > Congrats! Thanks for your contribution. > > On Wed, Aug 14, 2024, at 11:32, Renjie Liu wrote: > > Congratulations, everyone! > > On Wed, Aug 14, 2024 at 11:14 AM roryqi wrote: > > Congrats! > > Steven

Re: [DISCUSS] Start iceberg-rust 0.3.0 release process

2024-08-14 Thread Fokko Driesprong
Thanks Xuanwo for driving this, very excited to see this happening. Let me know if there is anything I can help with! Kind regards, Fokko Op wo 14 aug 2024 om 08:58 schreef Xuanwo : > Hello, everyone > > I'm starting this thread to discuss initiating the release process for > iceberg-rust 0.3.0.

Re: [DISCUSS] Filesystem in PyIceberg

2024-08-12 Thread Fokko Driesprong
Hi André, First of all, thanks for raising this. Maintenance routines are a long-awaited functionality in PyIceberg. The FileIO concept is not limited to PyIceberg, but is also present in Java

Re: [DISCUSS] Flink 1.20: make FLIP-27 default in SQL and mark the old FlinkSource as deprecated

2024-08-12 Thread Fokko Driesprong
Hey Steven, That sounds very exciting! I'm not a heavy Flink user, but I don't see any issues enabling it on Flink 1.20. We should make it explicit in the changelog, and if possible give some hints on how to drain the Flink jobs. Kind regards, Fokko Op ma 12 aug 2024 om 04:57 schreef Steven Wu :

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-09 Thread Fokko Driesprong
b.com/apache/iceberg-python/pull/1026 > > Sung > > On Thu, Aug 8, 2024 at 9:29 AM André Luis Anastácio > wrote: > >> I fixed an overwrite error that, I think, would be good to include in the >> 0.7.1 release https://github.com/apache/iceberg-python/pull/1023 >>

Re: [DISCUSS] Iceberg 1.6.1 release

2024-08-08 Thread Fokko Driesprong
Hey Piotr, We had some delays with the Avro 1.12.0 release, mostly because all the languages were released at once. On the Avro devlist, I suggested releasing 1.11.4 just for Java because of the CVE. Realistically this would be around 1-2 weeks. Does that sound reasonable? Kind regards, Fokko Op

Re: [DISCUSS] Release Avro Java 1.11.4

2024-08-08 Thread Fokko Driesprong
gt; On Fri, Aug 9, 2024, at 00:26, Ryan Blue wrote: > > +1 for releasing Avro Java separately. > > On Thu, Aug 8, 2024 at 8:28 AM Fokko Driesprong wrote: > > Hi everyone, > > In light of the recent discussion of releasing artifacts separately [1]. I > would like to discus

[DISCUSS] Release Avro Java 1.11.4

2024-08-08 Thread Fokko Driesprong
Hi everyone, In light of the recent discussion of releasing artifacts separately [1]. I would like to discuss releasing Java 1.11.4. Since Java 1.12.0 only supports JDK11+ I think it is important to also do a release of Java (which includes several CVE patches). I would like to hear if there are a

Re: [Discussion] Versioned SQL UDFs (Catalog routines) in Iceberg

2024-08-08 Thread Fokko Driesprong
Coming from PyIceberg, I have concerns as this proposal focuses on SQL-based engines, while Python-based systems often work with data frames. Adding imperative languages like Python would make this proposal more inclusive. Kind regards, Fokko Op do 8 aug 2024 om 10:27 schreef Piotr Findeisen :

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-08 Thread Fokko Driesprong
Thanks Eduard for bringing raising all the PRs. I like the approach of the server-side configuration, that way the catalog is in charge of providing a character that's suitable for them. Kind regards, Fokko Op do 8 aug 2024 om 12:27 schreef Eduard Tudenhöfner < etudenhoef...@apache.org>: > I've

Re: [DISCUSS] PyIceberg: Remove optional support for instance-level identifier in Catalog and Table APIs

2024-08-08 Thread Fokko Driesprong
Hey Sung, Thanks for raising this. This was also for a very long time on my list, but I was reluctant to do this because of the incompatible change as you already mentioned, however, I think it is good to remove this rather sooner than later. I just went over the PR

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-08 Thread Fokko Driesprong
> > > value in getting the proposed config property out as early as > possible for > > > > the larger community. > > > > > > > > I'm still on the fence regarding 17.0.0 upgrade. There are clear > > > > functional upsides, but I feel that co

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-06 Thread Fokko Driesprong
lication may have to upgrade >> their PyArrow versions which could be a deterrent (or a welcome nudge). >> Would it be worth starting that discussion on a separate thread? >> >> Sung >> >> On 2024/08/02 17:57:17 Fokko Driesprong wrote: >> > Hey Sung, >> &

Re: [DISCUSS] Iceberg-rust based Ruby bindings

2024-08-06 Thread Fokko Driesprong
Hi Chris, Thanks for raising this. Do you know how big the Ruby data community is? I think the most important part is that it gets some traction and will continue to be maintained. I fully agree that building on top of iceberg-rust makes a lot of sense, since also with PyIceberg we're running int

Re: [DISCUSS] Use iceberg-rust as pyiceberg file io

2024-08-02 Thread Fokko Driesprong
having Rust and Python code in a single repository. There are some exceptions like Pydantic (pydantic <https://github.com/pydantic/pydantic>, pydantic-core <https://github.com/pydantic/pydantic-core>). Kind regards, Fokko Op vr 2 aug 2024 om 20:11 schreef Fokko Driesprong : > Th

Re: [DISCUSS] Use iceberg-rust for PyIceberg Bucket Transform

2024-08-02 Thread Fokko Driesprong
Hey everyone, In the beginning of PyIceberg, one of the goals was to keep PyIceberg pure Python. At some point, we've added a Cython Avro decoder because of performance reasons, but we still have a pure Python fallback. Today you can still do metadata operating using s3fs without any native code.

Re: [DISCUSS] Use iceberg-rust as pyiceberg file io

2024-08-02 Thread Fokko Driesprong
Thanks for driving this Xuanwo, I already suggested this in my talk back at the Spark Summit to see if we can spark some interest, and it is exciting to see this materialize. For the IO abstraction, I think the FileIO is the best option. We already have the interface

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-02 Thread Fokko Driesprong
Hey Sung, Typically we only push patches into the minor versions, we could also go to version 0.8.0 immediately. Regarding the memory consumption, thanks for putting those numbers together! I would also love to get #929 , so we can push down the

Re: [VOTE] Clarify "File System Tables" in the table spec

2024-08-01 Thread Fokko Driesprong
+1 (binding) Op do 1 aug 2024 om 09:57 schreef Eduard Tudenhöfner < etudenhoef...@apache.org>: > +1 (non-binding) > > On Thu, Aug 1, 2024 at 6:52 AM Micah Kornfield > wrote: > >> +1 (non-binding) >> >> On Wed, Jul 31, 2024 at 5:12 PM Ryan Blue wrote: >> >>> As promised in the discussion thread,

Re: Catalog Questions

2024-07-30 Thread Fokko Driesprong
Hey Taher, You're right! Iceberg uses a catalog among others to maintain consistency, you can read more about it here . The choice of a catalog depends on your organization and how your setup is organized. For example, if you don't use Hive metastore

Re: [DISCUSS] Deprecate HadoopTableOperations, move to tests in 2.0

2024-07-30 Thread Fokko Driesprong
Jack, no atomic drop table support: this seems pretty fixable, as you can change > the semantics of dropping a table to be deleting the latest table version > hint file, instead of having to delete everything in the folder. I feel > that actually also fits the semantics of purge/no-purge better.

Re: [ANNOUNCE] Apache PyIceberg release 0.7.0

2024-07-30 Thread Fokko Driesprong
So many great new features, thanks everyone for contributing, and thanks Sung for running the release! Kind regards, Fokko Op wo 31 jul 2024 om 05:00 schreef Jack Ye : > Thank you Sung for managing the release! And many thanks to everyone that > participated! > > Best, > Jack Ye > > > On Tue, Ju

Re: [DISCUSS] Deprecate HadoopTableOperations, move to tests in 2.0

2024-07-30 Thread Fokko Driesprong
, do I choose REST, or do I choose (or even just build) a >>>>>>> storage-only Iceberg catalog? I feel I would actually choose the later. >>>>>>> >>>>>>> Going back to the discussion points, my current take of this topic >>>>

Re: [DISCUSS][BYLAWS] Moving forward on the bylaws

2024-07-29 Thread Fokko Driesprong
elease page <https://github.com/apache/iceberg/pull/10806>, which includes feedback on a recent PR <https://github.com/apache/iceberg/pull/10787>. Let me know what you think. Kind regards, Fokko Op ma 29 jul 2024 om 12:27 schreef Fokko Driesprong : > Hey everyone, > > JB,

Re: [DISCUSS][BYLAWS] Moving forward on the bylaws

2024-07-29 Thread Fokko Driesprong
Hey everyone, JB, I fully agree with you. For clarity and consistency, we should point to the docs. The starting point for the roles can be as simple as this . For the release manager, I clarified the how-to-release page

Re: [VOTE] Release Apache PyIceberg 0.7.0rc2

2024-07-27 Thread Fokko Driesprong
Hey everyone, I just yanked the release from PyPi. I still encourage everyone to test out PyIceberg 0.7.0rc1 to check if everything works on their end and give all the awesome new features a go. Since the release has been yanked, and releases are immutable in PyPi, there are two ways forward:

Re: [DISCUSS][BYLAWS] Moving forward on the bylaws

2024-07-25 Thread Fokko Driesprong
Hey Micah, I took a look a the PR, and I think that's a good start. Regarding 2. I like the idea of the state diagram, but it feels to me that there are a lot of states in there. Maybe we can take the Airflow AIP states as a start

Re: Dropping JDK 8 support

2024-07-23 Thread Fokko Driesprong
Hey everyone, I'm also in favor of dropping JDK8. To give some context in the ecosystem, next to Spark 4, a lot of projects are moving beyond Java 8: 1. Arrow dropped JDK8 support last week which will be part of the next 18.0.0 release. 2.

[ANNOUNCE] Welcoming new committers and PMC members

2024-07-23 Thread Fokko Driesprong
Hi everyone, The Iceberg PMC is excited to announce new committers and PMC members to the Apache Iceberg project. New committers: - Kevin Liu (kevinjqliu) - Piotr Findeisen (findepi) - Sung Yun (syun64) - Xuanwo (xuanwo) New members of the PMC: - Honah (ho

Re: Building with JDK 21

2024-07-22 Thread Fokko Driesprong
Thanks for summarizing this, Piotr. I believe having a separate thread on dropping Java 8 is the right thing to do. We want to be as transparent about these changes as possible. Kind regards, Fokko Driesprong Op ma 22 jul 2024 om 14:37 schreef Piotr Findeisen < piotr.findei...@gmail.

Re: [VOTE] Release Apache Iceberg 1.6.0 RC1

2024-07-22 Thread Fokko Driesprong
+1 (binding) - Validated checksums and signatures - Checked licenses - Compiled and ran the tests locally using JDK8 - Ran examples Kind regards, Fokko Op ma 22 jul 2024 om 07:05 schreef Ajantha Bhat : > +1 (non-binding) > > * validated checksum and signature > * checked license docs & ran RAT

Re: [DISCUSS] Deprecate HadoopTableOperations, move to tests in 2.0

2024-07-18 Thread Fokko Driesprong
Hey Ryan and others, Thanks for bringing this up. I would be in favor of removing the HadoopTableOperations, mostly because of the reasons that you already mentioned, but also about the fact that it is not fully in line with the first principles of Iceberg (being object store native) as it uses fi

Re: [VOTE] Merge table spec clarifications on time travel and equality deletes

2024-07-16 Thread Fokko Driesprong
+1 (binding) Thanks Micah for the clarification, much appreciated Kind regards, Fokko Op ma 15 jul 2024 om 22:35 schreef Micah Kornfield : > I'd like to raise on modifying the table specification with clarifications > on time travel and equality deletes [1][2]. The PRs have links to prior > ma

Re: [VOTE] Release Apache Iceberg 1.6.0 RC0

2024-07-15 Thread Fokko Driesprong
Thanks JB for running the release! +1 (binding) - Checked signatures and checksums - Ran license check - Ran. tests - Verified against example notebooks - Ran some tests regarding the split of the uri/oauth2-server-uri Kind regards, F

Re: [DISCUSS] Merging specification clarifications

2024-07-15 Thread Fokko Driesprong
Hey Micah, Thanks for raising this. I was going over all the open PRs on the table spec, and I think it would be great to get these in since they provide some valuable clarification. I think a VOTE is the most straightforward way to get it in, you can find an example here

Re: [DISCUSS] Formalized File IO Properties

2024-07-10 Thread Fokko Driesprong
Hey Xuanwo, Thanks for raising this. - The S3 properties are largely covered under the S3FileIO page: https://iceberg.apache.org/docs/nightly/aws/#s3-fileio. But it looks like some important ones are missing indeed. I've raised an issue here

Re: [DISCUSS] Fix property names in REST spec for statistics / partition statistics

2024-07-10 Thread Fokko Driesprong
Hey everyone, I'm fine with a vote, it is a change to the spec indeed, but it is because of a discrepancy between the reference implementation and the spec, so therefore you can also see it as fixing a bug. Let me give some context around how this is done for PyIceberg and Iceberg-Rust. Clients

Re: [DISCUSS] Enable the discussion tab for iceberg github repos

2024-07-10 Thread Fokko Driesprong
Thanks for raising this. I would also prefer discussions over a user mailing-list since it has a lower barrier. We could also first enable this on Iceberg-rust and evaluate it after a while to see the added value and then decide for Python and Java? WDYT? Kind regards, Fokko Op wo 10 jul 2024 om

Re: Building with JDK 21

2024-07-10 Thread Fokko Driesprong
Thanks Piotr for raising this and summing it up so far. The timing of deprecating is always hard, but it looks like there is a lot of traction within the Java data ecosystem to move to a later version: - Avro 1.12.0 will be JDK17+ - Spark 4.x will be JDK17+ - Arrow 18 will be JDK11+ -

Re: [INFO] Preparing the Apache Iceberg 1.6.0 release

2024-07-02 Thread Fokko Driesprong
Hey everyone, thanks for jumping in here. On the run Flink without Hadoop PR, I've removed it from the milestone since after discussing with Peter on the PR, we've agreed that there needs to be a more fundamental fix. Also, it would be good to have tests to ensure that it runs without Hadoop. I'l

Re: Iceberg - PySpark overwrite with a condition

2024-06-30 Thread Fokko Driesprong
how() +--+---+ | name|age| +--+---+ | Fokko| 1| | Gurbe| 2| |Pieter| 2| +--+---+ >>> >>> new_person = [('Joe', 2)] >>> df_overwrite = spark.createDataFrame(new_person, ['name', 'age']) >>> >>> from pyspark.sql.

Re: Iceberg - PySpark overwrite with a condition

2024-06-28 Thread Fokko Driesprong
Hey Ha, What version of Spark are you using? Can you share the whole stack trace? I tried to reproduce it locally and it worked fine: pyspark --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.2\ --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSession

Re: [INFO] Preparing the Apache Iceberg 1.6.0 release

2024-06-25 Thread Fokko Driesprong
it, but it doesn't hurt to >> mention it here so that it does not go under our radar. There is no >> breaking change in this release, FYI. >> >> >> >> Regards, >> >> >> >> Alex >> >> >> >> On Wed, Ju

Re: [Early Feedback] Variant and Subcolumnarization Support

2024-06-25 Thread Fokko Driesprong
#x27;s authors go over the open issues and try to resolve low-hanging fruit. This will clean up the proposal already quite a bit. Then we can come up with a list of open questions (happy to help) and have a meeting to discuss these. WDYT? Kind regards, Fokko Driesprong Op vr 31 mei 2024 om 18:54

Re: Feedback Collection: Bylaws in Iceberg

2024-06-24 Thread Fokko Driesprong
itial response, I think there is value in the bylaws, but I'm a firm believer in people over process (community over code?). I'll go over the Google-doc tomorrow morning in detail. Kind regards, Fokko Driesprong Op ma 24 jun 2024 om 21:20 schreef Ryan Blue : > Here is my original

Re: [DISCUSSION] Preparing the Apache iceberg-rust 0.3.0 release

2024-06-20 Thread Fokko Driesprong
; > >>> we just have to make sure that we're not able to write metadata > without field-IDs because that would violate the spec (and cause > potentially compatibility issues down the road). > >> > >> > >> We need to wait for a while for the av

Re: [DISCUSSION] Preparing the Apache iceberg-rust 0.3.0 release

2024-06-19 Thread Fokko Driesprong
+1 all in for a release looking at the amount of great features that are staged. Thank you Renjie and Xuanwo for attending the community sync, especially looking at your time zone. Renjie raised the issue of not having field-IDs based resolution when reading Avro. I think we're fine for leaving th

Re: Agenda Community Sync 19th June

2024-06-19 Thread Fokko Driesprong
Hey everyone, Thanks for the input. I've collected everything in the notes , feel free to do suggestions or edits. Thanks Brian for running the recording. Looking forward to seeing everyone later today! Kind regards

Re: Agenda Community Sync 19th June

2024-06-18 Thread Fokko Driesprong
Hey Jan, Thanks for raising this. Let me jot down the highlights, and feel free to add what you'd like to discuss. I'm personally looking forward to an update on the materialized views. Kind regards, Fokko Op di 18 jun 2024 om 20:28 schreef Jan Kaul : > Hi all, > > I was wondering whether there

Re: [INFO] Preparing the Apache Iceberg 1.6.0 release

2024-06-12 Thread Fokko Driesprong
Hi JB, thanks for raising this. - With the Gradle version update, we will be able to upgrade to Parquet > 1.14.0 We might want to defer this until Parquet 1.14.1 gets released. There is an issue found with Jackson that prohibits Spark from upgrad

Re: Addressing security questions in the Iceberg REST specification

2024-05-31 Thread Fokko Driesprong
T servers to opt for integrating with any standard >>> OAuth2 / >>> >>>> OIDC provider (e.g. Okta, Keycloak, Authelia). >>> >>>> >>> >>>> I agree with both of these points; again I don't think the >>> intentio

Re: Addressing security questions in the Iceberg REST specification

2024-05-28 Thread Fokko Driesprong
Hey Robert, Sorry for the late reply as I was out last week. I'm not an OAuth guru either, but some context from my end. * Credentials (for example username/password) must _never_ be sent to > the resource server, only to the authorization server. In an earlier discussion

Re: GitHub issue labels

2024-05-27 Thread Fokko Driesprong
Hey Manu, I don't explicitly use the labels, but they help me to categorize the issues mentally. I agree that there is room for improvement as there are more issues being raised every day. Other communities also have interesting approaches, such as: - Triage label: When a new bug, improvement

Re: [VOTE] Release Apache Iceberg 1.5.2 RC0

2024-05-02 Thread Fokko Driesprong
+1 (binding) Thanks for going through this once more! - Ran the signatures and checksums - Checked the licenses - Ran some sample checks with Spark 3.5 (Scala 2.12) Kind regards, Fokko Op do 2 mei 2024 om 15:51 schreef Eduard Tudenhoefner : > +1 (non-binding) > > * validated checksum and signa

Re: [ANNOUNCE] Apache PyIceberg release 0.6.1

2024-04-30 Thread Fokko Driesprong
Awesome! Thanks for running this release Honah 🙌 Kind regards, Fokko Op wo 1 mei 2024 om 06:48 schreef Honah J. : > I'm pleased to announce the release of Apache PyIceberg 0.6.1! > > Apache Iceberg is an open table format for huge analytic datasets. Iceberg > delivers high query performance for

Re: [VOTE] Release Apache Iceberg 1.5.1 RC0

2024-04-23 Thread Fokko Driesprong
Sorry for being late to the party! +1 (binding) - Checked checksum, signature and licenses - Ran example notebooks Kind regards, Fokko Op di 23 apr 2024 om 22:58 schre

Re: [VOTE] Release Apache PyIceberg 0.6.1rc3

2024-04-18 Thread Fokko Driesprong
Thanks Honah for the quick follow-up with RC3. +1 binding - Ran the signatures, checksums, and licenses. - Double-checked that it installs from a clean Python 3.10 doc

Re: [VOTE] Release Apache PyIceberg 0.6.1rc2

2024-04-17 Thread Fokko Driesprong
re is the poetry.lock file that provides reproducable CI builds, and this is missing from the tar.gz (where it will try to install the latest and greatest). Kind regards, Fokko Driesprong Op do 18 apr 2024 om 04:21 schreef Kevin Liu : > +1 (non binding) > > Downloaded specific commit f

Re: [VOTE] Release Apache PyIceberg 0.6.1rc2

2024-04-17 Thread Fokko Driesprong
Hey everyone, First of all, thanks Honah for running the release! +1 (binding) from my end - I checked the signature, hashes, and licenses and all look good . - Ran some local tests. Kind regards, Fokko Op di 16 apr 2024 om 05:55

Re: Looking for help with Pyflink and Iceberg

2024-04-10 Thread Fokko Driesprong
Hey Frank, Thanks for reaching out here. I spent some cycles a while ago to remove the Hadoop requirement from Flink. There were a lot of APIs that needed to change, which caused not to follow through with it. But this might help you in getting PyFlink up and running since it contains an example s

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-05 Thread Fokko Driesprong
Hey everyone, First of all thanks for all the votes. Regarding the discussion around the NOTICE. We all agree that when something is bundled, it needs to be added to the notice. However, Laynes Law of Debate comes into play: what's the definition of bundling? To e

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Fokko Driesprong
+1 (binding) - Checked the signature and the checksum - Ran the example notebooks against 0.6.1rc1 - Did some checks locally and looks all good! Thanks Honah for running the release! Kind regards, Fokko Op do 4 apr 2024 om 17:56 schr

Re: [PROPOSAL] Improvement on our PR flows

2024-03-20 Thread Fokko Driesprong
this? Kind regards, Fokko Driesprong Op wo 13 mrt 2024 om 13:17 schreef Renjie Liu : > Hi, JB: > > Your proposal looks great to me. We should definitely have a vote for a > proposal impacting the spec, and the model is great. > > On Tue, Mar 12, 2024 at 10:55 PM Jean-Baptiste Ono

Re: [DISCUSS] What do we plan for Iceberg 2.0.0 ?

2024-03-13 Thread Fokko Driesprong
Hey JB, Thanks for raising this. Sorry for the late reply, but I was OOO last week. I think in general the progress is being kept on the spec itself . Also, some features are already available (default values in Python, and nanosecond timestamps

Re: [DISCUSS] Iceberg board report - March 2024

2024-03-12 Thread Fokko Driesprong
Thanks Ryan, That looks comprehensive, thanks for taking the time to compile the report. I have a few suggestions for the release section: - Name the releases by name: Python → PyIceberg. If people want to look it up, just googling the name will bring them to it directly. - Split the rel

Re: [ANNOUNCE] Apache Iceberg release 1.5.0

2024-03-12 Thread Fokko Driesprong
Thanks for running the release Ajantha. It is great to see view support being released on the Java side 🎉 Thanks everyone for the hard work in making this release happen! Including all our new contributors ! Kind regards, Fokko

New committer: Renjie Liu

2024-03-08 Thread Fokko Driesprong
Hi everyone, The Project Management Committee (PMC) for Apache Iceberg has invited Renjie Liu to become a committer and we are pleased to announce that he has accepted. We're very excited to have Renjie as a committer as he's leading the effort of bringing Iceberg to the Rust world. Being a commi

Re: [VOTE] Release Apache Iceberg 1.5.0 RC6

2024-03-08 Thread Fokko Driesprong
+1 (binding) Thanks again for working on this Ajantha and Eduard. - Checked checksum and signature - Ran a modified version of dbt-spark to take advantage of the views and it worked great! Cheers, Fokko Op za 9 mrt 2024 om 06:35 schreef Szehon Ho : > +1 (binding) > > * Verified signature > *

New committer: Bryan Keller

2024-03-05 Thread Fokko Driesprong
Hi everyone, The Project Management Committee (PMC) for Apache Iceberg has invited Bryan Keller to become a committer and we are pleased to announce that he has accepted. Bryan was contributing to Iceberg before it was even open-source, did a lot of work on the topic of metadata generation, and i

Re: [VOTE] Release Apache Iceberg 1.5.0 RC4

2024-03-01 Thread Fokko Driesprong
+1 (binding) - Checked checksum and signature - Ran a modified version of dbt-spark to take advantage of the views, and it worked like a charm! 🥳 Cheers, Fokko Op vr 1 mrt 2024 om 06:43 schreef Ajantha Bhat : > Gentle reminder. > > On Wed, Feb 28, 2024 at 8:34 PM Eduard Tudenhoefner > wrote: >

Re: Gravitino an Iceberg REST catalog service

2024-02-29 Thread Fokko Driesprong
Hey everyone, Thanks for raising this. I think a test-jar would be a great first step. We already maintain "service" considering JDBC, Hive, etc catalogs. REST Catalog ref impl in Iceberg would be the sam. What I think Ryan means by a service is having to maintain Postgres (JDBC backend), Hive

Re: [VOTE] Release Apache Iceberg 1.5.0 RC0

2024-02-20 Thread Fokko Driesprong
Just using this thread to come back to the NOTICE discussion. This came also up with the latest Python release, and I spent quite a bit of time on it. If it's "used" section is not strictly required in NOTICE from a legal > perspective, the embedded dependencies should be mentioned (either > under

Re: [VOTE] Release Apache Iceberg Rust 0.2.0 RC1

2024-02-20 Thread Fokko Driesprong
error: no file found at: /home/blue/tmp/apache-iceberg-rust-0.2.0-src >>make: *** [Makefile:33: cargo-sort] Error 1 >> >> >> >> On Mon, Feb 19, 2024 at 11:00 AM Jack Ye wrote: >> >>> +1 (binding) >>> >>> Verified checksum, signatur

Re: [VOTE] Release Apache PyIceberg 0.6.0rc6

2024-02-19 Thread Fokko Driesprong
+1 (binding) I've checked signatures and checksums, checked the licenses, and did some checks around writing. Kind regards, Fokko Op ma 19 feb 2024 om 03:07 schreef Amogh Jahagirdar : > +1 non-binding > Verified signatures, checksum, and license > Ran unit/integ tests on Python 3.10.4 > Ran ad-

Java Iceberg 2.0: Hadoop upgrade

2024-02-16 Thread Fokko Driesprong
Hi everyone, I want to discuss adding the Hadoop upgrade to the list after moving to Iceberg 2.0. We still compile against Hadoop 2.7.3 to ensure we support as many users as possible. Hadoop 2.7.3 was released August 2016 and is not maintained anymore

[VOTE] Release Apache Iceberg Rust 0.2.0 RC1

2024-02-15 Thread Fokko Driesprong
Hello, Apache Iceberg Rust Community, This is a call for a vote to release Apache Iceberg Rust version 0.2.0. The tag to be voted on is 0.2.0-rc.1. This first release provides integration with the REST catalog and a lot of scaffolding that's needed for reading the data. The release candidate:

Re: [VOTE] Release Apache PyIceberg 0.6.0rc4

2024-02-11 Thread Fokko Driesprong
That makes sense. I've updated the PR: https://github.com/apache/iceberg-python/pull/410/ PTAL. Kind regards, Fokko Op zo 11 feb 2024 om 03:58 schreef Justin Mclean : > HI, > > For the Thrift and Hive ones, we have an optional dependency that ships > the content under the vendor/ directory: > ht

Re: [DISCUSS] iceberg-rust 0.2.0 release

2024-02-10 Thread Fokko Driesprong
Hey Renjie, That would be great. I'm happy to do the committer/PMC side of things. Let's coordinate on the release tracking issue: https://github.com/apache/iceberg-rust/issues/180 Kind regards, Fokko Kind regards, Fokko Driesprong Op wo 7 feb 2024 om 03:36 schreef Xuanwo : &g

Re: [VOTE] Release Apache PyIceberg 0.6.0rc4

2024-02-10 Thread Fokko Driesprong
Hi Justin, Dan, Thanks for checking this. For the Avro one, we copied parts of the decompression and binary decoder for the internal PyIceberg implementation (that reads from an Iceberg schema, rather than from an Avro schema). I checked the Avro NOTICE, and there isn't anything relevant. I notic

Re: [DISCUSS] iceberg-rust 0.2.0 release

2024-02-06 Thread Fokko Driesprong
ss here. >> >> -Dan >> >> On Wed, Jan 31, 2024 at 9:36 AM Fokko Driesprong >> wrote: >> >>> I'm all for the 0.2.0 release. Kudos to all the work so far. While the >>> functionality is limited today, a lot of things are already in prog

Re: [Discuss] Change iceberg-python and iceberg-go CI Settings to only require approval for first time contributors

2024-02-02 Thread Fokko Driesprong
+1 Op vr 2 feb 2024 om 08:47 schreef Eduard Tudenhoefner : > +1 > > > > On Fri 2. Feb 2024 at 04:56 Drew wrote: > >> +1 >> >> Thanks for bringing this up for PyIceberg Honah >> >> On Thu, Feb 1, 2024 at 5:35 PM Honah J. wrote: >> >>> Hello everyone >>> >>> Inspired by our recent discussion rega

Re: [PROPOSAL] Create user mailing list ?

2024-02-02 Thread Fokko Driesprong
±0 for having a user mailing list. I don't believe that having more channels will lead to better support. I agree that the archiving capabilities of Slack are limited, and the search is sub-optimal. But we should also make sure that the questions asked are also integrated into the documentation. T

Re: [DISCUSS] Change iceberg-rust CI Settings to only require approval for new github users

2024-01-31 Thread Fokko Driesprong
much faster. Also from a reviewer perspective, I like to know if the CI passes before reviewing and this also takes a bit of time. I don't think there is much risk since the Actions have limited permissions, and all the repositories are actively looked at. Kind regards, Fokko Driesprong Op wo 31 j

  1   2   >