I'll start work on making a table with currently supported operations. There are a lot of gaps right now, but I'd like to start closing out some of the most important ones in the coming weeks/months.
On Tue, Mar 5, 2019 at 4:11 PM Ryan Blue <rb...@netflix.com.invalid> wrote: > We may want to do that eventually, but I think it isn't necessary right > now. I also don't know how we would determine what must be implemented to > form that minimum support threshold. Not having features means not being > able to read tables that use them. If there is no Parquet support, then you > can't read Parquet tables. Same thing with encryption. I don't see a > minimum being valuable, compared to a table that makes it clear what is > supported. We should probably make that table, though! > > On Thu, Feb 28, 2019 at 3:39 PM Xabriel Collazo Mojica <xcoll...@adobe.com> > wrote: > >> Regarding: >> >> >> >> Would every feature added to the Java version need to be mirrored in >> Python? >> >> I think that the spec should be used to coordinate across >> implementations, but that those implementations can have different features >> and degrees of support. It would be fine if python didn’t have support for >> the encryption structures until someone needs that support from Python and >> adds it. Otherwise, we’re asking too much of contributors: go fix this in >> another language that you may not know or be comfortable in. >> >> >> >> Should then the spec have a feature support matrix stating minimum >> support needed? As in the customary MAY, SHOULD, MUST, etc. [1]? >> >> >> >> [1]: https://www.ietf.org/rfc/rfc2119.txt >> >> >> >> >> >> *Xabriel J Collazo Mojica* | Senior Software Engineer | Adobe | >> xcoll...@adobe.com >> >> >> >> *From: *Ryan Blue <rb...@netflix.com.INVALID> >> *Reply-To: *"dev@iceberg.apache.org" <dev@iceberg.apache.org>, " >> rb...@netflix.com" <rb...@netflix.com> >> *Date: *Thursday, February 28, 2019 at 3:23 PM >> *To: *Matt Cheah <mch...@palantir.com> >> *Cc: *"dev@iceberg.apache.org" <dev@iceberg.apache.org> >> *Subject: *Re: [DISCUSS] Python implementation >> >> >> >> The only difficulty I can think of is that we will need to remove the >> python directory from the source tarball when we build it. Shouldn't be a >> big problem. >> >> >> >> rb >> >> >> >> On Thu, Feb 28, 2019 at 2:08 PM Matt Cheah <mch...@palantir.com> wrote: >> >> I’m wondering how significant the maintenance burden is for maintaining >> two release cycles from the same repository? I would imagine that it would >> be less burden concentrated in one place if we had separate repositories at >> least to start with. Then when we have confidence in the readiness of the >> Python work we can merge it into Iceberg proper and have the release >> publish both versions. >> >> >> >> -Matt Cheah >> >> >> >> *From: *Daniel Weeks <daniel.c.we...@gmail.com> >> *Reply-To: *"dev@iceberg.apache.org" <dev@iceberg.apache.org> >> *Date: *Thursday, February 28, 2019 at 1:47 PM >> *To: *"dev@iceberg.apache.org" <dev@iceberg.apache.org>, Ryan Blue < >> rb...@netflix.com> >> *Subject: *Re: [DISCUSS] Python implementation >> >> >> >> I agree with this approach. >> >> >> >> Since this is an entirely new implementation for python, it makes more >> sense to take the initial version (pending any additional review/comments) >> and then continue to iterate from that point. It would be very difficult >> to break up into smaller commits and work through incrementally without >> adding a lot of value (though going forward we should lean into more >> incremental contributions). >> >> >> >> I do think that Matt brings up some good points and initially I would >> lean into keeping a single repo and if we find there are more contributions >> in other languages that we reconsider separating the repos to keep them >> from impacting releases. >> >> >> >> Also, want to cal lout a huge thanks to Ted for all the work they did to >> contribute to this and Uwe for reviewing. >> >> >> >> -Dan >> >> >> >> >> >> >> >> On Thu, Feb 28, 2019, 12:26 PM Ryan Blue <rb...@netflix.com.invalid> >> wrote: >> >> Hi everyone, >> >> >> >> One of our contributors, Ted, has done a lot of work on an initial python >> implementation and Uwe was kind enough to review it. Here's the PR: >> >> >> >> https://github.com/apache/incubator-iceberg/pull/54 [github.com] >> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__github.com_apache_incubator-2Diceberg_pull_54%26d%3DDwMFaQ%26c%3Dizlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8%26r%3DhzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs%26m%3D2fd2BMX_B8e6HdkY_gBWAhTDBM6ub2f3wG910jf-Itw%26s%3Dta9z2acUFCvQRc67MnbJypCG90OL1VuMFEmnd0ymOVA%26e%3D&data=02%7C01%7Cxcollazo%40adobe.com%7C02ff98f47e53462cfee308d69dd3b28c%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636869929896506719&sdata=lzZywzfkYZzSdeBWs0lILiUvV6sv2KUSSNK2nW7ztns%3D&reserved=0> >> >> >> >> Because this is a brand-new implementation, the PR is huge: 157 new >> files. That makes it really tough to review in depth, and also really time >> consuming to update and maintain. What I suggest is committing the PR as-is >> now that it has passed a round of reviews. Then we can improve it in >> smaller pull requests. >> >> >> >> Are there any objections to this plan or other thoughts? >> >> >> >> I think that the python implementation would not be included in the first >> Apache Iceberg release. I would prefer to release the python implementation >> on a separate release cycle so that Java blockers don't prevent a Python >> bug fix and vice versa. >> >> >> >> rb >> >> >> >> -- >> >> Ryan Blue >> >> Software Engineer >> >> Netflix >> >> >> >> >> -- >> >> Ryan Blue >> >> Software Engineer >> >> Netflix >> > > > -- > Ryan Blue > Software Engineer > Netflix >