Hi, Thanks for your input.
Every time there is a new major release, all new development shifts towards that new API and users of previous APIs are left behind. It is not just a matter of SemVer and size of version numbers, there is a whole development shift to be on top of the new API. I disagree that a software that has a major release every 3 months and no maintenance window over previous versions is stable. I alluded to the Tokio example because Tokio 1.0 recently became the runtime of rust-based AWS lambda functions [1]; this commitment is only possible by enforcing API stability and maintenance beyond a 3 month period (at least 3 years in their case). Also, imo the current major version number is not meaningless: divided by the software age, it constitutes the historical release pattern and is usually a good predictor of the pattern used in future releases. The evidence is that we haven't been able to support any version for any period of time; recently, Andrew has been doing amazing work at supporting the latest version for a period of 3 months. I.e. an application that depends on `arrow = ^5.0` has a support window of 3 months. Given that we have not backported any security fixes to previous versions, it is reasonable to assume that security patches are also applied within a 3 month period only. As contributor of arrow2, I would rather not have arrow2 under Apache Arrow than having to release it under its current versioning and scheduling (this is similar to some of Julia's concerns). As a contributor to the Apache Arrow, I currently cannot guarantee a maintenance window over arrow-rs for any period of time because it is unsafe by design and I do not have the motivation to fix it. As both, I am confident that the core arrow2 will soon reach a point where we can live with and develop on top of it for at least a year. This is not true to the whole API surface, though: there are APIs that we will need to change more often until stability can be promised. So, I am requesting that we tie the discussion of arrow2 to how it will be released. Could a middle ground be somewhere along the lines of splitting the crate in smaller crates that are versioned independently. I.e. continue to release `arrow` under the same versioning and cadence, and create 3 new crates, arrow-core, arrow-compute, and arrow-io (see also [2]) that would have their own versioning at 0.X until stability is achieved, based on arrow2's code base. The migration of the `arrow` crate to arrow2's API would be to re-export from the smaller crates (e.g. `pub use arrow_core::array`). [1] https://crates.io/crates/lambda_runtime/0.3.1/dependencies [2] https://github.com/jorgecarleitao/arrow2/issues/257 Best, Jorge On Thu, Aug 5, 2021 at 11:53 PM Adam Lippai <a...@rigo.sk> wrote: > Not taking sides, just two technical notes below. > > Server.org clearly defines ( > https://semver.org/#how-do-i-know-when-to-release-100) the versions > >1.0.0. > * If it's used in production, it's 1.0.0. > * If it provides an API others depend on then it's 1.0.0. > * If you intend to keep backward compatibility, it's 1.0.0. > Tl;Dr 1.0.0 represents a version which from point we guarantee that > non-production releases are marked (alpha, beta, rc) and breaking (API) > changes, backwards incompatible changes result in major version bump. This > we already do, 4x per year. > > The second fact is that arrow2 uses the arrow name, but it doesn't have > apache governance. It's not released from GitHub.com/apache, there are no > formal releases, there are no votes. This is not correct or fair usage of > the brand (on the same level as DataFuse, or db-benchmark calling a custom > R implementation arrow) even if it's "unofficial". My understanding is that > arrow2 can be an unofficial implementation with a different name or an > arrow-rs experiment with the intention to merge the code, but not both. > > I think both issues could be solved and I really value and like the arrow2 > work so far. That's the right way. I hope we'll see it in prod either way > as soon as it's ready. > > Best regards, > Adam Lippai > > On Wed, Aug 4, 2021, 08:25 QP Hou <houqp....@gmail.com> wrote: > > > Just my two cents. > > > > I think we all have the same goal here, which is to accelerate the > > transitioning of arrow to arrow2 as the official arrow rust > > implementation. > > > > In my opinion, the biggest gain we can get from merging two projects > > into one repo is to have some kind of a policy to enforce that every > > new feature/test added to the current arrow implementation also needs > > to be added to the arrow2 implementation. This way, we can make sure > > the gap between arrow and arrow2 is closing on every iteration. > > Without this, I tend to agree with Jorge that merging two repos would > > add more overhead to his work and slow him down. > > > > For those who want to contribute to arrow2 to accelerate the > > transition, I don't think they would have problem sending PRs to the > > arrow2 repo. For those who are not interested in contributing to > > arrow2, merging the arrow2 code base into the current arrow-rs repo > > won't incentivize them to contribute. Merging arrow2 into current > > arrow-rs repo could help with discovery. But I think this can be > > achieved by adding a big note in the current arrow-rs README to > > encourage contributions to the arrow2 repo as well. > > > > At the end of the day, Jorge is currently the sole active contributor > > to the arrow2 implementation, so I think he would have the most say on > > what's the most productive way to push arrow2 forward. The only > > concern I have with regards to merging arrow2 into arrow-rs right now > > is Jorge spent all the efforts to do the merge, then it turned out > > that he is still the only active contributor to arrow2 within > > arrow-rs, but with more overhead that he has to deal with. > > > > As for maintaining semantic versioning for arrow2, Andy had a good > > point that we could still release arrow2 with its own versioning even > > if we merge it into the arrow-rs repo. So I don't think we should > > worry/focus too much about versioning in our discussion. Velocity to > > close the gap between arrow-rs and arrow2 is the most important thing. > > > > Lastly, I do agree with Andrew that it would be good to only maintain > > a single arrow crate in crates.io in the long run. As he mentioned, > > when the current arrow2 code base becomes stable, we could still > > release it under the arrow namespace in crates.io with a major version > > bump. The absolute value in the major version doesn't really matter as > > long as we stick to the convention that breaking change will result in > > a major version bump. > > > > Thanks, > > QP > > > > > > > > On Tue, Aug 3, 2021 at 5:31 PM paddy horan <paddyho...@hotmail.com> > wrote: > > > > > > Hi Jorge, > > > > > > I see value in consolidating development in a single repo and releasing > > under the existing arrow crate. Regarding versioning, I think once we > > follow semantic versioning we are fine. I don't think it's worth > migrating > > to a different repo and crate to comply with the de-facto standard you > > mention. > > > > > > Just one person's opinion though, > > > Paddy > > > > > > > > > -----Original Message----- > > > From: Jorge Cardoso Leitão <jorgecarlei...@gmail.com> > > > Sent: Tuesday, August 3, 2021 5:23 PM > > > To: dev@arrow.apache.org > > > Subject: Re: [Discuss] [Rust] Arrow2/parquet2 going foward > > > > > > Hi Paddy, > > > > > > > What do you think about moving Arrow2 into the main Arrow repo where > > > > it > > > is only enabled via an "experimental" feature flag? > > > > > > AFAIK this is already possible: > > > * add `arrow2 = { version = "0.2.0", optional = true }` to Cargo.toml > > > * add `#[cfg(feature = "arrow2")]\npub mod arrow2;\n` to lib.rs > > > > > > We do this kind of thing to expose APIs from non-arrow crates such as > > parts of the parquet-format-rs crate, and is generally the way to go > when a > > crate wants to expose a third-party API. > > > > > > I would not recommend doing this, though: by exposing arrow2 from > arrow, > > we double the compilation time and binary size of all dependencies that > > activate the flag. Furthermore, there are users of arrow2 that do not > need > > the arrow crate, which this model would not support. > > > > > > AFAIK where development happens is unrelated to this aspect, Rust > > enables this by design. > > > > > > > but also this would be a clear signal that Arrow2 is <1.0. > > > > the experimental flag will be a clear signal to the existing Arrow > > > community that Arrow2 is the future but that it is <1.0 > > > > > > arrow2 is already <1.0 < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Farrow2&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=bJEw92M9Lz8cxJZ0o3vc0ezpou%2BuQx1S0MYeODKCKmE%3D&reserved=0 > >. > > My argument is that the arrow/arrow-flight/parquet are not versioned > > according to the Rust community standards: It is a de facto practice in > > Rust to delay major releases until the API is stable. Tokio's blog post > > about their 1.0 < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftokio.rs%2Fblog%2F2020-12-tokio-1-0&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=En8p4k7Etyc%2BnQ3mJC4woQD%2Fkt7Uhmhw%2Bzf8scHhdgQ%3D&reserved=0 > > > > (i.e. "[...] we commit to holding back on a Tokio 2.0 release for at > least > > 3 years."). 10 most downloaded > > > crates: > > > > > > * > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frand&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sBxp1XYBLl6OIV57nM%2FGsZO0AmbgyBeRaoPANEvdZGE%3D&reserved=0 > > (0.8.4) > > > * > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fsyn&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=oeQliVwSgrvgART7r49XeiM%2F72TYa7hX8M3QyVDrqsk%3D&reserved=0 > > (1.0.74) > > > * > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Flibc&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=OULOu9vhaWEgnavRqedebM7ceZRsVnaF7YjYuq1MJ3Y%3D&reserved=0 > > (0.2.98) > > > * > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frand_core&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=mx6X86bNRis6UykbWR%2FWTGEgAjq8h6JylmOSAQlfsh0%3D&reserved=0 > > (0.6.3) > > > * quote (1.0.9) > > > * unicode-xid (0.2.2) > > > * proc-macro2 (1.0.28) > > > * cfg-if (1.0.0) > > > * > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fserde&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=p%2FNgTB0839C1%2F1Zn4GeEnRtvr0hiFhOuBJ5tF76aW5E%3D&reserved=0 > > (1.0.126) > > > * bitflags (1.2.1) > > > > > > These are small crates with a small scope, but even larger projects > > share the same pattern: > > > > > > * crossbeam < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fcrossbeam&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764521997%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=9C%2BX5DnKLpp%2F8aTGrmKNB73Jf5JanlL4OhuC0YKgw9s%3D&reserved=0 > > > > (0.8.1) > > > * rocket < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Frocket&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Jh93g%2BiXxoeKlTNzhaOKvs3bsBfIJO3DJeetBI3nBV0%3D&reserved=0 > > > > (0.5) > > > * polars < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fpolars&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Pdzno7bF3oqviXmv6nxInZemHD1d0SsaxmfdUxJ57T0%3D&reserved=0 > > > > (0.14.8) > > > * tower < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Ftower&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=AmUGvrzXd8giphnKq0FNwjnc4a4Ki3T3GJL3P8rvEeM%3D&reserved=0 > > > > (0.4.8) > > > * Tokio < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Ftokio&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Z%2FqBVQ%2Fi0BCmSJiBL7E6y%2F%2BbMVGKYXdo3oCRGOjm5UA%3D&reserved=0 > > > > (1.9.0) > > > * hyper < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcrates.io%2Fcrates%2Fhyper&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=c%2Fy4eY0BQCXE8XIoSb6UZAVUx4U%2BwcRUKN9jGJs5v3w%3D&reserved=0 > > > > (0.14.11) > > > > > > Crates that arrow depends on > > > < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Farrow-rs%2Fblob%2Fmaster%2Farrow%2FCargo.toml&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=DdGZFC5Hf7i362%2FmhfFQUVVPnkDBJzw0zM6AzQ4jgcQ%3D&reserved=0 > > >, > > > that DataFusion > > > depends on > > > < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Farrow-datafusion%2Fblob%2Fmaster%2Fdatafusion%2FCargo.toml&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=OXKyW4O6q4hn6ZCHTN2jIvJpI3Iv8JvBBa0zKzBgZag%3D&reserved=0 > > >, > > > all share the same pattern of being either 0.X, 1.X when their API is > > stable, and 2.X when they needed a large change in the API. This > contrasts > > with Apache Arrow's releases where we are now at 5.0 (and we have yet to > > arrive at a safe design). > > > > > > > existing users will be well supported in this transition > > > > > > How so? imo people either PR to the arrow/arrow2 code base or they > won't. > > > This is largely independent of where the development of either arrow2 > or > > arrow happens; people google the crate, click on the repository link and > > file an issue or field a PR. > > > > > > > In general, I think the longer that development proceeds in separate > > > repos the harder it will be to eventually merge the two in a way that > > supports existing users. > > > > > > How so? I may be mistaken, but API design is unrelated to on which repo > > the development happens: it is primarily driven by who is designing it > and > > from where or who they are inspired by. Both arrow and parquet's crate > > design are inspired by the C++ implementation and have gradually been > > migrated to "idiomatic" Rust, as "idiomatic" is becoming more well > defined > > in Rust. > > > Arrow2 is inspired by the current crate and the pains of using it in > > DataFusion. Datafuse, a fork of datafusion, recently migrated to arrow2 > > > < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdatafuselabs%2Fdatafuse%2Fpull%2F1239&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0W9AeIxXcAvCrXkOE%2F1h0o%2BWam15PHEP7Pf7U1L84As%3D&reserved=0 > >: > > +1,947 −3,484, which shows that the crate is capturing important patterns > > from the arrow crate and exposing ones that are useful / result in less > > code for the same or higher performance. > > > > > > On the opposite side, merging the development of crates under the same > > repo leads to: more triagging of PRs; more work for releases and > > changelogging; tagging based on crates; multiple READMEs in subpaths of > the > > repo, curation of the CI to accommodate this, a workspace with many > crates > > each with its own set of dependencies, increasing compilation and > > development; mixed commit logs, difficulties in reverts and cherry-picks; > > more difficult to find stuff in the repo. See e.g. how tokio-rs does it: > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftokio-rs&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=nZUiKNr1DmeTNJLqiZgKX5P7nb6jt0OuZlufMywmDBE%3D&reserved=0 > , > > even for small crates like bytes < > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftokio-rs%2Fbytes&data=04%7C01%7C%7Ca37de2cddc6e447a777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ltf66TZejbomCtlqvhmDswFfdrunChIz5rDTeZzwyRU%3D&reserved=0 > > >. > > > > > > Best, > > > Jorge > > > > > > On Tue, Aug 3, 2021 at 3:13 PM paddy horan <paddyho...@hotmail.com> > > wrote: > > > > > > > Hi Jorge, > > > > > > > > What do you think about moving Arrow2 into the main Arrow repo where > > > > it is only enabled via an "experimental" feature flag? This would > > > > allow development of Arrow2 to proceed in the main repo but also this > > > > would be a clear signal that Arrow2 is <1.0. When we feel ready > (i.e. > > > > Arrow2 is 1.0) we can release it in the next main release with Arrow2 > > > > being the default and move the existing implementation behind a > > "legacy" feature flag. > > > > > > > > Here is why I think this might work well: > > > > - People contributing to the Arrow project will naturally contribute > > > > to Arrow2. At the moment, some people will still contribute to Arrow > > > > instead of Arrow2 just by virtue of it being the "official" > > implementation. > > > > However, if both are in one repo people will want to contribute to > the > > > > "future", i.e. Arrow2. > > > > - the experimental flag will be a clear signal to the existing Arrow > > > > community that Arrow2 is the future but that it is <1.0 > > > > - existing users will be well supported in this transition > > > > - In general, I think the longer that development proceeds in > > > > separate repos the harder it will be to eventually merge the two in a > > > > way that supports existing users. > > > > > > > > Do you think would work? > > > > > > > > Paddy > > > > > > > > -----Original Message----- > > > > From: Jorge Cardoso Leitão <jorgecarlei...@gmail.com> > > > > Sent: Monday, August 2, 2021 1:59 PM > > > > To: dev@arrow.apache.org > > > > Subject: Re: [Discuss] [Rust] Arrow2/parquet2 going foward > > > > > > > > Hi, > > > > > > > > Sorry for the delay. > > > > > > > > If there is a path towards an official release under a <1.0.0 > > > > versioning schema aligned with the rest of the Rust ecosystem and in > > > > line with the stability of the API, then IMO we should move all > > > > development to within Apache experimental asap (I can handle this and > > > > the likely IP clearance round). If we require a release >=1.X.Y to it > > > > and/or a schedule, then I prefer to keep expectations aligned and > > postpone any movement. > > > > > > > > Under the move situation, I was thinking in something as follows: > > > > > > > > * gradually stop maintaining "arrow" in crates, offering a > maintenance > > > > window over which we release patches (*) > > > > * work towards achieving feature parity on arrow2/parquet2 on the > > > > experimental repos. > > > > * keep releasing arrow2/parquet2 under a 0.X model during the step > > > > above > > > > (**) > > > > * migrate to arrow-rs and archive experimentals (***) > > > > * break arrow2 in smaller crates so that we can version the APIs at a > > > > different cadence > > > > * once a crate reaches some stability (this is always opinionated, > but > > > > it is fine), we bump it to 1.0 and announce a maintenance plan ala > > > > tokio < > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftokio > > > > > .rs%2Fblog%2F2020-12-tokio-1-0&data=04%7C01%7C%7Ca37de2cddc6e447a7 > > > > > 77b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637636225 > > > > > 764531989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIi > > > > > LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=oHPQI8MeSumgLTEsawCkRN > > > > 5hANft%2BkbLTEmLZ3pIDiU%3D&reserved=0 > > > > >. > > > > > > > > (*) e.g. "we will continue to patch the arrow crate up to at least 6 > > > > months starting after the first release of arrow2 that supports > > > > a) nested parquet read and write > > > > b) union array (including IPC integration tests) > > > > c) map array (including IPC integration tests)" > > > > > > > > (**) officially or un-officially (I would suggest officially so that > > > > we can acknowledge everyone's work on it, but no strong feelings) > > > > > > > > (***) something like: > > > > 1. place arrow2 on top of a clear arrow repo so that the full > > > > contribution history up to that point preserved 2. make arrow-rs the > > > > home of arrow2 (i.e. we start releasing arrow2 from > > > > arrow-rs) and archive the experimental repos; create arrow-rs-parquet > > > > or something for parquet2. > > > > > > > > In summary, the core pain point for me is the current versioning of > > > > arrow, which I feel is incompatible with my goals for arrow2 and the > > > > ecosystem I envision it supporting :) > > > > > > > > Best, > > > > Jorge > > > > > > > > On Fri, Jul 30, 2021 at 8:44 PM Wes McKinney <wesmck...@gmail.com> > > wrote: > > > > > > > > > I think it would also be fine to push "beta" arrow2 crates out of a > > > > > repo under apache/ so long as they are not marked on crates.io as > > > > > being Apache-official releases. There's a possible slippery slope > > > > > there, but as long as we are on a path to formalizing the releases > I > > > > think it is okay. > > > > > > > > > > On Fri, Jul 30, 2021 at 1:07 PM Andrew Lamb <al...@influxdata.com> > > > > wrote: > > > > > > > > > > > Jorge -- do you feel like we have a resolution on what to do with > > > > > > arrow2 > > > > > in > > > > > > the near term? > > > > > > > > > > > > The current state of affairs seems to me that arrow2 is released > > > > > > from > > > > > > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu > > > > b.com > %2Fjorgecarleitao%2Farrow2&data=04%7C01%7C%7Ca37de2cddc6e447a > > > > > 777b08d956c4dbce%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C63763622 > > > > > 5764541982%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI > > > > > iLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jNo5puUzWEOmWj3wIs8CN > > > > p44WmsoaRQGfsRdWgrftwE%3D&reserved=0 > > > > to crates.io (which is fine). > > > > > > Are > > > > > > you happy with keeping development in the jorgecarleitao repo > > > > > > where you will retain maximal control and flexibility until it is > > > > > > ready to start integrating? > > > > > > > > > > > > Or would you prefer to put it into one of the apache repos and > > > > > > subject > > > > > its > > > > > > development and release to the normal Arrow governance model > > > > > > (tarball, vote, etc)? > > > > > > > > > > > > Since you are the primary author/architect I think you should > have > > > > > > a substantial say at this stage. > > > > > > > > > > > > Andrew > > > > > > > > > > > > > > > > > > On Tue, Jul 27, 2021 at 7:16 PM Andrew Lamb < > al...@influxdata.com> > > > > > wrote: > > > > > > > > > > > > > I would be happy with this approach. Thank you for the > > > > > > > suggestion > > > > > > > > > > > > > > This hybrid approach of both arrow and arrow2 in the same repo > > > > > > > seems better to me than separate repos. > > > > > > > > > > > > > > What I really care about is ensuring we don't have two > > > > > > > crates/APIs indefinitely -- as long as we are continually > making > > > > > > > progress towards unification that is what is important to me. > > > > > > > > > > > > > > Andrew > > > > > > > > > > > > > > On Tue, Jul 27, 2021 at 1:40 PM Andy Grove > > > > > > > <andygrov...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > >> Apologies for being late to this discussion. > > > > > > >> > > > > > > >> There is a hybrid option to consider here where we add the > > > > > > >> arrow2 code into the arrow crate as a separate module, so we > > > > > > >> release one crate > > > > > containing > > > > > > >> the "old" API (which we can mark as deprecated) as well as the > > > > > > >> new > > > > > API. > > > > > > >> Java did a similar thing a long time ago with "java.io" > versus > > > > > > "java.nio" > > > > > > >> (new IO). > > > > > > >> > > > > > > >> I agree that the versioning wouldn't be ideal, but this seems > > > > > > >> like it might be a pragmatic compromise? > > > > > > >> > > > > > > >> Thanks, > > > > > > >> > > > > > > >> Andy. > > > > > > >> > > > > > > >> > > > > > > >> On Tue, Jul 20, 2021 at 5:41 AM Andrew Lamb > > > > > > >> <al...@influxdata.com> > > > > > > wrote: > > > > > > >> > > > > > > >> > What I meant is that when you decide arrow2 is suitable for > > > > > > >> > release > > > > > to > > > > > > >> > existing arrow users, I stand ready to help you incorporate > > > > > > >> > it into > > > > > > >> arrow. > > > > > > >> > > > > > > > >> > All the feedback I have heard so far from the rest of the > > > > > > >> > community > > > > > is > > > > > > >> that > > > > > > >> > we are ready. One might even say we are anxious to do so :) > > > > > > >> > > > > > > > >> > Andrew > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >