I am torn on this.

One one hand, I am a big fan of components that are standalone - have
no more dependencies than necessary, and are self-evidently
standalone. So, I think that re-absorbing sqlparser-rs back into
DataFusion would not be a good step. It would reduce the perception
that it is standalone.

On the other hand, it sounds as if sqlparser-rs would benefit by
having an Apache-like community around it. DataFusion isn't a perfect
fit - there is not much overlap between DataFusion and sqlparser-rs
users - but it takes a lot of effort to create and run a top-level
project, and DataFusion is already up and running.

The tension is that people want to consume components that they
perceive to be standalone, and yet the ASF wants to create communities
that produce either a single large component or sets of highly-coupled
components. The ASF used to do 'umbrella projects' whose sub-projects
were in the same subject area but had little or no dependencies. For
example, Apache DB [ https://db.apache.org/ ] has JDO, Derby and
Torque. And commons included many useful Java libraries. Umbrella
projects caused problems during the Jakarta and Hadoop eras, and now
are strongly discouraged at the ASF.

I think the ASF is wrong on this. I think it needs to provide a home
for medium-sized projects such as sqlparser-rs in an existing
top-level project; maybe those projects grow into top-level projects,
or maybe they remain medium-sized projects. This is especially
necessary in the Rust community, where there are many exciting
projects, but they are almost all happening outside ASF. (This is
exactly where Java was in ~2005. Maybe we need a rust-commons or
rust-db?)

My conclusion is to leave sqlparser-rs where it is for now, but to
continue talking about what might be an attractive home for it in ASF.

Julian

On Mon, Feb 26, 2024 at 8:12 AM Andrew Lamb <al...@influxdata.com> wrote:
>
> Sorry for the late reply,
>
> I think sqlparser-rs users are quite a bit more varied than DataFusion and
> there is not a large overlap between the contributors of the two projects.
> I currently seem to be the one reviewing / merging most sqlparser-rs
> reviews, and I would definitely love some more help.
>
> However, given that the project is not an Apache project, I did not have
> good luck attracting help.  A related discussion is here [1].
>
> If the DataFusion community would like to accelerate releases, we can also
> try to do that without bringing it into Apache governance. Specifically, it
> would be great to have help reviewing the PRs -- the actual release process
> is pretty low overhead. The reviews are what take the vast majority of the
> maintenance time.
>
> Andrew
>
> [1]: https://github.com/sqlparser-rs/sqlparser-rs/issues/818
>
>
>
> On Sat, Feb 17, 2024 at 4:44 PM Aldrin <octalene....@pm.me.invalid> wrote:
>
> > do users of sqlparser-rs mostly use datafusion? I don't know the
> > community, but it seems like it would be an annoying change for users who
> > use it with a different query engine. Just a thought
> >
> > Sent from Proton Mail <https://proton.me/mail/home> for iOS
> >
> >
> > On Sat, Feb 17, 2024 at 10:26, Andy Grove <andygrov...@gmail.com
> > <On+Sat,+Feb+17,+2024+at+10:26,+Andy+Grove+%3C%3Ca+href=>> wrote:
> >
> > I agree that it simplifies shipping new SQL features in DataFusion since we
> > can develop the changes in the parser concurrently with the changes in
> > other DataFusion crates and then release them all together.
> >
> > The name of the crate would not need to change, so downstream users should
> > see no impact.
> >
> > We would need to decide if we want to keep a separate version number or
> > bring it in line with DataFusion version numbers (I have no preference
> > either way).
> >
> >
> >
> > On Sat, Feb 17, 2024 at 11:09 AM Mehmet Ozan Kabak <o...@synnada.ai>
> > wrote:
> >
> > > Doing this will probably reduce the time-to-ship for DataFusion features
> > > that need parsing support due to increased convenience, so I’m inclined
> > to
> > > see it in a positive light.
> > >
> > > What would be the impact of doing this on people who use only
> > > sqlparser-rs, if any?
> > >
> > > > On Feb 17, 2024, at 7:16 PM, Andy Grove <andygrov...@gmail.com> wrote:
> > > >
> > > > The sqlparser-rs project [1] seems to have become the de-facto SQL
> > parser
> > > > for Rust, with almost 4 million downloads so far. This was originally
> > > part
> > > > of DataFusion very early on, and I moved it into a separate project
> > > because
> > > > it seemed useful for other projects. This was before DataFusion was
> > known
> > > > as a composable query engine, and with hindsight, I probably should
> > have
> > > > left it as part of the DataFusion project.
> > > >
> > > > Now that DataFusion has a reputation as a composable query engine, I
> > > think
> > > > it would make sense to move this code back into DataFusion, where it
> > > would
> > > > benefit from a larger community of maintainers.
> > > >
> > > > I would like to hear thoughts from the Apache Arrow / DataFusion
> > > community.
> > > > Does this seem like a good idea?
> > > >
> > > > Thanks,
> > > >
> > > > Andy.
> > > >
> > > > [1] https://github.com/sqlparser-rs/sqlparser-rs
> > >
> > >
> >
> >

Reply via email to