Hi Julian,

Thanks for posting your thoughts.

[As a Crail committer]: We agree that the notion of "we" creates confusion.
The Crail blog follows the trend in community projects, where a blogpost
falls in one of the two categories. The first type where a developer talks
about recent improvements, features, performance evaluation, etc. The
second type is where "a user" presents how they used the system for their
use-case. The Albis blog post falls into the second category. We can (and
should for future references) definitely categorize and mark it clear that
way. And we would encourage the community, whoever tries Crail please reach
out to us to present your story on the Crail blog. Crail is committed to
provide the best possible performance to all its users, be it Albis, Arrow,
ORC, or Parquet.

[As a developer of Albis and user of Crail]: I understand your sentiment
regarding the format wars, and it is not the aim of Albis to establish yet
another file format. Albis started as a prototype to quickly "explore"
various design choices for storing relational data for a variety of
scenarios with high-performance storage/networking devices - the kind of
devices Crail targets. This is something that I cannot easily do with
Arrow, ORC, or Parquet with HDFS (or something similar) within a reasonable
effort and time-frame as they all have already chosen certain design points
and trade-offs. Crail and Albis are not tied (or are preferred over other
choices) to each other, though since it is coming from a same set of
developers, I can see why the confusion arises. Having said this, I will be
happy to contribute back to the Arrow community about the findings from
Albis, and would appreciate any help with that. I had a brief discussion
with Julien Le Dem at last DataWorks summit in San Jose about Albis as
well. I have not done a through investigation of Arrow over Crail, but
perhaps something that can be picked-up now as a starting point.

I hope this clarifies the confusion. We will fix the blog post.

Thanks,
--
Animesh

On Tue, Sep 4, 2018 at 9:59 PM Julian Hyde <[email protected]> wrote:

> I just read the blog post [1] about Crail and file formats. (I have to
> declare my interests up front: I have been a huge supporter of Apache
> Arrow, and I am a PMC member. I’m speaking here as an Arrow contributor and
> enthusiast, not as a mentor of Crail.)
>
> I am a bit troubled about the endorsement of Albis in a Crail blog post.
> For example, "we have developed a new file format called Albis”. Since the
> blog post is not signed, I take it that “We” means the authors of the paper
> [2] mentioned in the blog post. But I hope that “we” does not mean “we as
> Crail committers and PMC members".
>
> I know that there are different forces at play if you work for a
> corporation, or are a researcher, or are an idealistic open source. As a
> researcher, you need to invent new stuff and prove that it is better than
> everything that has been done before.
>
> But I’ve been through the file format wars — ORC vs Parquet — driven in
> large part by two competing vendors. It was sickening, and a huge waste of
> effort. Please, please don’t let this happen again. If you want to make
> Crail successful, you should make it absolutely clear to the Arrow, ORC and
> Parquet communities that you will help to make Crail work as well as it
> possibly can
>
> Also, on paper Albis looks very similar to Arrow, and the performance gap
> is fairly narrow. If you have found insights that would improve Arrow, I
> encourage you to share them and make Arrow better. It may be good research
> practice to accentuate the differences between the two, but it’s good open
> source practice to find consensus between technologies, and merge
> communities. There is a lot of work to be done, and too few people to do it.
>
> Lastly, I know I seem to be giving mixed messages here. I do believe that
> content about Crail will help drive engagement and build community
> (controversial content even more so). I am delighted that the Crail team is
> writing blog posts and posting them to Twitter. But be careful not to
> alienate communities that could help Crail gain widespread adoption.
>
> Julian
>
> [1] http://crail.incubator.apache.org/blog/2018/08/sql-p1.html <
> http://crail.incubator.apache.org/blog/2018/08/sql-p1.html>
>
> [2] https://www.usenix.org/conference/atc18/presentation/trivedi <
> https://www.usenix.org/conference/atc18/presentation/trivedi>

Reply via email to