One of the possible aims for the cookbook is having interlinked
documentation between function docs and the cookbook, and both the R and
Python docs include tests that all of the outputs are expected.  Including
these tests means that we can immediately see if any code changes render
any recipes incorrect.  Therefore the decoupling between cookbook updates
and docs updates may not be necessary.

That said, there has been mention of having versions of the cookbook tied
to released versions of Arrow, which sounds like a great idea.

The repo also includes a Makefile which creates all the relevant setup, so
hopefully that should simplify things for users.  The R cookbook uses
bookdown, which has a feature where a reader can click an 'edit' button and
it automatically creates a fork where they can edit the cookbook and submit
a PR directly from GitHub.

It'd be great to see a lot of recipes in multiple languages, but in the
document of possible recipes circulated previously, we identified slightly
different needs for recipes for R/Python, and this may be further
complicated by writing for slightly different audiences (from what I
understand, the pyarrow implementation may be more geared towards people
building on top of the low-level bindings, whereas in R, we have both that
audience as well as folks who just want to make their dplyr code run faster
without needing to know that much about the details of Arrow).

I wonder, though, if we could still achieve that by having an additional
page that points to the recipes that *are* common between each cookbook.

On Thu, 8 Jul 2021 at 10:07, Antoine Pitrou <anto...@python.org> wrote:

>
> Hi Rares,
>
> Documentation bugs and improvement requests are welcome, feel free to
> file them on the JIRA!
>
> Regards
>
> Antoine.
>
>
> Le 08/07/2021 à 01:45, Rares Vernica a écrit :
> > Awesome! We would find C++ versions of these recipes very useful. From
> our
> > experience the C++ API is much much harder to deal with and error prone
> > than the R/Python one.
> >
> > Cheers,
> > Rares
> >
> > On Wed, Jul 7, 2021 at 9:07 AM Alessandro Molina <
> > alessan...@ursacomputing.com> wrote:
> >
> >> Yes, that was mostly what I meant when I wrote that the next step is
> >> opening a PR against the apache/arrow repository itself :D
> >> We moved forward in a separate repository initially to be able to cycle
> >> more quickly, but we reached a point where we think we can start
> >> integrating the cookbook with the Arrow documentation itself.
> >>
> >> If instead it's preferred to move forward the effort into its own
> separated
> >> repository (apache/arrow-cookbook) that's an option too, we are open to
> >> suggestions from the community.
> >>
> >> On Wed, Jul 7, 2021 at 5:57 PM Wes McKinney <wesmck...@gmail.com>
> wrote:
> >>
> >>> What do you think about developing this cookbook in an Apache Arrow
> >>> repository (it could be something like apache/arrow-cookbook, if not
> >>> part of the main development repo)? Creating expanded documentation
> >>> resources for learning how to use Apache Arrow to solve problems seems
> >>> certainly within the bounds of the community's objectives.
> >>>
> >>> On Wed, Jul 7, 2021 at 5:52 PM Alessandro Molina
> >>> <alessan...@ursacomputing.com> wrote:
> >>>>
> >>>> We finally have a first preview of the cookbook available for R and
> >>> Python,
> >>>> for anyone interested the two versions are visible at
> >>>> http://ursacomputing.com/arrow-cookbook/py/index.html and
> >>>> http://ursacomputing.com/arrow-cookbook/r/index.html
> >>>> A new version of the cookbook is automatically published on each new
> >>> recipe.
> >>>>
> >>>> After gathering feedback from interested parties and users, our plan
> >> for
> >>>> the next step would be to open a PR against the arrow repository and
> >>>> automate publishing the cookbook via github actions.
> >>>>
> >>>> At the moment the recipes implemented are nearly half of those that
> >> were
> >>>> identified in the dedicated Google Docs (
> >>>>
> >>>
> >>
> https://docs.google.com/document/d/1v-jK_9osnLvAnAjLOM_frgzakjFhLpUi8OC0MlKpxzw/edit?ts=60c73189#heading=h.m7fas2talgy5
> >>>> ) so if you have recipes to suggest feel free to leave comments on
> that
> >>>> document or suggest edits.
> >>>>
> >>>>
> >>>> On Mon, Jun 21, 2021 at 10:34 AM Alessandro Molina <
> >>>> alessan...@ursacomputing.com> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I'd like to share with the ML an idea which me and Nic Crane have
> >> been
> >>>>> experimenting with. It's still in the early stage, but we hope to
> >> turn
> >>> it
> >>>>> into a PR for Arrow documentation soon.
> >>>>>
> >>>>> The idea is to work on a Cookbook, a collection of ready made
> >> recipes,
> >>> on
> >>>>> how to use Arrow that both end users and developers of third party
> >>>>> libraries can refer to when they need to look up "the arrow way" of
> >>> doing
> >>>>> something.
> >>>>>
> >>>>> While the arrow documentation reports all features and functions that
> >>> are
> >>>>> available in arrow, it's not always obvious how to best combine them
> >>> for a
> >>>>> new user. Sometimes the solution ends up being more complicated than
> >>>>> necessary or performs badly due to not obvious side effects like
> >>> unexpected
> >>>>> memory copies etc.
> >>>>>
> >>>>> For this reason we thought about starting a documentation that users
> >>> can
> >>>>> refer to on how to combine arrow features to achieve the results they
> >>> care
> >>>>> about.
> >>>>>
> >>>>> We wrote a short document explaining the idea at
> >>>>>
> >>>
> >>
> https://docs.google.com/document/d/1v-jK_9osnLvAnAjLOM_frgzakjFhLpUi8OC0MlKpxzw/edit?usp=sharing
> >>>>>
> >>>>> The core idea behind the cookbook is that all recipes should be
> >>> testable,
> >>>>> so it should be possible to add a CI phase for the cookbook that
> >>> verifies
> >>>>> that all the recipes still work with the current version of Arrow and
> >>> lead
> >>>>> to the expected results.
> >>>>>
> >>>>> At the moment we started it in a separate repository (
> >>>>> https://github.com/ursacomputing/arrow-cookbook ), but we are yet
> >>> unsure
> >>>>> if it should live inside arrow/docs or its own directory (IE:
> >>>>> arrow/cookbook) or its own repository. In the end it's fairly
> >> decoupled
> >>>>> from the rest of Arrow and the documentation, which would have the
> >>> benefit
> >>>>> of allowing a dedicated release cycle every time new recipes are
> >> added
> >>> (at
> >>>>> least in the early phase).
> >>>>>
> >>>>> We are also looking for more ideas about recipes that would be good
> >>>>> candidates for inclusion, so if any of you has thoughts about which
> >>> recipes
> >>>>> we should add please feel free to comment on the document or reply by
> >>> mail
> >>>>> suggesting more recipes.
> >>>>>
> >>>>> Any suggestion for improvements is appreciated! We hope to have
> >>> something
> >>>>> we can release with the next Arrow release.
> >>>>>
> >>>
> >>
> >
>


-- 
Nic Crane
_______________________
@nic_crane <https://twitter.com/nic_crane>
https://thisisnic.github.io/

Reply via email to