What do you think about developing this cookbook in an Apache Arrow repository (it could be something like apache/arrow-cookbook, if not part of the main development repo)? Creating expanded documentation resources for learning how to use Apache Arrow to solve problems seems certainly within the bounds of the community's objectives.
On Wed, Jul 7, 2021 at 5:52 PM Alessandro Molina <alessan...@ursacomputing.com> wrote: > > We finally have a first preview of the cookbook available for R and Python, > for anyone interested the two versions are visible at > http://ursacomputing.com/arrow-cookbook/py/index.html and > http://ursacomputing.com/arrow-cookbook/r/index.html > A new version of the cookbook is automatically published on each new recipe. > > After gathering feedback from interested parties and users, our plan for > the next step would be to open a PR against the arrow repository and > automate publishing the cookbook via github actions. > > At the moment the recipes implemented are nearly half of those that were > identified in the dedicated Google Docs ( > https://docs.google.com/document/d/1v-jK_9osnLvAnAjLOM_frgzakjFhLpUi8OC0MlKpxzw/edit?ts=60c73189#heading=h.m7fas2talgy5 > ) so if you have recipes to suggest feel free to leave comments on that > document or suggest edits. > > > On Mon, Jun 21, 2021 at 10:34 AM Alessandro Molina < > alessan...@ursacomputing.com> wrote: > > > Hi, > > > > I'd like to share with the ML an idea which me and Nic Crane have been > > experimenting with. It's still in the early stage, but we hope to turn it > > into a PR for Arrow documentation soon. > > > > The idea is to work on a Cookbook, a collection of ready made recipes, on > > how to use Arrow that both end users and developers of third party > > libraries can refer to when they need to look up "the arrow way" of doing > > something. > > > > While the arrow documentation reports all features and functions that are > > available in arrow, it's not always obvious how to best combine them for a > > new user. Sometimes the solution ends up being more complicated than > > necessary or performs badly due to not obvious side effects like unexpected > > memory copies etc. > > > > For this reason we thought about starting a documentation that users can > > refer to on how to combine arrow features to achieve the results they care > > about. > > > > We wrote a short document explaining the idea at > > https://docs.google.com/document/d/1v-jK_9osnLvAnAjLOM_frgzakjFhLpUi8OC0MlKpxzw/edit?usp=sharing > > > > The core idea behind the cookbook is that all recipes should be testable, > > so it should be possible to add a CI phase for the cookbook that verifies > > that all the recipes still work with the current version of Arrow and lead > > to the expected results. > > > > At the moment we started it in a separate repository ( > > https://github.com/ursacomputing/arrow-cookbook ), but we are yet unsure > > if it should live inside arrow/docs or its own directory (IE: > > arrow/cookbook) or its own repository. In the end it's fairly decoupled > > from the rest of Arrow and the documentation, which would have the benefit > > of allowing a dedicated release cycle every time new recipes are added (at > > least in the early phase). > > > > We are also looking for more ideas about recipes that would be good > > candidates for inclusion, so if any of you has thoughts about which recipes > > we should add please feel free to comment on the document or reply by mail > > suggesting more recipes. > > > > Any suggestion for improvements is appreciated! We hope to have something > > we can release with the next Arrow release. > >