Re: state of the art in org-mode tables e.g. join, etc
That is remarkably slim code to get those results! Cook, Malcolm writes: > John, > > Checkout what R sqldf package makes easy: > > ** aggregation example > > Examples from https://github.com/tbanel/orgaggregate > > > #+NAME: original > | Day | Color | Level | Quantity | > |---+---+---+--| > | Monday| Red |30 | 11 | > | Monday| Blue |25 |3 | > | Tuesday | Red |51 | 12 | > | Tuesday | Red |45 | 15 | > | Tuesday | Blue |33 | 18 | > | Wednesday | Red |27 | 23 | > | Wednesday | Blue |12 | 16 | > | Wednesday | Blue |15 | 15 | > | Thursday | Red |39 | 24 | > | Thursday | Red |41 | 29 | > | Thursday | Red |49 | 30 | > | Friday| Blue | 7 |5 | > | Friday| Blue | 6 |8 | > | Friday| Blue |11 |9 | > > #+PROPERTY: header-args:R :session *R* > > #+begin_src R :results none > library(sqldf) > #+end_src > > > #+begin_src R :var original=original :colnames yes > sqldf('select Color, count(*) from original group by Color;') > #+end_src > > #+RESULTS: > | Color | count(*) | > |---+--| > | Blue |7 | > | Red |7 | > > > > ** join example > > Example from https://github.com/tbanel/orgtbljoin > > #+name: nutrition > | type | Fiber | Sugar | Protein | Carb | > |--+---+---+-+--| > | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | > | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | > | onion| 1.3 | 4.4 | 1.3 | 9.0 | > | egg | 0 | 18.3 |31.9 | 18.3 | > | rice | 0.2 | 0 | 1.5 | 16.0 | > | bread| 0.7 | 0.7 | 3.3 | 16.0 | > | orange | 3.1 | 11.9 | 1.3 | 17.6 | > | banana | 2.1 | 9.9 | 0.9 | 18.5 | > | tofu | 0.7 | 0.5 | 6.6 | 1.4 | > | nut | 2.6 | 1.3 | 4.9 | 7.2 | > | corn | 4.7 | 1.8 | 2.8 | 21.3 | > > > #+name: recipe > | type | quty | > |--+--| > | onion| 70 | > | tomatoe | 120 | > | eggplant | 300 | > | tofu | 100 | > > > #+begin_src R :var recipe=recipe :var nutrition=nutrition :colnames yes > sqldf('select * from recipe, nutrition where recipe.type=nutrition.type') > #+end_src > > #+RESULTS: > | type | quty | type | Fiber | Sugar | Protein | Carb | > |--+--+--+---+---+-+--| > | onion| 70 | onion| 1.3 | 4.4 | 1.3 |9 | > | tomatoe | 120 | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | > | eggplant | 300 | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | > | tofu | 100 | tofu | 0.7 | 0.5 | 6.6 | 1.4 | > > > > This should also be possible but I cannot get it to work now: > > #+begin_src R :var recipe=recipe :var nutrition=nutrition :colnames yes > :prologue sqldf(' :epilogue ') > select * from recipe, nutrition where recipe.type=nutrition.type > #+end_src > > > > > > From: Emacs-orgmode On Behalf > Of John Kitchin > Sent: Sunday, February 21, 2021 10:24 > To: Tim Cross > Cc: org-mode-email > Subject: Re: state of the art in org-mode tables e.g. join, etc > > ATTENTION: This email came from an external source. Do not open attachments > or click on links from unknown senders or unexpected emails. > > For fun, here is the sqlite equivalent of the Pandas example using the same > tables as before > > > ** aggregation example > > Examples from https://github.com/tbanel/orgaggregate > > > #+NAME: original > | Day | Color | Level | Quantity | > |---+---+---+--| > | Monday| Red |30 | 11 | > | Monday| Blue |25 |3 | > | Tuesday | Red |51 | 12 | > | Tuesday | Red |45 | 15 | > | Tuesday | Blue |33 | 18 | > | Wednesday | Red |27 | 23 | > | Wednesday | Blue |12 | 16 | > | Wednesday | Blue |15 | 15 | > | Thursday | Red |39 | 24 | > | Thursday | Red |41 | 29 | > | Thursday | Red |49 | 30 | > | Friday| Blue | 7 |5 | > | Friday| Blue | 6 |8 | > | Friday| Blue |11 |9 | > > > #+begin_src sqlite :db ":memory:" :var orgtable=original :colnames yes > drop table if exists testtable; > create table testtable(Day str, Color str, Level int, Quantity int); > .mode csv testtable > .import $orgtable testtable > select Color, count(*) from testtable group by Color; > #+end_src > > #+RESULT
Re: state of the art in org-mode tables e.g. join, etc
Thanks for the link! It looks like some useful functions there. It would be nice to integrate some of those with the rich output of a Jupyter kernel so you could get native org tables automatically in org-mode. Derek Feichtinger writes: > Hi John, > > I invested time some years ago in preparing babel examples, and a lot of > the description went into using tables. The most detailed documents I > had for elisp and python. > > In order to be productive, e.g. for producing all kinds of scientific > graphs, but also for doing the finances and planning for our scientific > computing section I ended up the same as you with mostly going to python > and leveraging Pandas. I think all of us end up using ":colnames no" as > the most convenient solution. > > https://github.com/dfeich/org-babel-examples/blob/master/python3/python3-babel.org > > (especially look at the Pandas section 10) > > In that file I also tangle a python library "orgbabelhelper" that is > available in Conda and PyPi. I mainly use that to work with my tables. > > Best regards > Derek -- Professor John Kitchin Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu
Re: state of the art in org-mode tables e.g. join, etc
Malcolm, thanks, and, yes, i'm of mixed mind, myself. cheers, Greg
RE: state of the art in org-mode tables e.g. join, etc
Greg, Of course, I’m not surprised by the results of your efforts. Nice! I myself don’t prefer the tidyverse, mainly except for ggplot, and instead find myself reaching for sqldf or data.tables where such benefit is needed. YMMV, Malcolm From: Greg Minshall Sent: Monday, February 22, 2021 02:13 To: Cook, Malcolm Cc: John Kitchin ; Tim Cross ; org-mode-email Subject: Re: state of the art in org-mode tables e.g. join, etc ATTENTION: This email came from an external source. Do not open attachments or click on links from unknown senders or unexpected emails. Malcolm, > Checkout what R sqldf package makes easy: very nice! Greg ps -- (feeling a challenge... :) for base R, dplyr::inner_join, the following seem to work (i apologize that i don't know how people embed org-frags in e-mail, or how important that format might be?) #+NAME: original | Day | Color | Level | Quantity | |---+---+---+--| | Monday | Red | 30 | 11 | | Monday | Blue | 25 | 3 | | Tuesday | Red | 51 | 12 | | Tuesday | Red | 45 | 15 | | Tuesday | Blue | 33 | 18 | | Wednesday | Red | 27 | 23 | | Wednesday | Blue | 12 | 16 | | Wednesday | Blue | 15 | 15 | | Thursday | Red | 39 | 24 | | Thursday | Red | 41 | 29 | | Thursday | Red | 49 | 30 | | Friday | Blue | 7 | 5 | | Friday | Blue | 6 | 8 | | Friday | Blue | 11 | 9 | #+PROPERTY: header-args:R :session *R* #+begin_src R :results none library(dplyr) #+end_src #+begin_src R :var original=original :colnames yes as.data.frame(table(Color=original$Color)) #+end_src #+RESULTS: | Color | Freq | |---+--| | Blue | 7 | | Red | 7 | *** join example Example from https://github.com/tbanel/orgtbljoin #+name: nutrition | type | Fiber | Sugar | Protein | Carb | |--+---+---+-+--| | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | | onion | 1.3 | 4.4 | 1.3 | 9.0 | | egg | 0 | 18.3 | 31.9 | 18.3 | | rice | 0.2 | 0 | 1.5 | 16.0 | | bread | 0.7 | 0.7 | 3.3 | 16.0 | | orange | 3.1 | 11.9 | 1.3 | 17.6 | | banana | 2.1 | 9.9 | 0.9 | 18.5 | | tofu | 0.7 | 0.5 | 6.6 | 1.4 | | nut | 2.6 | 1.3 | 4.9 | 7.2 | | corn | 4.7 | 1.8 | 2.8 | 21.3 | #+name: recipe | type | quty | |--+--| | onion | 70 | | tomatoe | 120 | | eggplant | 300 | | tofu | 100 | #+begin_src R :var recipe=recipe :var nutrition=nutrition :colnames yes dplyr::inner_join(nutrition, recipe) #+end_src #+RESULTS: | type | Fiber | Sugar | Protein | Carb | quty | |--+---+---+-+--+--| | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | 300 | | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | 120 | | onion | 1.3 | 4.4 | 1.3 | 9 | 70 | | tofu | 0.7 | 0.5 | 6.6 | 1.4 | 100 |
Re: state of the art in org-mode tables e.g. join, etc
Hi John, I invested time some years ago in preparing babel examples, and a lot of the description went into using tables. The most detailed documents I had for elisp and python. In order to be productive, e.g. for producing all kinds of scientific graphs, but also for doing the finances and planning for our scientific computing section I ended up the same as you with mostly going to python and leveraging Pandas. I think all of us end up using ":colnames no" as the most convenient solution. https://github.com/dfeich/org-babel-examples/blob/master/python3/python3-babel.org (especially look at the Pandas section 10) In that file I also tangle a python library "orgbabelhelper" that is available in Conda and PyPi. I mainly use that to work with my tables. Best regards Derek -- Paul Scherrer Institut Dr. Derek Feichtinger Phone: +41 56 310 47 33 Group Head HPC and Emerging Technologies Email: derek.feichtin...@psi.ch Building/Room No. OHSA/D17 Forschungsstrasse 111 CH-5232 Villigen PSI On Sun, Feb 21 2021, John Kitchin wrote: > For fun, here is the sqlite equivalent of the Pandas example using the same > tables as before > > > ** aggregation example > > Examples from https://github.com/tbanel/orgaggregate > > > #+NAME: original > | Day | Color | Level | Quantity | > |---+---+---+--| > | Monday| Red |30 | 11 | > | Monday| Blue |25 |3 | > | Tuesday | Red |51 | 12 | > | Tuesday | Red |45 | 15 | > | Tuesday | Blue |33 | 18 | > | Wednesday | Red |27 | 23 | > | Wednesday | Blue |12 | 16 | > | Wednesday | Blue |15 | 15 | > | Thursday | Red |39 | 24 | > | Thursday | Red |41 | 29 | > | Thursday | Red |49 | 30 | > | Friday| Blue | 7 |5 | > | Friday| Blue | 6 |8 | > | Friday| Blue |11 |9 | > > > #+begin_src sqlite :db ":memory:" :var orgtable=original :colnames yes > drop table if exists testtable; > create table testtable(Day str, Color str, Level int, Quantity int); > .mode csv testtable > .import $orgtable testtable > select Color, count(*) from testtable group by Color; > #+end_src > > #+RESULTS: > | Color | count(*) | > |---+--| > | Blue |7 | > | Red |7 | > > ** join example > > Example from https://github.com/tbanel/orgtbljoin > > #+name: nutrition > | type | Fiber | Sugar | Protein | Carb | > |--+---+---+-+--| > | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | > | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | > | onion| 1.3 | 4.4 | 1.3 | 9.0 | > | egg | 0 | 18.3 |31.9 | 18.3 | > | rice | 0.2 | 0 | 1.5 | 16.0 | > | bread| 0.7 | 0.7 | 3.3 | 16.0 | > | orange | 3.1 | 11.9 | 1.3 | 17.6 | > | banana | 2.1 | 9.9 | 0.9 | 18.5 | > | tofu | 0.7 | 0.5 | 6.6 | 1.4 | > | nut | 2.6 | 1.3 | 4.9 | 7.2 | > | corn | 4.7 | 1.8 | 2.8 | 21.3 | > > > #+name: recipe > | type | quty | > |--+--| > | onion| 70 | > | tomatoe | 120 | > | eggplant | 300 | > | tofu | 100 | > > > #+begin_src sqlite :db ":memory:" :var nut=nutrition rec=recipe :colnames > yes > drop table if exists nutrition; > drop table if exists recipe; > create table nutrition(type str, Fiber float, Sugar float, Protein float, > Carb float); > create table recipe(type str, quty int); > > .mode csv nutrition > .import $nut nutrition > > .mode csv recipe > .import $rec recipe > > select * from recipe, nutrition where recipe.type=nutrition.type; > #+end_src > > #+RESULTS: > | type | quty | type | Fiber | Sugar | Protein | Carb | > |--+--+--+---+---+-+--| > | onion| 70 | onion| 1.3 | 4.4 | 1.3 | 9.0 | > | tomatoe | 120 | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | > | eggplant | 300 | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | > | tofu | 100 | tofu | 0.7 | 0.5 | 6.6 | 1.4 | > > > John > > --- > Professor John Kitchin > Doherty Hall A207F > Department of Chemical Engineering > Carnegie Mellon University > Pittsburgh, PA 15213 > 412-268-7803 > @johnkitchin > http://kitchingroup.cheme.cmu.edu > > > > On Sun, Feb 21, 2021 at 10:03 AM John Kitchin > wrote: > >> Thanks Tim and Greg. I had mostly come to the same conclusions that it is >> probably best to outsource this. I worked out some examples from >> the orgtbljoin and orgaggregate packages with Pandas below, in case anyone >> is interested in seeing how it works. A key point is using the ":colnames >> no" header args to get the column names for Pandas. It seems like a pretty >> good approach. >> >> * org-mode tables with Pandas >> ** Aggregating from a table >> >> Examples from https://github.com/tbanel/orgaggregate >> >> >> #+NAME: original >> | Day |
Re: state of the art in org-mode tables e.g. join, etc
Malcolm, > Checkout what R sqldf package makes easy: very nice! Greg ps -- (feeling a challenge... :) for base R, dplyr::inner_join, the following seem to work (i apologize that i don't know how people embed org-frags in e-mail, or how important that format might be?) #+NAME: original | Day | Color | Level | Quantity | |---+---+---+--| | Monday| Red |30 | 11 | | Monday| Blue |25 |3 | | Tuesday | Red |51 | 12 | | Tuesday | Red |45 | 15 | | Tuesday | Blue |33 | 18 | | Wednesday | Red |27 | 23 | | Wednesday | Blue |12 | 16 | | Wednesday | Blue |15 | 15 | | Thursday | Red |39 | 24 | | Thursday | Red |41 | 29 | | Thursday | Red |49 | 30 | | Friday| Blue | 7 |5 | | Friday| Blue | 6 |8 | | Friday| Blue |11 |9 | #+PROPERTY: header-args:R :session *R* #+begin_src R :results none library(dplyr) #+end_src #+begin_src R :var original=original :colnames yes as.data.frame(table(Color=original$Color)) #+end_src #+RESULTS: | Color | Freq | |---+--| | Blue |7 | | Red |7 | *** join example Example from https://github.com/tbanel/orgtbljoin #+name: nutrition | type | Fiber | Sugar | Protein | Carb | |--+---+---+-+--| | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | | onion| 1.3 | 4.4 | 1.3 | 9.0 | | egg | 0 | 18.3 |31.9 | 18.3 | | rice | 0.2 | 0 | 1.5 | 16.0 | | bread| 0.7 | 0.7 | 3.3 | 16.0 | | orange | 3.1 | 11.9 | 1.3 | 17.6 | | banana | 2.1 | 9.9 | 0.9 | 18.5 | | tofu | 0.7 | 0.5 | 6.6 | 1.4 | | nut | 2.6 | 1.3 | 4.9 | 7.2 | | corn | 4.7 | 1.8 | 2.8 | 21.3 | #+name: recipe | type | quty | |--+--| | onion| 70 | | tomatoe | 120 | | eggplant | 300 | | tofu | 100 | #+begin_src R :var recipe=recipe :var nutrition=nutrition :colnames yes dplyr::inner_join(nutrition, recipe) #+end_src #+RESULTS: | type | Fiber | Sugar | Protein | Carb | quty | |--+---+---+-+--+--| | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | 300 | | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | 120 | | onion| 1.3 | 4.4 | 1.3 |9 | 70 | | tofu | 0.7 | 0.5 | 6.6 | 1.4 | 100 |
RE: state of the art in org-mode tables e.g. join, etc
John, Checkout what R sqldf package makes easy: ** aggregation example Examples from https://github.com/tbanel/orgaggregate #+NAME: original | Day | Color | Level | Quantity | |---+---+---+--| | Monday| Red |30 | 11 | | Monday| Blue |25 |3 | | Tuesday | Red |51 | 12 | | Tuesday | Red |45 | 15 | | Tuesday | Blue |33 | 18 | | Wednesday | Red |27 | 23 | | Wednesday | Blue |12 | 16 | | Wednesday | Blue |15 | 15 | | Thursday | Red |39 | 24 | | Thursday | Red |41 | 29 | | Thursday | Red |49 | 30 | | Friday| Blue | 7 |5 | | Friday| Blue | 6 |8 | | Friday| Blue |11 |9 | #+PROPERTY: header-args:R :session *R* #+begin_src R :results none library(sqldf) #+end_src #+begin_src R :var original=original :colnames yes sqldf('select Color, count(*) from original group by Color;') #+end_src #+RESULTS: | Color | count(*) | |---+--| | Blue |7 | | Red |7 | ** join example Example from https://github.com/tbanel/orgtbljoin #+name: nutrition | type | Fiber | Sugar | Protein | Carb | |--+---+---+-+--| | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | | onion| 1.3 | 4.4 | 1.3 | 9.0 | | egg | 0 | 18.3 |31.9 | 18.3 | | rice | 0.2 | 0 | 1.5 | 16.0 | | bread| 0.7 | 0.7 | 3.3 | 16.0 | | orange | 3.1 | 11.9 | 1.3 | 17.6 | | banana | 2.1 | 9.9 | 0.9 | 18.5 | | tofu | 0.7 | 0.5 | 6.6 | 1.4 | | nut | 2.6 | 1.3 | 4.9 | 7.2 | | corn | 4.7 | 1.8 | 2.8 | 21.3 | #+name: recipe | type | quty | |--+--| | onion| 70 | | tomatoe | 120 | | eggplant | 300 | | tofu | 100 | #+begin_src R :var recipe=recipe :var nutrition=nutrition :colnames yes sqldf('select * from recipe, nutrition where recipe.type=nutrition.type') #+end_src #+RESULTS: | type | quty | type | Fiber | Sugar | Protein | Carb | |--+--+--+---+---+-+--| | onion| 70 | onion| 1.3 | 4.4 | 1.3 |9 | | tomatoe | 120 | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | | eggplant | 300 | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | | tofu | 100 | tofu | 0.7 | 0.5 | 6.6 | 1.4 | This should also be possible but I cannot get it to work now: #+begin_src R :var recipe=recipe :var nutrition=nutrition :colnames yes :prologue sqldf(' :epilogue ') select * from recipe, nutrition where recipe.type=nutrition.type #+end_src From: Emacs-orgmode On Behalf Of John Kitchin Sent: Sunday, February 21, 2021 10:24 To: Tim Cross Cc: org-mode-email Subject: Re: state of the art in org-mode tables e.g. join, etc ATTENTION: This email came from an external source. Do not open attachments or click on links from unknown senders or unexpected emails. For fun, here is the sqlite equivalent of the Pandas example using the same tables as before ** aggregation example Examples from https://github.com/tbanel/orgaggregate #+NAME: original | Day | Color | Level | Quantity | |---+---+---+--| | Monday| Red |30 | 11 | | Monday| Blue |25 |3 | | Tuesday | Red |51 | 12 | | Tuesday | Red |45 | 15 | | Tuesday | Blue |33 | 18 | | Wednesday | Red |27 | 23 | | Wednesday | Blue |12 | 16 | | Wednesday | Blue |15 | 15 | | Thursday | Red |39 | 24 | | Thursday | Red |41 | 29 | | Thursday | Red |49 | 30 | | Friday| Blue | 7 |5 | | Friday| Blue | 6 |8 | | Friday| Blue |11 |9 | #+begin_src sqlite :db ":memory:" :var orgtable=original :colnames yes drop table if exists testtable; create table testtable(Day str, Color str, Level int, Quantity int); .mode csv testtable .import $orgtable testtable select Color, count(*) from testtable group by Color; #+end_src #+RESULTS: | Color | count(*) | |---+--| | Blue |7 | | Red |7 | ** join example Example from https://github.com/tbanel/orgtbljoin #+name: nutrition | type | Fiber | Sugar | Protein | Carb | |--+---+---+-+--| | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | | onion| 1.3 | 4.4 | 1.3 | 9.0 | | egg | 0 | 18.3 |31.9 | 18.3 | | rice | 0.2 | 0 | 1.5 | 16.0 | | bread| 0.7 | 0.7 | 3.3 | 16.0 | | orange | 3.1 | 11.9 | 1.3 | 17.6 | | banana | 2.1 | 9.9 | 0.9 | 18.5 | | tofu | 0.7 | 0.5 | 6.6 | 1.4 | | nut | 2.6 | 1.3 | 4.9 | 7.2 | | corn | 4.7 | 1.8 | 2.8 | 21.3
Re: state of the art in org-mode tables e.g. join, etc
For fun, here is the sqlite equivalent of the Pandas example using the same tables as before ** aggregation example Examples from https://github.com/tbanel/orgaggregate #+NAME: original | Day | Color | Level | Quantity | |---+---+---+--| | Monday| Red |30 | 11 | | Monday| Blue |25 |3 | | Tuesday | Red |51 | 12 | | Tuesday | Red |45 | 15 | | Tuesday | Blue |33 | 18 | | Wednesday | Red |27 | 23 | | Wednesday | Blue |12 | 16 | | Wednesday | Blue |15 | 15 | | Thursday | Red |39 | 24 | | Thursday | Red |41 | 29 | | Thursday | Red |49 | 30 | | Friday| Blue | 7 |5 | | Friday| Blue | 6 |8 | | Friday| Blue |11 |9 | #+begin_src sqlite :db ":memory:" :var orgtable=original :colnames yes drop table if exists testtable; create table testtable(Day str, Color str, Level int, Quantity int); .mode csv testtable .import $orgtable testtable select Color, count(*) from testtable group by Color; #+end_src #+RESULTS: | Color | count(*) | |---+--| | Blue |7 | | Red |7 | ** join example Example from https://github.com/tbanel/orgtbljoin #+name: nutrition | type | Fiber | Sugar | Protein | Carb | |--+---+---+-+--| | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | | onion| 1.3 | 4.4 | 1.3 | 9.0 | | egg | 0 | 18.3 |31.9 | 18.3 | | rice | 0.2 | 0 | 1.5 | 16.0 | | bread| 0.7 | 0.7 | 3.3 | 16.0 | | orange | 3.1 | 11.9 | 1.3 | 17.6 | | banana | 2.1 | 9.9 | 0.9 | 18.5 | | tofu | 0.7 | 0.5 | 6.6 | 1.4 | | nut | 2.6 | 1.3 | 4.9 | 7.2 | | corn | 4.7 | 1.8 | 2.8 | 21.3 | #+name: recipe | type | quty | |--+--| | onion| 70 | | tomatoe | 120 | | eggplant | 300 | | tofu | 100 | #+begin_src sqlite :db ":memory:" :var nut=nutrition rec=recipe :colnames yes drop table if exists nutrition; drop table if exists recipe; create table nutrition(type str, Fiber float, Sugar float, Protein float, Carb float); create table recipe(type str, quty int); .mode csv nutrition .import $nut nutrition .mode csv recipe .import $rec recipe select * from recipe, nutrition where recipe.type=nutrition.type; #+end_src #+RESULTS: | type | quty | type | Fiber | Sugar | Protein | Carb | |--+--+--+---+---+-+--| | onion| 70 | onion| 1.3 | 4.4 | 1.3 | 9.0 | | tomatoe | 120 | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | | eggplant | 300 | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | | tofu | 100 | tofu | 0.7 | 0.5 | 6.6 | 1.4 | John --- Professor John Kitchin Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu On Sun, Feb 21, 2021 at 10:03 AM John Kitchin wrote: > Thanks Tim and Greg. I had mostly come to the same conclusions that it is > probably best to outsource this. I worked out some examples from > the orgtbljoin and orgaggregate packages with Pandas below, in case anyone > is interested in seeing how it works. A key point is using the ":colnames > no" header args to get the column names for Pandas. It seems like a pretty > good approach. > > * org-mode tables with Pandas > ** Aggregating from a table > > Examples from https://github.com/tbanel/orgaggregate > > > #+NAME: original > | Day | Color | Level | Quantity | > |---+---+---+--| > | Monday| Red |30 | 11 | > | Monday| Blue |25 |3 | > | Tuesday | Red |51 | 12 | > | Tuesday | Red |45 | 15 | > | Tuesday | Blue |33 | 18 | > | Wednesday | Red |27 | 23 | > | Wednesday | Blue |12 | 16 | > | Wednesday | Blue |15 | 15 | > | Thursday | Red |39 | 24 | > | Thursday | Red |41 | 29 | > | Thursday | Red |49 | 30 | > | Friday| Blue | 7 |5 | > | Friday| Blue | 6 |8 | > | Friday| Blue |11 |9 | > > > #+BEGIN_SRC ipython :var data=original :colnames no > import pandas as pd > > pd.DataFrame(data[1:], columns=data[0]).groupby('Color').size() > #+END_SRC > > #+RESULTS: > :results: > # Out [1]: > # text/plain > : Color > : Blue7 > : Red 7 > : dtype: int64 > :end: > > The categorical stuff here is just to get the days sorted the same way as > the example. It is otherwise not needed. I feel there should be a more > clever way to do this, but didn't think of it. > > #+BEGIN_SRC ipython :var data=original :colnames no > df = pd.DataFrame(data[1:], columns=data[0]) > days = ['Monday', 'Tuesday',
Re: state of the art in org-mode tables e.g. join, etc
Thanks Tim and Greg. I had mostly come to the same conclusions that it is probably best to outsource this. I worked out some examples from the orgtbljoin and orgaggregate packages with Pandas below, in case anyone is interested in seeing how it works. A key point is using the ":colnames no" header args to get the column names for Pandas. It seems like a pretty good approach. * org-mode tables with Pandas ** Aggregating from a table Examples from https://github.com/tbanel/orgaggregate #+NAME: original | Day | Color | Level | Quantity | |---+---+---+--| | Monday| Red |30 | 11 | | Monday| Blue |25 |3 | | Tuesday | Red |51 | 12 | | Tuesday | Red |45 | 15 | | Tuesday | Blue |33 | 18 | | Wednesday | Red |27 | 23 | | Wednesday | Blue |12 | 16 | | Wednesday | Blue |15 | 15 | | Thursday | Red |39 | 24 | | Thursday | Red |41 | 29 | | Thursday | Red |49 | 30 | | Friday| Blue | 7 |5 | | Friday| Blue | 6 |8 | | Friday| Blue |11 |9 | #+BEGIN_SRC ipython :var data=original :colnames no import pandas as pd pd.DataFrame(data[1:], columns=data[0]).groupby('Color').size() #+END_SRC #+RESULTS: :results: # Out [1]: # text/plain : Color : Blue7 : Red 7 : dtype: int64 :end: The categorical stuff here is just to get the days sorted the same way as the example. It is otherwise not needed. I feel there should be a more clever way to do this, but didn't think of it. #+BEGIN_SRC ipython :var data=original :colnames no df = pd.DataFrame(data[1:], columns=data[0]) days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] df['Day'] = pd.Categorical(df['Day'], categories=days, ordered=True) (df .groupby('Day') .agg({'Level': 'mean', 'Quantity': 'sum'}) .sort_values('Day')) #+END_SRC #+RESULTS: :results: # Out [2]: # text/plain :Level Quantity : Day : Monday 27.514 : Tuesday 43.045 : Wednesday 18.054 : Thursday43.083 : Friday 8.022 : Saturday NaN 0 : Sunday NaN 0 [[file:/var/folders/3q/ht_2mtk52hl7ydxrcr87z2grgn/T/ob-ipython-htmlMnDA9a.html]] :end: ** Joining tables Example from https://github.com/tbanel/orgtbljoin #+name: nutrition | type | Fiber | Sugar | Protein | Carb | |--+---+---+-+--| | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | | onion| 1.3 | 4.4 | 1.3 | 9.0 | | egg | 0 | 18.3 |31.9 | 18.3 | | rice | 0.2 | 0 | 1.5 | 16.0 | | bread| 0.7 | 0.7 | 3.3 | 16.0 | | orange | 3.1 | 11.9 | 1.3 | 17.6 | | banana | 2.1 | 9.9 | 0.9 | 18.5 | | tofu | 0.7 | 0.5 | 6.6 | 1.4 | | nut | 2.6 | 1.3 | 4.9 | 7.2 | | corn | 4.7 | 1.8 | 2.8 | 21.3 | #+name: recipe | type | quty | |--+--| | onion| 70 | | tomatoe | 120 | | eggplant | 300 | | tofu | 100 | #+BEGIN_SRC ipython :var nut=nutrition recipe=recipe :colnames no nutrition = pd.DataFrame(nut[1:], columns=nut[0]) rec = pd.DataFrame(recipe[1:], columns=recipe[0]) pd.merge(rec, nutrition, on='type') #+END_SRC #+RESULTS: :results: # Out [4]: # text/plain :type quty Fiber Sugar Protein Carb : 0 onion701.34.4 1.3 9.0 : 1 tomatoe 1200.62.1 0.8 3.4 : 2 eggplant 3002.53.2 0.8 8.6 : 3 tofu 1000.70.5 6.6 1.4 :end: John --- Professor John Kitchin Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu On Sun, Feb 21, 2021 at 1:54 AM Tim Cross wrote: > > Greg Minshall writes: > > > John, > > > >> Is there a state of the art in using org-tables as little databases > >> with joins and stuff? > > > > i have to admit i do all that with an R code source block. (the dplyr > > package has the relevant joins, e.g. dplyr::inner_join().) and, in R, > > ":colnames yes" as a header argument gives you header lines on results. > > (maybe that's ?now? for "all" languages?) > > > > For really complex joins and ad hoc queries, I would do similar or put > the data into sqlite. For more simple ones, I just define a table which > uses table formulas to extract the values from the other tables - the > downside being the tables need to have the same data ordering or the > formulas need to be somewhat complex. Provided the tables have the same > number of records in the same order, table formulas are usually fairly > easy. > > I did think about writing some elisp functions to use in my table > formulas to make things easier, but then decided I was just re-inventing > and well defined
Re: state of the art in org-mode tables e.g. join, etc
Greg Minshall writes: > John, > >> Is there a state of the art in using org-tables as little databases >> with joins and stuff? > > i have to admit i do all that with an R code source block. (the dplyr > package has the relevant joins, e.g. dplyr::inner_join().) and, in R, > ":colnames yes" as a header argument gives you header lines on results. > (maybe that's ?now? for "all" languages?) > For really complex joins and ad hoc queries, I would do similar or put the data into sqlite. For more simple ones, I just define a table which uses table formulas to extract the values from the other tables - the downside being the tables need to have the same data ordering or the formulas need to be somewhat complex. Provided the tables have the same number of records in the same order, table formulas are usually fairly easy. I did think about writing some elisp functions to use in my table formulas to make things easier, but then decided I was just re-inventing and well defined database solution and figured when I need it, just use sqlite. However, it has been a while since I needed this level of complexity, so perhaps things have moved on and there are better ways now. -- Tim Cross
Re: state of the art in org-mode tables e.g. join, etc
John, > Is there a state of the art in using org-tables as little databases > with joins and stuff? i have to admit i do all that with an R code source block. (the dplyr package has the relevant joins, e.g. dplyr::inner_join().) and, in R, ":colnames yes" as a header argument gives you header lines on results. (maybe that's ?now? for "all" languages?) Greg https://dplyr.tidyverse.org/articles/two-table.html?q=join#mutating-joins