Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-22 Thread Jack Kamm
Jack Kamm  writes:

> Liu Hui  writes:
>
>> I think these objects need to be shown in a single column rather than
>> two. Besides, if the python code becomes too complex finally, I think
>> maintaining the python code outside the ob-python.el, as suggested by
>> Ihor, is a good idea.
>
> Thanks for reporting these misbehaving examples. I think the root of the
> problem is `org-babel-script-escape', which is too aggressive in
> recursively converting strings to lists. We may need to rewrite our own
> implementation for ob-python.
>
> Also, I agree that moving the python code to an external file will be
> helpful in handling these more complex cases.
>
> I may leave these tasks for future patches. In the meantime, we may have
> to recommend ":results verbatim" for these more complex cases that
> ":results table" doesn't fully handle yet.

Pushed the patch now, with one final change: I decided to leave dict as
string by default, converting to table only when ":results table" is
explicitly set. I think it's better this way for now, because of the
misbehaving examples you pointed out -- table conversion is not yet
fully robust for complex dict's containing complicated objects or
structures.



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-22 Thread Jack Kamm
Ihor Radchenko  writes:

> +1
> Don't forget to update
> https://orgmode.org/worg/org-contrib/babel/languages/ob-doc-python.html
> (note how the docs already have an example of org formatting from python)

Thanks! Done now:
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=579e8c572345c42ad581d3ddf0f484567d55a787

And updated Worg as well:
https://git.sr.ht/~bzg/worg/commit/7c7d352be72271ae73f31ddffa0f48d225b34259



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-21 Thread Liu Hui
> Thanks for reporting these misbehaving examples. I think the root of the
> problem is `org-babel-script-escape', which is too aggressive in
> recursively converting strings to lists. We may need to rewrite our own
> implementation for ob-python.
>
> Also, I agree that moving the python code to an external file will be
> helpful in handling these more complex cases.
>
> I may leave these tasks for future patches. In the meantime, we may have
> to recommend ":results verbatim" for these more complex cases that
> ":results table" doesn't fully handle yet.

Understand. Thanks again for your work!



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-20 Thread Jack Kamm
Liu Hui  writes:

> I think these objects need to be shown in a single column rather than
> two. Besides, if the python code becomes too complex finally, I think
> maintaining the python code outside the ob-python.el, as suggested by
> Ihor, is a good idea.

Thanks for reporting these misbehaving examples. I think the root of the
problem is `org-babel-script-escape', which is too aggressive in
recursively converting strings to lists. We may need to rewrite our own
implementation for ob-python.

Also, I agree that moving the python code to an external file will be
helpful in handling these more complex cases.

I may leave these tasks for future patches. In the meantime, we may have
to recommend ":results verbatim" for these more complex cases that
":results table" doesn't fully handle yet.



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-20 Thread Ihor Radchenko
Jack Kamm  writes:

> In the meantime, I'm thinking to squash and apply my patch as is. Then
> afterwards, I can start working on a followup patch to move some Python
> code into a separate file (and coordinate with emacs-devel if
> necessary).

+1
Don't forget to update
https://orgmode.org/worg/org-contrib/babel/languages/ob-doc-python.html
(note how the docs already have an example of org formatting from python)

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at .
Support Org development at ,
or support my work at 



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-20 Thread Ihor Radchenko
Jack Kamm  writes:

> Ihor Radchenko  writes:
>
>> Similar to the existing LaTeX formatters, one may write a Python package
>> that will pretty-print Org markup as text.
>
> This sounds interesting -- are these LaTeX formatters external to Org?
> Could you provide a link/reference?

https://docs.sympy.org/latest/tutorials/intro-tutorial/printing.html
https://jeltef.github.io/PyLaTeX/current/examples/basic.html

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at .
Support Org development at ,
or support my work at 



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-20 Thread Jack Kamm
Ihor Radchenko  writes:

> We might add the code into a separate proper python file. Then, we can
> use the contents of that file to retrieve the variable value.
>
> We already do the same thing for CSL style files and odt schema/style.

Thanks, I think this is a good idea, and will make the python code
easier to maintain.

And thanks also for the pointer to oc-csl and ox-odt -- I think I should
be able to implement this by following their example.

It seems like there will be an extra logistical step, to make sure the
extra python file is added to emacs as well. I'm not familiar with the
details of how we sync Org into Emacs, but will start to look into it.

In the meantime, I'm thinking to squash and apply my patch as is. Then
afterwards, I can start working on a followup patch to move some Python
code into a separate file (and coordinate with emacs-devel if
necessary).



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-20 Thread Jack Kamm
Ihor Radchenko  writes:

> Similar to the existing LaTeX formatters, one may write a Python package
> that will pretty-print Org markup as text.

This sounds interesting -- are these LaTeX formatters external to Org?
Could you provide a link/reference?



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-20 Thread Liu Hui
> > Here we can use '{}'.format(df.index.name) to show the name of index
>
> Patch has been updated to print the index name when it is non-None.

Thanks! It would be nice to also support MultiIndex names using
`result.index.names', e.g.

#+begin_src python :results table
import numpy as np
import pandas as pd

df = pd.DataFrame({
"A": ["foo", "bar", "foo", "bar", "foo", "bar", "foo", "foo"],
"B": ["one", "one", "two", "three", "two", "two", "one", "three"],
"C": np.random.randn(8),
"D": np.random.randn(8)})
return df.groupby(["A", "B"]).agg('sum').round(3)
#+end_src

Another problem is the display of objects like datetime, e.g.

#+begin_src python :results table
import pandas as pd
s = pd.Series(range(3), index=pd.date_range("2000", freq="D", periods=3))
return s.to_frame()
#+end_src

#+RESULTS:
|   | 0 |   |
|---+---+---|
| Timestamp | (2000-01-01 00:00:00 freq= D) | 0 |
| Timestamp | (2000-01-02 00:00:00 freq= D) | 1 |
| Timestamp | (2000-01-03 00:00:00 freq= D) | 2 |

#+begin_src python
from pathlib import Path
import numpy as np

return {'a': 1, 'path': Path('/'), 'array': np.zeros(3)}
#+end_src

#+RESULTS:
| a | 1 |   |
| path  | PosixPath | (/)   |
| array | array | ((0 0 0)) |

I think these objects need to be shown in a single column rather than
two. Besides, if the python code becomes too complex finally, I think
maintaining the python code outside the ob-python.el, as suggested by
Ihor, is a good idea.



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-19 Thread Ihor Radchenko
Jack Kamm  writes:

> As to the broader point, I agree there are many more features that would
> be nice to add ob-python results handling. But making ob-python too
> complex will be difficult to maintain, especially since the Python code
> is all in quoted strings without proper linting.

We might add the code into a separate proper python file. Then, we can
use the contents of that file to retrieve the variable value.

We already do the same thing for CSL style files and odt schema/style.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at .
Support Org development at ,
or support my work at 



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-19 Thread Ihor Radchenko
Jack Kamm  writes:

>> What about :results graphics file ?
>
> Not entirely sure what you mean here.

Never mind. I was mixing the meaning of header args in my mind after all
the previous discussions.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at .
Support Org development at ,
or support my work at 



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-19 Thread Ihor Radchenko
Jack Kamm  writes:

> So I am thinking now about how we could make this more extensible in
> future. One idea is to create a Python package for interfacing with Org
> Babel, and release it on PyPi. If we detect the package is installed,
> then we can delegate to it for results formatting. And the community
> could contribute results handling for all sorts of Python objects to
> that package.
>
> That is just one idea for improving extensibility -- I'm not sure it's
> the best, and am open to other suggestions as well.

Similar to the existing LaTeX formatters, one may write a Python package
that will pretty-print Org markup as text. Not just for Org babel - it
might be useful in general.

And we do not need to support such formatters explicitly in ob-python.
Users can simply arrange to call the formatters in their code blocks by
the usual means.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at .
Support Org development at ,
or support my work at 



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-18 Thread Jack Kamm
gerard.vermeu...@posteo.net writes:

> I do not know how much this "abuse" of defconst is frowned
> upon (elisp manual says defconst is advisory), but maybe it
> can be advertised as a feature.

org-babel-python--def-format-value is a "private" variable (it has
double dash "--" in its name).  Therefore it's not generally recommended
to modify it.

Of course, elisp doesn't have true private variables or functions, and
you are free to change things as you wish -- this is one of the perks of
Emacs :) But you've been warned, since this is a private variable, we
make no guarantees, and may break things in backward-incompatible ways
in the future.

As to the broader point, I agree there are many more features that would
be nice to add ob-python results handling. But making ob-python too
complex will be difficult to maintain, especially since the Python code
is all in quoted strings without proper linting.

So I am thinking now about how we could make this more extensible in
future. One idea is to create a Python package for interfacing with Org
Babel, and release it on PyPi. If we detect the package is installed,
then we can delegate to it for results formatting. And the community
could contribute results handling for all sorts of Python objects to
that package.

That is just one idea for improving extensibility -- I'm not sure it's
the best, and am open to other suggestions as well.



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-18 Thread Jack Kamm
Ihor Radchenko  writes:

> This is an ORG-NEWS entry for Version 9.4. Is it an intentional change?

Sorry, that was an accident. I've reverted it now:
https://github.com/jackkamm/org-mode/commit/f12a695d67bc5c06013d9fbe0af844c9739e347a

>> @@ -142,7 +144,9 @@ (defun org-babel-python-table-or-string (results)
>>"Convert RESULTS into an appropriate elisp value.
>>  If the results look like a list or tuple, then convert them into an
>>  Emacs-lisp table, otherwise return the results as a string."
>> -  (let ((res (org-babel-script-escape results)))
>> +  (let ((res (if (string-equal "{" (substring results 0 1))
>> + results ;don't covert dicts to elisp
>> +   (org-babel-script-escape results
>
> You may also need to update the docstring for
> `org-babel-python-table-or-string' after this change.

That change got reverted in subsequent update when I changed dict to
return as table by default instead of string. So there's no need to
update the docstring anymore.

>> -body)))
>> -   (`value (let ((tmp-file (org-babel-temp-file "python-")))
>> +(if graphics-file
>> +(format 
>> org-babel-python--output-graphics-wrapper
>> +body graphics-file)
>> +  body
>> +   (`value (let ((results-file (or graphics-file
>> +   (org-babel-temp-file "python-"
>
> What about :results graphics file ?

Not entirely sure what you mean here.

When ":results graphics file", then graphics-file will be non-nil --
org-babel-execute:python passes graphics-file onto
org-babel-python-evaluate and then
org-babel-python-evaluate-external-process. In case of ":results
graphics file output", org-babel-python--output-graphics-wrapper is used
to save pyplot.gcf(). Or if ":results graphics file value", then
org-babel-python--def-format-value saves the result with
Figure.savefig().



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-18 Thread Jack Kamm
Liu Hui  writes:

> Hi,
>
> Thank you for the patch!

Thanks for your feedback, I've incorporated it into
https://github.com/jackkamm/org-mode/tree/python-results-revisited-2023

More specifically, here:
https://github.com/jackkamm/org-mode/commit/af1d18314073446045395ff7a3d1de0303e92586

> Do we need to limit the table/list size by default, or handle them
> only with relevant result type (e.g. `table/list')? Dataframe/array
> are often large.

I've updated the patch so that Dataframe/Array are converted to table
only when ":results table" is explicitly set now. If ":results table" is
not set, they will be returned as string by default.

So code blocks that return large dataframes/arrays can continue to be
safely run.

Note I did make an additional change to Numpy array default behavior:
Previously, numpy arrays would be returned as table, but get mangled
when they were very large, e.g.:

  #+begin_src python
  import numpy as np
  return np.zeros((30,40))
  #+end_src
  
  #+RESULTS:
  | (0 0 0 ... 0 0 0) | (0 0 0 ... 0 0 0) | (0 0 0 ... 0 0 0) | ... | (0 0 0 
... 0 0 0) | (0 0 0 ... 0 0 0) | (0 0 0 ... 0 0 0) |

But now, Numpy array is returned in string form by default, in the same
format as in Jupyter:

  #+begin_src python
  import numpy as np
  return np.zeros((30,40))
  #+end_src
  
  #+RESULTS:
  : array([[0., 0., 0., ..., 0., 0., 0.],
  :[0., 0., 0., ..., 0., 0., 0.],
  :[0., 0., 0., ..., 0., 0., 0.],
  :...,
  :[0., 0., 0., ..., 0., 0., 0.],
  :[0., 0., 0., ..., 0., 0., 0.],
  :[0., 0., 0., ..., 0., 0., 0.]])


>> +if isinstance(result, pandas.DataFrame):
>> +result = [[''] + list(result.columns), None] + \
>
> Here we can use '{}'.format(df.index.name) to show the name of index

Patch has been updated to print the index name when it is non-None.

> Maybe `org-babel-python--def-format-value' can be evaluated only once
> in the session mode? It would shorten the string sent to the python
> shell, where temp files are used for long strings.

Patch has been updated to evaluate `org-babel-python--def-format-value'
once per session.



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-18 Thread Jack Kamm
Ihor Radchenko  writes:

>>>  #+begin_src python :results list
>>>return {"a": 1, "b": 2}
>>>  #+end_src
>>>
>>>  #+RESULTS:
>>>  - a :: 1
>>>  - b :: 2
>>
>> This seems harder, and may require more widespread changes beyond
>> ob-python. In particular, I think we'd need to change
>> `org-babel-insert-result' so that it can call `org-list-to-org' with a
>> list of type "descriptive" instead of "unordered" here:
>
> Actually, (org-list-to-org '(unordered ("a :: b") ("c :: d")))
> will just work.
>
> We do not support nested lists when transforming output anyway. So,
> unordered/descriptive does not matter in practice.

You're right, thanks for the suggestion.

I've added it now to
https://github.com/jackkamm/org-mode/tree/python-results-revisited-2023

More specifically, here:
https://github.com/jackkamm/org-mode/commit/0440caa3326b867a3a15d5f92a6f99cbf94c14d5



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-18 Thread gerard . vermeulen




On 18.08.2023 06:37, gerard.vermeu...@posteo.net wrote:

On 17.08.2023 14:10, Ihor Radchenko wrote:

gerard.vermeu...@posteo.net writes:

Your patches allow anyone to change 
org-babel-python--def-format-value.

For instance, I want to use black to "pretty-print" certain tree-like
structures


May you simply add an extra code to transform output as needed?


Yes, it is a way to switch between Jack's first and second set of 
patches if
one would like.  Or to add code to transform other Python data 
structures.


I take back the switching between Jack's first and second set of 
patches,

but I stand by "to add code to transform other Python data structures".




Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-17 Thread gerard . vermeulen




On 17.08.2023 14:10, Ihor Radchenko wrote:

gerard.vermeu...@posteo.net writes:

Your patches allow anyone to change 
org-babel-python--def-format-value.

For instance, I want to use black to "pretty-print" certain tree-like
structures


May you simply add an extra code to transform output as needed?


Yes, it is a way to switch between Jack's first and second set of 
patches if
one would like.  Or to add code to transform other Python data 
structures.






Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-17 Thread Ihor Radchenko
gerard.vermeu...@posteo.net writes:

> Your patches allow anyone to change org-babel-python--def-format-value.
> For instance, I want to use black to "pretty-print" certain tree-like 
> structures

May you simply add an extra code to transform output as needed?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at .
Support Org development at ,
or support my work at 



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-17 Thread Ihor Radchenko
Jack Kamm  writes:

> I attach a 2nd patch implementing this. It also makes ":results table"
> the default return type for dict. (Use ":results verbatim" to get the
> dict as a string instead).

Thanks!

>>  #+begin_src python :results list
>>return {"a": 1, "b": 2}
>>  #+end_src
>>
>>  #+RESULTS:
>>  - a :: 1
>>  - b :: 2
>
> This seems harder, and may require more widespread changes beyond
> ob-python. In particular, I think we'd need to change
> `org-babel-insert-result' so that it can call `org-list-to-org' with a
> list of type "descriptive" instead of "unordered" here:
>
> https://git.sr.ht/~bzg/org-mode/tree/cc435cba71a99ee7b12676be3b6e1211a9cb7285/item/lisp/ob-core.el#L2535

Actually, (org-list-to-org '(unordered ("a :: b") ("c :: d")))
will just work.

We do not support nested lists when transforming output anyway. So,
unordered/descriptive does not matter in practice.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at .
Support Org development at ,
or support my work at 



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-17 Thread Ihor Radchenko
Jack Kamm  writes:

> Following up on a discussion from last month [1], I am reviving my
> proposal from a couple years ago [2] to improve ob-python results
> handling. Since it's a relatively large change, I am sending it to the
> list for review before applying the patch.

Some comments on the patch itself.

> @@ -2041,8 +2056,8 @@ to switch to the new signature.
>  *** Python session return values must be top-level expression statements
>  
>  Python blocks with ~:session :results value~ header arguments now only
> -return a value if the last line is a top-level expression statement.
> -Also, when a None value is returned, "None" will be printed under
> +return a value if the last line is a top-level expression statement,
> +otherwise the result is None. Also, None will now show up under
>  "#+RESULTS:", as it already did with ~:results value~ for non-session
>  blocks.

This is an ORG-NEWS entry for Version 9.4. Is it an intentional change?
  
> @@ -142,7 +144,9 @@ (defun org-babel-python-table-or-string (results)
>"Convert RESULTS into an appropriate elisp value.
>  If the results look like a list or tuple, then convert them into an
>  Emacs-lisp table, otherwise return the results as a string."
> -  (let ((res (org-babel-script-escape results)))
> +  (let ((res (if (string-equal "{" (substring results 0 1))
> + results ;don't covert dicts to elisp
> +   (org-babel-script-escape results

You may also need to update the docstring for
`org-babel-python-table-or-string' after this change.

> - body)))
> -   (`value (let ((tmp-file (org-babel-temp-file "python-")))
> +(if graphics-file
> +(format 
> org-babel-python--output-graphics-wrapper
> +body graphics-file)
> +  body
> +   (`value (let ((results-file (or graphics-file
> +(org-babel-temp-file "python-"

What about :results graphics file ?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at .
Support Org development at ,
or support my work at 



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-17 Thread gerard . vermeulen




On 17.08.2023 06:04, Jack Kamm wrote:


I attach a 2nd patch implementing this. It also makes ":results table"
the default return type for dict. (Use ":results verbatim" to get the
dict as a string instead).

I am also putting a branch with these changes here:
https://github.com/jackkamm/org-mode/tree/python-results-revisited-2023


Happy to see that ob-python gets so much love!

Your patches allow anyone to change org-babel-python--def-format-value.
For instance, I want to use black to "pretty-print" certain tree-like 
structures

and I have now in my init.el:

(with-eval-after-load 'ob-python
  (setq org-babel-python--def-format-value "\
def __org_babel_python_format_value(result, result_file, result_params):
with open(result_file, 'w') as f:
if 'graphics' in result_params:
result.savefig(result_file)
elif 'pp' in result_params:
import black
f.write(black.format_str(repr(result), mode=black.Mode()))
else:
if not set(result_params).intersection(\
['scalar', 'verbatim', 'raw']):
try:
import pandas
except ImportError:
pass
else:
if isinstance(result, pandas.DataFrame):
result = [[''] + list(result.columns), None] + \
[[i] + list(row) for i, row in result.iterrows()]
elif isinstance(result, pandas.Series):
result = list(result.items())
try:
import numpy
except ImportError:
pass
else:
if isinstance(result, numpy.ndarray):
result = result.tolist()
f.write(str(result))"))

Without your patches I use advice to override
org-babel-python-format-session-value, which is worse IMO.

This also allows anyone to format for instance AstroPy tables
(https://docs.astropy.org/en/stable/table/).

I do not know how much this "abuse" of defconst is frowned
upon (elisp manual says defconst is advisory), but maybe it
can be advertised as a feature.

Best regards -- Gerard




Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-16 Thread Liu Hui
Hi,

Thank you for the patch!

> Next, for numpy arrays and pandas dataframes/series: these are
> converted to tables, for example:
>
> #+begin_src python
>   import pandas as pd
>   import numpy as np
>
>   return pd.DataFrame(np.array([[1,2,3],[4,5,6]]),
>   columns=['a','b','c'])
> #+end_src
>
> #+RESULTS:
> |   | a | b | c |
> |---+---+---+---|
> | 0 | 1 | 2 | 3 |
> | 1 | 4 | 5 | 6 |
>
> To avoid conversion, you can specify "raw", "verbatim", "scalar", or
> "output" in the ":results" header argument.

Do we need to limit the table/list size by default, or handle them
only with relevant result type (e.g. `table/list')? Dataframe/array
are often large. The following results are truncated by default
previously, which can be tweaked via np.set_printoptions and
pd.set_option.

#+begin_src python
import numpy as np
return np.random.randint(10, size=(30,40))
#+end_src

#+begin_src python
import numpy as np
return np.random.rand(20,3,4,5)
#+end_src

#+begin_src python
import pandas as pd
import numpy as np

d = {'col1': np.random.rand(100), 'col2': np.random.rand(100)}
return pd.DataFrame(d)
#+end_src

> +def __org_babel_python_format_value(result, result_file, result_params):
> +with open(result_file, 'w') as f:
> +if 'graphics' in result_params:
> +result.savefig(result_file)
> +elif 'pp' in result_params:
> +import pprint
> +f.write(pprint.pformat(result))
> +else:
> +if not set(result_params).intersection(\
> +['scalar', 'verbatim', 'raw']):
> +try:
> +import pandas
> +except ImportError:
> +pass
> +else:
> +if isinstance(result, pandas.DataFrame):
> +result = [[''] + list(result.columns), None] + \

Here we can use '{}'.format(df.index.name) to show the name of index

>  (defun org-babel-python-format-session-value
>  (src-file result-file result-params)
>"Return Python code to evaluate SRC-FILE and write result to RESULT-FILE."
> -  (format "\
> +  (concat org-babel-python--def-format-value
> +  (format "

Maybe `org-babel-python--def-format-value' can be evaluated only once
in the session mode? It would shorten the string sent to the python
shell, where temp files are used for long strings.



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-16 Thread Jack Kamm
Ihor Radchenko  writes:

> What about 
>
>  #+begin_src python :results table
>return {"a": 1, "b": 2}
>  #+end_src
>
>  #+RESULTS:
>  | a | 1 |
>  | b | 2 |

I attach a 2nd patch implementing this. It also makes ":results table"
the default return type for dict. (Use ":results verbatim" to get the
dict as a string instead).

I am also putting a branch with these changes here:
https://github.com/jackkamm/org-mode/tree/python-results-revisited-2023

>
> or 
>
>  #+begin_src python :results list
>return {"a": 1, "b": 2}
>  #+end_src
>
>  #+RESULTS:
>  - a :: 1
>  - b :: 2

This seems harder, and may require more widespread changes beyond
ob-python. In particular, I think we'd need to change
`org-babel-insert-result' so that it can call `org-list-to-org' with a
list of type "descriptive" instead of "unordered" here:

https://git.sr.ht/~bzg/org-mode/tree/cc435cba71a99ee7b12676be3b6e1211a9cb7285/item/lisp/ob-core.el#L2535

>From c24d2eeb3b8613df9b9c23583a4b26a6c0934931 Mon Sep 17 00:00:00 2001
From: Jack Kamm 
Date: Wed, 16 Aug 2023 20:27:10 -0700
Subject: [PATCH 2/2] ob-python: Convert dicts to tables

This commit to be squashed with its parent before applying
---
 etc/ORG-NEWS  |  8 +++-
 lisp/ob-python.el | 12 +---
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/etc/ORG-NEWS b/etc/ORG-NEWS
index 2630554ae..509011737 100644
--- a/etc/ORG-NEWS
+++ b/etc/ORG-NEWS
@@ -578,11 +578,9 @@ tested property is actually present.
 
 *** =ob-python.el=: Support for more result types and plotting
 
-=ob-python= now recognizes numpy arrays, and pandas dataframes/series,
-and will convert them to org-mode tables when appropriate.
-
-In addition, dict results are now returned in appropriate string form,
-instead of being mangled as they were previously.
+=ob-python= now recognizes dictionaries, numpy arrays, and pandas
+dataframes/series, and will convert them to org-mode tables when
+appropriate.
 
 When the header argument =:results graphics= is set, =ob-python= will
 use matplotlib to save graphics. The behavior depends on whether value
diff --git a/lisp/ob-python.el b/lisp/ob-python.el
index 35a82afc0..3d987da2f 100644
--- a/lisp/ob-python.el
+++ b/lisp/ob-python.el
@@ -144,9 +144,7 @@ (defun org-babel-python-table-or-string (results)
   "Convert RESULTS into an appropriate elisp value.
 If the results look like a list or tuple, then convert them into an
 Emacs-lisp table, otherwise return the results as a string."
-  (let ((res (if (string-equal "{" (substring results 0 1))
- results ;don't covert dicts to elisp
-   (org-babel-script-escape results
+  (let ((res (org-babel-script-escape results)))
 (if (listp res)
 (mapcar (lambda (el) (if (eq el 'None)
  org-babel-python-None-to el))
@@ -242,6 +240,14 @@ (defconst org-babel-python--def-format-value "\
 else:
 if not set(result_params).intersection(\
 ['scalar', 'verbatim', 'raw']):
+def dict2table(res):
+if isinstance(res, dict):
+return [(k, dict2table(v)) for k, v in res.items()]
+elif isinstance(res, list) or isinstance(res, tuple):
+return [dict2table(x) for x in res]
+else:
+return res
+result = dict2table(result)
 try:
 import pandas
 except ImportError:
-- 
2.41.0



Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-16 Thread Ihor Radchenko
Jack Kamm  writes:

> Starting with dicts: these are no longer mangled. The current behavior
> (before patch) is like so:
>
> #+begin_src python
>   return {"a": 1, "b": 2}
> #+end_src
>
> #+RESULTS:
> | a | : | 1 | b | : | 2 |
>
> But after the patch they appear like so:
>
> #+begin_src python
>   return {"a": 1, "b": 2}
> #+end_src
>
> #+RESULTS:
> : {'a': 1, 'b': 2}

What about 

 #+begin_src python :results table
   return {"a": 1, "b": 2}
 #+end_src

 #+RESULTS:
 | a | 1 |
 | b | 2 |

or 

 #+begin_src python :results list
   return {"a": 1, "b": 2}
 #+end_src

 #+RESULTS:
 - a :: 1
 - b :: 2

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at .
Support Org development at ,
or support my work at 



[PATCH] ob-python results handling for dicts, dataframes, arrays, and plots

2023-08-15 Thread Jack Kamm
Following up on a discussion from last month [1], I am reviving my
proposal from a couple years ago [2] to improve ob-python results
handling. Since it's a relatively large change, I am sending it to the
list for review before applying the patch.

The patch changes how ob-python handles the following types of
results:

- Dictionaries
- Numpy arrays
- Pandas dataframes and series
- Matplotlib figures

Starting with dicts: these are no longer mangled. The current behavior
(before patch) is like so:

#+begin_src python
  return {"a": 1, "b": 2}
#+end_src

#+RESULTS:
| a | : | 1 | b | : | 2 |

But after the patch they appear like so:

#+begin_src python
  return {"a": 1, "b": 2}
#+end_src

#+RESULTS:
: {'a': 1, 'b': 2}

Next, for numpy arrays and pandas dataframes/series: these are
converted to tables, for example:

#+begin_src python
  import pandas as pd
  import numpy as np

  return pd.DataFrame(np.array([[1,2,3],[4,5,6]]),
  columns=['a','b','c'])
#+end_src

#+RESULTS:
|   | a | b | c |
|---+---+---+---|
| 0 | 1 | 2 | 3 |
| 1 | 4 | 5 | 6 |

To avoid conversion, you can specify "raw", "verbatim", "scalar", or
"output" in the ":results" header argument.

Finally, for plots: ob-python now supports ":results graphics" header
arg. The behavior depends on whether using output or value
results. For output results, the current figure (pyplot.gcf) is
cleared before evaluating, then the result saved. For value results,
the block is expected to return a matplotlib Figure, which is
saved. To set the figure size, do it from within Python.

Here is an example of how to plot:

#+begin_src python :results output graphics file :file boxplot.svg
  import matplotlib.pyplot as plt
  import seaborn as sns
  plt.figure(figsize=(5, 5))
  tips = sns.load_dataset("tips")
  sns.boxplot(x="day", y="tip", data=tips)
#+end_src

Compared to the original version of this patch [2], I tried to
simplify and streamline things as much as possible, since this is a
relatively large and complex change. For example, the handling for
dict objects is much more simplistic now. And there are other
miscellaneous changes to the code structure which I hope improve the
clarity a bit.

[1] 
https://list.orgmode.org/caoqtw-n9re7fdrm1apmo8x5lrzmjfn_zjht3rvaf4x+s5m_...@mail.gmail.com/
[2] https://list.orgmode.org/87eenpfe77@gmail.com/

>From 468eeaa69660a18d8b0503e5a68c275301d6e6ae Mon Sep 17 00:00:00 2001
From: Jack Kamm 
Date: Mon, 7 Sep 2020 09:58:30 -0700
Subject: [PATCH] ob-python: Results handling for dicts, dataframes, arrays,
 plots

* lisp/ob-python.el (org-babel-execute:python): Parse graphics-file
from params, and pass it to `org-babel-python-evaluate'.
(org-babel-python-table-or-string): Prevent `org-babel-script-escape'
from mangling dict results.
(org-babel-python--def-format-value): Python code for formatting
value results before returning.
(org-babel-python-wrapper-method): Removed.  Instead use part of the
string directly in `org-babel-python-evaluate-external-process'.
(org-babel-python-pp-wrapper-method): Removed.  Pretty printing is now
handled by `org-babel-python--def-format-value'.
(org-babel-python--output-graphics-wrapper): New constant.  Python
code to save graphical output.
(org-babel-python--exec-tmpfile): Removed.  Instead use the raw string
directly in `org-babel-python-evaluate-session'.
(org-babel-python--def-format-value): New constant.  Python function
to format and save value results to file.  Includes handling for
graphics, dataframes, and arrays.
(org-babel-python-format-session-value): Updated to use
`org-babel-python--def-format-value' for formatting value result.
(org-babel-python-evaluate): New parameter graphics-file.  Pass
graphics-file onto downstream helper functions.
(org-babel-python-evaluate-external-process): New parameter
graphics-file.  Use `org-babel-python--output-graphics-wrapper' for
graphical output.  For value result, use
`org-babel-python--def-format-value'.
(org-babel-python-evaluate-session): New parameter graphics-file.  Use
`org-babel-python--output-graphics-wrapper' for graphical output.
Replace the removed constant `org-babel-python--exec-tmpfile' with the
string directly.  Rename local variable tmp-results-file to
results-file, which may take the value of graphics-file when provided.
(org-babel-python-async-evaluate-session): New parameter
graphics-file.  Use `org-babel-python--output-graphics-wrapper' for
graphical output.  Rename local variable tmp-results-file to
results-file, which may take the value of graphics-file when provided.
---
 etc/ORG-NEWS  |  19 +-
 lisp/ob-python.el | 164 --
 2 files changed, 119 insertions(+), 64 deletions(-)

diff --git a/etc/ORG-NEWS b/etc/ORG-NEWS
index 11fdf2825..2630554ae 100644
--- a/etc/ORG-NEWS
+++ b/etc/ORG-NEWS
@@ -576,6 +576,21 @@ of all rel