[issue23980] Documentation for format units starting with 'e' is inconsistent

2020-04-19 Thread Zackery Spytz


Change by Zackery Spytz :


--
versions: +Python 3.7, Python 3.8, Python 3.9 -Python 2.7, Python 3.4, Python 
3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23980] Documentation for format units starting with 'e' is inconsistent

2020-04-19 Thread Zackery Spytz


Change by Zackery Spytz :


--
keywords: +patch
nosy: +ZackerySpytz
nosy_count: 7.0 -> 8.0
pull_requests: +18944
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/19610

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23980] Documentation for format units starting with 'e' is inconsistent

2017-02-06 Thread STINNER Victor

Changes by STINNER Victor :


--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23980] Documentation for format units starting with 'e' is inconsistent

2017-02-01 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

LGTM.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23980] Documentation for format units starting with 'e' is inconsistent

2017-01-28 Thread Martin Panter

Martin Panter added the comment:

The O! and O& units are in a similar situation. They just use a different font 
and descriptive name, rather than a specific type:

``O!`` (object) [*typeobject*, PyObject \*]
``O&`` (object) [*converter*, *anything*]

Following this lead, you could write:

``es`` (:class:`str`) [*encoding*, char \*buffer]
``et`` (. . .) [*encoding*, char \*buffer]
``es#`` (:class:`str`) [*encoding*, char \*buffer, int buffer_length]
``et#`` (. . .) [*encoding*, char \*buffer, int buffer_length]

The text description should explain what *encoding* is, but it appears it may 
already do that well enough.

--
nosy: +martin.panter

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23980] Documentation for format units starting with 'e' is inconsistent

2015-04-17 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 17.04.2015 05:50, Larry Hastings wrote:
 
 Larry Hastings added the comment:
 
 The e variants (typically) allocate a buffer for you, since it's pretty
 much unknown how long the encoded data will be.
 
 All four will do it if you pass in a NULL pointer.  es# and et# can reuse
 an existing buffer, because you can also pass in its size.

Right.

 So I guess the e descriptions need to have the additional * removed
 or the paragraph has to be updated and all other listings need
 to be converted to precise types (that would be my preference).
 
 Here's the problem with removing the additional *.  The first argument
 to encoding is a static string.  You literally pass in the char * string,
 not a pointer to a variable containing the address of the string.  e.g.
 
   PyArg_ParseTuple(es, args, utf-8, buffer);
 
 So how do we annotate that?  [char, char *]?

A literal static string is really a pointer as well. The compiler
will allocate and initialize the string and then provide a pointer
to reference it. It is also possible to pass in the pointer
directly - it just needs to be a const char *, i.e. one who's target
value doesn't change.

So for es this would have to be [const char, char *], but I like
your  idea more: [ const char *,  char **]

 I wonder why no one has noticed in all these years.
 
 Because nobody ever freaking uses the e format units.  But by golly
 I'm going to see that the docs are right and Clinic supports them
 correctly--or die trying.

It's not used much, but there are a few cases in the posixmodule and
the Mac modules of the Python 2.7 stdlib, as well as in extension modules,
of course. The Python 3.4 stdlib does not seem to use them anymore
(the uses were replaced with custom solutions or the modules removed).

I guess people never found out about those parser markers or
simply always fetch Unicode as object rather than as encoded
string.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23980
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23980] Documentation for format units starting with 'e' is inconsistent

2015-04-17 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 17.04.2015 06:42, Larry Hastings wrote:
 
 Larry Hastings added the comment:
 
 One more idea.  We annotate with an  when you pass in a pointer to a 
 variable.  So format unit 'i' would get [ int], 's' would get [ char *], 
 and 'es#' would get [char *,  char *,  int].

That sounds like a nice solution.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23980
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23980] Documentation for format units starting with 'e' is inconsistent

2015-04-16 Thread Larry Hastings

Larry Hastings added the comment:

One more idea.  We annotate with an  when you pass in a pointer to a variable. 
 So format unit 'i' would get [ int], 's' would get [ char *], and 'es#' 
would get [char *,  char *,  int].

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23980
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23980] Documentation for format units starting with 'e' is inconsistent

2015-04-16 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 17.04.2015 01:38, Larry Hastings wrote:
 
 Documentation is here:
 
 https://docs.python.org/3/c-api/arg.html#arg-parsing
 
 
 The first line of documentation for each format unit follows this convention:
 formatunit (pythontype) [arguments, to, pyarg_parsetuple]
 
 These represent the format unit itself, followed by the Python type it 
 consumes in parentheses, followed by the C types it outputs in square 
 brackets.  Thus
 i (int) [int]
 means the format unit is 'i', it consumes a Python 'int', and it produces a C 
 'int'.  Similarly,
 s (str) [const char *]
 means the format unit is 's', it consumes a Python 'str', and it produces a C 
 'const char *'.
 
 When you call PyArg_ParseTuple (AndKeywords), you pass in a pointer to the 
 thing you expect.  If it gives you an int, you pass in my_int. So the type 
 of the expression you pass in for 'i' is actually int *.  And the type you 
 pass in for 's' is actually char **.
 
 The format units that deal with encodings are a bit weirder.  You actually 
 pass in a const char * string first, followed by the buffer you want to write 
 data too.  Technically the types of the values you pass in for es are 
 const char *, char **.  But the documentation for es says
 es (str) [const char *encoding, char **buffer]

You need to pass in a variable which will then be set up to point to a
buffer which will be written too :-)

The e variants (typically) allocate a buffer for you, since it's pretty
much unknown how long the encoded data will be.

 This led me to believe that I actually had to pass in a char *** for 
 buffer!  Which is wrong and doing so makes your programs explode-y.

Indeed :-)

 The documentation should
 
 * explain this first-line convention precisely, and
 
 * use the types consistently.
 
 My suspicion is that the things in brackets have to be the precise C type, 
 e.g. int * for i, char ** for s, const char *, char ** for es.

The paragraph under Parsing argument says:


In the following description, the quoted form is the format unit; the entry in 
(round) parentheses
is the Python object type that matches the format unit; and the entry in 
[square] brackets is the
type of the C variable(s) whose address should be passed.


So I guess the e descriptions need to have the additional * removed
or the paragraph has to be updated and all other listings need
to be converted to precise types (that would be my preference).

I wonder why no one has noticed in all these years. I apparently had
understood the listings back to be precise C types back in the days
I added the documentation for the e codes:
https://hg.python.org/cpython-fullhistory/rev/3ae06c57d09e).

The descriptions for the codes do clarify what is going on, though.

--
nosy: +lemburg

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23980
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23980] Documentation for format units starting with 'e' is inconsistent

2015-04-16 Thread Larry Hastings

New submission from Larry Hastings:

Documentation is here:

https://docs.python.org/3/c-api/arg.html#arg-parsing


The first line of documentation for each format unit follows this convention:
formatunit (pythontype) [arguments, to, pyarg_parsetuple]

These represent the format unit itself, followed by the Python type it consumes 
in parentheses, followed by the C types it outputs in square brackets.  Thus
i (int) [int]
means the format unit is 'i', it consumes a Python 'int', and it produces a C 
'int'.  Similarly,
s (str) [const char *]
means the format unit is 's', it consumes a Python 'str', and it produces a C 
'const char *'.

When you call PyArg_ParseTuple (AndKeywords), you pass in a pointer to the 
thing you expect.  If it gives you an int, you pass in my_int. So the type of 
the expression you pass in for 'i' is actually int *.  And the type you pass 
in for 's' is actually char **.

The format units that deal with encodings are a bit weirder.  You actually pass 
in a const char * string first, followed by the buffer you want to write data 
too.  Technically the types of the values you pass in for es are const char 
*, char **.  But the documentation for es says
es (str) [const char *encoding, char **buffer]

This led me to believe that I actually had to pass in a char *** for buffer!  
Which is wrong and doing so makes your programs explode-y.


The documentation should

* explain this first-line convention precisely, and

* use the types consistently.

My suspicion is that the things in brackets have to be the precise C type, e.g. 
int * for i, char ** for s, const char *, char ** for es.

--
assignee: larry
components: Documentation
messages: 241292
nosy: benjamin.peterson, georg.brandl, larry
priority: normal
severity: normal
stage: needs patch
status: open
title: Documentation for format units starting with 'e' is inconsistent
type: enhancement
versions: Python 2.7, Python 3.4, Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23980
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23980] Documentation for format units starting with 'e' is inconsistent

2015-04-16 Thread Larry Hastings

Larry Hastings added the comment:

 The e variants (typically) allocate a buffer for you, since it's pretty
 much unknown how long the encoded data will be.

All four will do it if you pass in a NULL pointer.  es# and et# can reuse
an existing buffer, because you can also pass in its size.

 So I guess the e descriptions need to have the additional * removed
 or the paragraph has to be updated and all other listings need
 to be converted to precise types (that would be my preference).

Here's the problem with removing the additional *.  The first argument
to encoding is a static string.  You literally pass in the char * string,
not a pointer to a variable containing the address of the string.  e.g.

  PyArg_ParseTuple(es, args, utf-8, buffer);

So how do we annotate that?  [char, char *]?

 I wonder why no one has noticed in all these years.

Because nobody ever freaking uses the e format units.  But by golly
I'm going to see that the docs are right and Clinic supports them
correctly--or die trying.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23980
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com