[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-19 Thread STINNER Victor


STINNER Victor  added the comment:

I close this issue until we can agree on an API.

--
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-12 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +8682

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-11 Thread STINNER Victor


STINNER Victor  added the comment:

Petr Viktorin asked me to open a wider discussion about this issue on 
python-dev. I just reverted my first change and posted:
https://mail.python.org/pipermail/python-dev/2018-September/155150.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-11 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 998b80636690ffbdb0a278810d9c031fad38631d by Victor Stinner in 
branch 'master':
Revert "bpo-34595: Add %T format to PyUnicode_FromFormatV() (GH-9080)" (GH-9187)
https://github.com/python/cpython/commit/998b80636690ffbdb0a278810d9c031fad38631d


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-11 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +8625

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-11 Thread Petr Viktorin


Petr Viktorin  added the comment:

> The PEP 399 requires that C accelerator behaves exactly as Python, [...]

It does not. PEP 399 requires that that the C code must pass the same *test 
suite*. And error messages in particular tend to not be checked in tests.

Anyway, I don't see how that applies to replacing `Py_TYPE(obj)->tp_name` by 
`%T`. The real reason for *that* change is removing borrowed references, right?
I have not yet seen a good reason why Py_TYPE(obj) is bad. The reasons you give 
in https://pythoncapi.readthedocs.io/bad_api.html#borrowed-references are about 
tagged pointers and PyList_GetItem(), but Py_TYPE() is very different.

I don't think the reasons are strong enough to add new API to 
PyUnicode_FromFormat().

--
nosy: +petr.viktorin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-09 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

> in error messages

And in reprs. It is common to format a repr as "{typename}(...)" or 
"<{typename}(...)>". The difference is whether the typename is a short or fully 
qualified name.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-09 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

> Ok, I wrote PR 9122 to add %t format and modify %T format:

Nice!

I agree that it is easy to use _PyType_Name() directly. But using 
_PyType_FullName() instead of tp_name can be cumbersome because it returns a 
new object and needs error handling.

> Or do you want to add a new formatter to type.__format__() to expose %T at 
> the Python level, f"{type(obj).__module__}.{type(obj).__qualname__}"?

Yes, I think we need a convenient way of formatting fully qualified name that 
omits the module name for types in the builtins module. It is equivalent to 
Py_TYPE(obj)->tp_name for extension types which is the most popular way to 
format a type name in error messages for now.

There are several open issues for inconsistency in error messages for Python 
and C implementations, because the former use type(obj).__name__ or 
obj.__class__.__name__, and the latter use Py_TYPE(obj)->tp_name. I hope 
finally we will fix this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-09 Thread STINNER Victor


STINNER Victor  added the comment:

> I think we need to handle only two cases: short and fully qualified names. 
> __qualname__ without __module__ doesn't make sense, and the value of tp_name 
> depends on implementation details (is it Python class or builtin class, heap 
> class or dynamic class?). Maybe use %t and %T?

Ok, I wrote PR 9122 to add %t format and modify %T format:

* %t <=> type(obj).__name__
* %T <=> f"{type(obj).__module__}.{type(obj).__qualname__}"


> But we may want to support formatting the name of the type itself and the 
> name of the object's type. This give us 4 variants.

Again, I'm not sure about these ones. _PyType_Name() can be used for %t-like 
directly on a type. Later we can expose _PyType_FullName() (function that I 
added in my latest PR) as a private function for %T-like directly on a type.


> For old string formatting we can introduce new % codes (with possible 
> modifiers). But in modern string formatting "T" can have meaning for some 
> types (e.g. for datetime). We can implement __format__ for the type type 
> itself (though it can cause confusion if cls.__format__() is different from 
> cls.__format__(instance)), but for formatting the name of  the object's type 
> (as in your original proposition) we need to add a new  conversion flag like 
> "!r".

I'm not sure that I understood directly.

Do you want to add a third formatter in PyUnicode_FromFormat() which would use 
Py_TYPE(obj)->tp_name? I dislike Py_TYPE(obj)->tp_name, since my intent is to 
conform to the PEP 399: tp_name is not accessible at the Python level, only 
type(obj).__name__ and type(obj).__qualname__.

Or do you want to add a new formatter to type.__format__() to expose %T at the 
Python level, f"{type(obj).__module__}.{type(obj).__qualname__}"?

Currently, type(obj).__name__ is the most popular way to format a string. Would 
it break the backward compatibility to modify *existing* error messages?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-09 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +8576

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-08 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

I think we need to handle only two cases: short and fully qualified names. 
__qualname__ without __module__ doesn't make sense, and the value of tp_name 
depends on implementation details (is it Python class or builtin class, heap 
class or dynamic class?). Maybe use %t and %T?

But we may want to support formatting the name of the type itself and the name 
of the object's type. This give us 4 variants.

For old string formatting we can introduce new % codes (with possible 
modifiers). But in modern string formatting "T" can have meaning for some types 
(e.g. for datetime). We can implement __format__ for the type type itself 
(though it can cause confusion if cls.__format__() is different from 
cls.__format__(instance)), but for formatting the name of  the object's type 
(as in your original proposition) we need to add a new  conversion flag like 
"!r".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-08 Thread STINNER Victor


STINNER Victor  added the comment:

An alternative would be to add multiple formatters. Example:

* %Tn: type name, type.__name__, Py_TYPE(obj)->tp_name
* %Tq: qualified name, type.__qualname__
* %Ts: short name, never contains "."
* %Tf: fully qualified name, "module.qualified.name"

What do you think Serhiy?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-07 Thread STINNER Victor


STINNER Victor  added the comment:

> In some cases we have the type itself, not an instance. So it makes sense to 
> make %T an equivalent of arg->tp_name instead of Py_TYPE(arg)->tp_name.

"arg->tp_name" is rare in the code base, whereas "Py_TYPE(arg)->tp_name" is a 
very common pattern.

I'm not sure that a formatter is needed for "arg->tp_name", you can already use 
"%s" with "arg->tp_name", no?

My intent was to make the C code less verbose, respect the PEP 399, but also 
indirectly avoid the implicit borrowed reference of Py_TYPE() :-)
https://pythoncapi.readthedocs.io/bad_api.html#borrowed-references


> On Python side, you need to output either short name obj.__class__.__name__ 
> or full qualified name obj.__class__.__module__ + '.' + 
> obj.__class__.__qualname__ with exception that the module name should be 
> omitted if it is 'builtins'.

Sometimes, the qualified name would be more appropriate, but it's sometimes 
tricky to decide if the short name, qualified name or fully qualified name is 
the "right" name... So I chose to restrict this issue to the most common case, 
Py_TYPE(arg)->tp_name :-)

Ah, and changing strings is a risk of breaking the backward compatibility. For 
example, cause issue with pickle. So it should be discussed on a case by case 
basis.

Moreover, if you want to change a string, the Python code should be updated as 
well. I suggest to open a new issue to discuss that.

Don't get me wrong, I'm interested to do these changes, but it's a wider 
project :-)


> obj.__class__.__qualname__ if obj.__class__.__module__ == 'builtins' else 
> f'{obj.__class__.__module__}.{obj.__class__.__qualname__}'
>
> The case of the module name '__main__' can be handled specially too.
> Obviously it is desirable to have a more handy way of writing such expression.

To be honest, I also considered to proposed a second formatter to do something 
like that :-) But as you explained, I'm not sure which name is the good name: 
qualified or fully qualified (with module name)?

First of all, would it help to have a *function* to get these names? Maybe we 
could first use such functions before discussing adding a new formatter in 
PyUnicode_FromFormat()?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-07 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

IIRC there is similar issue or a discussion on one of mailing lists. But the 
idea about adding this feature on Python side too was considered. It would be 
better to find this discussion before pushing this change.

In some cases we have the type itself, not an instance. So it makes sense to 
make %T an equivalent of arg->tp_name instead of Py_TYPE(arg)->tp_name.

On Python side, you need to output either short name obj.__class__.__name__ or 
full qualified name obj.__class__.__module__ + '.' + obj.__class__.__qualname__ 
with exception that the module name should be omitted if it is 'builtins'.

obj.__class__.__qualname__ if obj.__class__.__module__ == 'builtins' else 
f'{obj.__class__.__module__}.{obj.__class__.__qualname__}'

The case of the module name '__main__' can be handled specially too.
Obviously it is desirable to have a more handy way of writing such expression.

On C side, the problem is that tp_name means different, depending of the kind 
of the type. In some cases it is a short name, in other cases it is a full 
qualified name. It is not easy to write a code that produces the same output in 
Python and C. I have added a helper _PyType_Name() that helps to solve a part 
of these issues. If you want to output a short name (just cls.__name__ in 
Python), use _PyType_Name(cls) instead of cls->tp_name. But this doesn't help 
for the case when you need to output a full qualified name.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-07 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +8557

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-07 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 886483e2b9bbabf60ab769683269b873381dd5ee by Victor Stinner in 
branch 'master':
bpo-34595: Add %T format to PyUnicode_FromFormatV() (GH-9080)
https://github.com/python/cpython/commit/886483e2b9bbabf60ab769683269b873381dd5ee


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-06 Thread Eric V. Smith


Eric V. Smith  added the comment:

I don't think you have to worry about %T being used by other formatting 
functions. If (heaven forbid) dates were ever supported by 
PyUnicode_FromFormat(), there would have to be a way to switch from "normal" 
argument processing to argument-specific formatting specifiers, anyway.

--
nosy: +eric.smith

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-06 Thread STINNER Victor


STINNER Victor  added the comment:

Oh, PyUnicode_FromFormat() has %A to format as ASCII, whereas printf() already 
has %A but for a different meaning:

   a, A   For a conversion, the double argument is converted to hexadecimal 
notation (using the letters abcdef) in the style [-]0xh.p+-; for A 
conversion the prefix  0X,  the  letters  ABCDEF, and the exponent separator P 
is used. (...)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-06 Thread STINNER Victor


STINNER Victor  added the comment:

I cannot find %T in printf() manual pages on Fedora 28 (Linux).

I can find it in the strftime() documentation:

   %T The time in 24-hour notation (%H:%M:%S).  (SU)

But I don't think that it's an issue since printf() and strftime() formatters 
are exclusive, no? For example, strftime() %s means "number of seconds since 
the Epoch" whereas printf() %s means a "const char*" byte string.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-06 Thread STINNER Victor


STINNER Victor  added the comment:

My previous attempt to fix that issue, 7 years ago: bpo-10833 :-)

See also bpo-7330 where I implemented width and precision (ex: "%5.3s") in 
PyUnicode_FromFormat().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-06 Thread STINNER Victor


Change by STINNER Victor :


--
keywords: +patch
pull_requests: +8539
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34595] PyUnicode_FromFormat(): add %T format for an object type name

2018-09-06 Thread STINNER Victor


New submission from STINNER Victor :

The PEP 399 requires that C accelerator behaves exactly as Python, but a lot of 
C code truncates type name to an arbitrary length: 80, 100, 200, up to 500 (not 
sure if it's a number of bytes or characters).

Py_TYPE(obj)->tp_name is a common pattern: it would be nice to have a new "%T" 
format in PyUnicode_FromFormat(), so it can be used directly in PyErr_Format(), 
to format an object type name.

Attached PR implements the proposed %T format and modify unicodeobject.c to use 
it.

I propose to then write a second PR to modify all C code of CPython using 
Py_TYPE(obj)->tp_name to use the new %T type.

--
components: Interpreter Core
messages: 324675
nosy: vstinner
priority: normal
severity: normal
status: open
title: PyUnicode_FromFormat(): add %T format for an object type name
versions: Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com