Re: [Python-Dev] Timing for removing legacy Unicode APIs deprecated by PEP 393

2018-04-19 Thread INADA Naoki
>
> I suppose that many users will start porting to Python 3 only in 2020, after
> 2.7 EOL. After that time we shouldn't support compatibility with 2.7 and can
> start emitting deprecation warnings at runtime. After 1 or 2 releases after
> that we can make corresponding public API always failing and remove private
> API and data fields.
>

Python 3.8 is planned to be released at  2019-10-20.  It's just before 2.7 EOL.
My current thought is:

* In 3.8, we make sure deprecated API emits warning (compile time if possible,
  runtime for others).

* If the deprecation is adopted smoothly, drop them in 3.9 (Mid 2021).
Otherwise,
  removal is postponed to 3.10 (Late 2023).

>
> There are other functions which expect that data is aligned to sizeof(long)
> or 8 bytes.
>
> Siphash hashing is special because it is called not just for strings and
> bytes, but for memoryview, which doesn't guarantee any alignment.
>

Oh, I'm sad about hear that...

> Note that after removing the wchar_t* field the gap will not gone, because
> the size of the structure should be a multiple of the alignment of the first
> field (which is a pointer).

Of course, we need hack for packing.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Timing for removing legacy Unicode APIs deprecated by PEP 393

2018-04-18 Thread Serhiy Storchaka

13.04.18 16:27, INADA Naoki пише:

Then, I want to reschedule the removal of these APIs.
Can we remove them in 3.8? 3.9? or 3.10?
I prefer sooner as possible.


I suppose that many users will start porting to Python 3 only in 2020, 
after 2.7 EOL. After that time we shouldn't support compatibility with 
2.7 and can start emitting deprecation warnings at runtime. After 1 or 2 
releases after that we can make corresponding public API always failing 
and remove private API and data fields.



Slightly off topic, there are 4bytes alignment gap in the unicode object,
on 64bit platform.

typedef struct {
.
 struct {
 unsigned int interned:2;
 unsigned int kind:3;
 unsigned int compact:1;
 unsigned int ascii:1;
 unsigned int ready:1;
 unsigned int :24;
 } state;  // 4 bytes

 // implicit 4 bytes gap here.

 wchar_t *wstr;  // 8 bytes
} PyASCIIObject;

So, I think we can reduce 12 bytes instead of 8 bytes when removing wstr.
Or we can reduce 4 bytes soon by moving `wstr` before `state`.

Off course, it needs siphash support 4byte aligned data instead of 8byte.


There are other functions which expect that data is aligned to 
sizeof(long) or 8 bytes.


Siphash hashing is special because it is called not just for strings and 
bytes, but for memoryview, which doesn't guarantee any alignment.


Note that after removing the wchar_t* field the gap will not gone, 
because the size of the structure should be a multiple of the alignment 
of the first field (which is a pointer).


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Timing for removing legacy Unicode APIs deprecated by PEP 393

2018-04-13 Thread INADA Naoki
Hi,

PEP 393 [1] deprecates some Unicode APIs relating to Py_UNICODE.
The PEP doesn't provide schedule for removing them.  But the APIs are
marked "will be removed in 4.0" in the document.
When removing them, we can reduce `wchar_t *` member of unicode object.
It takes 8 bytes on 64bit platform.

[1]: "Flexible String Representation" https://www.python.org/dev/peps/pep-0393/


I thought Python 4.0 is the next version of 3.9.  But Guido has different idea.
He said following at Zulip chat (we're trying it for now).

> No, 4.0 is not just what comes after 3.9 -- the major number change would 
> indicate some kind of major change somewhere (like possibly the Gilectomy, 
> which changes a lot of the C APIs). If we have more than 10 3.x versions, 
> we'll just live with 3.10, 3.11 etc.


And he said about these APIs:

>> Unicode objects has some "Deprecated since version 3.3, will be removed in 
>> version 4.0" APIs (pep-393).
>> When removing them, we can reduce PyUnicode size about 8~12byte.
>
> We should be able to deprecate these sooner by updating the docs.


Then, I want to reschedule the removal of these APIs.
Can we remove them in 3.8? 3.9? or 3.10?
I prefer sooner as possible.

---

Slightly off topic, there are 4bytes alignment gap in the unicode object,
on 64bit platform.

typedef struct {

struct {
unsigned int interned:2;
unsigned int kind:3;
unsigned int compact:1;
unsigned int ascii:1;
unsigned int ready:1;
unsigned int :24;
} state;  // 4 bytes

// implicit 4 bytes gap here.

wchar_t *wstr;  // 8 bytes
} PyASCIIObject;

So, I think we can reduce 12 bytes instead of 8 bytes when removing wstr.
Or we can reduce 4 bytes soon by moving `wstr` before `state`.

Off course, it needs siphash support 4byte aligned data instead of 8byte.

Regards,
-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com