[issue4751] Patch for better thread support in hashlib

2009-01-03 Thread ebfe

Changes by ebfe knabberknusperh...@yahoo.de:


Removed file: http://bugs.python.org/file12557/md5module_small_locks.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4818] Patch for thread-support in md5module.c

2009-01-03 Thread ebfe

New submission from ebfe knabberknusperh...@yahoo.de:

Here is another patch, this time for the fallback-md5-module. I know
that situations are rare where openssl is not present but threading is.
However they might occur out there and the md5module needed some love
anyway:

- The MD5 class from the fallback module can now also use threads with
'small locks'
- The behaviour regarding unicode data input is now consistent as to
what the openssl-driven classes do.
- Some code cleanup.


I might act on the sha modules as way the next days. sha256.c still
accepts 's#'...


Also see issue #4751

--
files: md5module_small_locks.diff
keywords: patch
messages: 78947
nosy: ebfe
severity: normal
status: open
title: Patch for thread-support in md5module.c
Added file: http://bugs.python.org/file12565/md5module_small_locks.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4818
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2009-01-03 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

Haypo, we can probably reduce overhead by defining ENTER_HASHLIB like this:

#define ENTER_HASHLIB(obj) \
if ((obj)-lock) { \
if (!PyThread_acquire_lock((obj)-lock, 0)) { \
Py_BEGIN_ALLOW_THREADS \
PyThread_acquire_lock((obj)-lock, 1); \
Py_END_ALLOW_THREADS \
} \
}

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4738] Patch to make zlib-objects better support threads

2009-01-02 Thread ebfe

Changes by ebfe knabberknusperh...@yahoo.de:


Removed file: http://bugs.python.org/file12466/zlib_threads-2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4738
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4738] Patch to make zlib-objects better support threads

2009-01-02 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

Here is a small test-script with concurrent access to a single
compressosbj. The original patch will immediately deadlock.

The patch attached releases the GIL before trying to get the zlib-lock.
This allows the other thread to release the zlib-lock but comes at the
cost of one additional GIL lock/unlock.

Added file: http://bugs.python.org/file12531/zlib_threads-3.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4738
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4738] Patch to make zlib-objects better support threads

2009-01-02 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

test-script

Added file: http://bugs.python.org/file12532/zlibtest2.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4738
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

Releasing the GIL is somewhat expensive and should be avoided if
possible. I've moved LEAVE_HASHLIB in EVP_update so the object gets
unlocked before we call Py_END_ALLOW_THREADS. This is *only* possible
because EVP_update does not use the object beyond those lines.

Here is a new patch and a small test-script.

Added file: http://bugs.python.org/file12533/hashopenssl_threads-4.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

test-script

Added file: http://bugs.python.org/file12534/hashlibtest2.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe

Changes by ebfe knabberknusperh...@yahoo.de:


Removed file: http://bugs.python.org/file12461/hashopenssl_threads-3.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

gnarf, actually it should be 'threads.append(Hasher(md))' in the script :-\

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

I don't think this is actually worth the trouble. You run into situation
where one thread might decide that it needs a lock now with other
threads being in the to-be-locked-area at that time.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

I don't think so.

The interface should stay simple - python has very few such magic knobs.
People will optimize for their own box as you said - and that code will
run worse on all the others...

Besides, we've lived so long with single-threaded openssl. Let's make
HASHLIB_GIL_MINSIZE such that there is no risk of additional overhead
introduced by this patch and refer to it's current value in the
hashlib-module's documentation.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

haypo, the patch will not compile when WITH_THREADS is not defined. The
'lock'-member in the object structure is not present without
WITH_THREADS however the line 'if (self-lock == NULL  view.len =
HASHLIB_GIL_MINSIZE)' will always refer to it.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

Here is another patch, this time for the fallback-md5-module. I know
that situations are rare where openssl is not present but threading is.
However they might occur out there and the md5module needed some love
anyway:

- The MD5 class from the fallback module can now also use threads with
'small locks'
- The behaviour regarding unicode data input is now consistent as to
what the openssl-driven classes do.
- Some code cleanup.


I might act on the sha modules as way the next days. sha256.c still
accepts 's#'...

I might a

Added file: http://bugs.python.org/file12557/md5module_small_locks.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4746] Misguiding wording 3.0 c-api reference

2008-12-29 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

Whenever the documentation says you must not it really says don't do
that or your application *will* crash, burn and die... Of course I can
allocate storage for the string, copy it's content and then free or -
nothing will happen. How would it cause a crash - it's my own pointer.

That's exactly the line between not required to, should not and
must not: The current wording suggests that I may not even touch e.g.
malloc which is confusing and in fact to be ignored in it's current state.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4746
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4757] reject unicode in zlib

2008-12-27 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

I don't think Python 2.x should be changed - but 3.0 or 3.1 should be:

 - Characters don't mean a thing in zlib-land, all operations are based
on bytes and their (implicit) default encoding. This behaviour is hidden
and somewhat violates the rule of least surprise.
 - type(zlib.decompress(zlib.compress('abc'))) == bytes anyway
 - Changing from s* to y* forces the programmer to use .encode() on his
strings (e.g. zlib.compress('abc'.encode()) which very clearly shows
what's happening. If you want to compress and decompress Python3
strings, you *must* share the same character encoding; think of
zlib.compress('hôńè') and str(zlib.decompress(x)) with different locales.
 - Other modules (hashlib comes to my mind...) already reject Unicode
objects for the same argument.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4757
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4732] Object allocation stress leads to segfault on RHEL

2008-12-27 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

I can't reproduce the problem here.

Python 2.5.2 running on Linux lueg-desktop 2.6.24-22-generic #1 SMP Mon
Nov 24 18:32:42 UTC 2008 i686 GNU/Linux

--
nosy: +ebfe

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4732
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread ebfe

New submission from ebfe knabberknusperh...@yahoo.de:

The hashlib functions provided by _hashopenssl.c hold the GIL all the
time although the underlying openssl-library is basically thread-safe.
I've attached a patch (svn diff) which basically does four things:

* If python is compiled with thread-support, the EVPobject is extended
by an additional PyThread_type_lock which protects the objects individually.
* The 'update' function releases the GIL if the to-be-hashed object is a
Bytes-object and therefor provides trustworthy locking (all other types,
including subclasses, are not trustworthy!). This allows multiple
threads to do hashing in parallel.
* The EVP_hash function removes duplicated code.
* The situation regarding unicode objects is now more meaningful. Upon
passing a unicode-string to the .update() function, the original hashlib
throws a TypeError: object supporting the buffer API required which is
confusing. I think it's perfectly valid not to accept unicode-strings as
input and people should required to call str.encode() upon their strings
before hashing, so a well-defined byte-representation of their strings
get hashed. Therefor I patched the MY_GET_BUFFER_VIEW_OR_ERROUT-macro to
throw TypeError: Unicode-objects must be encoded before hashing. This
also fixes issue #1118


I've tested this patch and did not run into problems. CPU occupancy
relies on the buffer-size passed to .update() as releasing the GIL is
basically not worth the effort for very small buffers. More testing may
be needed...

--
components: Library (Lib)
files: hashopenssl_threads.diff
keywords: patch
messages: 78297
nosy: ebfe
severity: normal
status: open
title: Patch for better thread support in hashlib
type: performance
versions: Python 3.0
Added file: http://bugs.python.org/file12453/hashopenssl_threads.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread ebfe

Changes by ebfe knabberknusperh...@yahoo.de:


Removed file: http://bugs.python.org/file12453/hashopenssl_threads.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

Thanks for the advices.

Antoine, maybe you could clarify the situation regarding buffer-locks
for me. In older versions of PEP 3118 the PyBUF_LOCK flag was still
present but it doesn't seem to have made it's way into the final draft.
Is it save to assume that a buffer-view will not change until release()
is called - for all types supporting the buffer protocol in py3k ??

I've done some testing and the overhead of releasing and re-locking the
GIL is definitely a performance problem when trying to hash many small
strings (doubled runtime for 100.000 times b'abc'). I've taken on
haypo's patch to release the GIL only when the buffer is larger than 10kb.

Added file: http://bugs.python.org/file12461/hashopenssl_threads-3.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4738] Patch to make zlib-objects better support threads

2008-12-26 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

new svn diff attached

- GIL is now released for adler32 and crc32 if the buffer is larger than
5kb (we don't want to risk burning cpu cycles by GIL-stuff)
- adler32 got it's param by s# but now does s* - why s# anyway?
- ENTER_ZLIB no longer gives away the GIL. It's dangerous and useless as
there is no pressure on the object's lock.
- deflateCopy() and inflateCopy() are not worth the trouble.u

Added file: http://bugs.python.org/file12463/zlib_threads-2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4738
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4738] Patch to make zlib-objects better support threads

2008-12-26 Thread ebfe

Changes by ebfe knabberknusperh...@yahoo.de:


Removed file: http://bugs.python.org/file12448/zlib_threads.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4738
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

Here is another simple benchmarker. For me it shows almost perfect
scaling (2 cores = 196% performance) if the buffer put into .update() is
large enough.

I deliberately did not move Py_BEGIN_ALLOW_THREADS into EVP_hash as we
might call this function without having some lock on the input buffer.

The 10kb limit was based on my own computer (MacBook Pro 2x2.5GHz) and
is somewhat more-safe-than-sorry.
Hashing is *very* fast on modern CPUs and working on many small strings
becomes very inefficient when releasing the GIL all the time. Just try
to hash 10240 bytes vs. 10241 bytes.

Added file: http://bugs.python.org/file12465/hashlibtest.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4738] Patch to make zlib-objects better support threads

2008-12-26 Thread ebfe

ebfe knabberknusperh...@yahoo.de added the comment:

new svn diff attached

the indentation in this file is not my fault, it has tabs all over it...

The 5kb limits protects from the overhead of releasing the GIL. With
very small buffers the overall runtime in my benchmark tends to double.
I set it based on my testing and it remains being arbitrary to a certain
degree. Set the limit to 1 and try 1.000.000 times b'abc'...

May I also suggest to change the zlib module not to accept s* but y*:
 - Internally zlib operates on bytes, characters don't mean a thing in
zlib-land.
 - We rely on s* performing the encoding into default for us. This
behaviour is hidden from the programmer and somewhat violates the rule
of least surprise.
 - type(zlib.decompress(zlib.compress('abc'))) == bytes
 - Changing from s* to y* forces the programmer to use .encode() on his
strings (e.g. zlib.compress('abc'.encode()) which very clearly shows
what's happening.

Added file: http://bugs.python.org/file12466/zlib_threads-2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4738
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4738] Patch to make zlib-objects better support threads

2008-12-26 Thread ebfe

Changes by ebfe knabberknusperh...@yahoo.de:


Removed file: http://bugs.python.org/file12463/zlib_threads-2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4738
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4746] Misguiding wording 3.0 c-api reference

2008-12-25 Thread ebfe

New submission from ebfe knabberknusperh...@yahoo.de:

Quote from http://docs.python.org/3.0/c-api/arg.html, regarding the s
argument:


s (string or Unicode object) [const char *]

Convert a Python string or Unicode object to a C pointer to a
character string. You must not provide storage for the string itself; a
pointer to an existing string is stored into the character pointer
variable whose address you pass. 


I guess the phrase you must not provide storage is a failed
translation and not meant like that. It should say you are not required
to provide storage. It's confusing to have such strong wording without
reason.

--
assignee: georg.brandl
components: Documentation
messages: 78281
nosy: ebfe, georg.brandl
severity: normal
status: open
title: Misguiding wording 3.0 c-api reference
versions: Python 3.0

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4746
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4738] Patch to make zlib-objects better support threads

2008-12-24 Thread ebfe

New submission from ebfe knabberknusperh...@yahoo.de:

My application needs to pack and unpack workunits all day long and does
this using multiple threading.Threads. I've noticed that the zlib module
seems to use only one thread at a time when using [de]compressobj(). As
the comment in the sourcefile zlibmodule.c already says the module uses
a global lock to protect different threads from accessing the object.
While the c-functions release the GIL while waiting for the global lock,
only one thread at a time can use zlib.
My app ends up using only one CPU to compress/decompress it's workunits...

The patch (svn diff to ) attached here fixes this problem by extending
the compressobj-structure by an additional member to create
object-specific locks and removes the global lock. The lock protects
each compressobj individually and allows multiple python threads to use
zlib in parallel, utilizing all available CPUs.

--
components: None
files: zlib_threads.diff
keywords: patch
messages: 78266
nosy: ebfe
severity: normal
status: open
title: Patch to make zlib-objects better support threads
type: performance
versions: Python 2.5, Python 2.6, Python 2.7, Python 3.0, Python 3.1
Added file: http://bugs.python.org/file12440/zlib_threads.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4738
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com