[issue39408] Add support for SQLCipher

2020-01-21 Thread Sebastian Noack


Sebastian Noack  added the comment:

Yes, I could use LD_LIBRARY_PATH (after copying /usr/lib/libsqlcipher.so.0 to 
/some/folder/libsqlite3.so), or alternatively LD_PRELOAD, and the sqlite3 
stdlib module will just work as-is with SQLCipher. The latter is in fact what 
I'm doing at the moment, but this is quite a hack, and it's not portable to 
macOS or Windows.

Alternatively, I could fork the sqlite3 stdlib module, have it built against 
SQLCipher, and redistribute it. But I'd rather not go there.

That's why I'd love to see built-in support for SQLCipher in upstream Python, 
and as it is a drop-in replacement for SQLite3 which the stdlib already comes 
with bindings for, it seems to be a fairly small change on your end.

--

___
Python tracker 
<https://bugs.python.org/issue39408>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39408] Add support for SQLCipher

2020-01-21 Thread Sebastian Noack


Sebastian Noack  added the comment:

Well, the stdlib already depends on a third-party library here, i.e. SQLite3. 
SQLCipher is a drop-in replacement for SQLite3 that adds support for encrypted 
databases. In order to use SQLCipher, I'd have to build the sqlite3 module 
against SQLCipher (instead of SQLite3). As it's a drop-in replacement, no 
further changes are required (unless rather than having SQLCipher bindings 
exposed as a separate module, we want enable it through an argument in 
sqlite3.connect).

--

___
Python tracker 
<https://bugs.python.org/issue39408>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39408] Add support for SQLCipher

2020-01-21 Thread Sebastian Noack


New submission from Sebastian Noack :

SQLCipher is industry-standard technology for managing an encrypting SQLite 
databases. It has been implemented as a fork of SQLite3. So the sqlite3 corelib 
module would build as-is against it. But rather than a fork (of this module), 
I'd rather see integration of SQLCiper in upstream Python. I'm happy to 
volunteer if this changes have any chance of landing.

By just adding 2 lines to the cpython repository (and changing ~10 lines), I 
could make SQLCipher (based on the current sqlite3 module) available as a 
separate module (e.g. sqlcipher or sqlite3.cipher). However, IMO the ideal 
interface would be sqlilte3.connect(..., sqlcipher=True).

Any thoughts?

--
messages: 360373
nosy: Sebastian.Noack
priority: normal
severity: normal
status: open
title: Add support for SQLCipher

___
Python tracker 
<https://bugs.python.org/issue39408>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30297] Recursive starmap causes Segmentation fault

2017-05-09 Thread Sebastian Noack

Sebastian Noack added the comment:

Thanks for your response, both of you. All you said, make sense.

Just for the record, I wouldn't necessarily expect 200k nested iterators to 
work. Even if it could be made work, I guess it would use way too much memory. 
But a RuntimeError would be much preferable over a crash.

For the code above, the fix would be to just immediately convert the iterator 
returned by starmap() to a list. But in the end, regardless of this additional 
operation, it didn't perform well, so that I tossed that code, and used 
openssl's PBKDF2 implementation through the ctypes module.

Still, I'm somewhat concerned that code like this, will cause an unexpected 
crash that cannot be handled, dependent on run time variables. Could this 
perhaps even provide a security vulnerability? It seems to be a buffer 
overflow, after all.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30297>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30297] Recursive starmap causes Segmentation fault

2017-05-07 Thread Sebastian Noack

Sebastian Noack added the comment:

I just noticed that the segfault can also be reproduced with Python 2 [1]. So 
please ignore what I said before that this wouldn't be the case.

While it is debatable whether using a lazy evaluated object with so many 
recursions is a good idea in the first place, causing it the interpreter to 
crash with a segfault still seems concerning to me.

[1]: https://github.com/mitsuhiko/python-pbkdf2/issues/2

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30297>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30297] Recursive starmap causes Segmentation fault

2017-05-07 Thread Sebastian Noack

New submission from Sebastian Noack:

If I run following code (on Python 3.5.3, Linux) the interpreter crashes with a 
segfault:


def pbkdf2_bin(data, salt, iterations=1000, keylen=24, hashfunc=None):
hashfunc = hashfunc or hashlib.sha1
mac = hmac.new(data, None, hashfunc)
def _pseudorandom(x, mac=mac):
h = mac.copy()
h.update(x)
return h.digest()
buf = []
for block in range(1, -(-keylen // mac.digest_size) + 1):
rv = u = _pseudorandom(salt + _pack_int(block))
for i in range(iterations - 1):
u = _pseudorandom(u)
rv = starmap(xor, zip(rv, u))
buf.extend(rv)
return bytes(buf[:keylen])

pbkdf2_bin(b'1234567890', b'1234567890', 20, 32)


I was able to track it down to the line of buf.extend(rv) which apparently is 
causing the segfault. Note that rv is a lazy-evaluated starmap. I also get a 
segfault if I evaluate it by other means (e.g. by passing it to the list 
constructor). However, if I evaluate it immediately by wrapping the starmap 
constructor with the list constructor, the code works as expected. But I wasn't 
able yet, to further isolate the issue. FWIW, the Python 2 version [1] of this 
code works just fine without forcing immediate evaluation of the starmap.

Note that the code posted, except for the bits I changed in order to make it 
compatible with Python 3, is under the copyright of Armin Ronacher, who 
published it under the BSD license.

[1]: https://github.com/mitsuhiko/python-pbkdf2

--
messages: 293192
nosy: Sebastian.Noack
priority: normal
severity: normal
status: open
title: Recursive starmap causes Segmentation fault
type: crash
versions: Python 3.5

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30297>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24527] The MimeTypes class cannot ignore global files per instance

2015-06-29 Thread Sebastian Noack

New submission from Sebastian Noack:

In order to prevent the mimetypes module from considering global files and 
registry entries, you have to call mimetypes.init([]). However, this will 
enforce that behavior globally, and only works if the module wasn't initialized 
yet.

There is also a similar argument in the mimetypes.MimeTypes() constructor, 
however the list of files passed there are considered additionally. But there 
is no way to prevent an individual MinmeTypes instance to consider global files.

Adding a ignore_global_types option would be trivial too add to the MimeTypes 
constructor, and would be extremely useful.

--
components: Library (Lib)
messages: 245930
nosy: Sebastian Noack
priority: normal
severity: normal
status: open
title: The MimeTypes class cannot ignore global files per instance
type: behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24527
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-10-02 Thread Sebastian Noack

Sebastian Noack added the comment:

Exactly, with my implemantation the lock acquired first will be granted 
first. There is no way that either shared nor exclusive locks can starve, and 
therefore it should satisfy all use cases. Since you can only share simple 
datastructures like integers across processes, I also found that this seems to 
be the only policy (except ignoring the acquisition order at all), that can be 
implemented for multiprocessing.

I have also looked at the seqlock algorithm, which seems to be great for use 
cases where the exclusive lock is acquired rather rarely and where your 
reader code is in fact read-only and therefore can be repeated. But in any 
other case a seqlock would break your code. However the algorithm is ultra 
simple and can't be implemented as lock-like object anyway. Though you could 
implement it as context manager, but that would hide the fact that the reader 
code will be repeated. So if you find yourself that a seqlock is that what you 
need for your specific use case, you can just use the algorithm like below:

lock = multiprocessing.Value(0)
count = multiprocessing.Value(0)

def do_read():
  while True:
if count.value % 2:
  continue
data = ...
if count.value % 2:
  continue
return data

def do_write(data):
  with lock:
count.value += 1
# write data
count.value += 1

I have also experimented with implementing a shared/exclusive lock on top of a 
pipe and UNIX file locks (https://gist.github.com/3818148). However it works 
only on Unix and only with processes (not threads). Also it turned out that 
UNIX file locks don't implement an acquisition order. So exclusive locks can 
starve, which renders it useless for most use cases.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-10-02 Thread Sebastian Noack

Sebastian Noack added the comment:

@Kristján: Uhh, that is a huge amount of code, more than twice as much (don't 
counting tests) as my implementation, to accomplish the same. And it seems that 
there is not much code shared between the threading and multiprocessing 
implementation. And for what? Ah right, to make the API suck as much as the 
Windows API does. Please tell me more about good coding practice. ;)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-10-01 Thread Sebastian Noack

Sebastian Noack added the comment:

Yes, you could also look at the shared/exclusive lock as one lock with 
different states. But this approach is neither more common, have a look at 
Java's ReadWriteLock [1] for example, which works just like my patch does, 
except that a factory is returned instead of a tuple. Nor does it provide any 
of the benefits, I have mentioned before (same API as Lock and RLock, better 
compatibility with existing code an with statement, ability to pass the shared 
or exclusive lock separetly around). But maybe we could satisfy anybody, by 
following Richard's and Antoine's suggestion of returning a named tuple. So you 
could use the ShrdExclLock both ways:

# use a single object
lock = ShrdExclLock()

with lock.shared:
  ...

with lock.exclusive:
  ...

# unpack the the object into two variables and pass them separately around
shrd_lock, excl_lock = ShrdExclLock()

Thread(target=reader, args=(shrd_lock,)).start()
Thread(target=writer, args=(excl_lock,)).start)


The majority of us seems to prefer the terms shared and exclusive. However I 
can't deny that the terms read and write are more common, even though there are 
also notable exmples where the terms shared and exclusive are used [2] [3]. But 
let us ignore how other call it for now, and get to the origin of both set of 
terms, in order to figure out which fits best into Python:

shared/exclusive - abstract description of what it is
read/write   - best known use case

The reason why I prefer the terms shared and exculsive, is that it is more 
distinct and less likely to get misunderstand. Also naming a generic 
implementation after a specific use case is bad API design and I don't know any 
other case where that was done, in the Python core library.


[1] 
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/locks/ReadWriteLock.html
[2] http://www.postgresql.org/docs/9.2/static/explicit-locking.html
[3] http://www.unix.com/man-page/freebsd/9/SX/

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-10-01 Thread Sebastian Noack

Sebastian Noack added the comment:

@richard: I'm sorry, but both of my patches contain changes to 
'Lib/threading.py' and can be applied on top of Python 3.3.0. So can you 
explain what do you mean, by missing the changes to threading.py?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-10-01 Thread Sebastian Noack

Sebastian Noack added the comment:

 If you want to argue it this way, I counter that the attributes
 shared and exclusive apply to the type of access to the
 protected object you are talking about, and yet, the name suggest
 that they are attributes of the lock itself.

A lock's sole purpose is to synchronize access to a protected object or 
context. So naming a lock after its type of protection absolutely makes sense. 
Those names are also not supposed to be attributes of the lock, rather two 
locks (a shared and an exclusive lock) should be created, that might be 
returned as a namedtuple for convenience.

 In that sense, reader lock and writer lock, describe attributes
 of the user of the lock, and the verbs readlock and writelock
 describe the operation being requested.

The user of the lock isn't necessarily a reader or writer. This is just one of 
many possible use cases. For example in a server application a shared/exclusive 
lock might be used to protect a connection to the client. So every time a 
thread wants to use the connection, a shared lock must be acquired and when a 
thread wants to shutdown the connection, the exclusive lock must be acquired, 
in order to ensure that it doesn't interrupt any thread still processing a 
request for that connection. In that case you clearly wouldn't call the users 
reader and writer.


 The patch looks like it was produced using git rather than hg, so
 perhaps Rietveld got confused by this.  In that case it is a bug
 in Rietveld that it produced a partial review instead of producing
 no review.

Yes, I have imported the Python 3.3.0 tree into a local git repository and 
created the patch that way. Since patches generated with git are still 
compatible with the 'patch' program in order to apply them, I hope that isn't a 
problem.


 Although using namedtuple is probably a good idea, I don't think it
 really adds much flexibility.  This example could just as easily be
 written

  selock = ShrdExclLock()

  Thread(target=reader, args=(selock.shared,)).start()
  Thread(target=writer, args=(selock.exclusive,)).start()

Yes, that is true, but in some cases it is more convenient to be able unpack 
the shared/exclusive lock into two variables, with a one-liner. And defining a 
namedtuple doesn't require any extra code compared to defining a class that 
holds both locks. In fact it needs less code to be implemented.

However the flexibility comes from having two lock objects, doesn't matter how 
they are accessed, instead as suggested by Kristján to have a single lock 
object, which just provides proxies for use with the with statement.


 I also think it is time to drop the writer preference model, since
 it just adds complexity with doubtful benefits.  Sebastian's model
 also does that.

I have implemented the simplest possible acquisition order. The lock acquired 
first will be granted first. Without that (or a more advanced policy) in 
applications with concurrent threads/processes that are heavily using the 
shared lock, the exclusive lock can never be acquired, because of there is 
always a shared lock acquired and before it is released the next shared lock 
will be acquired.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-10-01 Thread Sebastian Noack

Sebastian Noack added the comment:

I would love to see how other people would implement a shared/exclusive lock 
that can be acquired from different processes. However it really seems that 
nobody did it before. If you know a reference implementation I would be more 
than happy.

There are plenty of implementations for threading only, but they won't work 
with multiprocessing, due to the limitations in the ways you can share data 
between processes.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-10-01 Thread Sebastian Noack

Sebastian Noack added the comment:

Thanks, but as I already said there are a lot of implementations for 
shared/exclusive lock that can be acquired from different threads. But we need 
with threading as well as with multiprocessing.

And by the way POSIX is the standard for implementing UNIX-like systems and not 
an industry standard for implementing anything, including high-level languages 
like Python.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-09-30 Thread Sebastian Noack

Sebastian Noack added the comment:

I was just waiting for a comment pointing out, that my patch comes without 
tests. :) Note that we are still discussing the implementation and this patch 
is just a proof of concept. And since the way it is implemented and the API it 
provides could still change, its quite pointless to write tests, until we at 
least agreed on the API.

I have uploaded a new patch. The way it is implemented now, is more like the 
Barrier is implemented. The common code is shared in the threading module and 
the shared/exclusive lock objects can be pickled now. I have also fixed a bug 
related to acquiring locks in non-blocking mode.

However the code still uses c_uint, but ctypes (and 
multiprocessing.sharedtypes) is only imported when ShrdExclLock is called. So 
it is just a lazy dependency, now. However the reason why I am using ctypes 
instead of python integers for threading and a BufferWrapper for 
multiprocessing (as the Barrier does) is, because of 2 of the 4 counters need 
to be continuously incremented, and c_uint has the nice feature that it can 
overflow, in contrast to python integers and integers in arrays. Also that way 
the implementation is simpler and it seems that there isn't much difference 
under the hood between using BufferWrapper() and RawValue().

A shared/exclusive lock isn't one lock but two locks, which are synchronized, 
but must be acquired separately. Similar to a pipe, which isn't one file, but 
one file connected to another file that reads whatever you have written into 
the first file. So it isn't strange to create two lock objects, as it also 
isn't strange that os.pipe() returns two file descriptors.

Also having a separate lock object for the shared and exclusive lock, each 
providing the same API (as Lock and RLock), gives you huge flexibility. You can 
acquire both locks using the with statement or pass them separately around. So 
for example when you have a function, thread or child process, that should only 
be able to acquire either the shared or the exclusive lock, you don't have to 
pass both locks. That also means that existing code that expects a lock-like 
object will be compatible with both the shared and exclusive lock.

--
Added file: 
http://bugs.python.org/file27363/Added-ShrdExclLock-to-threading-and-multiprocessing-2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-09-29 Thread Sebastian Noack

Sebastian Noack added the comment:

I would love to see a reader/writer lock implementation shipped with Python's 
threading (and multiprocessing) module. But I have some issues with the patch:

1. I would avoid the terms 'read' and 'write' as those terms are referring just 
to one of many use cases. A better and more generic name would be shared and 
exclusive lock.

2. If we add a new synchronization primitive to the threading module we really 
should add it also to the multiprocessing module, for consistency and to keep 
switching between threading and multiprocessing as easy as it is right now.

3. The methods rdlock() and wrlock() might even block if you call them with 
blocking=False. That's because of they acquire the internal lock in a blocking 
fashion before they would return False.

4. As Antoine already pointed out, it is a bad idea to make acquiring the 
exclusive (write) lock, the default behavior. That clearly violates the Zen of 
Python, since explicit is better than implicit.

5. The issue above only raises from the idea that the RWLock should provide the 
same API as the Lock and RLock primitives. So everywhere where a lock primitive 
is expected, you can pass either a Lock, RLock or RWLock. That is actually a 
good idea, but in that case you should explicitly specify, whether to pass the 
shared (read) or the exclusive (write) lock.

Both issues 4. and 5. only raise from the idea that a shared/exclusive lock 
should be implemented as a single class. But having two different lock 
primitives, one for the shared lock and one for the exclusive lock and a 
function returning a pair of those, would be much more flexible, pythonic and 
compatible with existing lock primitives.

def ShrdExclLock()
  class _ShrdLock(object):
def acquire(self, blocking=True):
  ...

def release(self, blocking=True):
  ...

def __enter__(self):
  self.acquire()
  retrun self

def __exit__(self, exc_value, exc_type, tb):
  self.release()

  class _ExclLock(object):
def acquire(self, blocking=True):
  ...

def release(self, blocking=True):
  ...

def __enter__(self):
  self.acquire()
  retrun self

def __exit__(self, exc_value, exc_type, tb):
  self.release()

  return _ShrdLock(), _ExclLock()

# create a shared/exclusive lock
shrd_lock, excl_lock = ShrdExclLock()

--
nosy: +Sebastian.Noack

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-09-29 Thread Sebastian Noack

Sebastian Noack added the comment:

Using a lock as context manager is the same as calling 
lock.acquire(blocking=True) and it will in fact block while waiting for an 
other thread to release the lock. In your code, the internal lock is indeed 
just hold for a very short period of time while acquiring or releasing a shared 
or exclusive lock, but it might add up to a notable amount of time dependent on 
how much concurrent threads are using the same RWLock and how slow/busy your 
computer is.

But what made me reconsider my point are following facts:

1. For example, when you acquire a shared (read) lock in non-blocking mode and 
False is returned, you assume that an other thread is holding an exclusive 
(write) lock. But that isn't necessarily the case, if it also returns False, 
when the internal lock is acquired by an other thread for example in order to 
acquire or release another shared (read) lock.

2. The internal lock must be acquired also in order to release a 
shared/exclusive lock. And the 'release' method (at least if implemented as for 
Lock and RLock) don't have a 'blocking' argument, anyway.

For that reasons, I think it is ok to block while waiting for the internal 
lock, even if the shared/exclusive lock was acquired in non-blocking mode. At 
least it seems to lead to less unexpected side effects, than returning False in 
case the internal lock is acquired.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8800] add threading.RWLock

2012-09-29 Thread Sebastian Noack

Sebastian Noack added the comment:

I've added a new patch, that implements a shared/exclusive lock as described in 
my comments above, for the threading and multiprocessing module.

--
Added file: 
http://bugs.python.org/file27350/Added-ShrdExclLock-to-threading-and-multiprocessing.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8800
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com