[issue19395] unpickled LZMACompressor is crashy
cantor added the comment: python 3.3 version - tried this code and got a sliglty faster processing time then when running lzma.compress() on its own. Could this be improved upon? import lzma from functools import partial from threading import Thread def split_len(seq, length): return [str.encode(seq[i:i+length]) for i in range(0, len(seq), length)] class CompressClass(Thread): def __init__ (self,data,c): Thread.__init__(self) self.exception=False self.data=data self.datacompressed="" self.c=c def getException(self): return self.exception def getOutput(self): return self.datacompressed def run(self): self.datacompressed=(self.c).compress(self.data) def launch_multiple_lzma(data,c): present=CompressClass(data,c) present.start() present.join() return present.getOutput() def threaded_lzma_map(sequence,threads): lzc = lzma.LZMACompressor() blocksize = int(round(len(sequence)/threads)) lzc_partial = partial(launch_multiple_lzma,c=lzc) lzlist = list(map(lzc_partial,split_len(sequence, blocksize))) out_flush = lzc.flush() return b"".join(lzlist + [out_flush]) threaded_lzma_map(sequence,threads=16) -- ___ Python tracker <http://bugs.python.org/issue19395> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19395] unpickled LZMACompressor is crashy
cantor added the comment: in python 2.7.3 this kind of works however it is less efficient than the pure lzma.compress() from threading import Thread from backports import lzma from functools import partial import multiprocessing class CompressClass(Thread): def __init__ (self,data,c): Thread.__init__(self) self.exception=False self.data=data self.datacompressed="" self.c=c def getException(self): return self.exception def getOutput(self): return self.datacompressed def run(self): self.datacompressed=(self.c).compress(self.data) def split_len(seq, length): return [seq[i:i+length] for i in range(0, len(seq), length)] def launch_multiple_lzma(data,c): print 'cores' present=CompressClass(data,c) present.start() present.join() return present.getOutput() def threaded_lzma_map(sequence,threads): lzc = lzma.LZMACompressor() blocksize = int(round(len(sequence)/threads)) lzc_partial = partial(launch_multiple_lzma,c=lzc) lzlist = map(lzc_partial,split_len(sequence, blocksize)) #pool=multiprocessing.Pool() #lzclist = pool.map(lzc_partial,split_len(sequence, blocksize)) #pool.close() #pool.join() out_flush = lzc.flush() res = "".join(lzlist + [out_flush]) return res sequence = 'AJKGJFKSHFKLHALWEHAIHWEOIAH IOAHIOWEHIOHEIOFEAFEASFEAFWEWEWFQWEWQWQGEWQFEWFDWEWEGEFGWEG' lzma.compress(sequence) == threaded_lzma_map(sequence,threads=16) Any way this could be imporved? -- ___ Python tracker <http://bugs.python.org/issue19395> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19395] unpickled LZMACompressor is crashy
cantor added the comment: just to mention that map() (i.e. the non parallel version) works: import lzma from functools import partial import multiprocessing def run_lzma(data,c): return c.compress(data) def split_len(seq, length): return [str.encode(seq[i:i+length]) for i in range(0, len(seq), length)] sequence='AJKGJFKSHFKLHALWEHAIHWEOIAH IOAHIOWEHIOHEIOFEAFEASFEAFWEWEWFQWEWQWQGEWQFEWFDWEWEGEFGWEG' threads=3 blocksize = int(round(len(sequence)/threads)) strings = split_len(sequence, blocksize) #map works lzc = lzma.LZMACompressor() out = list(map(lzc.compress,strings)) out_flush = lzc.flush() result = b"".join(out + [out_flush]) lzma.compress(str.encode(sequence)) lzma.compress(str.encode(sequence)) == result True # map with the use of partial function works as well lzc = lzma.LZMACompressor() lzc_partial = partial(run_lzma,c=lzc) out = list(map(lzc_partial,strings)) out_flush = lzc.flush() result = b"".join(out + [out_flush]) lzma.compress(str.encode(sequence)) == result -- ___ Python tracker <http://bugs.python.org/issue19395> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19395] lzma hangs for a very long time when run in parallel using python's muptiprocessing module?
Changes by cantor : -- components: -ctypes ___ Python tracker <http://bugs.python.org/issue19395> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19395] lzma hangs for a very long time when run in parallel using python's muptiprocessing module?
Changes by cantor : -- nosy: +nadeem.vawda ___ Python tracker <http://bugs.python.org/issue19395> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19395] lzma hangs for a very long time when run in parallel using python's muptiprocessing module?
cantor added the comment: lzma -- ___ Python tracker <http://bugs.python.org/issue19395> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19395] lzma hangs for a very long time when run in parallel using python's muptiprocessing module?
New submission from cantor: import lzma from functools import partial import multiprocessing def run_lzma(data,c): return c.compress(data) def split_len(seq, length): return [str.encode(seq[i:i+length]) for i in range(0, len(seq), length)] def lzma_mp(sequence,threads=3): lzc = lzma.LZMACompressor() blocksize = int(round(len(sequence)/threads)) strings = split_len(sequence, blocksize) lzc_partial = partial(run_lzma,c=lzc) pool=multiprocessing.Pool() lzc_pool = list(pool.map(lzc_partial,strings)) pool.close() pool.join() out_flush = lzc.flush() return b"".join(lzc_pool + [out_flush]) sequence = 'AJKGJFKSHFKLHALWEHAIHWEOIAH IOAHIOWEHIOHEIOFEAFEASFEAFWEWEWFQWEWQWQGEWQFEWFDWEWEGEFGWEG' lzma_mp(sequence,threads=3) -- components: ctypes messages: 201278 nosy: cantor priority: normal severity: normal status: open title: lzma hangs for a very long time when run in parallel using python's muptiprocessing module? type: behavior versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue19395> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18038] Unhelpful error message on invalid encoding specification
New submission from Max Cantor: When you specify a nonexistent encoding at the top of a file, like so for example: # -*- coding: fakefakefoobar -*- The following exception occurs: SyntaxError: encoding problem: with BOM This is very unhelpful, especially in cases where you might have made a typo in the encoding. -- components: Library (Lib) messages: 189840 nosy: Max.Cantor priority: normal severity: normal status: open title: Unhelpful error message on invalid encoding specification type: behavior versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue18038> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17769] python-config --ldflags gives broken output when statically linking Python with --as-needed
New submission from Max Cantor: On certain Linux distributions such as Ubuntu, the linker is invoked by default with --as-needed, which has an undesireable side effect when linking static libraries: it is bad at detecting required symbols, and the order of libraries on the command line become significant. Right now, on my Ubuntu 12.10 system with a custom 32-bit version of Python, I get the following command output: mcantor@hpmongo:~$ /opt/pym32/bin/python-config --ldflags -L/opt/pym32/lib/python2.7/config -lpthread -ldl -lutil -lm -lpython2.7 -Xlinker -export-dynamic When linking a project with those flags, I get the following error: /usr/bin/ld: /opt/pym32/lib/python2.7/config/libpython2.7.a(dynload_shlib.o): undefined reference to symbol 'dlopen@@GLIBC_2.1' /usr/bin/ld: note: 'dlopen@@GLIBC_2.1' is defined in DSO /usr/lib/gcc/x86_64-linux-gnu/4.7/../../../i386-linux-gnu/libdl.so so try adding it to the linker command line /usr/lib/gcc/x86_64-linux-gnu/4.7/../../../i386-linux-gnu/libdl.so: could not read symbols: Invalid operation collect2: error: ld returned 1 exit status To resolve the error, I moved -ldl and -lutil *AFTER* -lpython2.7, so the relevant chunk of my gcc command line looked like this: -L/opt/pym32/lib/python2.7/config -lpthread -lm -lpython2.7 -ldl -lutil -Xlinker -export-dynamic I have no idea why --as-needed has such an unpleasant side effect when static libraries are being used, and it's arguable from my perspective that this behavior is the real bug. However it's equally likely that there's a good reason for that behavior, like it causes a slowdown during leap-years on Apple IIs or something. So here I am. python-config ought to respect the quirks of --as-needed when outputting its ldflags. -- components: Build, Cross-Build messages: 187121 nosy: Max.Cantor priority: normal severity: normal status: open title: python-config --ldflags gives broken output when statically linking Python with --as-needed type: behavior versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue17769> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com