[issue14596] struct.unpack memory leak

2013-05-17 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 6707637f68ca by Serhiy Storchaka in branch 'default':
Issue #14596: The struct.Struct() objects now use more compact implementation.
http://hg.python.org/cpython/rev/6707637f68ca

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2013-05-17 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

We already lost this improvement for 3.3. :(

Could we now close the issue?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2013-05-17 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Closing indeed!

--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2013-05-14 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I don't think Serhiy's patch should be blocked by a larger issue. I suppose you 
could rebase easily over his changes.

--
versions: +Python 3.4 -Python 2.7, Python 3.2, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2013-05-14 Thread Meador Inge

Meador Inge added the comment:

 I don't think Serhiy's patch should be blocked by a larger issue.
 I suppose you could rebase easily over his changes.

Where rebase=undo, sure.  The changes for issue3132 are pretty
extensive (the basic data structures are changed).  And as mentioned
in msg165892, msg188840, and msg125617 I have already proposed and
implemented this optimization many months back.

If we feel that this optimization is really critical, then I agree
let's not hold it up and I will just work around it with my patch for
issue3132.  I don't see it as that critical, but I understand that the
PEP 3118 changes are dragging on and this optimization might be important
for some now.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2013-05-14 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Le mardi 14 mai 2013 à 16:37 +, Meador Inge a écrit :
 If we feel that this optimization is really critical, then I agree
 let's not hold it up and I will just work around it with my patch for
 issue3132.  I don't see it as that critical, but I understand that the
 PEP 3118 changes are dragging on and this optimization might be important
 for some now.

Are you sure the PEP 3118 changes will land in 3.4? It would be a pity
to lose a simple improvement because it was deferred to a bigger change.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2013-05-14 Thread Meador Inge

Meador Inge added the comment:

 Are you sure the PEP 3118 changes will land in 3.4? It would be a pity
 to lose a simple improvement because it was deferred to a bigger
 change.

No, I am not sure.  That is why I said that I understand if others felt
this bug was critical to fix now since the PEP 3118 changes were dragging
on.  In that case I will just rework my patch.

I am not trying to stand in the way of this patch.  I just wanted folks
to be aware that this approach was implemented in the PEP 3118 work.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2013-05-10 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

So what about more compact Struct object? This is not only memory issue (a 
Struct object can require several times larger memory than size of processed 
data), but performance issue (time of creating such object is proportional to 
it's size).

Here is a patch with updated __sizeof__ method and tests.

--
Added file: http://bugs.python.org/file30194/struct_repeat_2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2013-05-10 Thread Meador Inge

Meador Inge added the comment:


 Serhiy Storchaka added the comment:

 So what about more compact Struct object?

I already implemented the count optimization as a part of my patch for
implementing
PEP 3188 in issue3132.  I need to rebaseline the patch.  It has gotten
stale.  Hopefully
there is still interest in the PEP 3188 work and I can get that patch
pushed through.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-07-20 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

 I do think issue (3) should be fixed, but a separate issue should be opened 
 for it.

Issue #15402.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-07-19 Thread Meador Inge

Meador Inge mead...@gmail.com added the comment:

I just read through all this and see three separate points be discussed:

  1. The unbounded caching behavior.
  2. A more compact representation for repeat counts.
  3. Correct __sizeof__ support for struct.

For issue (1) I think this is unfortunate, but I don't think any code changes 
are required because there is already a way to get the cached and uncached 
behavior by using the free function or a Struct object, respectively.  I do 
think adjusting the documentation is appropriate.

As a side note, we do have a private function named 'struct._clearcache' that 
is used by the regression tests.  If others really think the caching is a 
problem we could make that public.

Issues (2) and (3) are hijacking this issue IMO.  However, since they are being 
discussed...  I already implemented something like (2) when reworking the 
struct data structures for PEP 3118 in issue3132.  Hopefully I will get the PEP 
3118 patch pushed through one of these days.

I do think issue (3) should be fixed, but a separate issue should be opened for 
it.  This issue should just address the caching behavior.  Serhiy, if you open 
another issue for the __sizeof__ change, then I promise to review ASAP.

--
stage:  - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-07-18 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Please, can anyone do the review?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-07-18 Thread Meador Inge

Changes by Meador Inge mead...@gmail.com:


--
nosy: +meador.inge

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-06-23 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Reduction of memory consumption of struct is a new feature. Any chance to 
commit struct_repeat.patch+struct_sizeof.patch today and to get this feature in 
Python 3.3?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-06-23 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

I'm still not convinced that something like struct_repeat.patch is necessary.  
So unless someone else wants to own this issue and review the struct_repeat, 
I'd say that it's too late for 3.3.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-06-23 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Now internal representation of Struct with small format string may
consume unexpectedly large memory and this representation may be
invisible cached. With patch you can get large internal representation
only for large format strings. It is expected.

And how about struct_sizeof.patch? Now sys.getsizeof() returns wrong
result for Struct:

28
 sys.getsizeof(struct.Struct('100B'))
28

The patch (it compatible with both Struct representations) fixes it:

52
 sys.getsizeof(struct.Struct('100B'))
1240

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-06-23 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

The struct_sizeof patch looks fine, but lacks tests.  I think it might be 
reasonable to call this a bugfix.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-06-23 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Here is Struct.__sizeof__ patch with tests.

--
Added file: http://bugs.python.org/file26115/struct_sizeof-2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___diff -r 53fc7f59c7bb Lib/test/test_struct.py
--- a/Lib/test/test_struct.py   Sat Jun 23 20:28:32 2012 +0200
+++ b/Lib/test/test_struct.py   Sat Jun 23 23:39:25 2012 +0300
@@ -572,6 +572,16 @@
 s = struct.Struct('i')
 s.__init__('ii')
 
+def test_sizeof(self):
+self.assertGreater(sys.getsizeof(struct.Struct('BHILfdspP')),
+   sys.getsizeof(struct.Struct('B')))
+self.assertGreaterEqual(sys.getsizeof(struct.Struct('123B')),
+sys.getsizeof(struct.Struct('B')))
+self.assertGreaterEqual(sys.getsizeof(struct.Struct('B' * 123)),
+sys.getsizeof(struct.Struct('123B')))
+self.assertGreaterEqual(sys.getsizeof(struct.Struct('123xB')),
+sys.getsizeof(struct.Struct('B')))
+
 def test_main():
 run_unittest(StructTest)
 
diff -r 53fc7f59c7bb Modules/_struct.c
--- a/Modules/_struct.c Sat Jun 23 20:28:32 2012 +0200
+++ b/Modules/_struct.c Sat Jun 23 23:39:25 2012 +0300
@@ -1752,6 +1752,19 @@
 return PyLong_FromSsize_t(self-s_size);
 }
 
+static PyObject *
+s_sizeof(PyStructObject *self, void *unused)
+{
+Py_ssize_t res;
+formatcode *code;
+
+res = sizeof(PyStructObject) + sizeof(formatcode);
+for (code = self-s_codes; code-fmtdef != NULL; code++) {
+res += sizeof(formatcode);
+}
+return PyLong_FromSsize_t(res);
+}
+
 /* List of functions */
 
 static struct PyMethodDef s_methods[] = {
@@ -1760,6 +1773,8 @@
 {unpack,  s_unpack,   METH_O, s_unpack__doc__},
 {unpack_from, (PyCFunction)s_unpack_from, METH_VARARGS|METH_KEYWORDS,
 s_unpack_from__doc__},
+{__sizeof__,  (PyCFunction)s_sizeof, METH_NOARGS,
+ Returns size in memory, in bytes},
 {NULL,   NULL}  /* sentinel */
 };
 
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-23 Thread Robert Elsner

Robert Elsner robert.elsn...@googlemail.com added the comment:

Well then at least the docs need an update. I simply fail to see how a
cache memory leak constitutes just fine (while the caching behavior of
struct.unpack is not documented - if somebody wants caching, he ought to
use struct.Struct.unpack which does cache and does not leak). Something
like a warning: struct.unpack might display memory leaks when parsing
big files using large format strings might be sufficient. I do not like
the idea of code failing outside some not-documented use-case.
Especially as those problems usually indicate some underlying design
flaw. I did not review the proposed patch but might find time to have a
look in a few months.

cheers

Am 20.04.2012 19:56, schrieb Mark Dickinson:
 
 Mark Dickinson dicki...@gmail.com added the comment:
 
 IMO, the struct module does what it's intended to do just fine here.  I don't 
 a big need for any change.  I'd propose closing this as won't fix.
 
 --
 
 ___
 Python tracker rep...@bugs.python.org
 http://bugs.python.org/issue14596
 ___

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-23 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

Other of our functions that do caching have been fixed so that the cache does 
not grow unbounded (usually by using lrucache, I think).  IMO a cache that is 
unbounded by default is a bug.

--
nosy: +r.david.murray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

AFAIU, it's not unbounded, but it's the memory footprint of individual cached 
objects which can grow senseless.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-23 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

If that is the case, then a doc footnote that a large repeat count will result 
in a large cache seems appropriate.  Some way to control the max size of the 
cache would also be a reasonable enhancement request, at which point the cache 
size issue could be documented along with the cache control.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-23 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Here is a patch that implements method __sizeof__ for Struct. This can
be used to limit the caching. In any case, it is useful to know the real
memory consumption of the object.

I'm not sure of the correctness of the method implementation.

--
Added file: http://bugs.python.org/file25325/struct_sizeof.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___diff -r c820aa9c0c00 Modules/_struct.c
--- a/Modules/_struct.c Fri Apr 20 18:04:03 2012 -0400
+++ b/Modules/_struct.c Mon Apr 23 16:57:18 2012 +0300
@@ -1752,6 +1752,19 @@
 return PyLong_FromSsize_t(self-s_size);
 }
 
+static PyObject *
+s_sizeof(PyStructObject *self, void *unused)
+{
+Py_ssize_t res;
+formatcode *code;
+
+res = sizeof(PyStructObject) + sizeof(formatcode);
+for (code = self-s_codes; code-fmtdef != NULL; code++) {
+res += sizeof(formatcode);
+}
+return PyLong_FromSsize_t(res);
+}
+
 /* List of functions */
 
 static struct PyMethodDef s_methods[] = {
@@ -1760,6 +1773,8 @@
 {unpack,  s_unpack,   METH_O, s_unpack__doc__},
 {unpack_from, (PyCFunction)s_unpack_from, METH_VARARGS|METH_KEYWORDS,
 s_unpack_from__doc__},
+{__sizeof__,  (PyCFunction)s_sizeof, METH_NOARGS,
+ Returns size in memory, in bytes},
 {NULL,   NULL}  /* sentinel */
 };
 
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-20 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

IMO, the struct module does what it's intended to do just fine here.  I don't a 
big need for any change.  I'd propose closing this as won't fix.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-18 Thread Jesús Cea Avión

Changes by Jesús Cea Avión j...@jcea.es:


--
nosy: +jcea

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Robert Elsner

New submission from Robert Elsner robert.elsn...@googlemail.com:

When unpacking multiple files with _variable_ length, struct unpack leaks 
massive amounts of memory. The corresponding functions from numpy (fromfile) or 
the array (fromfile) standard lib module behave as expected.

I prepared a minimal testcase illustrating the problem on 

Python 2.6.6 (r266:84292, Dec 26 2010, 22:31:48) 
[GCC 4.4.5] on linux2

This is a severe limitation when reading big files where performance is 
critical. The struct.Struct class does not display this behavior. Note that the 
variable length of the buffer is necessary to reproduce the problem (as is 
usually the case with real data files).
I suspect this is due to some internal buffer in the struct module not being 
freed after use.
I did not test on later Python versions, but could not find a related bug in 
the tracker.

--
components: Library (Lib)
files: unpack_memory_leak.py
messages: 158418
nosy: Robert.Elsner
priority: normal
severity: normal
status: open
title: struct.unpack memory leak
versions: Python 2.6
Added file: http://bugs.python.org/file25238/unpack_memory_leak.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Do you see the same results with Python 2.7?  Python 2.6 is only receiving 
security bugfixes at this point.

--
nosy: +mark.dickinson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Robert Elsner

Robert Elsner robert.elsn...@googlemail.com added the comment:

I would love to test but I am in a production environment atm and can't really 
spare the time to set up a test box. But maybe somebody with access to 2.7 on 
linux could test it with the supplied script (just start it and it should 
happily eat 8GB of memory - I think most users are going to notice ;)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

I suspect that this is due to the struct module cache, which caches Struct 
instances corresponding to formats used.  If that's true, there's no real leak 
as such.

As a test, what happens if you increase your xrange(30) to xrange(300)?  (And 
perhaps decrease the size of the struct itself a bit to compensate).  You 
should see that memory usage stays constant after the first ~100 runs.

Using Struct directly is a good workaround if this is a problem.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Robert Elsner

Robert Elsner robert.elsn...@googlemail.com added the comment:

Well seems like 3.1 is in the Debian repos as well. Same memory leak. So it is 
very unlikely it has been fixed in 2.7. I modified the test case to be 
compatible to 3.1 and 2.6.

--
versions: +Python 3.1
Added file: http://bugs.python.org/file25239/unpack_memory_leak.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Robert Elsner

Robert Elsner robert.elsn...@googlemail.com added the comment:

Well the problem is, that performance is severely degraded when calling unpack 
multiple times. I do not know in advance the size of the files and they might 
vary in size from 1M to 1G. I could use some fixed-size buffer which is 
inefficient depending on the file size (too big or too small). And if I change 
the buffer on the fly, I end up with the memory leak. I think the caching 
should take into account the available memory on the system. the no_leak 
function has comparable performance without the leak. And I think there is no 
point in caching Struct instances when they go out of scope and can not be 
accessed anymore? If i let it slip from the scope I do not want to use it 
thereafter. Especially considering that struct.Struct behaves as expected as do 
array.fromfile and numpy.fromfile.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 I suspect that this is due to the struct module cache, which caches
 Struct instances corresponding to formats used.  If that's true,
 there's no real leak as such.

Well, the posted code creates 30 struct instances. That shouldn't exhaust the 
memory of a 8GB box (which it does here)...

--
nosy: +pitrou
type:  - resource usage
versions: +Python 2.7, Python 3.2, Python 3.3 -Python 2.6, Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

Yes, the problem is the place to be in Python 2.7, in Python 3.2 and in Python 
3.3. Random size of the structure is important -- if you remove the randint, 
leakage will not.

The memory is not released in cached structuress, which are created in 
module-level unpack.

I now deal with it.

--
nosy: +storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

It appears the storage of Struct instances is rather inefficient when there's a 
repeat code such as 48L. In this example, 48 almost identical structures 
describing the L format (struct _formatcode) will be created. You can guess 
what happens with large repeat counts...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

 It appears the storage of Struct instances is rather inefficient when 
 there's a repeat code such as 48L

Right.  Repeat counts aren't directly supported in the underlying 
PyStructObject;  a format string containing repeat counts is effectively 
'compiled' to a series of (type, offset, size) triples before it can be used.  
The caching is there to save repeated compilations when the same format string 
is used repeatedly.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Perhaps the best quick fix would be to only cache small PyStructObjects, for 
some value of 'small'.  (Total size  a few hundred bytes, perhaps.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Perhaps the best quick fix would be to only cache small
 PyStructObjects, for some value of 'small'.  (Total size  a few
 hundred bytes, perhaps.)

Or perhaps not care at all? Is there a use case for huge repeat counts?
(limiting cacheability could decrease performance in existing
applications)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Robert Elsner

Robert Elsner robert.elsn...@googlemail.com added the comment:

Well I stumbled across this leak while reading big files. And what is
the point of having a fast C-level unpack when it can not be used with
big files?
I am not adverse to the idea of caching the format string but if the
cache grows beyond a reasonable size, it should be freed. And
reasonable is not the number of objects contained but the amount of
memory it consumes. And caching an arbitrary amount of data (8GB here)
is a waste of memory.

And reading the Python docs, struct.Struct.unpack which is _not_
affected from the memory leak is supposed to be faster. Quote:

 class struct.Struct(format)
 
 Return a new Struct object which writes and reads binary data according to 
 the format string format. Creating a Struct object once and calling its 
 methods is more efficient than calling the struct functions with the same 
 format since the format string only needs to be compiled once.

Caching in case of struct.Struct is straightforward: As long as the
object exists, the format string is cached and if the object is no
longer accessible, its memory gets freed - including the cached format
string. The problem is with the magic creation of struct.Struct
objects by struct.unpack that linger around even after all associated
variables are no longer in scope.

Using for example fixed 1MB buffer to read files (regardless of size)
incurs a huge performance penalty. Reading everything at once into
memory using struct.unpack (or with the same speed struct.Struct.unpack)
is the fastest way. Approximately 40% faster than array.fromfile and and
70% faster than numpy.fromfile.

I read some unspecified report about a possible memory leak in
struct.unpack but the author did not investigate further. It took me
quite some time to figure out what exactly happens. So there should be
at least a warning about this (ugly) behavior when reading big files for
speed and a pointer to a quick workaround (using struct.Struct.unpack).

cheers

Am 16.04.2012 15:59, schrieb Antoine Pitrou:
 
 Antoine Pitrou pit...@free.fr added the comment:
 
 Perhaps the best quick fix would be to only cache small
 PyStructObjects, for some value of 'small'.  (Total size  a few
 hundred bytes, perhaps.)
 
 Or perhaps not care at all? Is there a use case for huge repeat counts?
 (limiting cacheability could decrease performance in existing
 applications)
 
 --
 
 ___
 Python tracker rep...@bugs.python.org
 http://bugs.python.org/issue14596
 ___

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

 Or perhaps not care at all?

That's also possible. :-)   IMO, Robert's use-case doesn't really match the 
intended use-case for struct (parsing structures of values laid out like a 
C-struct ).  There the caching makes sense.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14596] struct.unpack memory leak

2012-04-16 Thread Serhiy Storchaka

Serhiy Storchaka storch...@gmail.com added the comment:

The proposed patch uses a more compact encoding format of large structures.

--
keywords: +patch
Added file: http://bugs.python.org/file25242/struct_repeat.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com