Raymond Hettinger rhettin...@users.sourceforge.net added the comment:
Antoine, what do you want to do with the one? Without a good test case the
OP's original issue is undiagnosable.
--
assignee: rhettinger - pitrou
versions: +Python 3.1
___
Python
Shawn swal...@opensolaris.org added the comment:
I specifically mentioned *SPARC* as the performance problem area, but the reply
about 0.5s to dump fails to mention on what platform they tested
My problem is not undiagnosable. I'll be happy to provide you with even more
data files. But I
Antoine Pitrou pit...@free.fr added the comment:
Raymond, I'll follow up in private with Shawn. All the recent performance
improvements done on JSON (in 3.2) mean the issue can be closed IMO.
--
resolution: - out of date
status: open - closed
___
Valentin Kuznetsov vkuz...@gmail.com added the comment:
Antoine,
indeed, both patches improved time and memory foot print. The latest
patch shows only 1.1GB RAM usage and is very fast. What's worry me
though, that memory is not released back to the system. Is this is the
case? I just added
Antoine Pitrou pit...@free.fr added the comment:
Antoine,
indeed, both patches improved time and memory foot print. The latest
patch shows only 1.1GB RAM usage and is very fast. What's worry me
though, that memory is not released back to the system. Is this is the
case? I just added
Valentin Kuznetsov vkuz...@gmail.com added the comment:
Nope, all three json's implementation do not release the memory. I used
your patched one, the one shipped with 2.6 and cjson. The one which comes
with 2.6, reach 2GB, then release 200MB and stays with 1.8GB during
sleep. The cjson
Antoine Pitrou pit...@free.fr added the comment:
Nope, all three json's implementation do not release the memory. I used
your patched one, the one shipped with 2.6 and cjson. The one which comes
with 2.6, reach 2GB, then release 200MB and stays with 1.8GB during
sleep. The cjson reaches
Valentin Kuznetsov vkuz...@gmail.com added the comment:
I made data local, but adding del shows the same behavior.
This is the test
def test():
source = open('mangled.json', 'r')
data = json.load(source)
source.close()
del data
test()
time.sleep(20)
--
Shawn swal...@opensolaris.org added the comment:
The attached patch doubles write times for my particular case when
applied to simplejson trunk using python 2.6.2. Not good.
--
___
Python tracker rep...@bugs.python.org
Antoine Pitrou pit...@free.fr added the comment:
The attached patch doubles write times for my particular case when
applied to simplejson trunk using python 2.6.2. Not good.
What do you mean by write times? The patch only affects decoding.
--
___
Shawn swal...@opensolaris.org added the comment:
You are right, an environment anomaly let me to falsely believe that
this had somehow affected encoding performance.
I had repeated the test many times with and without the patch using
simplejson trunk and wrongly concluded that the patch was to
Shawn swal...@opensolaris.org added the comment:
I've attached a sample JSON file that is much slower to write out on
some systems as described in the initial comment.
If you were to restructure the contents of this file into more of a tree
structure instead of the flat array structure it uses
Antoine Pitrou pit...@free.fr added the comment:
However, this bug is about the serializer (encoder). So perhaps the
decode performance patch should be a separate bug?
You're right, I've filed a separate bug for it: issue7451.
--
stage: patch review - needs patch
Changes by Antoine Pitrou pit...@free.fr:
Removed file: http://bugs.python.org/file15450/json-opts2.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6594
___
Antoine Pitrou pit...@free.fr added the comment:
Your example takes 0.5s to dump here.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6594
___
Antoine Pitrou pit...@free.fr added the comment:
Here is a new patch with an internal memo dict to reuse equal keys, and
some tests.
--
stage: - patch review
versions: +Python 3.2
Added file: http://bugs.python.org/file15450/json-opts2.patch
___
Changes by Antoine Pitrou pit...@free.fr:
Removed file: http://bugs.python.org/file15444/json-opts.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6594
___
Valentin Kuznetsov vkuz...@gmail.com added the comment:
Hi,
I'm sorry for delay, I was busy. Here is a test data file:
http://www.lns.cornell.edu/~vk/files/mangled.json
Its size is 150 MB, 50MB less of original, due to scrambled values I was
forced to do.
The tests with stock json module in
Antoine Pitrou pit...@free.fr added the comment:
Using cjson module, I observed 180MB of RAM utilization
source = open('mangled.json', 'r')
data = cjson.encode(source.read())
cjson is about 10 times faster!
This is simply wrong. You should be using cjson.decode(), not
cjson.encode().
If
Antoine Pitrou pit...@free.fr added the comment:
That said, it is possible to further improve json by reducing the number
of memory allocations and temporary copies. Here is an experimental
(meaning: not polished) patch which gains 40% in decoding speed in your
example (9 seconds versus 15).
We
Valentin Kuznetsov vkuz...@gmail.com added the comment:
Oops, that's explain why I saw such small memory usage with cjson. I
constructed tests on a fly.
Regarding the data structure. Unfortunately it's out of my hands. The
data comes from data-service. So, I can't do much and can only report
Valentin Kuznetsov vkuz...@gmail.com added the comment:
Hi,
I just found this bug and would like to add my experience with
performance of large JSON docs. I have a few JSON docs about 180MB in
size which I read from data-services. I use python2.6, run on Linux, 64-
bit node w/ 16GB of RAM and
Bob Ippolito b...@redivi.com added the comment:
Did you try the trunk of simplejson? It doesn't work quite the same way as
the current json module in Python 2.6+.
Without the data or a tool to produce data that causes the problem, there
isn't much I can do to help.
--
Antoine Pitrou pit...@free.fr added the comment:
As Raymond said, and besides, when you talk about penalty, please
explain what the baseline is. Otherwise it's a bit hard to follow.
(and I stress again that SPARC is a nich platform, even Niagara :-);
moreover, Niagara is throughput-oriented
Shawn swal...@opensolaris.org added the comment:
First, I want to apologise for not providing more detail initially.
Notably, one thing you may want to be aware of is that I'm using python
2.4.4 with the latest version of simplejson. So my timings and
assumptions here are based on the fact
Antoine Pitrou pit...@free.fr added the comment:
I'm not sure there's anything we should do about this. Some
architectures are unreasonably slow at some things, and the old SPARC
implementations are a niche nowadays. I suppose you may witness the same
kinds of slowdowns if you use cPickle rather
Shawn swal...@opensolaris.org added the comment:
As I mentioned, there's also noticeable performance penalties on recent
SPARC systems, such as Niagra T1000, T2000, etc. The degradation is
just less obvious (a 10-15 second penalty instead of a 20 or 30 second
penalty). While x86 enjoys no
Raymond Hettinger rhettin...@users.sourceforge.net added the comment:
Are you sure that recursion depth is the issue? Have you tried the same
number and kind of objects listed serially (unnested)? This would help
rule-out memory allocation issues and would instead confirm that it has
something
New submission from Shawn swal...@opensolaris.org:
The json serializer's performance (when using the C speedups) appears to
be tied to the depth of the structure being serialized on some systems.
In particular, dict structure that are more than a few levels deep,
especially when they content
Changes by Brett Cannon br...@python.org:
--
priority: - low
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6594
___
___
Python-bugs-list mailing
30 matches
Mail list logo