Re: [pypy-dev] Change to the frontpage of speed.pypy.org
I see. There is an easy solution for that, at least for the moment: enabling zooming. I just did that, and you can now use zooming in a timeline plot to select a narrower yaxis range or just view a particular area in detail. A single click resets the zoom level. If that is not enough, we can discuss a better solution when you have more time. 2011/3/9 Laura Creighton l...@openend.se: In a message of Tue, 08 Mar 2011 18:17:17 +0100, Miquel Torres writes: you mean this timeline, right?: http://speed.pypy.org/timeline/?ben=3Dspectral-norm Because the December 22 result is so high, the yaxis maximum goes up to 2.5, thus having less space for the more interesting 1 range, right? yes Regarding mozilla, do you mean this site?: http://arewefastyet.com/ I can see their timelines have some holes, probably failed runs... I was seeing something else, and I don't have a url. I think that what I was seeing is what they use to make the arewefastyet.com site. I see a problem with the approach you suggest. Entering an arbitrary maximum yaxis number is not a good thing. I think the onus is there on the benchmark infrastructure to not send results that aren't statistically significant. See Javastats (http://www.elis.ugent.be/en/JavaStats), or ReBench (https://github.com/smarr/ReBench). I don't think you understand what I want. Sorry if I was unclear. I am fine with the way that the benchmarks are displayed right now, but I want a way to dynamically do there and say, I want to throw away all data that is higher than a certain figure, or lower than a certain one, because right now I am onoy interested in results in a certain range. I'm not looking to change what the benchmark says for everybody who looks at it, or change how it is presented in general. I just want a way to zoom in and only see results in the range that interests me. You and anybody else might have a different range that interests you, and you should be free to get this as well. Something that can be done on the Codespeed side is to treat differently points that have a too high stddev. In the aforementioned spectral-norm timeline, the stddev floor is around 0.0050, while the spike has a 0.30 stddev, much higher. A strict mode could be implemented that invalidates or hides statistically unsound data. The problem is that I want to throw away arbitrary amounts of data regardless of whether they are statistically significant or not, on the basis of I know what I want to see, and this other stuff is getting in the way or being distractingÃ. Btw., I had written to the arewefastyet guys about the possibility of configuring a Codespeed instance for them. We may yet see collaboration there ;-) Miquel Laura ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Change to the frontpage of speed.pypy.org
And I want the 'lestest results' stuff gone from the front page. It's as misleading as ever. And I have done a poll of developers. it's not just me. Nobody finds it valuable. Because things stay red forever, we all ignore it all the time and go directly to the raw results of runs, which is what we are interested in. This also tells us of improvements, which we are also interested in, because unexpected improvements often mean something is very wrong. Ok, that is pretty clear. Explaining that they aren't current, or latest results, but instead 'sometime in the past when we were bad once' is getting irritating. Can you please move it someplace else so I don't have to have this conversation with pycon attendees any more? Sorry about that. I have removed it from the site. Later, when pycon is over, we can discuss and work out a better design for informing developers that a given build may have broken things. This way is not working. Yes, we can figure out a better approach after PyCon. And I don't think that you can use the geometric mean to prove a thing with this. So I think talking about it makes us look bad -- we are making claims based on either bad science, or pseudo-science. I agree that it wasn't the best explanation. I have removed most text, so that it doesn't explicitly say that it means PyPy *is* X times faster. It now just states that the geometric mean is X times faster. If it is still too much, we can remove all text. But then some people won't understand the numbers. We can also remove the second plot. The mean of a set of benchmarks that does not represent a balanced real-world task mix is of course not very good. But then again the world is complicated. And, as a normal Python developer, I find the second plot extremely interesting, because it gives me a ballpark idea of where the PyPy project is heading to. Before I decide to use PyPy in production instead of CPython, I will do my own testing for *my* application, but I assure you that not having a ballpark average won't be a plus in considering to invest the time to test and adopt PyPy. But that is my opinion of course ;-) Have fun at PyCon! Miquel 2011/3/9 Laura Creighton l...@openend.se: In a message of Tue, 08 Mar 2011 20:20:06 +0100, Miquel Torres writes: Ok, I just committed the changes. They address two general cases: - You want to know how fast PyPy is *now* compared to CPython in different benchmark scenarios, or tasks. - You want to know how PyPy has been *improving* overall over the last re le= ases That is now answered on the front page, and the reports are now much less prominent (I didn't change the logic because it is something I want to do properly, not just as a hack for speed.pypy). - I have not yet addressed the smaller is better point. I am aware that the wording of the faster on average needs to be improved (I am discussing it with Holger even now ;). Please chime in so that we can have a good paragraph that is informative and short enough while at the same time not being misleading. Miquel The graphic is lovely. you have a sÃpelling error s/taks/task/. Many of us are at PyCon now, so working on wording may not be something we have time for now. I am not sure that the geometric mean of all benchmarks give you anything meaningful, so I would have avoided saying anything like that. More specifically, I think that there is a division between some highly mathematical programs, where you might get a speedup of 20 to 30 times CPython, and the benchmarks whcih I find much more meaningful, those that represent actualy python programs -- where I think we are typically only between 2 and 3 times faster. The only reason to have some of the benchmarks is because they are well known. So people expect them. But doing very well on them is not actually all that significant -- it would be easy to write something that is great and running these contrived, synthetic benchmarks, but really lousy at running real python code. And I don't think that you can use the geometric mean to prove a thing with this. So I think talking about it makes us look bad -- we are making claims based on either bad science, or pseudo-science. And I want the 'lestest results' stuff gone from the front page. It's as misleading as ever. And I have done a poll of developers. it's not just me. Nobody finds it valuable. Because things stay red forever, we all ignore it all the time and go directly to the raw results of runs, which is what we are interested in. This also tells us of improvements, which we are also interested in, because unexpected improvements often mean something is very wrong. The whole thing has the same problem as those popup windows 'do you really want to delete that file? confirm y/n'. You get used to typing y. Then you do it when you meant not to save the file. The red pages get ignored for precisely the same reason. We're all used to all the red, which
[pypy-dev] [PATCH] Fix segmentation fault on parsing empty list-assignments
This same patch is on bitbucket at https://bitbucket.org/price/pypy, where I've sent a pull request. Holger Krekel suggested on IRC that I send mail here. If others have different preferences for how to submit a patch, let me know. Before this patch, [] = [] would abort the interpreter, with a segmentation fault if in pypy-c. A segmentation fault is always bad, but in this case further the code is valid Python, if not very useful. (In my commit message on bitbucket, I incorrectly said it only affects invalid Python, like [] += [].) Greg diff -r eb44d135f334 -r 0db4ac049ea2 pypy/interpreter/astcompiler/asthelpers.py --- a/pypy/interpreter/astcompiler/asthelpers.py Tue Mar 08 11:14:36 2011 -0800 +++ b/pypy/interpreter/astcompiler/asthelpers.py Wed Mar 09 03:26:54 2011 -0800 @@ -40,9 +40,10 @@ return self.elts def set_context(self, ctx): - for elt in self.elts: - elt.set_context(ctx) - self.ctx = ctx + if self.elts: + for elt in self.elts: + elt.set_context(ctx) + self.ctx = ctx class __extend__(ast.Attribute): diff -r eb44d135f334 -r 0db4ac049ea2 pypy/interpreter/astcompiler/test/test_compiler.py --- a/pypy/interpreter/astcompiler/test/test_compiler.py Tue Mar 08 11:14:36 2011 -0800 +++ b/pypy/interpreter/astcompiler/test/test_compiler.py Wed Mar 09 03:26:54 2011 -0800 @@ -70,6 +70,9 @@ st = simple_test + def error_test(self, source, exc_type): + py.test.raises(exc_type, self.simple_test, source, None, None) + def test_long_jump(self): func = def f(x): y = 0 @@ -98,11 +101,13 @@ self.simple_test(stmt, type(x), int) def test_tuple_assign(self): + yield self.error_test, () = 1, SyntaxError yield self.simple_test, x,= 1,, x, 1 yield self.simple_test, x,y = 1,2, x,y, (1, 2) yield self.simple_test, x,y,z = 1,2,3, x,y,z, (1, 2, 3) yield self.simple_test, x,y,z,t = 1,2,3,4, x,y,z,t, (1, 2, 3, 4) yield self.simple_test, x,y,x,t = 1,2,3,4, x,y,t, (3, 2, 4) + yield self.simple_test, [] = [], 1, 1 yield self.simple_test, [x]= 1,, x, 1 yield self.simple_test, [x,y] = [1,2], x,y, (1, 2) yield self.simple_test, [x,y,z] = 1,2,3, x,y,z, (1, 2, 3) ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Change to the frontpage of speed.pypy.org
On Wed, Mar 9, 2011 at 8:19 AM, Massa, Harald Armin c...@ghum.de wrote: I really, really like the new display! And it motivated me to dig into the data ... which is a great result on its own. The first question for myself was hey, why is it slow on slowspitfire, and, btw, what is slowspitfire? Could that be something that my application does, too? But I was unable to find out what slowspitfire is doing ... I found spitfire, which does some HTML templating stuff, and deducted, that slowspitfire will do some slow HTML templating stuff. Where did I click wrong? Is there a path down to the slowspitfire.py file or an explanation what slowspitfire is doing? Harald https://bitbucket.org/pypy/benchmarks/src/b93caae762a0/unladen_swallow/performance/bm_spitfire.py It's creating a very large template table (1000x1000 elements I think) The explanation why it's slow is a bit longish. It's a combination of factors, including very long lists with GC objects in it, using ''.join(list) instead of cStringIO (the latter is faster and yes, it is a bug) and a bit of other factors. -- GHUM GmbH Harald Armin Massa Spielberger Straße 49 70435 Stuttgart 0173/9409607 Amtsgericht Stuttgart, HRB 734971 - persuadere. et programmare ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Change to the frontpage of speed.pypy.org
Hey Miquel. A small feature request ;-) Can we get favicon? Cheers, fijal ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Change to the frontpage of speed.pypy.org
In a message of Wed, 09 Mar 2011 09:26:29 +0100, Miquel Torres writes: I see. There is an easy solution for that, at least for the moment: enabling zooming. I just did that, and you can now use zooming in a timeline plot to select a narrower yaxis range or just view a particular area in detail. A single click resets the zoom level. If that is not enough, we can discuss a better solution when you have mor e time. This is terrific. Thank you very much. Laura ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] [PATCH] Fix segmentation fault on parsing empty list-assignments
On 09/03/11 13:29, Greg Price wrote: This same patch is on bitbucket at https://bitbucket.org/price/pypy, where I've sent a pull request. Holger Krekel suggested on IRC that I send mail here. If others have different preferences for how to submit a patch, let me know. Hi Greg, I have pushed your commit upstream, thanks for help! ciao, anto ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Change to the frontpage of speed.pypy.org
Hi all, On Wed, Mar 9, 2011 at 14:21, Maciej Fijalkowski fij...@gmail.com wrote: On Wed, Mar 9, 2011 at 8:19 AM, Massa, Harald Armin c...@ghum.de wrote: I really, really like the new display! Congratulations to Miquel for the great work! A minor comment about the homepage: the answer to How fast is PyPy? would better stay close to the question, i.e. above Plot 1 (at least with the current wording). But I was unable to find out what slowspitfire is doing ... I found spitfire, which does some HTML templating stuff, and deducted, that slowspitfire will do some slow HTML templating stuff. Where did I click wrong? Is there a path down to the slowspitfire.py file or an explanation what slowspitfire is doing? Is there a place the answer this information to the website? I propose a link to the source in each benchmark page. Additionally, on the frontpage the individual benchmark names could be links to the benchmark page, like in the grid view. The specific benchmark has no description: http://speed.pypy.org/timeline/?exe=1%2C3base=2%2B35ben=slowspitfireenv=tannitrevs=200 Following the spitfire_cstringio template, I propose the following rewording of the below answer (I'm not entirely happy but I guess Maciej can fix it easily if it's too bad): slowspitfire: Uses the Spitfire template system to build a 1000x1000-cell HTML table; it differs from spitfire which is slower on PyPy: it uses .join(list) instead of cStringIO module, has very long lists with GC objects in it, and some other smaller problems. https://bitbucket.org/pypy/benchmarks/src/b93caae762a0/unladen_swallow/performance/bm_spitfire.py It's creating a very large template table (1000x1000 elements I think) The explanation why it's slow is a bit longish. It's a combination of factors, including very long lists with GC objects in it, using ''.join(list) instead of cStringIO (the latter is faster and yes, it is a bug) and a bit of other factors. Another small problem I had with zooming (which is really cool, BTW): There is an easy solution for that, at least for the moment: enabling zooming. I just did that, and you can now use zooming in a timeline plot to select a narrower yaxis range or just view a particular area in detail. A single click resets the zoom level. While trying this I clicked on a revision; I immediately clicked on back, but I was brought too much backwards, to the grid of all benchmarks, which loads slow enough for one to notice. If you instead click Back from a specific benchmark page, you are brought back to the home. Fixing this without loading a separate page for each plot seems hard; however, it seems that e.g. Facebook handles this by modifying the URL part after #, so that the page is not reloaded from scratch, but I'm no web developer, so you probably know better than me. Cheers, -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Change to the frontpage of speed.pypy.org
Thank you very, very much. I am sorry if I was short with you last night, I was very tired. This is great. I am fine with saying that the geometric mean is X times faster. I'd like to come up with a good way to say that we are faster on real programs, not just contrived examples, but this will take work and thought. Maybe some of the people who are on this list but not at pycon can come up with something. Thank you very much. I'm very happy now. Laura ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Change to the frontpage of speed.pypy.org
But I was unable to find out what slowspitfire is doing ... I found spitfire, which does some HTML templating stuff, and deducted, that slowspitfire will do some slow HTML templating stuff. Where did I click wrong? Is there a path down to the slowspitfire.py file or an explanation what slowspitfire is doing? You definitely have a point, and there are features planned that should make benchmark-discovery more user-friendly. 2011/3/9 Massa, Harald Armin c...@ghum.de: I really, really like the new display! And it motivated me to dig into the data ... which is a great result on its own. The first question for myself was hey, why is it slow on slowspitfire, and, btw, what is slowspitfire? Could that be something that my application does, too? But I was unable to find out what slowspitfire is doing ... I found spitfire, which does some HTML templating stuff, and deducted, that slowspitfire will do some slow HTML templating stuff. Where did I click wrong? Is there a path down to the slowspitfire.py file or an explanation what slowspitfire is doing? Harald -- GHUM GmbH Harald Armin Massa Spielberger Straße 49 70435 Stuttgart 0173/9409607 Amtsgericht Stuttgart, HRB 734971 - persuadere. et programmare ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Change to the frontpage of speed.pypy.org
I am sorry if I was short with you last night, I was very tired. No problem Laura, that only shows that you care ;-) It would have also been much easier to discuss things if I had finished the changes a couple of days earlier instead of in the middle of your arrival in PyCon. But I didn't have the time. Life's like that. Criticism is sometimes hard to swallow, but very necessary to improve, as a person and as a project. As long as people use reasoned arguments and are respectful, it is not only OK to do so: It is your duty! ;-) So thanks to you too. Cheers, Miquel 2011/3/9 Laura Creighton l...@openend.se: Thank you very, very much. I am sorry if I was short with you last night, I was very tired. This is great. I am fine with saying that the geometric mean is X times faster. I'd like to come up with a good way to say that we are faster on real programs, not just contrived examples, but this will take work and thought. Maybe some of the people who are on this list but not at pycon can come up with something. Thank you very much. I'm very happy now. Laura ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
[pypy-dev] Regex for RPython
I know rlib has a regex module, but it seems to be limited to only recognizing characters and not returning groups. Is there a full featured regex parser somewhere that is supported by rpython? Thanks, Timothy -- “One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.” (Robert Firth) ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
[pypy-dev] wrong precedence of __radd__ vs list __iadd__
The following program works in CPython, but fails in PyPy: class C(object): def __radd__(self, other): other.append(1) return other z = [] z += C() print z # should be [1] In PyPy, this fails with TypeError: 'C' object is not iterable. The issue is that PyPy is allowing the list's inplace_add behavior to take precedence over the C object's __radd__ method, while CPython does the reverse. A similar issue occurs in the following program: class C(object): def __rmul__(self, other): other *= 2 return other def __index__(self): return 3 print [1] * C() # should be [1,1] where PyPy instead prints [1,1,1]. In Python, if the LHS of a foo-augmented assignment has an in-place foo method defined, then that method is given the first opportunity to perform the operation, followed by the foo method and the RHS's reverse-foo methods. http://docs.python.org/reference/datamodel.html#object.__iadd__ But: CPython's 'list' type does not have any numeric methods defined at the C level, in-place or otherwise. See PyList_Type in listobject.c, where tp_as_number is null. So an in-place addition falls through to the RHS's nb_add, if present, and for a class with metaclass 'type' and an __radd__() method this is slot_nb_add() from typeobject.c, which calls __radd__(). So when z += [1,2] runs in CPython, it works via a further wrinkle. The meat of the implementation of INPLACE_ADD is PyNumber_InPlaceAdd() in abstract.c. When all numeric methods fail to handle the addition, this function falls back to sequence methods, looking for sq_inplace_concat or sq_concat methods on the LHS. 'list' has these methods, so its sq_inplace_concat method handles the operation. Similarly, PyNumber_InPlaceMultiply() tries all numeric methods first, before falling back to sq_inplace_repeat or sq_repeat. In PyPy, by contrast, it doesn't look like there's any logic for falling back to a concatenate method if numeric methods fail. Instead, if I'm reading correctly, the inplace_add__List_ANY() method in pypy.objspace.std.listobject runs *before* any numeric methods on the RHS are tried. For the narrow case at hand, a sufficient hack for e.g. addition would be to teach inplace_add__List_ANY() to look for an __radd__() first. At a quick grep through CPython, full compatibility by that approach would require similar hacks for bytes, array.array, collections.deque, str, unicode, buffer, and bytearray; plus the users of structseq.c, including type(sys.float_info), pwd.struct_passwd, grp.struct_group, posix.stat_result, time.struct_time, and several others. Those types in CPython all have sq_concat and not nb_add or nb_inplace_add, so they will all permit an RHS's __radd__ to take precedence, but then fall back on concatenation if no such method exists. A more comprehensive approach would teach the generic dispatch code about sequence methods falling after numeric methods in the dispatch sequence. Then each sequence type would need to identify its sequence methods for that code. Perhaps a hybrid approach is best. Assume that no type with a sq_concat also has a nb_add, and no type with a sq_repeat also has a nb_multiply. (I believe this is true for all built-in, stdlib, and pure-Python types.) Then whenever we define a method {inplace_,}{add,mul}__Foo_ANY, for a sequence type Foo, it's enough for that method to check for __r{add,mul}__ on the RHS. So we can write a generic helper function and use that in each such method. Does that last approach sound reasonable? I'm happy to go and implement it, but I'm open to other suggestions. I've posted tests (which fail) at https://bitbucket.org/price/pypy-queue/changeset/9dd9c2a5116a Greg ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Regex for RPython
On Wed, Mar 9, 2011 at 5:30 PM, Timothy Baldridge tbaldri...@gmail.comwrote: I know rlib has a regex module, but it seems to be limited to only recognizing characters and not returning groups. Is there a full featured regex parser somewhere that is supported by rpython? Thanks, Timothy -- “One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.” (Robert Firth) ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev The PyPy python interpret just uses the stdlib regex parser, so no I don't think there is a regex parser. Alex -- I disapprove of what you say, but I will defend to the death your right to say it. -- Evelyn Beatrice Hall (summarizing Voltaire) The people's good is the highest law. -- Cicero ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev