Re: [pypy-dev] Change to the frontpage of speed.pypy.org

2011-03-09 Thread Miquel Torres
I see.

There is an easy solution for that, at least for the moment: enabling
zooming. I just did that, and you can now use zooming in a timeline
plot to select a narrower yaxis range or just view a particular area
in detail. A single click resets the zoom level.

If that is not enough, we can discuss a better solution when you have more time.



2011/3/9 Laura Creighton l...@openend.se:
 In a message of Tue, 08 Mar 2011 18:17:17 +0100, Miquel Torres writes:
you mean this timeline, right?:
http://speed.pypy.org/timeline/?ben=3Dspectral-norm

Because the December 22 result is so high, the yaxis maximum goes up
to 2.5, thus having less space for the more interesting  1 range,
right?

 yes


Regarding mozilla, do you mean this site?: http://arewefastyet.com/
I can see their timelines have some holes, probably failed runs...

 I was seeing something else, and I don't have a url. I think that what
 I was seeing is what they use to make the arewefastyet.com site.

I see a problem with the approach you suggest. Entering an arbitrary
maximum yaxis number is not a good thing. I think the onus is there on
the benchmark infrastructure to not send results that aren't
statistically significant. See Javastats
(http://www.elis.ugent.be/en/JavaStats), or ReBench
(https://github.com/smarr/ReBench).

 I don't think you understand what I want. Sorry if I was unclear.
 I am fine with the way that the benchmarks are displayed right now,
 but I want a way to dynamically do there and say, I want to throw
 away all data that is higher than a certain figure, or lower than
 a certain one, because right now I am onoy interested in results
 in a certain range.

 I'm not looking to change what the benchmark says for everybody
 who looks at it, or change how it is presented in general.  I just
 want a way to zoom in and only see results in the range that
 interests me.  You and anybody else might have a different
 range that interests you, and you should be free to get this as well.

Something that can be done on the Codespeed side is to treat
differently points that have a too high stddev. In the aforementioned
spectral-norm timeline, the stddev floor is around 0.0050, while the
spike has a 0.30 stddev, much higher. A strict mode could be
implemented that invalidates or hides statistically unsound data.

 The problem is that I want to throw away arbitrary amounts of data
 regardless of whether they are statistically significant or not,
 on the basis of I know what I want to see, and this other stuff
 is getting in the way or being distractingÃ.

Btw., I had written to the arewefastyet guys about the possibility of
configuring a Codespeed instance for them. We may yet see
collaboration there ;-)

Miquel

 Laura

___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Change to the frontpage of speed.pypy.org

2011-03-09 Thread Miquel Torres
 And I want the 'lestest results' stuff gone from the front page.
 It's as misleading as ever.  And I have done a poll of developers.  it's
 not just me.  Nobody finds it valuable.  Because things stay red forever,
 we all ignore it all the time and go directly to the raw results of
 runs, which is what we are interested in.  This also tells us of
 improvements, which we are also interested in, because unexpected
 improvements often mean something is very wrong.
Ok, that is pretty clear.

 Explaining that they aren't current, or latest results, but instead
 'sometime in the past when we were bad once' is getting irritating.
 Can you please move it someplace else so I don't have to have this
 conversation with pycon attendees any more?
Sorry about that.
I have removed it from the site.

 Later, when pycon is over, we can discuss and work out a better design
 for informing developers that a given build may have broken things.  This
 way is not working.

Yes, we can figure out a better approach after PyCon.

 And I don't think that you can use the geometric mean to prove a thing
 with this.  So I think talking about it makes us look bad -- we are
 making claims based on either bad science, or pseudo-science.
I agree that it wasn't the best explanation. I have removed most text,
so that it doesn't explicitly say that it means PyPy *is* X times
faster. It now just states that the geometric mean is X times faster.
If it is still too much, we can remove all text. But then some people
won't understand the numbers.

We can also remove the second plot.

The mean of a set of benchmarks that does not represent a balanced
real-world task mix is of course not very good. But then again the
world is complicated. And, as a normal Python developer, I find the
second plot extremely interesting, because it gives me a ballpark idea
of where the PyPy project is heading to. Before I decide to use PyPy
in production instead of CPython, I will do my own testing for *my*
application, but I assure you that not having a ballpark average won't
be a plus in considering to invest the time to test and adopt PyPy.
But that is my opinion of course ;-)

Have fun at PyCon!
Miquel


2011/3/9 Laura Creighton l...@openend.se:
 In a message of Tue, 08 Mar 2011 20:20:06 +0100, Miquel Torres writes:
Ok, I just committed the changes.

They address two general cases:
- You want to know how fast PyPy is *now* compared to CPython in
different benchmark scenarios, or tasks.
- You want to know how PyPy has been *improving* overall over the last re
le=
ases

That is now answered on the front page, and the reports are now much
less prominent (I didn't change the logic because it is something I
want to do properly, not just as a hack for speed.pypy).
- I have not yet addressed the smaller is better point.

I am aware that the wording of the faster on average needs to be
improved (I am discussing it with Holger even now ;). Please chime in
so that we can have a good paragraph that is informative and short
enough while at the same time not being misleading.

Miquel

 The graphic is lovely.

 you have a sÃpelling error s/taks/task/.

 Many of us are at PyCon now, so working on wording may not be
 something we have time for now.  I am not sure that the geometric
 mean of all benchmarks give you anything meaningful, so I would
 have avoided saying anything like that.  More specifically, I think
 that there is a division between some highly mathematical programs,
 where you might get a speedup of 20 to 30 times CPython, and the
 benchmarks whcih I find much more meaningful, those that represent
 actualy python programs -- where I think we are typically only between
 2 and 3 times faster.

 The only reason to have some of the benchmarks is because they are
 well known.  So people expect them.   But doing very well on them is
 not actually all that significant -- it would be easy to write something
 that is great and running these contrived, synthetic benchmarks, but
 really lousy at running real python code.

 And I don't think that you can use the geometric mean to prove a thing
 with this.  So I think talking about it makes us look bad -- we are
 making claims based on either bad science, or pseudo-science.

 And I want the 'lestest results' stuff gone from the front page.
 It's as misleading as ever.  And I have done a poll of developers.  it's
 not just me.  Nobody finds it valuable.  Because things stay red forever,
 we all ignore it all the time and go directly to the raw results of
 runs, which is what we are interested in.  This also tells us of
 improvements, which we are also interested in, because unexpected
 improvements often mean something is very wrong.

 The whole thing has the same problem as those popup windows 'do you
 really want to delete that file? confirm y/n'.  You get used to typing
 y.  Then you do it when you meant not to save the file.  The red pages
 get ignored for precisely the same reason.  We're all used to all the
 red, which 

[pypy-dev] [PATCH] Fix segmentation fault on parsing empty list-assignments

2011-03-09 Thread Greg Price
This same patch is on bitbucket at https://bitbucket.org/price/pypy,
where I've sent a pull request. Holger Krekel suggested on IRC that I
send mail here. If others have different preferences for how to submit
a patch, let me know.

Before this patch, [] = [] would abort the interpreter, with a
segmentation fault if in pypy-c. A segmentation fault is always bad,
but in this case further the code is valid Python, if not very useful.
(In my commit message on bitbucket, I incorrectly said it only affects
invalid Python, like [] += [].)

Greg


diff -r eb44d135f334 -r 0db4ac049ea2 pypy/interpreter/astcompiler/asthelpers.py
--- a/pypy/interpreter/astcompiler/asthelpers.py Tue Mar 08 11:14:36 2011 -0800
+++ b/pypy/interpreter/astcompiler/asthelpers.py Wed Mar 09 03:26:54 2011 -0800
@@ -40,9 +40,10 @@
         return self.elts

     def set_context(self, ctx):
-        for elt in self.elts:
-            elt.set_context(ctx)
-        self.ctx = ctx
+        if self.elts:
+            for elt in self.elts:
+                elt.set_context(ctx)
+            self.ctx = ctx


 class __extend__(ast.Attribute):
diff -r eb44d135f334 -r 0db4ac049ea2
pypy/interpreter/astcompiler/test/test_compiler.py
--- a/pypy/interpreter/astcompiler/test/test_compiler.py Tue Mar 08
11:14:36 2011 -0800
+++ b/pypy/interpreter/astcompiler/test/test_compiler.py Wed Mar 09
03:26:54 2011 -0800
@@ -70,6 +70,9 @@

     st = simple_test

+    def error_test(self, source, exc_type):
+        py.test.raises(exc_type, self.simple_test, source, None, None)
+
     def test_long_jump(self):
         func = def f(x):
     y = 0
@@ -98,11 +101,13 @@
         self.simple_test(stmt, type(x), int)

     def test_tuple_assign(self):
+        yield self.error_test, () = 1, SyntaxError
         yield self.simple_test, x,= 1,, x, 1
         yield self.simple_test, x,y = 1,2, x,y, (1, 2)
         yield self.simple_test, x,y,z = 1,2,3, x,y,z, (1, 2, 3)
         yield self.simple_test, x,y,z,t = 1,2,3,4, x,y,z,t, (1, 2, 3, 4)
         yield self.simple_test, x,y,x,t = 1,2,3,4, x,y,t, (3, 2, 4)
+        yield self.simple_test, [] = [], 1, 1
         yield self.simple_test, [x]= 1,, x, 1
         yield self.simple_test, [x,y] = [1,2], x,y, (1, 2)
         yield self.simple_test, [x,y,z] = 1,2,3, x,y,z, (1, 2, 3)
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev

Re: [pypy-dev] Change to the frontpage of speed.pypy.org

2011-03-09 Thread Maciej Fijalkowski
On Wed, Mar 9, 2011 at 8:19 AM, Massa, Harald Armin c...@ghum.de wrote:
 I really, really like the new display!

 And it motivated me to dig into the data ... which is a great result on its 
 own.

 The first question for myself was hey, why is it slow on
 slowspitfire, and, btw, what is slowspitfire? Could that be something
 that my application does, too?

 But I was unable to find out what slowspitfire is doing ... I found
 spitfire, which does some HTML templating stuff, and deducted, that
 slowspitfire will do some slow HTML templating stuff. Where did I
 click wrong? Is there a path down to the slowspitfire.py file or an
 explanation what slowspitfire is doing?

 Harald


https://bitbucket.org/pypy/benchmarks/src/b93caae762a0/unladen_swallow/performance/bm_spitfire.py

It's creating a very large template table (1000x1000 elements I think)

The explanation why it's slow is a bit longish. It's a combination
of factors, including very long lists with GC objects in it, using
''.join(list) instead of cStringIO (the latter is faster and yes, it
is a bug) and a bit of other factors.


 --
 GHUM GmbH
 Harald Armin Massa
 Spielberger Straße 49
 70435 Stuttgart
 0173/9409607

 Amtsgericht Stuttgart, HRB 734971
 -
 persuadere.
 et programmare

___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev

Re: [pypy-dev] Change to the frontpage of speed.pypy.org

2011-03-09 Thread Maciej Fijalkowski
Hey Miquel.

A small feature request ;-) Can we get favicon?

Cheers,
fijal
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Change to the frontpage of speed.pypy.org

2011-03-09 Thread Laura Creighton
In a message of Wed, 09 Mar 2011 09:26:29 +0100, Miquel Torres writes:
I see.

There is an easy solution for that, at least for the moment: enabling
zooming. I just did that, and you can now use zooming in a timeline
plot to select a narrower yaxis range or just view a particular area
in detail. A single click resets the zoom level.

If that is not enough, we can discuss a better solution when you have mor
e time.

This is terrific.  Thank you very much.

Laura
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] [PATCH] Fix segmentation fault on parsing empty list-assignments

2011-03-09 Thread Antonio Cuni
On 09/03/11 13:29, Greg Price wrote:
 This same patch is on bitbucket at https://bitbucket.org/price/pypy,
 where I've sent a pull request. Holger Krekel suggested on IRC that I
 send mail here. If others have different preferences for how to submit
 a patch, let me know.

Hi Greg,
I have pushed your commit upstream, thanks for help!

ciao,
anto
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Change to the frontpage of speed.pypy.org

2011-03-09 Thread Paolo Giarrusso
Hi all,

On Wed, Mar 9, 2011 at 14:21, Maciej Fijalkowski fij...@gmail.com wrote:
 On Wed, Mar 9, 2011 at 8:19 AM, Massa, Harald Armin c...@ghum.de wrote:
  I really, really like the new display!
Congratulations to Miquel for the great work!

A minor comment about the homepage: the answer to How fast is PyPy?
would better stay close to the question, i.e. above Plot 1 (at least
with the current wording).

  But I was unable to find out what slowspitfire is doing ... I found
  spitfire, which does some HTML templating stuff, and deducted, that
  slowspitfire will do some slow HTML templating stuff. Where did I
  click wrong?

  Is there a path down to the slowspitfire.py file or an
  explanation what slowspitfire is doing?

Is there a place the answer this information to the website? I propose
a link to the source in each benchmark page.
Additionally, on the frontpage the individual benchmark names could be
links to the benchmark page, like in the grid view.

The specific benchmark has no description:
http://speed.pypy.org/timeline/?exe=1%2C3base=2%2B35ben=slowspitfireenv=tannitrevs=200

Following the spitfire_cstringio template, I propose the following
rewording of the below answer (I'm not entirely happy but I guess
Maciej can fix it easily if it's too bad):
slowspitfire: Uses the Spitfire template system to build a
1000x1000-cell HTML table; it differs from spitfire which is slower on
PyPy: it uses .join(list) instead of cStringIO module, has very long
lists with GC objects in it, and some other smaller problems.

 https://bitbucket.org/pypy/benchmarks/src/b93caae762a0/unladen_swallow/performance/bm_spitfire.py

 It's creating a very large template table (1000x1000 elements I think)

 The explanation why it's slow is a bit longish. It's a combination
 of factors, including very long lists with GC objects in it, using
 ''.join(list) instead of cStringIO (the latter is faster and yes, it
 is a bug) and a bit of other factors.

Another small problem I had with zooming (which is really cool, BTW):

There is an easy solution for that, at least for the moment: enabling
zooming. I just did that, and you can now use zooming in a timeline
plot to select a narrower yaxis range or just view a particular area
in detail. A single click resets the zoom level.

While trying this I clicked on a revision; I immediately clicked on
back, but I was brought too much backwards, to the grid of all
benchmarks, which loads slow enough for one to notice. If you instead
click Back from a specific benchmark page, you are brought back to
the home.
Fixing this without loading a separate page for each plot seems hard;
however, it seems that e.g. Facebook handles this by modifying the URL
part after #, so that the page is not reloaded from scratch, but I'm
no web developer, so you probably know better than me.

Cheers,
--
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Change to the frontpage of speed.pypy.org

2011-03-09 Thread Laura Creighton
Thank you very, very much.  I am sorry if I was short with you last
night, I was very tired.  This is great.  I am fine with
saying that the geometric mean is X times faster.  I'd  like to come up
with a good way to say that we are faster on real programs, not just
contrived examples, but this will take work and thought.  Maybe some
of the people who are on this list but not at pycon can come up with
something.

Thank you very much.  I'm very happy now.

Laura
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Change to the frontpage of speed.pypy.org

2011-03-09 Thread Miquel Torres
 But I was unable to find out what slowspitfire is doing ... I found
 spitfire, which does some HTML templating stuff, and deducted, that
 slowspitfire will do some slow HTML templating stuff. Where did I
 click wrong? Is there a path down to the slowspitfire.py file or an
 explanation what slowspitfire is doing?

You definitely have a point, and there are features planned that
should make benchmark-discovery more user-friendly.



2011/3/9 Massa, Harald Armin c...@ghum.de:
 I really, really like the new display!

 And it motivated me to dig into the data ... which is a great result on its 
 own.

 The first question for myself was hey, why is it slow on
 slowspitfire, and, btw, what is slowspitfire? Could that be something
 that my application does, too?

 But I was unable to find out what slowspitfire is doing ... I found
 spitfire, which does some HTML templating stuff, and deducted, that
 slowspitfire will do some slow HTML templating stuff. Where did I
 click wrong? Is there a path down to the slowspitfire.py file or an
 explanation what slowspitfire is doing?

 Harald


 --
 GHUM GmbH
 Harald Armin Massa
 Spielberger Straße 49
 70435 Stuttgart
 0173/9409607

 Amtsgericht Stuttgart, HRB 734971
 -
 persuadere.
 et programmare

___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Change to the frontpage of speed.pypy.org

2011-03-09 Thread Miquel Torres
 I am sorry if I was short with you last night, I was very tired.

No problem Laura, that only shows that you care ;-)
It would have also been much easier to discuss things if I had
finished the changes a couple of days earlier instead of in the middle
of your arrival in PyCon. But I didn't have the time. Life's like
that.

Criticism is sometimes hard to swallow, but very necessary to improve,
as a person and as a project.
As long as people use reasoned arguments and are respectful, it is not
only OK to do so: It is your duty! ;-)

So thanks to you too.

Cheers,
Miquel


2011/3/9 Laura Creighton l...@openend.se:
 Thank you very, very much.  I am sorry if I was short with you last
 night, I was very tired.  This is great.  I am fine with
 saying that the geometric mean is X times faster.  I'd  like to come up
 with a good way to say that we are faster on real programs, not just
 contrived examples, but this will take work and thought.  Maybe some
 of the people who are on this list but not at pycon can come up with
 something.

 Thank you very much.  I'm very happy now.

 Laura

___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


[pypy-dev] Regex for RPython

2011-03-09 Thread Timothy Baldridge
I know rlib has a regex module, but it seems to be limited to only
recognizing characters and not returning groups. Is there a full
featured regex parser somewhere that is supported by rpython?

Thanks,

Timothy

-- 
“One of the main causes of the fall of the Roman Empire was
that–lacking zero–they had no way to indicate successful termination
of their C programs.”
(Robert Firth)
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


[pypy-dev] wrong precedence of __radd__ vs list __iadd__

2011-03-09 Thread Greg Price
The following program works in CPython, but fails in PyPy:

  class C(object):
def __radd__(self, other):
  other.append(1)
  return other

  z = []
  z += C()
  print z  # should be [1]

In PyPy, this fails with TypeError: 'C' object is not iterable.

The issue is that PyPy is allowing the list's inplace_add behavior to
take precedence over the C object's __radd__ method, while CPython
does the reverse.

A similar issue occurs in the following program:

  class C(object):
  def __rmul__(self, other):
  other *= 2
  return other
  def __index__(self):
  return 3

  print [1] * C() # should be [1,1]

where PyPy instead prints [1,1,1].


In Python, if the LHS of a foo-augmented assignment has an in-place
foo method defined, then that method is given the first opportunity to
perform the operation, followed by the foo method and the RHS's
reverse-foo methods.
http://docs.python.org/reference/datamodel.html#object.__iadd__

But: CPython's 'list' type does not have any numeric methods defined
at the C level, in-place or otherwise. See PyList_Type in
listobject.c, where tp_as_number is null. So an in-place addition
falls through to the RHS's nb_add, if present, and for a class with
metaclass 'type' and an __radd__() method this is slot_nb_add() from
typeobject.c, which calls __radd__().

So when z += [1,2] runs in CPython, it works via a further wrinkle.
The meat of the implementation of INPLACE_ADD is PyNumber_InPlaceAdd()
in abstract.c. When all numeric methods fail to handle the addition,
this function falls back to sequence methods, looking for
sq_inplace_concat or sq_concat methods on the LHS. 'list' has these
methods, so its sq_inplace_concat method handles the operation.

Similarly, PyNumber_InPlaceMultiply() tries all numeric methods first,
before falling back to sq_inplace_repeat or sq_repeat.

In PyPy, by contrast, it doesn't look like there's any logic for
falling back to a concatenate method if numeric methods fail. Instead,
if I'm reading correctly, the inplace_add__List_ANY() method in
pypy.objspace.std.listobject runs *before* any numeric methods on the
RHS are tried.


For the narrow case at hand, a sufficient hack for e.g. addition would
be to teach inplace_add__List_ANY() to look for an __radd__() first.
At a quick grep through CPython, full compatibility by that approach
would require similar hacks for bytes, array.array, collections.deque,
str, unicode, buffer, and bytearray; plus the users of structseq.c,
including type(sys.float_info), pwd.struct_passwd, grp.struct_group,
posix.stat_result, time.struct_time, and several others. Those types
in CPython all have sq_concat and not nb_add or nb_inplace_add, so
they will all permit an RHS's __radd__ to take precedence, but then
fall back on concatenation if no such method exists.

A more comprehensive approach would teach the generic dispatch code
about sequence methods falling after numeric methods in the dispatch
sequence. Then each sequence type would need to identify its sequence
methods for that code.

Perhaps a hybrid approach is best. Assume that no type with a
sq_concat also has a nb_add, and no type with a sq_repeat also has a
nb_multiply. (I believe this is true for all built-in, stdlib, and
pure-Python types.) Then whenever we define a method
{inplace_,}{add,mul}__Foo_ANY, for a sequence type Foo, it's enough
for that method to check for __r{add,mul}__ on the RHS. So we can
write a generic helper function and use that in each such method.

Does that last approach sound reasonable? I'm happy to go and
implement it, but I'm open to other suggestions.


I've posted tests (which fail) at
  https://bitbucket.org/price/pypy-queue/changeset/9dd9c2a5116a

Greg
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Regex for RPython

2011-03-09 Thread Alex Gaynor
On Wed, Mar 9, 2011 at 5:30 PM, Timothy Baldridge tbaldri...@gmail.comwrote:

 I know rlib has a regex module, but it seems to be limited to only
 recognizing characters and not returning groups. Is there a full
 featured regex parser somewhere that is supported by rpython?

 Thanks,

 Timothy

 --
 “One of the main causes of the fall of the Roman Empire was
 that–lacking zero–they had no way to indicate successful termination
 of their C programs.”
 (Robert Firth)
 ___
 pypy-dev@codespeak.net
 http://codespeak.net/mailman/listinfo/pypy-dev


The PyPy python interpret just uses the stdlib regex parser, so no I don't
think there is a regex parser.

Alex

-- 
I disapprove of what you say, but I will defend to the death your right to
say it. -- Evelyn Beatrice Hall (summarizing Voltaire)
The people's good is the highest law. -- Cicero
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev