Re: Building CPython

2015-05-17 Thread Marko Rauhamaa

Iterating. A line still missing:

 
 ### Simple OO Framework

 class _O: pass

 def make_object(*procedures, base=None, bases=None):
 o = _O()
 methods = {}
 o.__methods__ = methods
 o.__derived__ = None
 if base is not None:
 _inherit_single(o, base)
 elif bases is not None:
 _inherit_multi(o, bases)
 for procedure in procedures:
 methods[procedure.__name__] = procedure
 def method(*args, __procedure__=procedure, __dispatch__=True, 
 **kwargs):
 if not __dispatch__ or o.__derived__ is None:
 return __procedure__(*args, **kwargs)
 derived = o
 while derived.__derived__ is not None:
 derived = derived.__derived__
 return getattr(derived, __procedure__.__name__)(*args, **kwargs)
 setattr(o, procedure.__name__, method)
 return o

 def _inherit_single(o, base):
 methods = o.__methods__
 for name, method in base.__methods__.items():
 methods[name] = method
 setattr(o, name, method)
  base.__derived__ = o

 def _inherit_multi(o, bases):
 for base in bases:
 _inherit_single(o, base)

 def delegate(method, *args, **kwargs):
 return method(*args, __dispatch__=False, **kwargs)

 ### Used as follows

 def TCPClient():
 def connect(address):
 pass
 def shut_down():
 pass
 return make_object(connect, shut_down)

 def SMTPClient():
 tcp_client = TCPClient()
 def connect(address):
 delegate(tcp_client.connect, address)
 do_stuff()
 def send_message(message):
 pass
 return make_object(connect, send_message, base=tcp_client)

 client = SMTPClient()
 client.connect(None)
 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-17 Thread Jonas Wielicki
On 16.05.2015 02:55, Gregory Ewing wrote:
 BartC wrote:
 For example, there is a /specific/ byte-code called BINARY_ADD, which
 then proceeds to call a /generic/ binary-op handler! This throws away
 the advantage of knowing at byte-code generation time exactly which
 operation is needed.
 
 While inlining the binary-op handling might give you a
 slightly shorter code path, it wouldn't necessarily speed
 anything up. It's possible, for example, that the shared
 binary-op handler fits in the instruction cache, but the
 various inlined copies of it don't, leading to a slowdown.
 
 The only way to be sure about things like that is to try
 them and measure. The days when you could predict the speed
 of a program just by counting the number of instructions
 executed are long gone.

That, and also, the days where you could guess the number of
instructions executed from looking at the code are also gone. Compilers,
and especially C or C++ compilers, are huge beasts with an insane number
of different optimizations which yield pretty impressive results. Not to
mention that they may know the architecture you’re targeting and can
optimize each build for a different architecture; which is not really
possible if you do optimizations which e.g. rely on cache
characteristics or instruction timings or interactions by hand.

I changed my habits to just trust my compiler a few years ago and have
more readable code in exchange for that. The compiler does a fairly
great job, although gcc still outruns clang for *my* usecases. YMMV.

regards,
jwi




signature.asc
Description: OpenPGP digital signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-17 Thread BartC

On 17/05/2015 13:25, Jonas Wielicki wrote:

On 16.05.2015 02:55, Gregory Ewing wrote:

BartC wrote:

For example, there is a /specific/ byte-code called BINARY_ADD, which
then proceeds to call a /generic/ binary-op handler! This throws away
the advantage of knowing at byte-code generation time exactly which
operation is needed.


While inlining the binary-op handling might give you a
slightly shorter code path, it wouldn't necessarily speed
anything up. It's possible, for example, that the shared
binary-op handler fits in the instruction cache, but the
various inlined copies of it don't, leading to a slowdown.

The only way to be sure about things like that is to try
them and measure. The days when you could predict the speed
of a program just by counting the number of instructions
executed are long gone.


That, and also, the days where you could guess the number of
instructions executed from looking at the code are also gone. Compilers,
and especially C or C++ compilers, are huge beasts with an insane number
of different optimizations which yield pretty impressive results. Not to
mention that they may know the architecture you’re targeting and can
optimize each build for a different architecture; which is not really
possible if you do optimizations which e.g. rely on cache
characteristics or instruction timings or interactions by hand.

I changed my habits to just trust my compiler a few years ago and have
more readable code in exchange for that. The compiler does a fairly
great job, although gcc still outruns clang for *my* usecases.



YMMV.


It does. For my interpreter projects, gcc -O3 does a pretty good job.

For running a suite of standard benchmarks ('spectral', 'fannkuch', 
'binary-tree', all that lot) in the bytecode language under test, then 
gcc is 30% faster than my own language/compiler. (And 25% faster than 
clang.)


(In that project, gcc can do a lot of inlining, which doesn't seem to be 
practical in CPython as functions are all over the place.)


However, when I plug in an ASM dispatcher to my version (which tries to 
deal with simple bytecodes or some common object types before passing 
control to the HLL to deal with), then I can get /twice as fast/ as gcc 
-O3. (For real programs the difference is narrower, but usually still 
faster than gcc.)


(This approach I don't think will work with CPython, because there don't 
appear to be any simple cases for ASM to deal with! The ASM dispatcher 
keeps essential globals such as the stack pointer and program counter in 
registers, and uses chained 'threaded' code rather than function calls. 
A proportion of byte-codes need to be handled in this environment, 
otherwise it could actually slow things down, as the switch to/from HLL 
code is expensive.)


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-16 Thread Marko Rauhamaa
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:

 On Sat, 16 May 2015 11:59 pm, Marko Rauhamaa wrote:
 supports multiple inheritance without classes. Maybe I should port that
 to Python...

 I'd like to see it, but somehow I don't think that your Scheme object
 system is another name for closures. We were talking about closures,
 weren't we?

Ok, here's a quick port that have barely tried out:



### Simple OO Framework

class _O: pass

def make_object(*procedures, base=None, bases=None):
o = _O()
methods = {}
setattr(o, '%methods', methods)
if base is not None:
inherit_single(o, base)
elif bases is not None:
inherit_multi(o, bases)
for procedure in procedures:
methods[procedure.__name__] = procedure
setattr(o, procedure.__name__, procedure)
return o

def inherit_single(o, base):
methods = getattr(o, '%methods')
for name, method in getattr(base, '%methods').items():
methods[name] = method
setattr(o, name, method)

def inherit_multi(o, bases):
for base in bases:
inherit_single(o, base)

### Used as follows

def TCPClient():
def connect(socket_address):
...
return make_object(connect)

def SMTPClient():
tcp_client = TCPClient()
def connect(host):
tcp_client.connect((host, 25))
def send_message(message):
...
return make_object(send_message, base=tcp_client)

client = SMTPClient()
client.connect('mail.example.com')


 I mean, people had to *debate* the introduction of closures? There were
 three competing proposals for them, plus an argument for don't add them.
 Some people say closures were added in Java 7, others say closures have
 been there since the beginning, and James Gosling himself says that Java
 used inner classes instead of closures but the result was painful...

I'm with those who say anonymous and named inner classes have been there
forever and serve the purpose of closures. Yes, Java's boilerplate
requirements are painful, but if you don't like that, use Python.

 It seems to me that in the Java community, there's a lot of confusion over
 closures, which are seen as an advanced (and rather scary) feature. Hardly
 routine.

Importantly, anonymous inner classes have been in active use by Java
newbs from day one.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-16 Thread Marko Rauhamaa
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:

 A couple more negatives:

 - no such thing as inheritance;

Untrue. My simple Scheme object system (125 lines incl. documentation)
supports multiple inheritance without classes. Maybe I should port that
to Python...

 - is-a relationship tests don't work;

From the ducktyping point of view, that is an advantage. The whole
Linnaean categorization of objects is unnecessary ontological chaff. How
many times have people here had to advise newcomers not to inspect type
and instance relations of objects and just call the method?

 - an unfamiliar idiom for most people;

That's impossible to ascertain objectively. Java and JavaScript
programmers (of all people!) routinely deal with closures.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-16 Thread Steven D'Aprano
On Sat, 16 May 2015 11:59 pm, Marko Rauhamaa wrote:

 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
 
 A couple more negatives:

 - no such thing as inheritance;
 
 Untrue. My simple Scheme object system (125 lines incl. documentation)

Ah yes, I've seen Javascript code like that too. Each line is thirty
thousand characters long...

*wink*


 supports multiple inheritance without classes. Maybe I should port that
 to Python...

I'd like to see it, but somehow I don't think that your Scheme object
system is another name for closures. We were talking about closures,
weren't we?

It sounds like you have implemented a form of prototype-based object
programming. The question of whether prototype-OOP has inheritance is an
interesting one. Clearly prototypes implement something *like* inheritance,
but it is based in delegation or copying. Delegation-based cloning is quite
close to class-based inheritance, but copying-based cloning is not.

My sense is that I prefer to say that prototypes don't have inheritance in
the same sense as classes, but they have something that plays the same role
as inheritance. But if you want to call it inheritance, I can't really
argue.



 - is-a relationship tests don't work;
 
 From the ducktyping point of view, that is an advantage. The whole
 Linnaean categorization of objects is unnecessary ontological chaff. How
 many times have people here had to advise newcomers not to inspect type
 and instance relations of objects and just call the method?

Um, is that a trick question? I don't remember the last time.



 - an unfamiliar idiom for most people;
 
 That's impossible to ascertain objectively. Java and JavaScript
 programmers (of all people!) routinely deal with closures.

I don't think so. I can't say for Javascript, but for Java, there's a lot of
confusion around closures, they've been described as evil with the
recommendation not to use them, and people cannot even agree when they were
introduced!

http://java.dzone.com/articles/whats-wrong-java-8-currying-vs
www.javaworld.com/javaworld/jw-06-2008/jw-06-closures.html


I mean, people had to *debate* the introduction of closures? There were
three competing proposals for them, plus an argument for don't add them.
Some people say closures were added in Java 7, others say closures have
been there since the beginning, and James Gosling himself says that Java
used inner classes instead of closures but the result was painful...

It seems to me that in the Java community, there's a lot of confusion over
closures, which are seen as an advanced (and rather scary) feature. Hardly
routine.


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-16 Thread Steven D'Aprano
On Sat, 16 May 2015 06:08 pm, Marko Rauhamaa wrote:

 Note that almost identical semantics could be achieved without a class.
 Thus, these two constructs are almost identical:
[...]

 IOW, the class is a virtually superfluous concept in Python. Python has
 gotten it probably without much thought (other languages at the time had
 it). I comes with advantages and disadvantages:

Your example is effectively just a way of using closures instead of a class
instance.

Almost anything you can do with classes, you can do with closures. The big
advantage of classes over closures is that you have an interface to access
arbitrary class attributes, while you would need a separate closure for
each and every attribute you want access to.

For example, here is sketch:

class K:
def method(self, arg):
return self.spam + arg


k = K()
the_method = k.method  # bound method


as a closure becomes:


def make_closure(instance):  # instance is equivalent to self above
def method(arg):
return instance.spam + arg
return method

the_method = make_closure(obj)  # Some object with a spam field.


The big advantage of a closure is that you have much more strict
encapsulation. The big disadvantage of a closure is that you have much more
strict encapsulation.


  + improves readability

I wouldn't say that.


  + makes objects slightly smaller
 
  + makes object instantiation slightly faster

Are you sure? Have you actually benchmarked this?

 
  - goes against the grain of ducktyping
 
  - makes method calls slower
 
  - makes method call semantics a bit tricky


A couple more negatives:


- no such thing as inheritance;

- is-a relationship tests don't work;

- an unfamiliar idiom for most people;



Also, at least with Python's implementation, a couple of mixed blessings:

± closures are closed against modification;

± internals of the closure are strictly private;




-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-16 Thread Marko Rauhamaa
BartC b...@freeuk.com:

 I suppose in many cases an object will have no attributes of its own,
 and so it can rapidly bypass the first lookup.

Almost all objects have quite many instance attributes. That's what
tells objects apart.

 I don't understand the need for an object creation (to represent A.B
 so that it can call it?) but perhaps such an object can already exist,
 prepared ready for use.

Note that almost identical semantics could be achieved without a class.
Thus, these two constructs are almost identical:

class C:
def __init__(self, x):
self.x = x

def square(self):
return self.x * self.x

def cube(self):
return self.x * self.square()

##

class O: pass

def C(x):
o = O()

def square():
return x * x

def cube():
return x * square()

o.square = square
o.cube = cube
return o


IOW, the class is a virtually superfluous concept in Python. Python has
gotten it probably without much thought (other languages at the time had
it). I comes with advantages and disadvantages:

 + improves readability

 + makes objects slightly smaller

 + makes object instantiation slightly faster

 - goes against the grain of ducktyping

 - makes method calls slower

 - makes method call semantics a bit tricky


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-16 Thread Marko Rauhamaa
Marko Rauhamaa ma...@pacujo.net:

 Ok, here's a quick port that have barely tried out:

And here's a more complete port (with some possible dunder abuse):


### Simple OO Framework

class _O: pass

def make_object(*procedures, base=None, bases=None):
o = _O()
methods = {}
o.__methods__ = methods
o.__derived__ = None
if base is not None:
_inherit_single(o, base)
elif bases is not None:
_inherit_multi(o, bases)
for procedure in procedures:
methods[procedure.__name__] = procedure
def method(*args, __procedure__=procedure, __dispatch__=True, **kwargs):
if not __dispatch__ or o.__derived__ is None:
return __procedure__(*args, **kwargs)
derived = o
while derived.__derived__ is not None:
derived = derived.__derived__
return getattr(derived, __procedure__.__name__)(*args, **kwargs)
setattr(o, procedure.__name__, method)
return o

def _inherit_single(o, base):
methods = o.__methods__
for name, method in base.__methods__.items():
methods[name] = method
setattr(o, name, method)

def _inherit_multi(o, bases):
for base in bases:
_inherit_single(o, base)

def delegate(method, *args, **kwargs):
return method(*args, __dispatch__=False, **kwargs)

### Used as follows

def TCPClient():
def connect(address):
pass
def shut_down():
pass
return make_object(connect, shut_down)

def SMTPClient():
tcp_client = TCPClient()
def connect(address):
delegate(tcp_client.connect, address)
do_stuff()
def send_message(message):
pass
return make_object(connect, send_message, base=tcp_client)

client = SMTPClient()
client.connect(None)



Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Marko Rauhamaa
Gregory Ewing greg.ew...@canterbury.ac.nz:

 BartC wrote:
 It appears to be those = and + operations in the code above
 where much of the time is spent. When I trace out the execution paths
 a bit more, I'll have a better picture of how many lines of C code
 are involved in each iteration.

 The path from decoding a bytecode to the C code that implements it can
 be rather convoluted, but there are reasons for each of the
 complications -- mainly to do with supporting the ability to override
 operators with C and/or Python code.

 If you removed those abilities, the implemention would be simpler, and
 possibly faster. But then the language wouldn't be Python any more.

I agree that Python's raison-d'être is its dynamism and expressive
power. It definitely shouldn't be sacrificed for performance.

However, in some respects, Python might be going overboard with its
dynamism; are all those dunder methods really needed? Must false be
defined so broadly? Must a method lookup necessarily involve object
creation?


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Chris Angelico
On Fri, May 15, 2015 at 6:59 PM, Marko Rauhamaa ma...@pacujo.net wrote:
 However, in some respects, Python might be going overboard with its
 dynamism; are all those dunder methods really needed?

Yes - at least, most of them. As regards operators, there are three
options: either you have magic methods for all of them (Python style),
or none of them (Java style, no operator overloading), or you
hybridize and permit just a handful of them (and then you have to
decide which). There's really no reason not to have them. The other
dunder methods are a mixed bag; some are to allow you to customize
object creation itself (a class's __new__ method could be considered
equivalent to an instance-specific __call__ method on the type
object), some let you pretend to be different types of number
(__int__, __index__, __complex__), which allows you to duck-type
integerness rather than having to subclass int and rely on magic; and
others let you customize the behaviour of well-known functions, such
as __len__ for len() and __repr__ for repr(). Without dunder methods
for all of these, it would be difficult to make an object play
nicely with the overall Python ecosystem. There are others, though,
which are less crucial (__getstate__ and so on for pickle), but you
can ignore them if you're not using those features.

 Must false be defined so broadly?

Different languages define true and false differently. REXX says that
1 is true and 0 is false, and anything else is an error. Pike says
that the integer 0 is false and anything else is true; the philosophy
is that a thing is true and the absence of any thing is false.
Python says that an empty thing is false and a non-empty thing is
true; if next(iter(x)) raises StopIteration, x is probably false, and
vice versa. All three have their merits, all three have their
consequences. One consequence of Python's model is that a __bool__
method is needed on any object that might be empty (unless it defines
__len__, in which case that makes a fine fall-back); it's normal in
Python code to distinguish between if x: and if x is not None:,
where the former sees if x has anything in it, but the latter sees if
there x even exists. (More or less.)

 Must a method lookup necessarily involve object creation?

Actually, no. Conceptually, this method call:

foo.bar(1, 2, 3)

involves looking up 'foo' in the current namespace, looking up
attribute 'bar' on it, treating the result as a function, and calling
it with three integer objects as its arguments. And the language
definition demands that this work even if the foo.bar part is broken
out:

def return_function():
return foo.bar

def call_function():
return_function()(1, 2, 3)

But a particular Python implementation is most welcome to notice the
extremely common situation of method calls and optimize it. I'm not
sure if PyPy does this, but I do remember reading about at least one
Python that does; CPython has an optimization for the actual memory
allocations involved, though I think it does actually construct some
sort of object for each one; as long as the resulting behaviour is
within spec, objects needn't be created just to be destroyed.

Dynamism doesn't have to be implemented naively, just as long as the
slow path is there if anyone needs it.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread BartC

On 15/05/2015 07:05, Gregory Ewing wrote:

BartC wrote:

It appears to be those = and + operations in the code above where
much of the time is spent. When I trace out the execution paths a bit
more, I'll have a better picture of how many lines of C code are
involved in each iteration.


The path from decoding a bytecode to the C code that
implements it can be rather convoluted, but there are
reasons for each of the complications -- mainly to
do with supporting the ability to override operators
with C and/or Python code.

If you removed those abilities, the implemention
would be simpler, and possibly faster. But then the
language wouldn't be Python any more.


That's the challenge; programs must still work as they did before. (But 
I suppose it can be exasperating for CPython developers if 99% of 
programs could be made faster but can't be because of the 1% that rely 
on certain features).


Still, I'm just seeing what there is in CPython that limits the 
performance, and whether anything can be done /easily/ without going to 
solutions such as PyPy which are unsatisfactory IMO (by being even more 
fantastically complicated, but they don't always give a speed-up either).


For example, there is a /specific/ byte-code called BINARY_ADD, which 
then proceeds to call a /generic/ binary-op handler! This throws away 
the advantage of knowing at byte-code generation time exactly which 
operation is needed.


(Unless I'm just looking at a bunch of macros or maybe there is some 
inlining going on with the compiler reducing all those table-indexing 
operations for 'Add', with direct accesses, but it didn't look like it. 
I'm just glad it doesn't use C++ which would have made it truly 
impossible to figure out what's going on.)


(BTW since I'm having to use Linux to compile this anyway, is there a 
tool available that will tell me whether something in the C sources is a 
function or macro? (And preferably where the definition might be located.))


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Gregory Ewing

BartC wrote:
It appears to be those = and + operations in the code above where 
much of the time is spent. When I trace out the execution paths a bit 
more, I'll have a better picture of how many lines of C code are 
involved in each iteration.


The path from decoding a bytecode to the C code that
implements it can be rather convoluted, but there are
reasons for each of the complications -- mainly to
do with supporting the ability to override operators
with C and/or Python code.

If you removed those abilities, the implemention
would be simpler, and possibly faster. But then the
language wouldn't be Python any more.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Marko Rauhamaa
wxjmfa...@gmail.com:

 Implement unicode correctly.

Did they reject your patch?


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Marko Rauhamaa
wxjmfa...@gmail.com:

 Le vendredi 15 mai 2015 11:20:25 UTC+2, Marko Rauhamaa a écrit :
 wxjmfa...@gmail.com:
 
  Implement unicode correctly.
 Did they reject your patch?

 You can not patch something that is wrong by design.

Are you saying the Python language spec is unfixable or that the CPython
implementation is unfixable?

If CPython is unfixable, you can develop a better Python implementation.

If Python itself is unfixable, what brings you here?


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Steven D'Aprano
On Fri, 15 May 2015 08:52 pm, Marko Rauhamaa wrote:

 wxjmfa...@gmail.com:
 
 Le vendredi 15 mai 2015 11:20:25 UTC+2, Marko Rauhamaa a écrit :
 wxjmfa...@gmail.com:
 
  Implement unicode correctly.
 Did they reject your patch?

 You can not patch something that is wrong by design.
 
 Are you saying the Python language spec is unfixable or that the CPython
 implementation is unfixable?

JMF is obsessed with a trivial and artificial performance regression in the
handling of Unicode strings since Python 3.3, which introduced a
significant memory optimization for Unicode strings. Each individual string
uses a code unit no larger than necessary, thus if a string contains
nothing but ASCII or Latin 1 characters, it will use one byte per
character; if it fits into the Basic Multilingual Plane, two bytes per
character; and only use four bytes per character if there are astral
characters in the string.

(That is, Python strings select from a Latin-1, UCS-2 and UTF-32 encoded
form at creation time, according to the largest code point in the string.)

The benefit of this is that most strings will use 1/2 or 1/4 of the memory
that they otherwise would need, which gives an impressive memory saving.
That leads to demonstrable speed-ups in real-world code, however it is
possible to find artificial benchmarks that experience a slowdown compared
to Python 3.2.

JMF found one such artificial benchmark, involving creating and throwing
away many strings as fast as possible without doing any work with them, and
from this has built this fantasy in his head that Python is not compliant
with the Unicode spec and is logically, mathematically broken.


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Chris Angelico
On Fri, May 15, 2015 at 8:14 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 (If anything, using an implicit boolean test will be faster than an
 explicit manual test, because it doesn't have to call the len() global.)

Even more so: Some objects may be capable of determining their own
lengths, but can ascertain their own emptiness more quickly. So len(x)
might have to chug chug chug to figure out exactly how many results
there are (imagine a database query or something), where bool(x)
merely has to see whether or not a single one exists (imagine a
database query with LIMIT 1 tacked on).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Marko Rauhamaa
Chris Angelico ros...@gmail.com:

 On Fri, May 15, 2015 at 6:59 PM, Marko Rauhamaa ma...@pacujo.net wrote:
 Must a method lookup necessarily involve object creation?

 Actually, no.
 [...]
 a particular Python implementation is most welcome to notice the
 extremely common situation of method calls and optimize it.

I'm not sure that is feasible given the way it has been specified. You'd
have to prove the class attribute lookup produces the same outcome in
consecutive method references.

Also:

class X:
   ...   def f(self): pass
   ... 
x = X()
f = x.f
ff = x.f
f is ff
   False

Would a compliant Python implementation be allowed to respond True?

Maybe. At least method objects seem immutable:

f.__name__
   'f'
f.__name__ = 'g'
   Traceback (most recent call last):
 File stdin, line 1, in module
   AttributeError: 'method' object has no attribute '__name__'
f.__eq__ = None
   Traceback (most recent call last):
 File stdin, line 1, in module
   AttributeError: 'method' object attribute '__eq__' is read-only
f.xyz = 'xyz'
   Traceback (most recent call last):
 File stdin, line 1, in module
   AttributeError: 'method' object has no attribute 'xyz'


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Christian Gollwitzer

Am 14.05.15 um 20:50 schrieb Terry Reedy:

On 5/14/2015 1:11 PM, Chris Angelico wrote:


2) make test - run the entire test suite. Takes just as long every
time, but most of it won't have changed.


The test runner has an option, -jn, to run tests in n processes instead
of just 1.  On my 6 core pentium, -j5 cuts time to almost exactly 1/5th
of otherwise.  -j10 seems faster but have not times it.  I suspect that
'make test' does not use -j option.



Just to clarify, -j is an option of GNU make to run the Makefile in 
parallel. Unless the Makefile is buggy, this should result in the same 
output. You can also set an environment variable to enable this 
permanently (until you log out) like


export MAKEFLAGS=-j5

Put this into your .bashrc or .profile, and it'll become permanent.

Christian
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Chris Angelico
On Fri, May 15, 2015 at 10:10 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 The benefit of this is that most strings will use 1/2 or 1/4 of the memory
 that they otherwise would need, which gives an impressive memory saving.
 That leads to demonstrable speed-ups in real-world code, however it is
 possible to find artificial benchmarks that experience a slowdown compared
 to Python 3.2.

It's also possible to find a number of situations in which a narrow
build of 3.2 was faster than 3.3, due to the buggy handling of
surrogates. I've no idea whether jmf is still complaining against that
basis or not, as I don't see his posts.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Steven D'Aprano
On Fri, 15 May 2015 06:59 pm, Marko Rauhamaa wrote:

 However, in some respects, Python might be going overboard with its
 dynamism; are all those dunder methods really needed? Must false be
 defined so broadly? Must a method lookup necessarily involve object
 creation?

Yes, what do you mean, and no.

(1) Yes, the dunder methods are necessary to support operator overloading
and various other protocols. The existence of dunder methods doesn't have
any runtime costs except for that related to memory usage. Fortunately, the
dunder methods only exist on the class itself, not each and every instance.

(2) What do you mean by false being defined so broadly?

The singleton instance False is not defined broadly at all. It is a built-in
constant, and there's only one of it. In Python 3, False is even a
keyword, so you cannot redefine the name. (It's not a keyword in Python 2
because of historical reasons.)

Perhaps you are talking about falsey (false-like) instances, e.g. 

if []: ...
else: ...

will run the else block. That's quite broad, in a sense, but it has no real
runtime cost over and above a more Pascal-ish language would force you to
have:

if len([]) == 0: ...
else: ...

Because True and False are singletons, the overhead of testing something in
a boolean context is no greater than the cost of testing the same condition
by hand. That is, it makes no difference whether you manually compare the
list's length to zero, or let its __nonzero__ or __bool__ method do the
same. (If anything, using an implicit boolean test will be faster than an
explicit manual test, because it doesn't have to call the len() global.)

(3) Method lookups don't *necessarily* have to involve object creation. The
only semantics which Python requires (so far as I understand it) are:

- Taking a reference to an unbound method:

ref = str.upper

  may return the function object itself, which clearly already exists.

- Taking a reference to a bound method:

ref = some string.upper

  must return a method object, but it doesn't have to be recreated from
  scratch each and every time. It could cache it once, then always return
  that. That, I believe, is an implementation detail.

- Calling a method may bypass creating a method, and just use the 
  function object directly, provided Python knows that the descriptor
  has no side-effects.

For example, since FunctionType.__get__ has no side-effects, a Python
interpreter could special-case function objects and avoid calling the
descriptor protocol when it knows that the method is just going to be
called immediately.


But... having said all that... how do you know that these issues are
bottlenecks that eliminating them would speed up the code by any
significant amount?




-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Mark Lawrence

On 15/05/2015 11:52, Marko Rauhamaa wrote:

wxjmfa...@gmail.com:


Le vendredi 15 mai 2015 11:20:25 UTC+2, Marko Rauhamaa a écrit :

wxjmfa...@gmail.com:


Implement unicode correctly.

Did they reject your patch?


You can not patch something that is wrong by design.


Are you saying the Python language spec is unfixable or that the CPython
implementation is unfixable?

If CPython is unfixable, you can develop a better Python implementation.

If Python itself is unfixable, what brings you here?

Marko



I forgot to mention earlier that I report all his rubbish as abuse on 
google groups.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Steven D'Aprano
On Fri, 15 May 2015 08:50 pm, Marko Rauhamaa wrote:

 Chris Angelico ros...@gmail.com:
 
 On Fri, May 15, 2015 at 6:59 PM, Marko Rauhamaa ma...@pacujo.net wrote:
 Must a method lookup necessarily involve object creation?

 Actually, no.
 [...]
 a particular Python implementation is most welcome to notice the
 extremely common situation of method calls and optimize it.
 
 I'm not sure that is feasible given the way it has been specified. You'd
 have to prove the class attribute lookup produces the same outcome in
 consecutive method references.

Sure. But some implementations may have a more, um, flexible approach to
correctness, and offer more aggressive optimizations which break the letter
of Python's semantics but work for 90% of cases. Just because CPython
doesn't do so, doesn't mean that some new implementation might not offer a
series of aggressive optimizations which the caller (or maybe the module?)
can turn on as needed, e.g.:

- assume methods never change;
- assume classes are static;
- assume built-in names always refer to the known built-in;

etc. Such an optimized Python, when running with those optimizations turned
on, is not *strictly* Python, but buyer beware applies here. If the
optimizations break your code or make testing hard, don't use it.


 Also:
 
 class X:
...   def f(self): pass
...
 x = X()
 f = x.f
 ff = x.f
 f is ff
False
 
 Would a compliant Python implementation be allowed to respond True?

Certainly.

When you retrieve x.f, Python applies the usual attribute lookup code,
which simplified looks like this:

if 'f' in x.__dict__:
attr = x.__dict__['f']
else:
for K in type(x).__mro__:
# Walk the parent classes of x in the method resolution order
if 'f' in K.__dict__:
attr = K.__dict__['f']
break
else:  # no break
raise AttributeError
# if we get here, we know x.f exists and is bound to attr
# now apply the descriptor protocol (simplified)
if hasattr(attr, '__get__'):
attr = attr.__get__(x, type(x))
# Finally we can call x.f()
return attr(x, *args)

Functions have a __get__ method which returns the method object! Imagine
they look something like this:

class FunctionType:
def __call__(self, *args, **kwargs):
# Actually call the code that does stuff

def __get__(self, instance, cls):
if cls is None:
# Unbound instance
return self
return MethodType(self, instance)  # self is the function


This implementation creates a new method object every time you look it up.
But functions *could* do this:

def __get__(self, instance, cls):
if cls is None:
# Unbound instance
return self
if self._method is None:
self._method = MethodType(self, instance)  # Cache it.
return self._method



What's more, a compliant implementation could reach the if we get here
point in the lookup procedure above, and do this:

# if we get here, we know attr exists
if type(attr) is FunctionType:  # Fast pointer comparison.
return attr(x, *args)
else:
# do the descriptor protocol thing, and then call attr


It can only do this if it knows that x.f is a real function, not some sort
of callable or function subclass, because in that case who knows what
side-effects the __get__ method might have.

How much time would it save? Probably very little. After all, unless the
method call itself did bugger-all work, the time to create the method
object is probably insignificant. But it's a possible optimization.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Mark Lawrence

On 15/05/2015 10:20, Marko Rauhamaa wrote:

wxjmfa...@gmail.com:


Implement unicode correctly.


Did they reject your patch?

Marko



Please don't feed him, it's been obvious for years that he hasn't the 
faintest idea what he's talking about.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Chris Angelico
On Sat, May 16, 2015 at 1:00 AM, Ian Kelly ian.g.ke...@gmail.com wrote:
 On Fri, May 15, 2015 at 6:43 AM, Steven D'Aprano
 steve+comp.lang.pyt...@pearwood.info wrote:
 How much time would it save? Probably very little. After all, unless the
 method call itself did bugger-all work, the time to create the method
 object is probably insignificant. But it's a possible optimization.

 An interesting alternative (if it's not already being done) might be
 to maintain a limited free-list of method objects, removing the need
 to allocate memory for one before filling it in with data.

It is already done. Some stats were posted recently to python-dev, and
(if I read them correctly) method objects are among the free-list
types. So the actual memory (de)allocations are optimized, and all
that's left is setting a couple of pointers to select an object and a
function.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Marko Rauhamaa
BartC b...@freeuk.com:

 What /is/ a method lookup? Is it when you have this:

  A.B()

 and need to find whether the expression A (or its class or type) has a
 name B associated with it? (And it then needs to check whether B is
 something that can be called.)

 If so, does that have to be done using Python's Dict mechanism? (Ie.
 searching for a key 'B' by name and seeing if the object associated
 with it is a method. That does not sound efficient.)

That is a general feature among high-level programming languages. In
Python, it is even more complicated:

 * first the object's dict is looked up for the method name

 * if the method is not found (it usually isn't), the dict of the
   object's class is consulted

 * if the method is found (it usually is), a function object is
   instantiated that delegates to the class's method and embeds a self
   reference to the object to the call

IOW, two dict lookups plus an object construction for each method call.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Mark Lawrence

On 15/05/2015 23:44, Marko Rauhamaa wrote:

BartC b...@freeuk.com:


What /is/ a method lookup? Is it when you have this:

  A.B()

and need to find whether the expression A (or its class or type) has a
name B associated with it? (And it then needs to check whether B is
something that can be called.)

If so, does that have to be done using Python's Dict mechanism? (Ie.
searching for a key 'B' by name and seeing if the object associated
with it is a method. That does not sound efficient.)


That is a general feature among high-level programming languages. In
Python, it is even more complicated:

  * first the object's dict is looked up for the method name

  * if the method is not found (it usually isn't), the dict of the
object's class is consulted

  * if the method is found (it usually is), a function object is
instantiated that delegates to the class's method and embeds a self
reference to the object to the call

IOW, two dict lookups plus an object construction for each method call.


Marko



As a picture paints a thousand words is anybody aware of a site or sites 
that show this diagramatically, as I think I and possibly others would 
find it far easier to grasp.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread MRAB

On 2015-05-16 01:43, BartC wrote:

On 15/05/2015 23:44, Marko Rauhamaa wrote:

BartC b...@freeuk.com:


What /is/ a method lookup? Is it when you have this:

  A.B()

and need to find whether the expression A (or its class or type) has a
name B associated with it? (And it then needs to check whether B is
something that can be called.)

If so, does that have to be done using Python's Dict mechanism? (Ie.
searching for a key 'B' by name and seeing if the object associated
with it is a method. That does not sound efficient.)


That is a general feature among high-level programming languages. In
Python, it is even more complicated:

  * first the object's dict is looked up for the method name

  * if the method is not found (it usually isn't), the dict of the
object's class is consulted

  * if the method is found (it usually is), a function object is
instantiated that delegates to the class's method and embeds a self
reference to the object to the call

IOW, two dict lookups plus an object construction for each method call.


OK, I didn't know that objects have their own set of attributes that are
distinct from the class they belong to. I really ought to learn more
Python!.

(Yet, I have this crazy urge now to create my own bytecode interpreter
for, if not exactly Python itself, then an equivalent language. Just to
see if I can do any better than CPython, given the same language
restraints.

Although I'm hampered a little by not knowing Python well enough. Nor
OOP, but those are minor details... Anyway it sounds more fun than
trying to decipher the layers of macros and conditional code that appear
to be the CPython sources.)

   IOW, two dict lookups plus an object construction for each method call.

I suppose in many cases an object will have no attributes of its own,
and so it can rapidly bypass the first lookup. I don't understand the
need for an object creation (to represent A.B so that it can call it?)
but perhaps such an object can already exist, prepared ready for use.


It's possible to do:

f = A.B
...
f()

so it's necessary to have an object for A.B.

The question is how much you would gain from optimising A.B() as a
special case (increase in speed vs increase in complexity).

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Terry Reedy

On 5/15/2015 5:54 PM, BartC wrote:


What /is/ a method lookup? Is it when you have this:

  A.B()


This is parsed as (A.B)()


and need to find whether the expression A (or its class or type) has a
name B associated with it?


Yes.  Dotted names imply an attribute lookup.


(And it then needs to check whether B is something that can be called.)


The object resulting from the attribute lookup, A.B (not B exactly), is 
called in a separate operation (with a separate bytecode). It depends on 
the object having a .__call__ method.  The .__call__ method is 
*executed* (rather than *called*, which would lead to infinite regress).


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Gregory Ewing

BartC wrote:
For example, there is a /specific/ byte-code called BINARY_ADD, which 
then proceeds to call a /generic/ binary-op handler! This throws away 
the advantage of knowing at byte-code generation time exactly which 
operation is needed.


While inlining the binary-op handling might give you a
slightly shorter code path, it wouldn't necessarily speed
anything up. It's possible, for example, that the shared
binary-op handler fits in the instruction cache, but the
various inlined copies of it don't, leading to a slowdown.

The only way to be sure about things like that is to try
them and measure. The days when you could predict the speed
of a program just by counting the number of instructions
executed are long gone.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread BartC

On 15/05/2015 23:44, Marko Rauhamaa wrote:

BartC b...@freeuk.com:


What /is/ a method lookup? Is it when you have this:

  A.B()

and need to find whether the expression A (or its class or type) has a
name B associated with it? (And it then needs to check whether B is
something that can be called.)

If so, does that have to be done using Python's Dict mechanism? (Ie.
searching for a key 'B' by name and seeing if the object associated
with it is a method. That does not sound efficient.)


That is a general feature among high-level programming languages. In
Python, it is even more complicated:

  * first the object's dict is looked up for the method name

  * if the method is not found (it usually isn't), the dict of the
object's class is consulted

  * if the method is found (it usually is), a function object is
instantiated that delegates to the class's method and embeds a self
reference to the object to the call

IOW, two dict lookups plus an object construction for each method call.


OK, I didn't know that objects have their own set of attributes that are 
distinct from the class they belong to. I really ought to learn more 
Python!.


(Yet, I have this crazy urge now to create my own bytecode interpreter 
for, if not exactly Python itself, then an equivalent language. Just to 
see if I can do any better than CPython, given the same language 
restraints.


Although I'm hampered a little by not knowing Python well enough. Nor 
OOP, but those are minor details... Anyway it sounds more fun than 
trying to decipher the layers of macros and conditional code that appear 
to be the CPython sources.)


 IOW, two dict lookups plus an object construction for each method call.

I suppose in many cases an object will have no attributes of its own, 
and so it can rapidly bypass the first lookup. I don't understand the 
need for an object creation (to represent A.B so that it can call it?) 
but perhaps such an object can already exist, prepared ready for use.


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread BartC

On 15/05/2015 09:59, Marko Rauhamaa wrote:


The path from decoding a bytecode to the C code that implements it can
be rather convoluted, but there are reasons for each of the
complications -- mainly to do with supporting the ability to override
operators with C and/or Python code.

If you removed those abilities, the implemention would be simpler, and
possibly faster. But then the language wouldn't be Python any more.


I agree that Python's raison-d'être is its dynamism and expressive
power. It definitely shouldn't be sacrificed for performance.

However, in some respects, Python might be going overboard with its
dynamism; are all those dunder methods really needed? Must false be
defined so broadly? Must a method lookup necessarily involve object
creation?


What /is/ a method lookup? Is it when you have this:

 A.B()

and need to find whether the expression A (or its class or type) has a 
name B associated with it? (And it then needs to check whether B is 
something that can be called.)


If so, does that have to be done using Python's Dict mechanism? (Ie. 
searching for a key 'B' by name and seeing if the object associated with 
it is a method. That does not sound efficient.)


(And I guess Python's classes come into play so if B is not part of A's 
class then it might be part of some base-class. I can see that it can 
get complicated, but I don't use OO so can't speculate further.)


(In the language whose implementation I'm comparing with CPython's, it 
doesn't have classes. A.B() can still appear, but .B will need to be an 
attribute that the (bytecode) compiler already knows from a prior 
definition (usually, some struct or record if A is an expression).


If there is only one .B it knows, then a simple check that A is the 
correct type is all that is needed. Otherwise a runtime search through 
all .Bs (and a compact table will have been set up for this) is needed 
to find the .B that matches A's type. But this is all still pretty quick.


In this context, A.B will need to be some function pointer, and A.B() 
will call it.)


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Ian Kelly
On Fri, May 15, 2015 at 6:43 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 How much time would it save? Probably very little. After all, unless the
 method call itself did bugger-all work, the time to create the method
 object is probably insignificant. But it's a possible optimization.

An interesting alternative (if it's not already being done) might be
to maintain a limited free-list of method objects, removing the need
to allocate memory for one before filling it in with data.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Ian Kelly
On Fri, May 15, 2015 at 9:00 AM, Ian Kelly ian.g.ke...@gmail.com wrote:
 On Fri, May 15, 2015 at 6:43 AM, Steven D'Aprano
 steve+comp.lang.pyt...@pearwood.info wrote:
 How much time would it save? Probably very little. After all, unless the
 method call itself did bugger-all work, the time to create the method
 object is probably insignificant. But it's a possible optimization.

 An interesting alternative (if it's not already being done) might be
 to maintain a limited free-list of method objects, removing the need
 to allocate memory for one before filling it in with data.

Looks like it is already being done:
https://hg.python.org/cpython/file/e7c7431f91b2/Objects/methodobject.c#l7
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Steven D'Aprano
On Sat, 16 May 2015 09:27 am, Mark Lawrence wrote:

 On 15/05/2015 23:44, Marko Rauhamaa wrote:
 BartC b...@freeuk.com:

 What /is/ a method lookup? Is it when you have this:

   A.B()

 and need to find whether the expression A (or its class or type) has a
 name B associated with it? (And it then needs to check whether B is
 something that can be called.)

 If so, does that have to be done using Python's Dict mechanism? (Ie.
 searching for a key 'B' by name and seeing if the object associated
 with it is a method. That does not sound efficient.)

It's not as inefficient as you may think. Dicts are hash tables, and hash
tables are a standard computer science data structure for performing very
fast searches at constant (or near constant) time.

The dict is basically an array of pointers to (key, value). To look a name
up in the dict, you hash the string which gives you an index into the
array, then look at that position. If it is blank, you know there is no
match. If it points to a string, you compare that string to your string. If
they are equal, then you have a match. If they aren't equal, you have a
collision, and you have to look elsewhere (details differ) but typically
you don't end up looking in more than one or two positions. So all pretty
fast, and close enough to constant time.

To speed things up even further, I think that the hash value is cached with
the string, so it only needs to be calculated the first time.



 That is a general feature among high-level programming languages. In
 Python, it is even more complicated:

   * first the object's dict is looked up for the method name

   * if the method is not found (it usually isn't), the dict of the
 object's class is consulted

   * if the method is found (it usually is), a function object is
 instantiated that delegates to the class's method and embeds a self
 reference to the object to the call

It's the other way around. The function object already exists: you created
it by writing `def method(self, *args): ... ` inside the class body. def
always makes a function. It's the *method* object which is created on the
fly, delegating to the function.



 IOW, two dict lookups plus an object construction for each method call.


 Marko

 
 As a picture paints a thousand words is anybody aware of a site or sites
 that show this diagramatically, as I think I and possibly others would
 find it far easier to grasp.

No I'm not aware of any such site, but I can try to make it more obvious
with an example.

Suppose we have a hierarchy of classes, starting from the root of the
hierarchy (object) to a specific instance:

class Animal(object): 
pass

class Mammal(Animal):
pass

class Dog(Mammal):
def bark(self): ...

laddy = Dog()


We then look up a method:

laddy.bark()

In a non-dynamic language like Java, the compiler knows exactly where bark
is defined (in the Dog class) and can call it directly. In dynamic
languages like Python, the compiler can't be sure that bark hasn't been
shadowed or overridden at runtime, so it has to search for the first match
found. Simplified:

* Does laddy.__dict__ contain the key bark? If so, we have a match.

* For each class in the MRO (Method Resolution Order), namely 
  [Dog, Mammal, Animal, object], does the class __dict__ contain the
  key bark? If so, we have a match.

* Do any of those classes in the MRO have a __getattr__ method? If
  so, then try calling __getattr__(bark) and see if it returns 
  a match.

* Give up and raise AttributeError.

[Aside: some of the things I haven't covered: __slots__, __getattribute__,
how metaclasses can affect this.]

In the case of laddy.bark, the matching attribute is found as
Dog.__dict__['bark']:

temp = Dog.__bark__['bark']  # laddy.bark, is a function

At this point, the descriptor protocol applies. You can ignore this part if
you like, and just pretend that laddy.bark returns a method instead of a
function, but if you want to know what actually happens in all it's gory
details, it is something like this (again, simplified):

* Does the attribute have a __get__ method? If not, then we just 
  use the object as-is, with no changes.

* But if it does have a __get__ method, then it is a descriptor and
  we call the __get__ method to get the object we actually use.


Since functions are descriptors, we get this:

temp = temp.__get__(laddy, Dog)  # returns a method object

and finally we can call the method:

temp()  # laddy.bark()


None of these individual operations are particularly expensive, nor are
there a lot of them. For a typical instance, the MRO usually contains only
two or three classes, and __dict__ lookups are fast. Nevertheless, even
though each method lookup is individually *almost* as fast as the sort of
pre-compiled all-but-instantaneous access which Java can do, it all adds
up. So in Java, a long chain of dots:

foo.bar.baz.foobar.spam.eggs.cheese

can be resolved at compile-time, and takes no more time than


Re: Building CPython

2015-05-15 Thread Chris Angelico
On Sat, May 16, 2015 at 11:55 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 but in Python's case it has to be resolved at run-time, so if you care about
 speed, you should try to avoid long chains of dots in performance critical
 loops. E.g. instead of:

 for x in sequence:
 foo.bar.baz.foobar.spam.eggs.cheese(x)

 you can write:

 cheese = foo.bar.baz.foobar.spam.eggs.cheese
 for x in sequence:
 cheese(x)

Like all advice, of course, this should not be taken on its own. Some
code tries to avoid long chains of dots by adding wrapper methods:

for x in sequence:
foo.cheese(x)

where foo.cheese() delegates to self.bar.cheese() and so on down the
line. That, of course, will be far FAR slower, in pretty much any
language (unless the compiler can in-line the code completely, in
which case it's effectively being optimized down to the original
anyway); the dots aren't as bad as you might think :)

Just always remember the one cardinal rule of optimization: MEASURE
FIRST. You have no idea how slow something is until you measure it.

(I'm not addressing my comments to Steven here, who I'm fairly sure is
aware of all of this(!), but it's his post that gave the example that
I'm quoting.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Mark Lawrence

On 16/05/2015 02:55, Steven D'Aprano wrote:

On Sat, 16 May 2015 09:27 am, Mark Lawrence wrote:


On 15/05/2015 23:44, Marko Rauhamaa wrote:

BartC b...@freeuk.com:


What /is/ a method lookup? Is it when you have this:

   A.B()

and need to find whether the expression A (or its class or type) has a
name B associated with it? (And it then needs to check whether B is
something that can be called.)

If so, does that have to be done using Python's Dict mechanism? (Ie.
searching for a key 'B' by name and seeing if the object associated
with it is a method. That does not sound efficient.)


It's not as inefficient as you may think. Dicts are hash tables, and hash
tables are a standard computer science data structure for performing very
fast searches at constant (or near constant) time.

The dict is basically an array of pointers to (key, value). To look a name
up in the dict, you hash the string which gives you an index into the
array, then look at that position. If it is blank, you know there is no
match. If it points to a string, you compare that string to your string. If
they are equal, then you have a match. If they aren't equal, you have a
collision, and you have to look elsewhere (details differ) but typically
you don't end up looking in more than one or two positions. So all pretty
fast, and close enough to constant time.

To speed things up even further, I think that the hash value is cached with
the string, so it only needs to be calculated the first time.




That is a general feature among high-level programming languages. In
Python, it is even more complicated:

   * first the object's dict is looked up for the method name

   * if the method is not found (it usually isn't), the dict of the
 object's class is consulted

   * if the method is found (it usually is), a function object is
 instantiated that delegates to the class's method and embeds a self
 reference to the object to the call


It's the other way around. The function object already exists: you created
it by writing `def method(self, *args): ... ` inside the class body. def
always makes a function. It's the *method* object which is created on the
fly, delegating to the function.




IOW, two dict lookups plus an object construction for each method call.


Marko



As a picture paints a thousand words is anybody aware of a site or sites
that show this diagramatically, as I think I and possibly others would
find it far easier to grasp.


No I'm not aware of any such site, but I can try to make it more obvious
with an example.

Suppose we have a hierarchy of classes, starting from the root of the
hierarchy (object) to a specific instance:

class Animal(object):
 pass

class Mammal(Animal):
 pass

class Dog(Mammal):
 def bark(self): ...

laddy = Dog()


We then look up a method:

laddy.bark()

In a non-dynamic language like Java, the compiler knows exactly where bark
is defined (in the Dog class) and can call it directly. In dynamic
languages like Python, the compiler can't be sure that bark hasn't been
shadowed or overridden at runtime, so it has to search for the first match
found. Simplified:

* Does laddy.__dict__ contain the key bark? If so, we have a match.

* For each class in the MRO (Method Resolution Order), namely
   [Dog, Mammal, Animal, object], does the class __dict__ contain the
   key bark? If so, we have a match.

* Do any of those classes in the MRO have a __getattr__ method? If
   so, then try calling __getattr__(bark) and see if it returns
   a match.

* Give up and raise AttributeError.

[Aside: some of the things I haven't covered: __slots__, __getattribute__,
how metaclasses can affect this.]

In the case of laddy.bark, the matching attribute is found as
Dog.__dict__['bark']:

 temp = Dog.__bark__['bark']  # laddy.bark, is a function

At this point, the descriptor protocol applies. You can ignore this part if
you like, and just pretend that laddy.bark returns a method instead of a
function, but if you want to know what actually happens in all it's gory
details, it is something like this (again, simplified):

* Does the attribute have a __get__ method? If not, then we just
   use the object as-is, with no changes.

* But if it does have a __get__ method, then it is a descriptor and
   we call the __get__ method to get the object we actually use.


Since functions are descriptors, we get this:

 temp = temp.__get__(laddy, Dog)  # returns a method object

and finally we can call the method:

 temp()  # laddy.bark()


None of these individual operations are particularly expensive, nor are
there a lot of them. For a typical instance, the MRO usually contains only
two or three classes, and __dict__ lookups are fast. Nevertheless, even
though each method lookup is individually *almost* as fast as the sort of
pre-compiled all-but-instantaneous access which Java can do, it all adds
up. So in Java, a long chain of dots:

 foo.bar.baz.foobar.spam.eggs.cheese

can be resolved at 

Re: Building CPython

2015-05-15 Thread Terry Reedy

On 5/15/2015 4:59 AM, Marko Rauhamaa wrote:

Must a method lookup necessarily involve object creation?


Where is matters, inside loops, method lookup can be avoided after doing 
it once.


for i in range(100): ob.meth(i)

versus

meth = ob.meth
for i in range(100): meth(i)

For working with a single stack:

a = []
apush = stack.append
apop = stack.pop
# Now apush(x), apop(x), and test with 'if a:' as desired.
Being able to do this was part of my rational for adding list.pop, about 
15 years ago.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-15 Thread Terry Reedy

On 5/15/2015 6:51 AM, Christian Gollwitzer wrote:

Am 14.05.15 um 20:50 schrieb Terry Reedy:

On 5/14/2015 1:11 PM, Chris Angelico wrote:


2) make test - run the entire test suite. Takes just as long every
time, but most of it won't have changed.


The test runner has an option, -jn, to run tests in n processes instead
of just 1.  On my 6 core pentium, -j5 cuts time to almost exactly 1/5th
of otherwise.  -j10 seems faster but have not times it.  I suspect that
'make test' does not use -j option.



Just to clarify, -j is an option of GNU make to run the Makefile in
parallel. Unless the Makefile is buggy, this should result in the same
output. You can also set an environment variable to enable this
permanently (until you log out) like

export MAKEFLAGS=-j5

Put this into your .bashrc or .profile, and it'll become permanent.


This is not applicable when running on Windows.  I was specifically 
referring, for instance, to entering at command prompt


current_dir python -m test -j10

AFAIK, test.__main__ does not look at environmental variables.

ps. I have been wondering why 'j' rather than, say 'c' or 'p' was used 
for 'run in parallel on multiple cores'.  Your comment suggests that 'j' 
follows precedent.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread BartC

On 14/05/2015 17:09, Chris Angelico wrote:

On Fri, May 15, 2015 at 1:51 AM, BartC b...@freeuk.com wrote:

OK, the answer seems to be No then - you can't just trivially compile the C
modules that comprise the sources with the nearest compiler to hand. So much
for C's famous portability!

(Actually, I think you already lost me on your first line.)



If you want to just quickly play around with CPython's sources, I
would strongly recommend getting yourself a Linux box. Either spin up
some actual hardware with actual Linux, or grab a virtualization
engine like VMWare, VirtualBox, etc, etc, and installing into a VM.
With a Debian-based Linux (Debian, Ubuntu, Mint, etc), you should
simply be able to:

sudo apt-get build-dep python3


Actually I had VirtualBox with Ubuntu, but I don't know my way around 
Linux and preferred doing things under Windows (and with all my own tools).


But it's now building under Ubuntu.

(Well, I'm not sure what it's doing exactly; the instructions said type 
make, then make test, then make install, and it's still doing make test.


I hope there's a quicker way of re-building an executable after a minor 
source file change, otherwise doing any sort of development is going to 
be impractical.)


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread BartC

On 14/05/2015 18:11, Chris Angelico wrote:

On Fri, May 15, 2015 at 3:02 AM, BartC b...@freeuk.com wrote:



I hope there's a quicker way of re-building an executable after a minor
source file change, otherwise doing any sort of development is going to be
impractical.)


The whole point of 'make' is to rebuild only the parts that need to be
rebuilt (either they've changed, or they depend on something that was
changed). Sometimes practically everything needs to be rebuilt, if you
do some really fundamental change, but generally not.

The three parts to the build process are:

1) make - actually generate an executable. Takes ages the first time,
will be a lot quicker if you haven't changed much.
2) make test - run the entire test suite. Takes just as long every
time, but most of it won't have changed.
3) make install (needs root access, so probably 'sudo make install') -
install this as your primary build of Python.

When you start tinkering, I suggest just running make; rerunning the
test suite isn't necessary till you're all done, and even then it's
only important for making sure that your change hasn't broken anything
anywhere else. Test your actual changes by simply running the
freshly-built Python - most likely that'll be ./python.


OK, thanks. I didn't even know where the executable was put! Now I don't 
need 'make install', while 'make test' I won't bother with any more.


Making a small change and typing 'make' took 5 seconds, which is 
reasonable enough (although I had to use the copy of the source in 
Windows to find where the main.c file I needed was located).


Now Python 3.4.3 says Bart's Python.


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread Dave Angel

On 05/14/2015 01:02 PM, BartC wrote:

On 14/05/2015 17:09, Chris Angelico wrote:

On Fri, May 15, 2015 at 1:51 AM, BartC b...@freeuk.com wrote:

OK, the answer seems to be No then - you can't just trivially compile
the C
modules that comprise the sources with the nearest compiler to hand.
So much
for C's famous portability!

(Actually, I think you already lost me on your first line.)



If you want to just quickly play around with CPython's sources, I
would strongly recommend getting yourself a Linux box. Either spin up
some actual hardware with actual Linux, or grab a virtualization
engine like VMWare, VirtualBox, etc, etc, and installing into a VM.
With a Debian-based Linux (Debian, Ubuntu, Mint, etc), you should
simply be able to:

sudo apt-get build-dep python3


Actually I had VirtualBox with Ubuntu, but I don't know my way around
Linux and preferred doing things under Windows (and with all my own tools).

But it's now building under Ubuntu.

(Well, I'm not sure what it's doing exactly; the instructions said type
make, then make test, then make install, and it's still doing make test.

I hope there's a quicker way of re-building an executable after a minor
source file change, otherwise doing any sort of development is going to
be impractical.)



That's what make is good for.  It compares the datestamps of the source 
files against the obj files (etc.) and recompiles only when the source 
is newer.  (It's more complex, but that's the idea)


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread Chris Angelico
On Fri, May 15, 2015 at 3:02 AM, BartC b...@freeuk.com wrote:
 Actually I had VirtualBox with Ubuntu, but I don't know my way around Linux
 and preferred doing things under Windows (and with all my own tools).

 But it's now building under Ubuntu.

 (Well, I'm not sure what it's doing exactly; the instructions said type
 make, then make test, then make install, and it's still doing make test.

 I hope there's a quicker way of re-building an executable after a minor
 source file change, otherwise doing any sort of development is going to be
 impractical.)

The whole point of 'make' is to rebuild only the parts that need to be
rebuilt (either they've changed, or they depend on something that was
changed). Sometimes practically everything needs to be rebuilt, if you
do some really fundamental change, but generally not.

The three parts to the build process are:

1) make - actually generate an executable. Takes ages the first time,
will be a lot quicker if you haven't changed much.
2) make test - run the entire test suite. Takes just as long every
time, but most of it won't have changed.
3) make install (needs root access, so probably 'sudo make install') -
install this as your primary build of Python.

When you start tinkering, I suggest just running make; rerunning the
test suite isn't necessary till you're all done, and even then it's
only important for making sure that your change hasn't broken anything
anywhere else. Test your actual changes by simply running the
freshly-built Python - most likely that'll be ./python. Working that
way is fairly quick - you can tweak some C code, see the results, and
go back and tweak some more, all without pausing for a sword fight.
But once you go and rerun the full test suite, well... that's when
it's time for some:

https://xkcd.com/303/

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread Chris Angelico
On Fri, May 15, 2015 at 3:32 AM, BartC b...@freeuk.com wrote:
 OK, thanks. I didn't even know where the executable was put! Now I don't
 need 'make install', while 'make test' I won't bother with any more.

 Making a small change and typing 'make' took 5 seconds, which is reasonable
 enough (although I had to use the copy of the source in Windows to find
 where the main.c file I needed was located).

 Now Python 3.4.3 says Bart's Python.

Haha. I don't usually bother rebranding my builds of things; though
that's partly because normally I'm making very few changes, and all in
the hope that they'll be accepted upstream anyway.

Incidentally, a quick 'make' can range anywhere from a fraction of a
second to quite a long time, depending mainly on the speed of your
hard drive and the performance of your disk cache. On my Linux, a
null make (ie when literally nothing has changed - just rerunning
make) takes about half a second, and that's dealing with a module
that's failing to build. When you rebuild lots of times all at once,
you'll pretty much be working in RAM the whole time, assuming you have
enough of it available.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread Terry Reedy

On 5/14/2015 1:11 PM, Chris Angelico wrote:


2) make test - run the entire test suite. Takes just as long every
time, but most of it won't have changed.


The test runner has an option, -jn, to run tests in n processes instead 
of just 1.  On my 6 core pentium, -j5 cuts time to almost exactly 1/5th 
of otherwise.  -j10 seems faster but have not times it.  I suspect that 
'make test' does not use -j option.



--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread Marko Rauhamaa
BartC b...@freeuk.com:

 That's a shame because I wanted to tinker with the main dispatcher
 loop to try and find out what exactly is making it slow. Nothing that
 seems obvious at first sight.

My guess is the main culprit is attribute lookup in two ways:

 * Each object attribute reference involves a dictionary lookup.

 * Each method call involves *two* dictionary lookups plus an object
   creation.

Tell us what you find out.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread Chris Angelico
On Fri, May 15, 2015 at 1:51 AM, BartC b...@freeuk.com wrote:
 OK, the answer seems to be No then - you can't just trivially compile the C
 modules that comprise the sources with the nearest compiler to hand. So much
 for C's famous portability!

 (Actually, I think you already lost me on your first line.)

 That's a shame because I wanted to tinker with the main dispatcher loop to
 try and find out what exactly is making it slow. Nothing that seems obvious
 at first sight. (The comments even talk about improving branch prediction on
 certain architectures, even though the performance is a couple of magnitudes
 away from that kind of optimisation being relevant.)

C's portability isn't really sufficient for building a huge project,
so what you generally end up with is a huge slab of common code that
doesn't change from platform to platform, plus a (relatively) tiny
section of platform-specific code, such as makefiles/project files,
linker definitions, and so on. When you start hacking on CPython, you
don't generally have to consider which platform you're aiming at, as
long as you're building on one of the ones that's well supported;
trying to port Python to a new compiler on a known OS is actually
about as much work as porting it to a known compiler on a new OS. (I
know this, because I've attempted both - using mingw on Windows, and
gcc on OS/2. It's a big job either way.)

If you want to just quickly play around with CPython's sources, I
would strongly recommend getting yourself a Linux box. Either spin up
some actual hardware with actual Linux, or grab a virtualization
engine like VMWare, VirtualBox, etc, etc, and installing into a VM.
With a Debian-based Linux (Debian, Ubuntu, Mint, etc), you should
simply be able to:

sudo apt-get build-dep python3

to get all the build dependencies for Python 3; that, plus the source
code, should be enough to get you a'building. Similar simplicities are
available for other Linux distros, but I'll let someone else recommend
them.

Even when you have all the appropriate build tools, Windows can be at
times a pain for building software on. The general philosophy of
Windows is that you should normally be taking ready-made binaries; the
general philosophy of Linux is that it's perfectly normal to spin up
your own binaries from the distributed source code. It's not
impossible to build on Windows, by any means, but be prepared for
extra hassles and less support from the OS than you might like.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread BartC

On 13/05/2015 23:34, Terry Reedy wrote:

On 5/13/2015 3:36 PM, BartC wrote:

I'm interested in playing with the CPython sources. I need to be able to
build under Windows, but don't want to use make files (which rarely work
properly), nor do a 6GB installation of Visual Studio Express which is
what seems to be needed (I'm hopeless with complicated IDEs anyway).


Once installed hg or tortoisehg (I use this) and VSE are installed and
repository cloned, are half done. At command prompt, with top directory
of repository as current directory enter
tools\scripts\external.bat
Double-clicking file in Explorer does not work.
Usually only needs to be done once per branch after x.y.0 release as
dependencies are usually not updated for bugfix releases.

Then in same directory enter
pcbuild\python.sln
or double click in Explorer or open VSE and open this file.
Hit F7, wait until get line like
== Build: 1 succeeded, 0 failed, 24 up-to-date, 1 skipped,
hit F5, pin python_d to taskbar (optional, but handy), and go.


OK, the answer seems to be No then - you can't just trivially compile 
the C modules that comprise the sources with the nearest compiler to 
hand. So much for C's famous portability!


(Actually, I think you already lost me on your first line.)

That's a shame because I wanted to tinker with the main dispatcher loop 
to try and find out what exactly is making it slow. Nothing that seems 
obvious at first sight. (The comments even talk about improving branch 
prediction on certain architectures, even though the performance is a 
couple of magnitudes away from that kind of optimisation being relevant.)


Perhaps I was hoping there were some options turned on by default which, 
if disabled, would suddenly double the speed of simple benchmarks. Now I 
won't be able to find out...


--
Bartc

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread BartC

On 14/05/2015 17:29, Marko Rauhamaa wrote:

BartC b...@freeuk.com:


That's a shame because I wanted to tinker with the main dispatcher
loop to try and find out what exactly is making it slow. Nothing that
seems obvious at first sight.


My guess is the main culprit is attribute lookup in two ways:

  * Each object attribute reference involves a dictionary lookup.

  * Each method call involves *two* dictionary lookups plus an object
creation.

Tell us what you find out.


I'm just starting but I can tell you that it isn't because debug mode 
(Py_DEBUG defined) was left on by mistake!


What is interesting however is that on the very simple test I'm doing (a 
while loop incrementing a variable up to 100 million), the timings under 
Windows are:


Python 2.5   9.2 seconds
Python 3.1  13.1
Python 3.4.317.0
Python 3.4.314.3 (under Ubuntu on same machine, using the version
 I built today)

That's quite a big range!

PyPy does it in 0.7 seconds. The same program within my own *simple* 
bytecode interpreter (not for Python) has a fastest time of 1.5 seconds 
but makes use of ASM. A version 100% in (gcc) C can manage 2 seconds.


So far I haven't find anything that explains the discrepancy (the 
languages are different, mine is simpler, but the Python code isn't 
doing anything that complicated, only LOAD_FASTs and such, and LOAD_FAST 
is apparently just an array access.


But the nearly 2:1 difference between new and old Python versions is 
also intriguing.


def whiletest():
i=0
while i=1:
i=i+1

whiletest()

--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread MRAB

On 2015-05-14 22:55, BartC wrote:

On 14/05/2015 17:29, Marko Rauhamaa wrote:

BartC b...@freeuk.com:


That's a shame because I wanted to tinker with the main dispatcher
loop to try and find out what exactly is making it slow. Nothing that
seems obvious at first sight.


My guess is the main culprit is attribute lookup in two ways:

  * Each object attribute reference involves a dictionary lookup.

  * Each method call involves *two* dictionary lookups plus an object
creation.

Tell us what you find out.


I'm just starting but I can tell you that it isn't because debug mode
(Py_DEBUG defined) was left on by mistake!

What is interesting however is that on the very simple test I'm doing (a
while loop incrementing a variable up to 100 million), the timings under
Windows are:

Python 2.5   9.2 seconds
Python 3.1  13.1
Python 3.4.317.0
Python 3.4.314.3 (under Ubuntu on same machine, using the version
   I built today)

That's quite a big range!

PyPy does it in 0.7 seconds. The same program within my own *simple*
bytecode interpreter (not for Python) has a fastest time of 1.5 seconds
but makes use of ASM. A version 100% in (gcc) C can manage 2 seconds.

So far I haven't find anything that explains the discrepancy (the
languages are different, mine is simpler, but the Python code isn't
doing anything that complicated, only LOAD_FASTs and such, and LOAD_FAST
is apparently just an array access.

But the nearly 2:1 difference between new and old Python versions is
also intriguing.

def whiletest():
i=0
while i=1:
i=i+1

whiletest()


Python 2.x has int and long; Python 3 has int, which is the old 'long'.
Try Python 2 with longs.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-14 Thread BartC

On 14/05/2015 22:55, BartC wrote:

 def whiletest():
  i=0
  while i=1:
  i=i+1

 whiletest()



Python 2.5   9.2 seconds
Python 3.1  13.1
Python 3.4.317.0
Python 3.4.314.3 (under Ubuntu on same machine, using the version
  I built today)

That's quite a big range!

PyPy does it in 0.7 seconds. The same program within my own *simple*
bytecode interpreter (not for Python) has a fastest time of 1.5 seconds
but makes use of ASM. A version 100% in (gcc) C can manage 2 seconds.


(Actually my ASM-aided code took 0.5 seconds (some crucial section had 
been commented out). Faster than PyPy and just using simple brute-force 
methods.)



So far I haven't find anything that explains the discrepancy (the
languages are different, mine is simpler, but the Python code isn't
doing anything that complicated, only LOAD_FASTs and such, and LOAD_FAST
is apparently just an array access.


It turns out the LOAD_FASTs were the simple byte-codes!

I'm just starting to find out just how much of a big complicated mess 
this project really is. I wouldn't be surprised if there aren't many 
people who actually understand it all, and that would explain why no-one 
seems to have had much luck in getting the speed up (if anything, it's 
getting slower).


I still have no idea yet exactly what an object comprises. I know that 
even an innocuous-looking LOAD_CONST actually loads a tuple from a table 
(complete with flag testing and bounds checks, although taking those out 
made no discernible difference).


It appears to be those = and + operations in the code above where 
much of the time is spent. When I trace out the execution paths a bit 
more, I'll have a better picture of how many lines of C code are 
involved in each iteration.


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-13 Thread Mark Lawrence

On 13/05/2015 20:36, BartC wrote:

I'm interested in playing with the CPython sources. I need to be able to
build under Windows, but don't want to use make files (which rarely work
properly), nor do a 6GB installation of Visual Studio Express which is
what seems to be needed (I'm hopeless with complicated IDEs anyway).

Is it possible to do this by using mingw-gcc to compile the .c files of
the Python sources one by one, or is it one of those complicated
projects where some of the source is generated as it goes along?

I thought I'd start with the source file containing Py_Main and continue
from there, but some modules compile and some don't, obscure errors that
I don't want to investigate unless it's going to be worthwhile (ie.
eventually ending up with a python.exe that can run simple .py programs).



Before you spend too much time on the mingw route, check out the 
outstanding issues for it on the bug tracker.  Then you'll maybe realise 
that using the supported VS setup is far easier.  Everything you need is 
in the PCBuild directory, including pcbuild.sln for use with VS and 
build.bat for command line use.  Details here 
https://docs.python.org/devguide/setup.html#windows


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Building CPython

2015-05-13 Thread Terry Reedy

On 5/13/2015 3:36 PM, BartC wrote:

I'm interested in playing with the CPython sources. I need to be able to
build under Windows, but don't want to use make files (which rarely work
properly), nor do a 6GB installation of Visual Studio Express which is
what seems to be needed (I'm hopeless with complicated IDEs anyway).


Once installed hg or tortoisehg (I use this) and VSE are installed and 
repository cloned, are half done. At command prompt, with top directory 
of repository as current directory enter

tools\scripts\external.bat
Double-clicking file in Explorer does not work.
Usually only needs to be done once per branch after x.y.0 release as 
dependencies are usually not updated for bugfix releases.


Then in same directory enter
pcbuild\python.sln
or double click in Explorer or open VSE and open this file.
Hit F7, wait until get line like
== Build: 1 succeeded, 0 failed, 24 up-to-date, 1 skipped,
hit F5, pin python_d to taskbar (optional, but handy), and go.

And read devguide.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list