Re: ESR "Waning of Python" post

2018-10-17 Thread Brian Oney via Python-list



On October 17, 2018 7:56:51 AM GMT+02:00, Marko Rauhamaa  
wrote:
>I can't be positive about swapping. I don't remember hearing thrashing.
>However, I do admit running emacs for months on end and occasionally
>with huge buffers so the resident size can be a couple of gigabytes.
>
That's a pretty good stress test for any program, especially one with so much 
human interaction.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-17 Thread Marko Rauhamaa
Paul Rubin :

> Marko Rauhamaa  writes:
>> Emacs occasionally hangs for about a minute to perform garbage
>> collection.
>
> I've never experienced that, especially with more recent versions that I
> think do a little bit of heap tidying in the background.  Even in the
> era of much slower computers I never saw an Emacs GC pause of more than
> a second or two unless something had run amuck and exhausted memory.
> It's always near imperceptible in my experience now.  Is your system
> swapping or something?

I can't be positive about swapping. I don't remember hearing thrashing.
However, I do admit running emacs for months on end and occasionally
with huge buffers so the resident size can be a couple of gigabytes.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-16 Thread Marko Rauhamaa
Paul Rubin :

> But it's possible to do parallel GC with bounded latency. Perry
> Cheng's 2001 PhD thesis says how to do it and is fairly readable:
>
> http://reports-archive.adm.cs.cmu.edu/anon/2001/CMU-CS-01-174.pdf

Thanks. On a quick glance, it is difficult to judge what the worst-case
time and space behavior are as the thesis mixes theory and practice and
leans heavily on practice. The thesis says in its introduction:

   A real-time collector comprises two important features: pauses are
   bounded by some reasonably small value and the mutator can make
   sufficient progress between pauses. Different collectors meet these
   conditions with varying degrees of success and their viability
   depends on application needs. It is important to note that a
   collector must also complete collection within a reasonable time. A
   "real-time" collector which mereloy stops collections whenever it
   runs out of time would be hard real-time but useless if it never
   finishes a collection. In such cases, memory is soon exhausted. As
   with other real-time applications, the most important distinction
   among real-time collectors is the strength of the guarantee.

> If you hang out with users of Lisp, Haskell, Ocaml, Java, Ruby, etc.,
> they (like Python users) have all kinds of complaints about their
> languages, but GC pauses aren't a frequent topic of those complaints.

I don't suffer from it, either.

> Most applications don't actually care about sub-millisecond realtime.
> They just want pauses to be small or infrequent enough to not interfere
> with interactively using a program.  If there's a millisecond pause
> every few seconds of operation and an 0.2 second pause a few times an
> hour, that's usually fine.

Emacs occasionally hangs for about a minute to perform garbage
collection.

Similarly, Firefox occasionally becomes unresponsive for a long time,
and I'm guessing it's due to GC.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: ESR "Waning of Python" post

2018-10-16 Thread Ryan Johnson
Have seen this waning of python thread so many times. Hoping it would have 
waned by now. Lol.

Sent from Mail for Windows 10

From: jfine2...@gmail.com
Sent: Tuesday, October 16, 2018 12:42 PM
To: python-list@python.org
Subject: Re: ESR "Waning of Python" post

On Tuesday, October 16, 2018 at 8:00:26 AM UTC+1, Marko Rauhamaa wrote:
>https://making.pusher.com/golangs-real-time-gc-in-theory-and-practice/>

I'm all in favour of collecting useful URLs. Here's some more suggestions:

https://stackoverflow.com/questions/4491260/explanation-of-azuls-pauseless-garbage-collector
https://pdfs.semanticscholar.org/9770/fc9baf0f2b6c7521f00958973657bf03337d.pdf
https://www.researchgate.net/publication/220800769_Tax-and-spend_Democratic_scheduling_for_real-time_garbage_collection
http://digg.com/2018/private-garbage-collection-propublica
http://flyingfrogblog.blogspot.com/

Aside: One of the above is not about software garbage collection. Can you guess 
which one?

-- 
Jonathan




-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-16 Thread jfine2358
On Tuesday, October 16, 2018 at 8:00:26 AM UTC+1, Marko Rauhamaa wrote:
>https://making.pusher.com/golangs-real-time-gc-in-theory-and-practice/>

I'm all in favour of collecting useful URLs. Here's some more suggestions:

https://stackoverflow.com/questions/4491260/explanation-of-azuls-pauseless-garbage-collector
https://pdfs.semanticscholar.org/9770/fc9baf0f2b6c7521f00958973657bf03337d.pdf
https://www.researchgate.net/publication/220800769_Tax-and-spend_Democratic_scheduling_for_real-time_garbage_collection
http://digg.com/2018/private-garbage-collection-propublica
http://flyingfrogblog.blogspot.com/

Aside: One of the above is not about software garbage collection. Can you guess 
which one?

-- 
Jonathan




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-16 Thread Marko Rauhamaa
Paul Rubin :

> Marko Rauhamaa  writes:
>>> Right, if I need near realtime behaviour and must live
>>> with [C]Python's garbage collector.
>> Or any other GC ever invented.
>
> There are realtime ones, like the Azul GC for Java, that have bounded
> delay in the milliseconds or lower. The total overhead is higher
> though.

I'd be interested in a definitive, non-anecdotal analysis on the topic.
Do you happen to have a link?

One reference I found stated there was no upper bound for heap use:

  A second cost of concurrent garbage collection is unpredictable heap
  growth. The program can allocate arbitrary amounts of memory while the
  GC is running.

  https://making.pusher.com/golangs-real-time-gc-in-theory-and-prac
  tice/>

If that worst-case behavior were tolerated, it would be trivial to
implement real-time GC: just let the objects pile up and never reclaim.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-15 Thread Chris Angelico
On Tue, Oct 16, 2018 at 4:18 PM dieter  wrote:
>
> Marko Rauhamaa  writes:
> > Or you could blame the parts of the software that create too many
> > long-term objects.
>
> I do not do that because I understand why in my application
> there are many long living objects.
>
> > You shouldn't blame the parts of the software that churn out zillions of
> > short-term objects.
>
> I do precisely that: the blamed component produced a very large
> number of short living objects -- without need and while it should
> have been aware that it operates on mass data - among others from the
> fact that its complete environment took special care to work with
> this mass data efficiently.

Exactly. Long-term objects are NOT a problem. Tell me, how many
objects get created as a web app boots up and then are never destroyed
for the lifetime of that process? To find out, I added this line just
before a very VERY small Flask app of mine goes into its main loop:

import gc; print(len(gc.get_objects()), "objects currently tracked")

There are over 40,000 of them. Now, I can't say for sure that every
one of those objects will stick around till the process shuts down,
but I'd say a lot of them will. They're modules, functions, types,
class dictionaries... oh, and the GC doesn't track strings or numbers,
so that's another whole huge slab of objects that you'd have to count.
If you replace the refcounting GC with a pure mark-and-sweep, you have
to check every single one of them every time you do a GC pass.

(For reference, running the same GC check in an empty Python
interpreter gives around five thousand tracked objects, still mostly
functions.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-15 Thread dieter
Marko Rauhamaa  writes:
> dieter :
>> Marko Rauhamaa  writes:
>>> Keeping the number of long-term objects low is key.
>>
>> Right, if I need near realtime behaviour and must live
>> with [C]Python's garbage collector.
>
> Or any other GC ever invented.

There are "realtime garbage collection" algorithms. For them,
the creation of new objects must cooperate with the garbage
collection - likely with the need to acquire a lock for a short
period. But there is no (principle) need to block all
"normal" activity during the complete garbage collection (as [C]Python's
garbage collector has done this at the time of my problem).

>> But, a web application does usually not need near realtime behaviour.
>> An occasional (maybe once in a few days) garbage collection and
>> associated reduced response time is acceptable.
>> A problem only arises if a badly designed component produces
>> quite frequently hundreds of thousands of temporary objects
>> likely triggering (frequent) garbage collections.
>
> But I think you are barking up the wrong tree. You could rightly blame
> GC itself as an unworkable paradigm and switch to, say, C++ or Rust.

I am happy that [C]Python uses mainly reference counting for
its memory management and that GC is used quite sparingly.

> Or you could blame the parts of the software that create too many
> long-term objects.

I do not do that because I understand why in my application
there are many long living objects.

> You shouldn't blame the parts of the software that churn out zillions of
> short-term objects.

I do precisely that: the blamed component produced a very large
number of short living objects -- without need and while it should
have been aware that it operates on mass data - among others from the
fact that its complete environment took special care to work with
this mass data efficiently.

I solved my problem by replacing this single component by
one knowing what it does. No need to rewrite the complete
application, get rid of Python object caches or even
switch to a different language.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-15 Thread Marko Rauhamaa
dieter :

> Marko Rauhamaa  writes:
>> Keeping the number of long-term objects low is key.
>
> Right, if I need near realtime behaviour and must live
> with [C]Python's garbage collector.

Or any other GC ever invented.

> But, a web application does usually not need near realtime behaviour.
> An occasional (maybe once in a few days) garbage collection and
> associated reduced response time is acceptable.
> A problem only arises if a badly designed component produces
> quite frequently hundreds of thousands of temporary objects
> likely triggering (frequent) garbage collections.

But I think you are barking up the wrong tree. You could rightly blame
GC itself as an unworkable paradigm and switch to, say, C++ or Rust.

Or you could blame the parts of the software that create too many
long-term objects.

You shouldn't blame the parts of the software that churn out zillions of
short-term objects.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-14 Thread dieter
Marko Rauhamaa  writes:
> dieter :
> ...
>> Definitely. The application concerned was a long running web application;
>> caching was an important feature to speed up its typical use cases.
>
> As an optimization technique, I suggest turning the cache into a "binary
> blob" opaque to GC, or using some external component like SQLite.

This was a Python application and as such, it was primarily working with
Python objects. And it was a complex application heavily depending
on subframeworks, many of them internally using caching to speed
things up, the details hidden from the application.
Redesigning the application to use an alternative caching was
out of question.


> Keeping the number of long-term objects low is key.

Right, if I need near realtime behaviour and must live
with [C]Python's garbage collector.

But, a web application does usually not need near realtime behaviour.
An occasional (maybe once in a few days) garbage collection and
associated reduced response time is acceptable.
A problem only arises if a badly designed component produces
quite frequently hundreds of thousands of temporary objects
likely triggering (frequent) garbage collections.


> Note that Python creates a temporary object every time you invoke a
> method. CPython removes them quickly through reference counting, but
> other Python implementations just let GC deal with them, and that's
> generally ok.

The initial point has been that you must carefully look at the context
for which you design a solution and choose appropriately (among
other the implementation language).
In my case, the web application framework was fixed ("Zope")
and therefore the language ("Python") and its implementation ("CPython").
"Normal" temporary objects did not pose a problem (due to reference
counting); only the mass creation of temporary objects can
be problematic (as GC is triggered before the temporary objects
are released again (due to reference counting).

>
>
> Marko

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-13 Thread Marko Rauhamaa
Paul Rubin :
> Note that Java has a lot of [GC] options to choose from:
> https://docs.oracle.com/javase/9/gctuning/available-collectors.htm

I'm all for GC, but Java's GC tuning options are the strongest
counter-argument against it. The options just shift the blame from the
programming language to the operator of the software.

For GC to be acceptable, you shouldn't ever have to tune it. And I've
seen it in action. A customer complains about bad performance. The
system engineer makes a tailored GC recipe to address the issue, which
may help for a short while.

Here's my rule of thumb. Calculate how much memory you need for
long-term objects. Don't let the application exceed that amount.
Multiply the amount by 10 and allocate that much RAM for your
application.

> Another approach is Erlang's, where the application is split into a
> lot of small lightweight processes, each of which has its own GC and
> heap. So while some of them are GC'ing, the rest can keep running.

So the lightweight processes don't share any data. That may be a fine
approach.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-13 Thread Marko Rauhamaa
dieter :
> Marko Rauhamaa  writes:
>> However, I challenge the notion that creating hundreds of thousands of
>> temporary objects is stupid. I suspect that the root cause of the
>> lengthy pauses is that the program maintains millions of *nongarbage*
>> objects in RAM (a cache, maybe?).
>
> Definitely. The application concerned was a long running web application;
> caching was an important feature to speed up its typical use cases.

As an optimization technique, I suggest turning the cache into a "binary
blob" opaque to GC, or using some external component like SQLite.
Keeping the number of long-term objects low is key.

Note that Python creates a temporary object every time you invoke a
method. CPython removes them quickly through reference counting, but
other Python implementations just let GC deal with them, and that's
generally ok.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-13 Thread jfine2358
On Friday, October 12, 2018 at 8:41:12 PM UTC+1, Paul Rubin wrote:

> 1) If you keep the existing refcount mechanism, you have to put locks
> around all the refcounts, which kills performance since refcounts are
> updated all the time.

I think BUFFERED multi-core reference count garbage collection is possible. If 
so, then locks are not needed. I explain this in this thread:

[Python-ideas] Multi-core reference count garbage collection
https://groups.google.com/forum/#!topic/python-ideas/xRPdu3ZGeuk
https://mail.python.org/pipermail/python-ideas/2018-July/052054.html

-- 
Jonathan
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-13 Thread Peter J. Holzer
On 2018-10-12 14:07:56 -0500, Tim Daneliuk wrote:
> On 10/11/2018 12:15 AM, Gregory Ewing wrote:
> > But it's not like that at all. As far as I know, all the
> > attempts that have been made so far to remove the GIL have
> > led to performance that was less than satisfactory. It's a
> > hard problem that we haven't found a good solution to yet.
> 
> Do you happen to have a reference that explains what the issues are
> for GIL removal?

I'm certainly not an expert on CPython internals, but what I've gathered
from talks and discussions on the topic is that the CPython interpreter
accesses shared state a lot (the reference count fields are an obvious
example, but there are others), so to remove the GIL you would have to
replace it with a myriad of small locks which are taken and released all
the time - this adds a lot of overhead compared to a single lock which
is basically always taken and just released before blocking syscalls. 

(If you ask your favourite search engine for "gilectomy", you'll
probably find lots of info)

It might be better to start from scratch: Design a new VM suitable for
Python which can run mostly without locks. But this is of course a lot
of work with no guarantee of success. (JPython and IronPython did
something similar, although they didn't design new VMs, but reused VMs
designed for other languages).

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-13 Thread dieter
Marko Rauhamaa  writes:
> dieter :
> ...
>> I work in the domain of web applications. And I made there a nasty
>> experience with garbage collection: occasionally, the web application
>> stopped to respond for about a minute. A (quite difficult) analysis
>> revealed that some (stupid) component created in some situations (a
>> search) hundreds of thousands of temporary objects and thereby
>> triggered a complete garbage collection. The garbage collector started
>> its mark and sweep phase to detect unreachable objects - traversing a
>> graph of millions of objects.
>>
>> As garbage collection becomes drastically more complex if the object
>> graph can change during this phase (and this was Python), a global
>> look prevented any other activity -- leading to the observed
>> latencies.
>
> Yes. The occasional global freeze is unavoidable in any
> garbage-collected runtime environment regardless of the programming
> language.
>
> However, I challenge the notion that creating hundreds of thousands of
> temporary objects is stupid. I suspect that the root cause of the
> lengthy pauses is that the program maintains millions of *nongarbage*
> objects in RAM (a cache, maybe?).

Definitely. The application concerned was a long running web application;
caching was an important feature to speed up its typical use cases.

I do not say that creating hundreds of thousands of temporary objects
is always stupid. But in this case, those temporary objects
were used to wrap early on the document ids found in an index entry just
to get a comfortable interface to access the corresponding documents.
While the index authors were aware that they treat mass data and
therefore stored it in a compact way as C level objects with
efficient "C" level implemented filtering operations on it,
the search author has neglected this aspect and wrapped all document ids
into Python objects.
"search" is essentially a filtering
operation; typically, you need to access far less documents (at
most those in a prefiltered result set) than document ids (the input to
the filtering); in this case, it is stupid to create temporary objects
for all document ids in order to access much less documents later in a
comfortable way.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Gregory Ewing

Paul Rubin wrote:

I even wonder what happens if you turn Py_INCREF etc. into no-ops,
install the Boehm garbage collector in a stop-the-world mode, and
disable the GIL.


I suspect you would run into problems with things that need
mutual exclusion but don't do any locking of their own, because
the GIL is assumed to take care of it.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Neil Cerutti
On 2018-10-12, Peter J. Holzer  wrote:
> Neil Cerutti said:
>> I imagine that if I stuck with Go long enough I'd develop a
>> new coding style that didn't inolve creating useful data
>> types.
>
> I haven't used Go for any real project yet (that may change
> next year - we'll see whether I love it or hate it), but I
> don't see why you wouldn't create useful data types in your Go
> programs. Go isn't object-oriented, but that doesn't make its
> type system useless (I've certainly created lots of useful data
> types in C).

Yeah, my comment was fairly flippant. It isn't that you can't
make new data types in Go, it's that you can't transparently use
Go's syntax facilities with them. In Python (and C++), if the
language provides a facility, my datatype can usually take direct
advantage of it, e.g., for loops or iterators. I didn't stick
around long enough to get used to the restrictions on this in Go.

-- 
Neil Cerutti
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Chris Angelico
On Sat, Oct 13, 2018 at 7:25 AM Vito De Tullio  wrote:
>
> Chris Angelico wrote:
>
> >> Reference counting was likely a bad idea to begin with.
> >
> > Then prove CPython wrong by making a fantastically better
> > implementation that uses some other form of garbage collection.
>
> I'm not talking about the "goodness" of the implemetations, but AFAIK jython
> and ironpython doesn't have the refcount gb (and they don't even have the
> GIL...)
>

Yes, which proves that it's viable. I notice that neither of them has
swept CPython away by massively outperforming them, so perhaps
reference counting isn't such a terrible idea after all?

I'm not disputing that refcounting has its downsides, but "likely a
bad idea to begin with" seems to be a bit beyond reasonable.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Vito De Tullio
Chris Angelico wrote:

>> Reference counting was likely a bad idea to begin with.
> 
> Then prove CPython wrong by making a fantastically better
> implementation that uses some other form of garbage collection.

I'm not talking about the "goodness" of the implemetations, but AFAIK jython 
and ironpython doesn't have the refcount gb (and they don't even have the 
GIL...)


-- 
By ZeD

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Tim Daneliuk
On 10/12/2018 11:43 AM, Skip Montanaro wrote:
> I sort of skimmed ESR's post, and sort of skimmed this thread, so
> obviously I'm totally qualified to offer my observations on the post
> and follow ups. :-)

Skip -

In the 15-ish years I've been reading this group, this has NEVER been
an obstacle for posters :P

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Tim Daneliuk
On 10/11/2018 12:15 AM, Gregory Ewing wrote:
> Paul Rubin wrote [concerning GIL removal]:
>> It's weird that Python's designers were willing to mess up the user
>> language in the 2-to-3 transition but felt that the C API had to be kept
>> sarcosanct.  Huge opportunities were blown at multiple levels.
> 
> You say that as though we had a solution for GIL removal all
> thought out and ready to go, and the only thing that stopped us
> is that it would have required changing the C API.
> 
> But it's not like that at all. As far as I know, all the
> attempts that have been made so far to remove the GIL have
> led to performance that was less than satisfactory. It's a
> hard problem that we haven't found a good solution to yet.
> 


Do you happen to have a reference that explains what the issues are
for GIL removal?


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Peter J. Holzer
On 2018-10-11 17:56:43 +, Neil Cerutti wrote:
> On 2018-10-10, Paul Rubin  wrote:
> > Neil Cerutti  writes:
> >>
> >>> the GIL (15/16th of his CPUs are unused..)
> >> Channels are a big selling point of Go, no argument there.
> >
> > The unused CPUs are not about channels (Python has Queue which
> > is similar).  They're about the GIL, which is an implementation
> > artifact of CPython that's hard to eliminate without modifying
> > CPython's C API.  ESR didn't say this explicitly but it came to
> > mind when reading his post: It's weird that Python's designers
> > were willing to mess up the user language in the 2-to-3
> > transition but felt that the C API had to be kept sarcosanct.
> > Huge opportunities were blown at multiple levels.
> 
> OK. I assumed he planned to parallelize computations in Go by
> offloading to concurrent channels--

You mean goroutines, not channels. Channels are a mechanism which
goroutines use to communicate with each other. In Python terms,
goroutines are sort of like threads, channels like queues.

> I guess I don't understand why he can't do that in Python.

The Go runtime can efficiently distribute goroutines over all cores. The
CPython runtime can't do the same with threads (mostly because of the
GIL, as already discussed). You can run several Python processes and let
them communicate via some IPC mechanism (indeed the Python
multiprocessing module provides Pipe and Queue classes for this
purpose), but IPC is generally more expensive than communication within
the same process. So the overhead may be too hight if you need very
fine-grained communication.

> I imagine that if I stuck with Go long enough I'd develop a new coding
> style that didn't inolve creating useful data types.

I haven't used Go for any real project yet (that may change next year -
we'll see whether I love it or hate it), but I don't see why you
wouldn't create useful data types in your Go programs. Go isn't
object-oriented, but that doesn't make its type system useless (I've
certainly created lots of useful data types in C).

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Skip Montanaro
I sort of skimmed ESR's post, and sort of skimmed this thread, so
obviously I'm totally qualified to offer my observations on the post
and follow ups. :-)

Eric makes a mistake, in my opinion, confusing his particular
application with the mainstream, when in fact it seems pretty
specialized to me. Consequently, he foresees the waning of Python, in
my opinion, incorrectly.

As to the problems posed by the GIL, I am aware of serious attempts to
remove it at least as far back as Python ~ 1.4 (by Greg Smith, as I
recall). His experience was what has been reported in this thread.
While he could improve CPU utilization for multi-threaded
applications, that came at an unacceptable performance cost for
single-threaded applications, which I believe still dominate the
application landscape today. Beyond performance, there is the issue
that the GIL keeps extension authors from shooting themselves in the
foot. When your code is called, the GIL insures that nobody else can
stomp on your data when you are executing.

Skip Montanaro
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Chris Angelico
On Fri, Oct 12, 2018 at 5:51 PM Marko Rauhamaa  wrote:
> > Python uses the GIL mainly because it uses reference counting (with
> > almost constant changes to potentially concurrently used objects) for
> > memory management. Dropping the GIL would mean dropping reference
> > counting likely in favour of garbage collection.
>
> Reference counting was likely a bad idea to begin with.

Then prove CPython wrong by making a fantastically better
implementation that uses some other form of garbage collection. The
language spec does not stipulate refcounting, and in fact very clearly
does NOT stipulate what kind of garbage collection is used, so if
refcounting is such a bad idea, you should be able to take advantage
of CPython's flaw and make something better.

Go ahead. Put your code where your mouth is.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Marko Rauhamaa
dieter :

> Every system you use has its advantages and its drawbacks.
> Depending on the specific context (problem, resources, knowledge, ...),
> you must choose an appropriate one.

Yep. I use Python for numerous tasks professionally and at home. Just
this past week I used it to plan a junior soccer winter tournament.
Python is used to verify various team and match constraints and
Sudoku-solver-type match order generation.

> Python uses the GIL mainly because it uses reference counting (with
> almost constant changes to potentially concurrently used objects) for
> memory management. Dropping the GIL would mean dropping reference
> counting likely in favour of garbage collection.

Reference counting was likely a bad idea to begin with.

> I work in the domain of web applications. And I made there a nasty
> experience with garbage collection: occasionally, the web application
> stopped to respond for about a minute. A (quite difficult) analysis
> revealed that some (stupid) component created in some situations (a
> search) hundreds of thousands of temporary objects and thereby
> triggered a complete garbage collection. The garbage collector started
> its mark and sweep phase to detect unreachable objects - traversing a
> graph of millions of objects.
>
> As garbage collection becomes drastically more complex if the object
> graph can change during this phase (and this was Python), a global
> look prevented any other activity -- leading to the observed
> latencies.

Yes. The occasional global freeze is unavoidable in any
garbage-collected runtime environment regardless of the programming
language.

However, I challenge the notion that creating hundreds of thousands of
temporary objects is stupid. I suspect that the root cause of the
lengthy pauses is that the program maintains millions of *nongarbage*
objects in RAM (a cache, maybe?).

> When I remember right, there are garbage collection schemes that
> can operate safely without stopping other concurrent work.

There are heuristics, but I believe the worst case is the same.

> Nevertheless, even those garbage collectors have a significant impact
> on performance when they become active (at apparently
> non-deterministic times) and that may be inacceptable for some
> applications.

If performance is key, Python is probably not the answer. Python's
dynamism make it necessarily much slower than, say, Java or Go.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread dieter
Ben Finney  writes:
> ...
> Is it your position that the described behaviour is not a problem? Do
> you hold that position because you think multi-core machines are not a
> sector that Python needs to be good at? Or that the described behaviour
> doesn't occur? Or something else?

Every system you use has its advantages and its drawbacks.
Depending on the specific context (problem, resources, knowledge, ...),
you must choose an appropriate one.

Python uses the GIL mainly because it uses reference counting
(with almost constant changes to potentially concurrently used objects)
for memory management.
Dropping the GIL would mean dropping reference counting
likely in favour of garbage collection.

I work in the domain of web applications. And I made there a nasty
experience with garbage collection: occasionally, the web application
stopped to respond for about a minute. A (quite difficult) analysis
revealed that some (stupid) component created in some situations
(a search) hundreds of thousands of temporary objects and thereby
triggered a complete garbage collection. The garbage collector
started its mark and sweep phase to detect unreachable objects - traversing
a graph of millions of objects.
As garbage collection becomes drastically more complex
if the object graph can change during this phase (and this was Python),
a global look prevented any other activity -- leading to the observed
latencies.

When I remember right, there are garbage collection schemes that
can operate safely without stopping other concurrent work.
Nevertheless, even those garbage collectors have a significant
impact on performance when they become active (at apparently
non-deterministic times) and that may be inacceptable for
some applications.



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-11 Thread Neil Cerutti
On 2018-10-10, Paul Rubin  wrote:
> Neil Cerutti  writes:
>> As Stephen said, it's sort of silly not to be aware of those
>> issues going in.
>
> If you're saying ESR messed up by using Python in the first
> place for that program, that's not a great advert for Python
> either.

I meant Stefan, by the way. Sorry, Stefan.

I actually do kind of think that, but ESR is a much better
and more experienced programmer than me and a whole lot more
knowledgable about the resource usage his program requires. So take
that part of my opinion with a grain of salt, if you didn't
already have one. Somebody in his comments section was
incredulous that his software could possible bog down with such a
"small" code base.

It concerning to me that he glossed over his decision process
when abandoning the idea of offloading the part of the work
that's getting stuck in one core to a suitable engine. To me,
that has to be easier than rewriting from scratch in a new
language, but we don't have enough information to judge.

One issue I didn't address was his criticism of deploying Python
programs, thanks to library location issues. As I understand it,
Go's library system has it own huge bloc of complainants regaring
versioning hell, with a solution that's vaporware. Python's
library issues can be addressed with virtualenv or py2exe type
tools.

I also disliked how he lumped current problems with Python in
together with "ancient" history.

It was definitely a good enough article for a blog post but would
need rewriting for a journal or a magazine.

>>> the GIL (15/16th of his CPUs are unused..)
>> Channels are a big selling point of Go, no argument there.
>
> The unused CPUs are not about channels (Python has Queue which
> is similar).  They're about the GIL, which is an implementation
> artifact of CPython that's hard to eliminate without modifying
> CPython's C API.  ESR didn't say this explicitly but it came to
> mind when reading his post: It's weird that Python's designers
> were willing to mess up the user language in the 2-to-3
> transition but felt that the C API had to be kept sarcosanct.
> Huge opportunities were blown at multiple levels.

OK. I assumed he planned to parallelize computations in Go by
offloading to concurrent channels--I guess I don't understand why
he can't do that in Python.

>> You must carefully use a different set of functions and
>> operators to regard the bytes as unicode code-points. So Go
>> makes it easy to do things incorrectly
>
> That seems like a deficiency in Go's type system rather than in
> the UTF8 representation per se.  

I agree. Python could've used UTF8 as the internal representation
as well, but there's no need to go back into that jmxfauth horror
show.

>> On the other hand, I only used Go until it made me feel really
>> annoyed
>
> Yeah I haven't used Go a whole lot either, for similar and
> other reasons.  ESR's criticisms of Python are imho mostly
> valid but I don't think Go is the answer.  It's too low level,
> showing its roots in C. Haskell and Ocaml have their own
> deficiencies but I think they are closer to the right
> direction.  I.e. I'd like something with a serious type system
> and a more functional-programming flavor than either Go or
> Python currently have.

A big reason he chose Go in the article is that he regarded it as
more simple to translate Python->Go than those other options.
It's hard to believe that good Python code would make could Go
code, but at least its a starting point. I imagine that if I
stuck with Go long enough I'd develop a new coding style that
didn't inolve creating useful data types. I've used C++ and
Python so long I just don't think that way.

-- 
Neil Cerutti
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-11 Thread Tomasz Rola
On Thu, Oct 11, 2018 at 06:22:13PM +1100, Chris Angelico wrote:
[...]
> 
> There's a huge difference between deciding on using some different
> language for a project, and going on a massive ire-filled rant.

I agree, in fact this is the kind of posture that I myself
implemented in my actions. 

I have read the article, which was interesting, but am yet to read the
comments. I have a question. What is the size of data? I could not
spot it. If one had read it into one huge byte array, how many bytes
would it be? ESR says he makes a graph of it and there are ~350k
objects - i.e. nodes, I assume? How many edges (arcs, lines) for a
node, on average? on max?

I am curious, as always. Because it is nice of him to rant, but
technical/mathematical side of decision making is still a bit too
foggy for me and I cannot assess it.

-- 
Regards,
Tomasz Rola

--
** A C programmer asked whether computer had Buddha's nature.  **
** As the answer, master did "rm -rif" on the programmer's home**
** directory. And then the C programmer became enlightened...  **
** **
** Tomasz Rola  mailto:tomasz_r...@bigfoot.com **
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-11 Thread Chris Angelico
On Thu, Oct 11, 2018 at 6:43 PM Thomas Jollans  wrote:
> The gist is that the GIL is a problem only for relatively few problems
> (e.g. games that need limited-scale low-latency parallelism). Most of
> the time, you either only need one process in the first place, or you
> can take full advantage of your multi-core machine, or multiple
> multi-core machines, using multiple processes (with ipyparallel or whatever)

Right. If someone goes on a long rant saying how he wasn't able to
write his device driver in Python, and he's giving up on the language
and going to C, would that be taken as an affront to Python, or the
knell of the language, or anything like that? No, it'd be "well,
that's not Python's role". But if someone needs to manage a billion
teensy sub-jobs and then finds that Python is unsuitable, it's clearly
the GIL's fault, and this is a fundamental flaw in the language, and
we should all move to Go for all our coding because Python utterly
sucks. Why? Why not just let other languages do things differently,
and have other design tradeoffs?

This is one fairly specific class of problem which is poorly served by
Python's/CPython's design. (I'm not sure how much of this could be
done differently in an alternative implementation; so far, we haven't
seen a "Python for embarrassingly parallel problems" implementation,
so I suspect part of it is language design.) So pick a different
language *for those problems*.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-11 Thread Thomas Jollans

On 11/10/2018 09:11, Ben Finney wrote:

Chris Angelico  writes:


In actual fact, it's not a problem per-se. It's a design choice, and
every alternative choice tried so far has even worse problems. THAT is
why we still have it.


That reads to me like a rejection of the point made in the blog post:
that the GIL prevents Python from taking proper advantage of multi-core
machines.

In other words: Yes, it's a design decision, but that design decision
causes the problems described.

Is it your position that the described behaviour is not a problem? Do
you hold that position because you think multi-core machines are not a
sector that Python needs to be good at? Or that the described behaviour
doesn't occur? Or something else?



I recently watched this talk by Raymond Hettinger on concurrency which 
gives some perspective on this question especially in the first ten 
minutes: https://www.youtube.com/watch?v=9zinZmE3Ogk


The gist is that the GIL is a problem only for relatively few problems 
(e.g. games that need limited-scale low-latency parallelism). Most of 
the time, you either only need one process in the first place, or you 
can take full advantage of your multi-core machine, or multiple 
multi-core machines, using multiple processes (with ipyparallel or whatever)

--
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-11 Thread Chris Angelico
On Thu, Oct 11, 2018 at 6:12 PM Ben Finney  wrote:
>
> Chris Angelico  writes:
>
> > In actual fact, it's not a problem per-se. It's a design choice, and
> > every alternative choice tried so far has even worse problems. THAT is
> > why we still have it.
>
> That reads to me like a rejection of the point made in the blog post:
> that the GIL prevents Python from taking proper advantage of multi-core
> machines.
>
> In other words: Yes, it's a design decision, but that design decision
> causes the problems described.
>
> Is it your position that the described behaviour is not a problem? Do
> you hold that position because you think multi-core machines are not a
> sector that Python needs to be good at? Or that the described behaviour
> doesn't occur? Or something else?

Multi-core machines are important, but even on multi-core machines,
most Python processes don't need more than one. AFAIK, every single
alternative to the GIL has resulted in a measurable performance
penalty when running on a single core. (Happy to be proven wrong if
that's not the case.) So if you want better multi-core performance,
you MUST accept a single-core penalty.

Frankly, I don't see a problem with saying "Python doesn't make it
easy to write code that floods eight cores with work, therefore I will
choose a different language for this job". It doesn't mean Python is a
bad language. It just means that Python is not the one and only
language that all code must forever be written in.

There's a huge difference between deciding on using some different
language for a project, and going on a massive ire-filled rant.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-11 Thread Ben Finney
Chris Angelico  writes:

> In actual fact, it's not a problem per-se. It's a design choice, and
> every alternative choice tried so far has even worse problems. THAT is
> why we still have it.

That reads to me like a rejection of the point made in the blog post:
that the GIL prevents Python from taking proper advantage of multi-core
machines.

In other words: Yes, it's a design decision, but that design decision
causes the problems described.

Is it your position that the described behaviour is not a problem? Do
you hold that position because you think multi-core machines are not a
sector that Python needs to be good at? Or that the described behaviour
doesn't occur? Or something else?

-- 
 \  “A hundred times every day I remind myself that […] I must |
  `\   exert myself in order to give in the same measure as I have |
_o__)received and am still receiving” —Albert Einstein |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-11 Thread Chris Angelico
On Thu, Oct 11, 2018 at 4:21 PM Gregory Ewing
 wrote:
>
> Paul Rubin wrote [concerning GIL removal]:
> > It's weird that Python's designers were willing to mess up the user
> > language in the 2-to-3 transition but felt that the C API had to be kept
> > sarcosanct.  Huge opportunities were blown at multiple levels.
>
> You say that as though we had a solution for GIL removal all
> thought out and ready to go, and the only thing that stopped us
> is that it would have required changing the C API.
>
> But it's not like that at all. As far as I know, all the
> attempts that have been made so far to remove the GIL have
> led to performance that was less than satisfactory. It's a
> hard problem that we haven't found a good solution to yet.
>

In actual fact, it's not a problem per-se. It's a design choice, and
every alternative choice tried so far has even worse problems. THAT is
why we still have it.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-10 Thread Gregory Ewing

Paul Rubin wrote [concerning GIL removal]:

It's weird that Python's designers were willing to mess up the user
language in the 2-to-3 transition but felt that the C API had to be kept
sarcosanct.  Huge opportunities were blown at multiple levels.


You say that as though we had a solution for GIL removal all
thought out and ready to go, and the only thing that stopped us
is that it would have required changing the C API.

But it's not like that at all. As far as I know, all the
attempts that have been made so far to remove the GIL have
led to performance that was less than satisfactory. It's a
hard problem that we haven't found a good solution to yet.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-10 Thread Neil Cerutti
On 2018-10-09, Paul Rubin  wrote:
> If anyone cares, Eric Raymond posted a big rant saying
> basically he's giving up on Python and porting a big program
> he's working on to Go. Reasons he gives are

> performance (Go is 40x faster for his app)
> memory footprint (high overhead of simple Python objects cause
> his 64GB 16 core box to OOM on his data)

As Stephen said, it's sort of silly not to be aware of those
issues going in.

> the GIL (15/16th of his CPUs are unused, of course there are
> ways around that but I'm summarizing what he says even when I
> don't fully agree),

Channels are a big selling point of Go, no argument there. Using
them right is considerably trickier than it appears at first, but
they have good syntax and feel lightweight.

> Unicode (he says Go's uniform use of UTF8 is better than
> Python's bloaty codepoint lists),

Go's system for character encoding is objectively worse, IMHO.

Both Python and Go require you to decode to an internal unicode
storage format on the way into your program, and to encode it
again on the way out.

But the internal storage formats are not equally usable. The
internal storage format is UTF8 in Go, but it's regarded simply as
bytes by most normal operations and functions. You must carefully
use a different set of functions and operators to regard the
bytes as unicode code-points. So Go makes it easy to do things
incorrectly, a la Python 2, which is a benefit only if you just
don't care to do things correctly.

On the other hand, I only used Go until it made me feel really
annoyed that I couldn't build my own data types and interfaces
without feeling like they were 2nd or 3rd class citizens, forced
to carry around heavy, clunking chains, while the builtin types
and interfaces enjoyed unfair syntax and usability privileges.

I tried to be open minded about the error propogation mechanism
in Go, but it remained stubbornly frustrating, especially when
designing my own interface.

> It is ranty and there are parts I don't agree with, but I think
> it is worth reading.  It is around 300 lines, followed by
> several pages of reader comments.
>
> http://esr.ibiblio.org/?p=8161

Thanks for sharing it.

-- 
Neil Cerutti
-- 
https://mail.python.org/mailman/listinfo/python-list