Re: Progress on the Gilectomy
Why not make the garbage collector check the reference count before freeing objects? Only c extensions would increment the ref count while python code would just use garbage collector making ref count = 0. That way even the existing c extensions would continue to work. Regarding to Java using all the memory, thats not really true. It has a default heap size which may exceed the total memory in a particular environment(Android ). -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On 06/22/2017 10:26 PM, Rustom Mody wrote: Lawrence d'Oliveiro was banned on 30th Sept 2016 till end-of-year https://mail.python.org/pipermail/python-list/2016-September/714725.html Is there still a ban? My apologies to Lawrence, I completely forgot. The ban is now lifted. -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
Gregory Ewing: > Lawrence D’Oliveiro wrote: >> what WOULD you consider to be so “representative”? > > I don't claim any of them to be representative. Different GC > strategies have different characteristics. My experiences with Hotspot were a bit disheartening. GC is a winning concept provided that you don't have to strategize too much. In practice, it seems tweaking the GC parameters is a frequent necessity. On the other hand, I believe much of the trouble comes from storing too much information in the heap. Applications shouldn't have semipersistent multigigabyte lookup structures kept in RAM, at least not in numerous small objects. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
Lawrence D’Oliveiro wrote: what WOULD you consider to be so “representative”? I don't claim any of them to be representative. Different GC strategies have different characteristics. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Marko Rauhamaa wrote: And, BTW, my rule of thumb came from experiences with the Hotspot JRE. I wouldn't take a Java implementation to be representative of the behaviour of GC systems in general. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Thursday, June 22, 2017 at 4:28:03 AM UTC+5:30, Steve D'Aprano wrote: > On Thu, 22 Jun 2017 08:23 am, breamoreboy wrote: > > > Don't you know that Lawrence D’Oliveiro has been banned from the mailing > > list > > as he hasn't got a clue what he's talking about, > > That's not why he was given a ban. Being ignorant is not a crime -- if it > were, > a lot more of us would be banned, including all newbies. Lawrence d'Oliveiro was banned on 30th Sept 2016 till end-of-year https://mail.python.org/pipermail/python-list/2016-September/714725.html Is there still a ban? -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Fri, 23 Jun 2017 01:07 am, breamore...@gmail.com wrote: > 11 comments on the thread "Instagram: 40% Py3 to 99% Py3 in 10 months" showing > that he knows as much about Unicode as LDO knows about garabge collection. Who cares? Every time he opens his mouth to write absolute rubbish he just makes a fool of himself. Why do you let it upset you? -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
On Jun 22, 2017 4:03 PM, "Chris Angelico"wrote: On Fri, Jun 23, 2017 at 5:22 AM, CFK wrote: > On Jun 22, 2017 9:32 AM, "Chris Angelico" wrote: > > On Thu, Jun 22, 2017 at 11:24 PM, CFK wrote: >> When >> I draw memory usage graphs, I see sawtooth waves to the memory usage which >> suggest that the garbage builds up until the GC kicks in and reaps the >> garbage. > > Interesting. How do you actually measure this memory usage? Often, > when a GC frees up memory, it's merely made available for subsequent > allocations, rather than actually given back to the system - all it > takes is one still-used object on a page and the whole page has to be > retained. > > As such, a "create and drop" usage model would tend to result in > memory usage going up for a while, but then remaining stable, as all > allocations are being fulfilled from previously-released memory that's > still owned by the process. > > > I'm measuring it using a bit of a hack; I use psutil.Popen > (https://pypi.python.org/pypi/psutil) to open a simulation as a child > process, and in a tight loop gather the size of the resident set and the > number of virtual pages currently in use of the child. The sawtooths are > about 10% (and decreasing) of the size of the overall memory usage, and are > probably due to different stages of the simulation doing different things. > That is an educated guess though, I don't have strong evidence to back it > up. > > And, yes, what you describe is pretty close to what I'm seeing. The longer > the simulation has been running, the smoother the memory usage gets. Ah, I think I understand. So the code would be something like this: Phase one: Create a bunch of objects Do a bunch of simulation Destroy a bunch of objects Simulate more Destroy all the objects used in this phase, other than the result Phase two: Like phase one In that case, yes, it's entirely possible that the end of a phase could signal a complete cleanup of intermediate state, with the consequent release of memory to the system. (Or, more likely, a near-complete cleanup, with release of MOST of memory.) Very cool bit of analysis you've done there. Thank you! And, yes, that is essentially what is going on (or was in that version of the simulator; I'm in the middle of a big refactor to speed things up and expect the memory usage patterns to change) Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
On Fri, Jun 23, 2017 at 5:22 AM, CFKwrote: > On Jun 22, 2017 9:32 AM, "Chris Angelico" wrote: > > On Thu, Jun 22, 2017 at 11:24 PM, CFK wrote: >> When >> I draw memory usage graphs, I see sawtooth waves to the memory usage which >> suggest that the garbage builds up until the GC kicks in and reaps the >> garbage. > > Interesting. How do you actually measure this memory usage? Often, > when a GC frees up memory, it's merely made available for subsequent > allocations, rather than actually given back to the system - all it > takes is one still-used object on a page and the whole page has to be > retained. > > As such, a "create and drop" usage model would tend to result in > memory usage going up for a while, but then remaining stable, as all > allocations are being fulfilled from previously-released memory that's > still owned by the process. > > > I'm measuring it using a bit of a hack; I use psutil.Popen > (https://pypi.python.org/pypi/psutil) to open a simulation as a child > process, and in a tight loop gather the size of the resident set and the > number of virtual pages currently in use of the child. The sawtooths are > about 10% (and decreasing) of the size of the overall memory usage, and are > probably due to different stages of the simulation doing different things. > That is an educated guess though, I don't have strong evidence to back it > up. > > And, yes, what you describe is pretty close to what I'm seeing. The longer > the simulation has been running, the smoother the memory usage gets. Ah, I think I understand. So the code would be something like this: Phase one: Create a bunch of objects Do a bunch of simulation Destroy a bunch of objects Simulate more Destroy all the objects used in this phase, other than the result Phase two: Like phase one In that case, yes, it's entirely possible that the end of a phase could signal a complete cleanup of intermediate state, with the consequent release of memory to the system. (Or, more likely, a near-complete cleanup, with release of MOST of memory.) Very cool bit of analysis you've done there. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Thursday, June 22, 2017 at 11:07:36 AM UTC-4, bream...@gmail.com wrote: > On Wednesday, June 21, 2017 at 11:58:03 PM UTC+1, Steve D'Aprano wrote: > > On Thu, 22 Jun 2017 08:23 am, breamoreboy wrote: > > > > > Don't you know that Lawrence D’Oliveiro has been banned from the mailing > > > list > > > as he hasn't got a clue what he's talking about, > > > > That's not why he was given a ban. Being ignorant is not a crime -- if it > > were, > > a lot more of us would be banned, including all newbies. > > > > > just like the RUE? > > > > What is your obsession with wxjmfauth? You repeatedly mention him in > > unrelated > > discussions. > > > > 11 comments on the thread "Instagram: 40% Py3 to 99% Py3 in 10 months" > showing that he knows as much about Unicode as LDO knows about garabge > collection. You've been asked to stop making personal attacks before. Please stop. --Ned. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
On Jun 22, 2017 9:32 AM, "Chris Angelico"wrote: On Thu, Jun 22, 2017 at 11:24 PM, CFK wrote: > When > I draw memory usage graphs, I see sawtooth waves to the memory usage which > suggest that the garbage builds up until the GC kicks in and reaps the > garbage. Interesting. How do you actually measure this memory usage? Often, when a GC frees up memory, it's merely made available for subsequent allocations, rather than actually given back to the system - all it takes is one still-used object on a page and the whole page has to be retained. As such, a "create and drop" usage model would tend to result in memory usage going up for a while, but then remaining stable, as all allocations are being fulfilled from previously-released memory that's still owned by the process. I'm measuring it using a bit of a hack; I use psutil.Popen ( https://pypi.python.org/pypi/psutil) to open a simulation as a child process, and in a tight loop gather the size of the resident set and the number of virtual pages currently in use of the child. The sawtooths are about 10% (and decreasing) of the size of the overall memory usage, and are probably due to different stages of the simulation doing different things. That is an educated guess though, I don't have strong evidence to back it up. And, yes, what you describe is pretty close to what I'm seeing. The longer the simulation has been running, the smoother the memory usage gets. Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Fri, Jun 23, 2017 at 1:48 AM, Marko Rauhamaawrote: > Chris Angelico : > >> not "aim for 400MB because the garbage collector is only 10% >> efficient". Get yourself a better garbage collector. Employ Veolia or >> something. > > It's about giving GC room (space- and timewise) to operate. Also, you > don't want your memory consumption to hit the RAM ceiling even for a > moment. Again, if you'd said to *leave 10% room*, I would be inclined to believe you (eg to use no more than 3.5ish gig when you have four available), but not to leave 90% room. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Marko Rauhamaa: > Chris Angelico : > >> not "aim for 400MB because the garbage collector is only 10% >> efficient". Get yourself a better garbage collector. Employ Veolia or >> something. > > It's about giving GC room (space- and timewise) to operate. Also, you > don't want your memory consumption to hit the RAM ceiling even for a > moment. And, BTW, my rule of thumb came from experiences with the Hotspot JRE. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Chris Angelico: > not "aim for 400MB because the garbage collector is only 10% > efficient". Get yourself a better garbage collector. Employ Veolia or > something. It's about giving GC room (space- and timewise) to operate. Also, you don't want your memory consumption to hit the RAM ceiling even for a moment. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Thu, Jun 22, 2017 at 11:27 PM, Marko Rauhamaawrote: > CFK : > >> Yes, and this is why I suspect CPython would work well too. My usage >> pattern may be similar to Python usage patterns. The only way to know for >> sure is to try it and see what happens. > > I have a rule of thumb that your application should not need more than > 10% of the available RAM. If your server has 4 GB of RAM, your > application should only need 400 MB. The 90% buffer should be left for > the GC to maneuver. *BOGGLE* I could see a justification in saying "aim for 400MB, because then unexpected spikes won't kill you", or "aim for 400MB to ensure that you can run multiple instances of the app for load balancing", or "aim for 400MB because you don't want to crowd out the database and the disk cache", but not "aim for 400MB because the garbage collector is only 10% efficient". Get yourself a better garbage collector. Employ Veolia or something. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
CFK: > Yes, and this is why I suspect CPython would work well too. My usage > pattern may be similar to Python usage patterns. The only way to know for > sure is to try it and see what happens. I have a rule of thumb that your application should not need more than 10% of the available RAM. If your server has 4 GB of RAM, your application should only need 400 MB. The 90% buffer should be left for the GC to maneuver. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
On Thu, Jun 22, 2017 at 11:24 PM, CFKwrote: > When > I draw memory usage graphs, I see sawtooth waves to the memory usage which > suggest that the garbage builds up until the GC kicks in and reaps the > garbage. Interesting. How do you actually measure this memory usage? Often, when a GC frees up memory, it's merely made available for subsequent allocations, rather than actually given back to the system - all it takes is one still-used object on a page and the whole page has to be retained. As such, a "create and drop" usage model would tend to result in memory usage going up for a while, but then remaining stable, as all allocations are being fulfilled from previously-released memory that's still owned by the process. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
On Jun 22, 2017 12:38 AM, "Paul Rubin"wrote: Lawrence D’Oliveiro writes: > while “memory footprint” depends on how much memory is actually being > retained in accessible objects. If the object won't be re-accessed but is still retained by gc, then refcounting won't free it either. > Once again: The trouble with GC is, it doesn’t know when to kick in: > it just keeps on allocating memory until it runs out. When was the last time you encountered a problem like that in practice? It's almost never an issue. "Runs out" means reached an allocation threshold that's usually much smaller than the program's memory region. And as you say, you can always manually trigger a gc if the need arises. I'm with Paul and Steve on this. I've had to do a **lot** of profiling on my simulator to get it to run at a reasonable speed. Memory usage seems to follow an exponential decay curve, hitting a strict maximum that strongly correlates with the number of live objects in a given simulation run. When I draw memory usage graphs, I see sawtooth waves to the memory usage which suggest that the garbage builds up until the GC kicks in and reaps the garbage. In short, only an exceptionally poorly written GC would exhaust memory before reaping garbage. Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Jun 21, 2017 1:38 AM, "Paul Rubin"wrote: Cem Karan writes: > I'm not too sure how much of performance impact that will have. My > code generates a very large number of tiny, short-lived objects at a > fairly high rate of speed throughout its lifetime. At least in the > last iteration of the code, garbage collection consumed less than 1% > of the total runtime. Maybe this is something that needs to be done > and profiled to see how well it works? If the gc uses that little runtime and your app isn't suffering from the added memory fragmentation, then it sounds like you're doing fine. Yes, and this is why I suspect CPython would work well too. My usage pattern may be similar to Python usage patterns. The only way to know for sure is to try it and see what happens. > I **still** can't figure out how they managed to do it, How it works (i.e. what the implementation does) is quite simple and understandable. The amazing thing is that it doesn't leak memory catastrophically. I'll have to read through the code then, just to see what they are doing. Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
Lawrence D’Oliveirowrites: > while “memory footprint” depends on how much memory is actually being > retained in accessible objects. If the object won't be re-accessed but is still retained by gc, then refcounting won't free it either. > Once again: The trouble with GC is, it doesn’t know when to kick in: > it just keeps on allocating memory until it runs out. When was the last time you encountered a problem like that in practice? It's almost never an issue. "Runs out" means reached an allocation threshold that's usually much smaller than the program's memory region. And as you say, you can always manually trigger a gc if the need arises. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy (Posting On Python-List Prohibited)
On Thu, 22 Jun 2017 10:30 am, Lawrence D’Oliveiro wrote: > Once again: The trouble with GC is, it doesn’t know when to kick in: it just > keeps on allocating memory until it runs out. Once again: no it doesn't. Are you aware that CPython has a GC? (Or rather, a *second* GC, apart from the reference counter.) It runs periodically to reclaim dead objects in cycles that the reference counter won't free. It runs whenever the number of allocations minus the number of deallocations exceed certain thresholds, and you can set and query the thresholds using: gc.set_threshold gc.get_threshold CPython alone disproves your assertion that GCs "keep on allocating memory until it runs out". Are you aware that there are more than one garbage collection algorithm? Apart from reference-counting GC, there are also "mark and sweep" GCs, generational GCs (like CPython's), real-time algorithms, and more. One real-time algorithm implicitly divides memory into two halves. When one half is half-full, it moves all the live objects into the other half, freeing up the first half. The Mercury programming language even has a *compile time* garbage collector that can determine when an object can be freed during compilation -- no sweeps or reference counting required. It may be that *some* (possibly toy) GC algorithms behave as you say, only running when memory is completely full. But your belief that *all* GC algorithms behave this way is simply wrong. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Thu, 22 Jun 2017 08:23 am, breamore...@gmail.com wrote: > Don't you know that Lawrence D’Oliveiro has been banned from the mailing list > as he hasn't got a clue what he's talking about, That's not why he was given a ban. Being ignorant is not a crime -- if it were, a lot more of us would be banned, including all newbies. > just like the RUE? What is your obsession with wxjmfauth? You repeatedly mention him in unrelated discussions. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Lawrence D’Oliveirowrites: > The trouble with GC is, it doesn’t know when to kick in: it just keeps > on allocating memory until it runs out. That's not how GC works, geez. Typically it would run after every N bytes of memory allocated, for N chosen to balance memory footprint with cpu overhead. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Paul Rubin: > How it works (i.e. what the implementation does) is quite simple and > understandable. The amazing thing is that it doesn't leak memory > catastrophically. If I understand it correctly, the 32-bit Go language runtime implementation suffered "catastrophically" at one point. The reason was that modern programs can actually use 2GB of RAM. That being the case, there is a 50% chance for any random 4-byte combination to look like a valid pointer into the heap. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Cem Karanwrites: > I'm not too sure how much of performance impact that will have. My > code generates a very large number of tiny, short-lived objects at a > fairly high rate of speed throughout its lifetime. At least in the > last iteration of the code, garbage collection consumed less than 1% > of the total runtime. Maybe this is something that needs to be done > and profiled to see how well it works? If the gc uses that little runtime and your app isn't suffering from the added memory fragmentation, then it sounds like you're doing fine. > I **still** can't figure out how they managed to do it, How it works (i.e. what the implementation does) is quite simple and understandable. The amazing thing is that it doesn't leak memory catastrophically. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Jun 20, 2017, at 1:19 AM, Paul Rubinwrote: > Cem Karan writes: >> Can you give examples of how it's not reliable? > > Basically there's a chance of it leaking memory by mistaking a data word > for a pointer. This is unlikely to happen by accident and usually > inconsequential if it does happen, but maybe there could be malicious > data that makes it happen Got it, thank you. My processes will run for 1-2 weeks at a time, so I can handle minor memory leaks over that time without too much trouble. > Also, it's a non-compacting gc that has to touch all the garbage as it > sweeps, not a reliability issue per se, but not great for performance > especially in large, long-running systems. I'm not too sure how much of performance impact that will have. My code generates a very large number of tiny, short-lived objects at a fairly high rate of speed throughout its lifetime. At least in the last iteration of the code, garbage collection consumed less than 1% of the total runtime. Maybe this is something that needs to be done and profiled to see how well it works? > It's brilliant though. It's one of those things that seemingly can't > possibly work, but it turns out to be quite effective. Agreed! I **still** can't figure out how they managed to do it, it really does look like it shouldn't work at all! Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Paul Rubin: > The simplest way to start experimenting with GC in Python might be to > redefine the refcount macros to do nothing, connect the allocator to > the Boehm GC, and stop all the threads when GC time comes. I don't > know if Guile has threads at all, but I know it uses the Boehm GC and > it's quite effective. Guile requires careful programming practices in the C extension code: https://www.gnu.org/software/guile/manual/html_node/Foreign-Ob ject-Memory-Management.html#Foreign-Object-Memory-Management> Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Cem Karanwrites: > Can you give examples of how it's not reliable? Basically there's a chance of it leaking memory by mistaking a data word for a pointer. This is unlikely to happen by accident and usually inconsequential if it does happen, but maybe there could be malicious data that makes it happen Also, it's a non-compacting gc that has to touch all the garbage as it sweeps, not a reliability issue per se, but not great for performance especially in large, long-running systems. It's brilliant though. It's one of those things that seemingly can't possibly work, but it turns out to be quite effective. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Chris Angelicowrites: > Or let's look at it a different way. Instead of using a PyObject* in C > code, you could write C++ code that uses a trivial wrapper class that > holds the pointer, increments its refcount on construction, and > decrements that refcount on destruction. That's the C++ STL shared_ptr template. Unfortunately it has the same problem as Python refcounts, i.e. it has to use locks to maintain thread safety, which slows it down significantly. The simplest way to start experimenting with GC in Python might be to redefine the refcount macros to do nothing, connect the allocator to the Boehm GC, and stop all the threads when GC time comes. I don't know if Guile has threads at all, but I know it uses the Boehm GC and it's quite effective. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Tue, Jun 20, 2017 at 1:52 PM, Rustom Modywrote: > Saw this this morning > https://medium.com/@alexdixon/functional-programming-in-javascript-is-an-antipattern-58526819f21e > > May seem irrelevant to this, but if JS, FP is replaced by Python, GC it > becomes > more on topical https://rhettinger.wordpress.com/2011/05/26/super-considered-super/ If super() is replaced with GC, it also becomes on-topic. I'm sure all this has some deep existential meaning about how easily blog posts can be transplanted into utterly unrelated conversations, but at the moment, it eludes me. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Tuesday, June 20, 2017 at 5:53:00 AM UTC+5:30, Cem Karan wrote: > On Jun 19, 2017, at 6:19 PM, Gregory Ewing wrote: > > > Ethan Furman wrote: > >> Let me ask a different question: How much effort is required at the C > >> level when using tracing garbage collection? > > > > That depends on the details of the GC implementation, but often > > you end up swapping one form of boilerplate (maintaining ref > > counts) for another (such as making sure the GC system knows > > about all the temporary references you're using). > > > > Some, such as the Bohm collector, try to figure it all out > > automagically, but they rely on non-portable tricks and aren't > > totally reliable. > > Can you give examples of how it's not reliable? I'm currently using it in > one of my projects, so if it has problems, I need to know about them. Saw this this morning https://medium.com/@alexdixon/functional-programming-in-javascript-is-an-antipattern-58526819f21e May seem irrelevant to this, but if JS, FP is replaced by Python, GC it becomes more on topical -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Jun 19, 2017, at 6:19 PM, Gregory Ewingwrote: > Ethan Furman wrote: >> Let me ask a different question: How much effort is required at the C level >> when using tracing garbage collection? > > That depends on the details of the GC implementation, but often > you end up swapping one form of boilerplate (maintaining ref > counts) for another (such as making sure the GC system knows > about all the temporary references you're using). > > Some, such as the Bohm collector, try to figure it all out > automagically, but they rely on non-portable tricks and aren't > totally reliable. Can you give examples of how it's not reliable? I'm currently using it in one of my projects, so if it has problems, I need to know about them. On the main topic: I think that a good tracing garbage collector would probably be a good idea. I've been having a real headache binding python to my C library via ctypes, and a large part of that problem is that I've got two different garbage collectors (python and bdwgc). I think I've got it worked out at this point, but it would have been convenient to get memory allocated from python's garbage collected heap on the C-side. Lot fewer headaches. Thanks, Cem Karan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Ethan Furman wrote: Let me ask a different question: How much effort is required at the C level when using tracing garbage collection? That depends on the details of the GC implementation, but often you end up swapping one form of boilerplate (maintaining ref counts) for another (such as making sure the GC system knows about all the temporary references you're using). Some, such as the Bohm collector, try to figure it all out automagically, but they rely on non-portable tricks and aren't totally reliable. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Tue, Jun 20, 2017 at 1:44 AM, Skip Montanarowrote: > On Mon, Jun 19, 2017 at 10:20 AM, Ethan Furman wrote: > >> Programming at the C level is not working in Python, and many Python >> niceties simply don't exist there. > > > True, but a lot of functionality available to Python programmers exists at > the extension module level, whether delivered as part of the core > distribution or from third-party sources. (The core CPython test suite > spends a fair amount of effort on leak detection, one side effect of > incorrect reference counting.) While programming in Python you don't need > to worry about reference counting errors, when they slip through from the C > level, they affect you. High level languages mean that you don't have to write C code. Does the presence of core code and/or extension modules written in C mean that Python isn't a high level language? No. And nor does that code mean Python isn't garbage-collected. Everything has to have an implementation somewhere. Or let's look at it a different way. Instead of using a PyObject* in C code, you could write C++ code that uses a trivial wrapper class that holds the pointer, increments its refcount on construction, and decrements that refcount on destruction. That way, you can simply declare these PyObjectWrappers and let them expire. Does that mean that suddenly the refcounting isn't your responsibility, ergo it's now a garbage collector? Because the transformation is trivially easy. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On 06/19/2017 08:44 AM, Skip Montanaro wrote: On Mon, Jun 19, 2017 at 10:20 AM, Ethan Furman wrote: Programming at the C level is not working in Python, and many Python niceties simply don't exist there. True, but a lot of functionality available to Python programmers exists at the extension module level, whether delivered as part of the core distribution or from third-party sources. (The core CPython test suite spends a fair amount of effort on leak detection, one side effect of incorrect reference counting.) While programming in Python you don't need to worry about reference counting errors, when they slip through from the C level, they affect you. Let me ask a different question: How much effort is required at the C level when using tracing garbage collection? -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Mon, Jun 19, 2017 at 10:20 AM, Ethan Furmanwrote: > Programming at the C level is not working in Python, and many Python > niceties simply don't exist there. True, but a lot of functionality available to Python programmers exists at the extension module level, whether delivered as part of the core distribution or from third-party sources. (The core CPython test suite spends a fair amount of effort on leak detection, one side effect of incorrect reference counting.) While programming in Python you don't need to worry about reference counting errors, when they slip through from the C level, they affect you. Skip -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On 06/19/2017 08:06 AM, Skip Montanaro wrote: On Mon, Jun 19, 2017 at 9:20 AM, Ethan Furman wrote: Reference counting is a valid garbage collecting mechanism, therefore Python is also a GC language. Garbage collection is usually thought of as a way to remove responsibility for tracking of live data from the user. Reference counting doesn't do that. Caveat: I'm not a CS major. Question: In the same way that Object Orientation is usually thought of as data hiding? Comment: Except in rare cases (e.g. messing with __del__), the Python user does not have to think about nor manage live data, so reference counting seems to meet that requirement. Programming at the C level is not working in Python, and many Python niceties simply don't exist there. -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Mon, Jun 19, 2017 at 9:20 AM, Ethan Furmanwrote: > Reference counting is a valid garbage collecting mechanism, therefore > Python is also a GC language. Garbage collection is usually thought of as a way to remove responsibility for tracking of live data from the user. Reference counting doesn't do that. Skip -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Monday, June 19, 2017 at 7:40:49 PM UTC+5:30, Robin Becker wrote: > On 19/06/2017 01:20, Paul Rubin wrote: > ... > > the existing C API quite seriously. Reworking the C modules in the > > stdlib would be a large but not impossible undertaking. The many > > external C modules out there would be more of an issue. > > > I have always found the management of reference counts to be one of the > hardest > things about the C api. I'm not sure exactly how C extensions would/should > interact with a GC python. There seem to be different approaches eg lua & go > are > both GC languages but seem different in how C/GC memory should interact. Worth reading for chances python missed: https://stackoverflow.com/questions/588958/what-are-the-drawbacks-of-stackless-python To be fair also this: https://stackoverflow.com/questions/377254/stackless-python-and-multicores -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On 06/19/2017 07:10 AM, Robin Becker wrote: I have always found the management of reference counts to be one of the hardest things about the C api. I'm not sure exactly how C extensions would/should interact with a GC python. There seem to be different approaches eg lua & go are both GC languages but seem different in how C/GC memory should interact. The conversation would be easier if the proper terms were used. Reference counting is a valid garbage collecting mechanism, therefore Python is also a GC language. -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On 19/06/2017 01:20, Paul Rubin wrote: ... the existing C API quite seriously. Reworking the C modules in the stdlib would be a large but not impossible undertaking. The many external C modules out there would be more of an issue. I have always found the management of reference counts to be one of the hardest things about the C api. I'm not sure exactly how C extensions would/should interact with a GC python. There seem to be different approaches eg lua & go are both GC languages but seem different in how C/GC memory should interact. -- Robin Becker -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
I always thought the GIL removal obstacle was the need to put locks around every refcount adjustment, and the only real cure for that is to use a tracing GC. That is a good idea in many ways, but it would break the existing C API quite seriously. Reworking the C modules in the stdlib would be a large but not impossible undertaking. The many external C modules out there would be more of an issue. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Tue, Jun 13, 2017 at 1:53 PM, Terry Reedywrote: > This was tried at least once, perhaps 15 years ago. Yes, I believe Greg Smith (?) implemented a proof-of-concept in about the Python 1.4 timeframe. The observation at the time was that it slowed down single-threaded programs too much to be accepted as it existed then. That remains the primary bugaboo as I understand it. It seems Larry has pushed the envelope a fair bit farther, but there are still problems. I don't know if the Gilectomy code changes are too great to live along the mainline branches, but I wonder if having a bleeding-edge-gilectomy branch in Git (maintained alongside the regular stuff, but not formally released) would a) help it stay in sync better with CPython b) expose the changes to more people, especially extension module authors Combined, the two might make it so the GIL-free branch isn't always playing catchup (because of 'a') and more extension modules get tweaked to work properly in a GIL-free world (because of 'b'). I imagine Larry Hastings has given the idea some consideration. Skip -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On 6/13/2017 12:09 PM, Robin Becker wrote: On 11/06/2017 07:27, Steve D'Aprano wrote: I'm tired of people complaining about the GIL as a "mistake" without acknowledging that it exists for a reason. I thought we were also consenting adults about problems arising from bad extensions. The GIL is a blocker for cpython's ability to use multi-core cpus. When using threads, not when using multiple processes. > The contention issues all arise from reference counting. Newer > languages like go seem to prefer the garbage collection approach. > Perhaps someone should try a reference-countectomy, This was tried at least once, perhaps 15 years ago. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Tue, Jun 13, 2017 at 11:09 AM, Robin Beckerwrote: > I looked at Larry's talk with interest. The GIL is not a requirement as he > pointed out at the end, both IronPython and Jython don't need it. But they don't support CPython's extension module API either, I don't think. (I imagine that might have been the point of your reference.) Skip -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On 11/06/2017 07:27, Steve D'Aprano wrote: I'm tired of people complaining about the GIL as a "mistake" without acknowledging that it exists for a reason. I thought we were also consenting adults about problems arising from bad extensions. The GIL is a blocker for cpython's ability to use multi-core cpus. I looked at Larry's talk with interest. The GIL is not a requirement as he pointed out at the end, both IronPython and Jython don't need it. That said I think the approach he outlined is probably wrong unless we attach a very high weight to preserving the current extension interface. C extensions are a real nuisance. The contention issues all arise from reference counting. Newer languages like go seem to prefer the garbage collection approach. Perhaps someone should try a reference-countectomy, but then they already have with other python implementations. -- Robin Becker -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On Sun, 11 Jun 2017 04:21 pm, Stefan Behnel wrote: > Serhiy Storchaka schrieb am 11.06.2017 um 07:11: >> And also GIL is used for guaranteeing atomicity of many operations and >> consistencity of internal structures without using additional locks. Many >> parts of the core and the stdlib would just not work correctly in >> multithread environment without GIL. > > And the same applies to external extension modules. The GIL is really handy > when it comes to reasoning about safety and correctness of algorithms under > the threat of thread concurrency. Especially in native code, where the > result of an unanticipated race condition is usually a crash rather than an > exception. Thank you Stefan and Serhiy! I'm tired of people complaining about the GIL as a "mistake" without acknowledging that it exists for a reason. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
Serhiy Storchaka schrieb am 11.06.2017 um 07:11: > 10.06.17 15:54, Steve D'Aprano пише: >> Larry Hastings is working on removing the GIL from CPython: >> >> https://lwn.net/Articles/723949/ >> >> For those who don't know the background: >> >> - The GIL (Global Interpreter Lock) is used to ensure that only one piece of >> code can update references to an object at a time. >> >> - The downside of the GIL is that CPython cannot take advantage of >> multiple CPU >> cores effectively. Hence multi-threaded code is not as fast as it could be. >> >> - Past attempts to remove the GIL caused unacceptable slow-downs for >> single-threaded programs and code run on single-core CPUs. >> >> - And also failed to show the expected performance gains for multi-threaded >> programs on multi-core CPUs. (There was some gain, but not much.) >> >> >> Thanks Larry for your experiments on this! > > And also GIL is used for guaranteeing atomicity of many operations and > consistencity of internal structures without using additional locks. Many > parts of the core and the stdlib would just not work correctly in > multithread environment without GIL. And the same applies to external extension modules. The GIL is really handy when it comes to reasoning about safety and correctness of algorithms under the threat of thread concurrency. Especially in native code, where the result of an unanticipated race condition is usually a crash rather than an exception. Stefan -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
10.06.17 15:54, Steve D'Aprano пише: Larry Hastings is working on removing the GIL from CPython: https://lwn.net/Articles/723949/ For those who don't know the background: - The GIL (Global Interpreter Lock) is used to ensure that only one piece of code can update references to an object at a time. - The downside of the GIL is that CPython cannot take advantage of multiple CPU cores effectively. Hence multi-threaded code is not as fast as it could be. - Past attempts to remove the GIL caused unacceptable slow-downs for single-threaded programs and code run on single-core CPUs. - And also failed to show the expected performance gains for multi-threaded programs on multi-core CPUs. (There was some gain, but not much.) Thanks Larry for your experiments on this! And also GIL is used for guaranteeing atomicity of many operations and consistencity of internal structures without using additional locks. Many parts of the core and the stdlib would just not work correctly in multithread environment without GIL. -- https://mail.python.org/mailman/listinfo/python-list
Re: Progress on the Gilectomy
On 10-6-2017 14:54, Steve D'Aprano wrote: > Larry Hastings is working on removing the GIL from CPython: > > https://lwn.net/Articles/723949/ Here is Larry's "How's it going" presentation from Pycon 2017 on this subject https://www.youtube.com/watch?v=pLqv11ScGsQ -irmen -- https://mail.python.org/mailman/listinfo/python-list
Progress on the Gilectomy
Larry Hastings is working on removing the GIL from CPython: https://lwn.net/Articles/723949/ For those who don't know the background: - The GIL (Global Interpreter Lock) is used to ensure that only one piece of code can update references to an object at a time. - The downside of the GIL is that CPython cannot take advantage of multiple CPU cores effectively. Hence multi-threaded code is not as fast as it could be. - Past attempts to remove the GIL caused unacceptable slow-downs for single-threaded programs and code run on single-core CPUs. - And also failed to show the expected performance gains for multi-threaded programs on multi-core CPUs. (There was some gain, but not much.) Thanks Larry for your experiments on this! -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list