[Tim Delaney ]
> ...
> If I'm not mistaken, #3 would result in the optimiser changing str.format()
> into an f-string in-place. Is this correct? We're not talking here about
> people manually changing the code from str.format() to f-strings, right?
All correct. It's
[Kirill Balunov ]
> I apologize that I get into the discussion. Obviously in some situations it
> will be useful to check that a floating-point number is integral, but from
> the examples given it is clear that they are very rare. Why the variant with
> the inclusion of
[David Mertz]
> I've been using and teaching python for close to 20 years and I never
> noticed that x.is_integer() exists until this thread.
Except it was impossible to notice across most of those years, because
it didn't exist across most of those years ;-)
> I would say the "one obvious way"
And, yes, also a pain in the
ass ;-)
--- nothing new below ---
On Wed, Mar 21, 2018 at 3:49 PM, David Mertz <me...@gnosis.cx> wrote:
> On Wed, Mar 21, 2018 at 3:02 PM, Tim Peters <tim.pet...@gmail.com> wrote:
>>
>> [David Mertz]
>> > I've been using and teachi
[Devin Jeanpierre ]
> PyPy (5.8):
> x = 1e300
> x.is_integer()
> True
> math.sqrt(x**2).is_integer()
> False
> x**2
> inf
I think you missed that David said "even without reaching inf" (you
did reach inf), and that I said "such that x*x neither
[David Mertz ]
>> For example, this can be true (even without reaching inf):
>>
>> >>> x.is_integer()
>> True
>> >>> (math.sqrt(x**2)).is_integer()
>> False
[Mark Dickinson ]
> If you have a moment to share it, I'd be interested to know what value of
> `x`
[Chris Barker ]
> ...
> ... "is it the "right" thing to do in most cases, when deployed by folks
> that haven't thought deeply about floating point.
Gimme a break ;-) Even people who _believe_ they've thought about
floating point still litter the bug tracker with
>>> .1 +
[Kirill Balunov ]
> ...
> In spite of the fact that the pronouncement has
> already been made, there may still be an opportunity to influence this
> decision.
That's not really how this works. Guido has been doing this for
decades, and when he Pronounces he's done
[Tim]
>> from trig functions doing argument reduction as if pi were represented
>> with infinite precision,
[Greg Ewing ]
> That sounds like an interesting trick! Can you provide
> pointers to any literature describing how it's done?
>
> Not doubting it's possible,
[Guido]
>> You don't seem to grasp the usability improvements this will give.
>> I hear you but at this point appeals to Python's "Zen" don't help you.
[Łukasz Langa ]
> This reads dismissive to me. I did read the PEP and followed the discussion on
> python-dev. I referred to PEP
[Tim]
>> To my eyes, this is genuinely harder to follow, despite its relative brevity:
>>
>> while total != (total := total + term):
[Antoine]
> Does it even work? Perhaps if the goal is to stop when total is NaN,
> but otherwise?
I don't follow you. You snipped all the text explaining
[Tim]
To my eyes, this is genuinely harder to follow, despite its relative
brevity:
while total != (total := total + term):
[Antoine]
>>> Does it even work? Perhaps if the goal is to stop when total is NaN,
>>> but otherwise?
[Chris]
>> Yes, it does, because the
[Kirill Balunov]
> Not sure, but if additional motivating examples are required, there is a
> common pattern for dynamic attribute lookup (snippet from `copy.py`):
>
> reductor = dispatch_table.get(cls)
> if reductor:
> rv = reductor(x)
> else:
> reductor = getattr(x,
[Tim]
>> So, to match your sarcasm, here's mine: try using a feature for what
>> it's good at instead of for what it's bad at ;-)
[Lukasz Langa ]
> Yes, this is the fundamental wisdom. Judging which is which is left as an
> exercise to the programmer.
>
> With this, I'm leaving
[Chris Barker]
>>> So what about:
>>>
>>> l = [x:=i for i in range(3)]
>>>
>>> vs
>>>
>>> g = (x:=i for i in range(3))
>>>
>>> Is there any way to keep these consistent if the "x" is in the regular
local scope?
[Tim]
>> I'm not clear on what the question is. The list comprehension would
>> bind
[Chris]
> yes, it was a contrived example, but the simplest one I could think of off
> the top of my head that re-bound a name in the loop -- which was what I
> thought was the entire point of this discussion?
But why off the top of your head? There are literally hundreds & hundreds
of prior
[Guido]
> ..
> Given that definition of `__parentlocal`, in first approximation the
> scoping rule proposed by PEP 572 would then be: In comprehensions
> (which in my use in the PEP 572 discussion includes generator
> expressions) the targets of inline assignments are automatically
> endowed with
[Chris Barker]
> ...
> So what about:
>
> l = [x:=i for i in range(3)]
>
> vs
>
> g = (x:=i for i in range(3))
>
> Is there any way to keep these consistent if the "x" is in the regular
local scope?
I'm not clear on what the question is. The list comprehension would bind `
l ` to [0, 1, 2] and
[MRAB]
>> If I want to cache some objects, I put them in a dict, using the id as
>> the key. If I wanted to locate an object in a cache and didn't have
>> id(), I'd have to do a linear search for it.
[Greg Ewing ]
> That sounds dangerous. An id() is only valid as long as the object
> it came
[Gregory P. Smith ]
> Good point, I hadn't considered that it was regular common ref
> count 0 dealloc chaining.
It pretty much has to be whenever you see a chain of XXX_dealloc
routines in a stack trace. gcmodule.c never even looks at a
tp_dealloc slot directly, let alone directly invoke a
[Gregory P. Smith ]
> ...
> A situation came up the other day where I believe this could've helped.
>
> Scenario (admittedly not one most environments run into): A Python process
> with a C++ extension module implementing a threaded server (threads
> spawned by C++) that could call back into
[Tim]
> Key invariants:
> ...
> 2. nfp2lasta[pa->nfreepools] == pa if and only if pa is the only arena
> in usable_arenas with that many free pools.
Ack! Scratch that. I need a nap :-(
In fact if that equality holds, it means that nfp2lasta entry has to
change if pa is moved and pa->prevarena
[Tim]
> I'll note that the approach I very briefly sketched before
> (restructure the list of arenas to partition it into multiple lists
> partitioned by number of free pools) "should make" obmalloc
> competitive with malloc here ...
But it's also intrusive, breaking up a simple linked list into
[Larry Hastings ]
> I have a computer with two Xeon CPUs and 256GB of RAM. So, even
> though it's NUMA, I still have 128GB of memory per CPU. It's running a
> "spin" of Ubuntu 18.10.
>
> I compiled a fresh Python 3.7.3 --with-optimizations. I copied the sample
> program straight off the
I made a pull request for this that appears to work fine for my 10x
smaller test case (reduces tear-down time from over 45 seconds to over
7). It implements my second earlier sketch (add a vector of search
fingers, to eliminate searches):
https://github.com/python/cpython/pull/13612
It would be
The PR for this looks good to go:
https://github.com/python/cpython/pull/13612
But, I still have no idea how it works for the OP's original test
case. So, if you have at least 80 GB of RAM to try it, I added
`arena.py` to the BPO report:
https://bugs.python.org/issue37029
That adds code to
[Tim]
>> I'm keen to get feedback on this before merging the PR, because this
>> case is so very much larger than anything I've ever tried that I'm
>> wary that there may be more than one "surprise" lurking here. ...
[Inada Naoki ]
> I started r5a.4xlarge EC2 instance and started arena.py.
> I
To be clearer, while knowing the size of allocated objects may be of
some use to some other allocators, "not really" for obmalloc. That
one does small objects by itself in a uniform way, and punts
everything else to the system malloc family. The _only_ thing it
wants to know on a free/realloc is
[Tim]
>> But I don't know what you mean by "access memory in random order to
>> iterate over known objects". obmalloc never needs to iterate over
>> known objects - indeed, it contains no code capable of doing that..
>> Our cyclic gc does, but that's independent of obmalloc.
[Antoine]
> It's
[Antoine Pitrou ]
> But my response was under the assumption that we would want obmalloc to
> deal with all allocations.
I didn't know that. I personally have no interest in that: if we
want an all-purpose allocator, there are several already to choose
from. There's no reason to imagine we
[Antoine Pitrou ]
> The interesting thing here is that in many situations, the size is
> known up front when deallocating - it is simply not communicated to the
> deallocator because the traditional free() API takes a sole pointer,
> not a size. But CPython could communicate that size easily if
[Antoine Pitrou, replying to Thomas Wouters]
> Interesting that a 20-year simple allocator (obmalloc) is able to do
> better than the sophisticated TCMalloc.
It's very hard to beat obmalloc (O) at what it does. TCMalloc (T) is
actually very similar where they overlap, but has to be more complex
[Inada Naoki . to Neil S]
> Oh, do you mean your branch doesn't have headers in each page?
That's probably right ;-) Neil is using a new data structure, a radix
tree implementing a sparse set of arena addresses. Within obmalloc
pools, which can be of any multiple-of-4KiB (on a 64-bit box) size,
[Antoine]
> We moved from malloc() to mmap() for allocating arenas because of user
> requests to release memory more deterministically:
>
> https://bugs.python.org/issue11849
Which was a good change! As was using VirtualAlloc() on Windows.
None of that is being disputed. The change under
[Neil Schemenauer ]
> ...
> BTW, the current radix tree doesn't even require that pools are
> aligned to POOL_SIZE. We probably want to keep pools aligned
> because other parts of obmalloc rely on that.
obmalloc relies on it heavily. Another radix tree could map block
addresses to all the
[Inada Naoki ]
> obmalloc is very nice at allocating small (~224 bytes) memory blocks.
> But it seems current SMALL_REQUEST_THRESHOLD (512) is too large to me.
For the "unavoidable memory waste" reasons you spell out here,
Vladimir deliberately set the threshold to 256 at the start. As
things
[Neil Schemenauer ]
> I've done a little testing the pool overhead. I have an application
> that uses many small dicts as holders of data. The function:
>
> sys._debugmallocstats()
>
> is useful to get stats for the obmalloc pools. Total data allocated
> by obmalloc is 262 MB. At the
[Tim\
> For the current obmalloc, I have in mind a different way ...
> Not ideal, but ... captures the important part (more objects
> in a pool -> more times obmalloc can remain in its
> fastest "all within the pool" paths).
And now there's a PR that removes obmalloc's limit on pool sizes, and,
[Tim]
> ...
> Here are some stats from running [memcrunch.py] under
> my PR, but using 200 times the initial number of objects
> as the original script:
>
> n = 2000 #number of things
>
> At the end, with 1M arena and 16K pool:
>
> 3362 arenas * 1048576 bytes/arena =3,525,312,512
> #
And one more random clue.
The memcrunch.py attached to the earlier-mentioned bug report does
benefit a lot from changing to a "use the arena with the smallest
address" heuristic, leaving 86.6% of allocated bytes in use by objects
at the end (this is with the arena-thrashing fix, and the current
[Tim]
> ...
> Now under 3.7.3. First when phase 10 is done building:
>
> phase 10 adding 9953410
> phase 10 has 16743920 objects
>
> # arenas allocated total = 14,485
> # arenas reclaimed =2,020
> # arenas highwater mark=
Heh. I wasn't intending to be nasty, but this program makes our arena
recycling look _much_ worse than memcrunch.py does. It cycles through
phases. In each phase, it first creates a large randomish number of
objects, then deletes half of all objects in existence. Except that
every 10th phase,
[Tim]
> - For truly effective RAM releasing, we would almost certainly need to
> make major changes, to release RAM at an OS page level. 256K arenas
> were already too fat a granularity.
We can approximate that closely right now by using 4K pools _and_ 4K
arenas: one pool per arena, and
[Antoine Pitrou ]
> For the record, there's another contender in the allocator
> competition now:
> https://github.com/microsoft/mimalloc/
Thanks! From a quick skim, most of it is addressing things obmalloc doesn't:
1) Efficient thread safety (we rely on the GIL).
2) Directly handling requests
[Inada Naoki]
>> Increasing pool size is one obvious way to fix these problems.
>> I think 16KiB pool size and 2MiB (huge page size of x86) arena size is
>> a sweet spot for recent web servers (typically, about 32 threads, and
>> 64GiB), but there is no evidence about it.
[Antoine]
> Note that
[Tim]
>> I don't think we need to cater anymore to careless code that mixes
>> system memory calls with O calls (e.g., if an extension gets memory
>> via `malloc()`, it's its responsibility to call `free()`), and if not
>> then `address_in_range()` isn't really necessary anymore either, and
>>
[Tim]
> The radix tree generally appears to be a little more memory-frugal
> than my PR (presumably because my need to break "big pools" into 4K
> chunks, while the tree branch doesn't, buys the tree more space to
> actually store objects than it costs for the new tree).
It depends a whole lot on
[Thomas]
>>> And what would be an efficient way of detecting allocations punted to
>>> malloc, if not address_in_range?
[Tim]
>> _The_ most efficient way is the one almost all allocators used long
>> ago: use some "hidden" bits right before the address returned to the
>> user to store info about
[Tim]
>> At the start, obmalloc never returned arenas to the system. The vast
>> majority of users were fine with that.
[Neil]
> Yeah, I was totally fine with that back in the day. However, I
> wonder now if there is a stronger reason to try to free memory back
> to the OS. Years ago, people
[Tim. to Neil]
>> Moving to bigger pools and bigger arenas are pretty much no-brainers
>> for us, [...]
[Antoine]
> Why "no-brainers"?
We're running tests, benchmarks, the Python programs we always run,
Python programs that are important to us, staring at obmalloc stats
... and seeing nothing
There's a Stackoverflow report[1] I suspect is worth looking into, but
it requires far more RAM (over 80GB) than I have). The OP whittled it
down to a reasonably brief & straightforward pure Python 3 program.
It builds a ternary search tree, with perhaps a billion nodes. The
problem is that it
[Inada Naoki ]
> ...
> 2. This loop is cleary hot:
> https://github.com/python/cpython/blob/51aa35e9e17eef60d04add9619fe2a7eb938358c/Objects/obmalloc.c#L1816-L1819
Which is 3 lines of code plus a closing brace. The OP didn't build
their own Python, and the source from which it was compiled
It seems pretty clear now that the primary cause is keeping arenas
sorted by number of free pools. As deallocation goes on, the number of
distinct "# of free pools" values decreases, leaving large numbers of
arenas sharing the same value. Then, e.g., if there are 10,000 arenas
with 30 free pools
[Inada Naoki ]
> For the record, result for 10M nodes, Ubuntu 18.04 on AWS r5a.4xlarge:
I'm unclear on what "nodes" means. If you mean you changed 27M to 10M
in this line:
for token in random_strings(27_000_000):
that's fine, but there are about 40 times more than that `Node`
objects
[Larry Hastings ]
> Guido just stopped by--we're all at the PyCon 2019 dev sprints--and we had
> a chat about it. Guido likes it but wanted us to restore a little of the
> magical
> behavior we had in "!d": now, = in f-strings will default to repr (!r), unless
> you specify a format spec. If you
[Jordan Adler ]
> Through the course of work on the future polyfills that mimic the behavior
> of Py3 builtins across versions of Python, we've discovered that the
> equality check behavior of at least some builtin types do not match the
> documented core data model.
>
> Specifically, a comparison
[Antoine Pitrou ]
>> Ah, interesting. Were you able to measure the memory footprint as well?
[Inada Naoki ]
> Hmm, it is not good. mimalloc uses MADV_FREE so it may affect to some
> benchmarks. I will look it later.
>
> ```
> $ ./python -m pyperf compare_to pymalloc-mem.json mimalloc-mem.json
[Victor Stinner ]
> I guess that INADA-san used pyperformance --track-memory.
>
> pyperf --track-memory doc:
> "--track-memory: get the memory peak usage. it is less accurate than
> tracemalloc, but has a lower overhead. On Linux, compute the sum of
> Private_Clean and Private_Dirty memory
[Pablo Galindo Salgado ]
> Recently, we moved the optimization for the removal of dead code of the form
>
> if 0:
>
>
> to the ast so we use JUMP bytecodes instead (being completed in PR14116). The
> reason is that currently, any syntax error in the block will never be
> reported.
> For
[Inada Naoki]
>> So I tried to use LIKELY/UNLIKELY macro to teach compiler hot part.
>> But I need to use
>> "static inline" for pymalloc_alloc and pymalloc_free yet [1].
[Neil Schemenauer]
> I think LIKELY/UNLIKELY is not helpful if you compile with LTO/PGO
> enabled.
I like adding those
[Inada Naoki , trying mimalloc]
>>> Hmm, it is not good. mimalloc uses MADV_FREE so it may affect to some
>>> benchmarks. I will look it later.
>> ...
>> $ ./python -m pyperf compare_to pymalloc-mem.json mimalloc-mem.json -G
>> Slower (60):
>> - logging_format: 10.6 MB +- 384.2 kB -> 27.2 MB
[This is about the mailing list, not about Python development]
That python-dev-owner has gotten two complaints about this message so
far suggests I should explain what's going on ;-)
New list members are automatically moderated. Their posts sit in a
moderation queue waiting for moderator
[Tim]
> While python-dev has several "official" moderators, best I can tell
> I'm the only one who has reviewed these messages for years.
I should clarify that! That's not meant to be a dig at the other
moderators. I review everything because I'm retired and am near the
computer many hours
[Mariatta ]
- Since this is a 1st-time contributor, does it need a change to the ACKS
file?
>
> I think the change is trivial enough, the misc/acks is not necessary.
>
> - Anything else?
>
>
> 1. Does it need to be backported? If so, please add the "needs backport to
> .." label.
>
> 2. Add the
[Brett Cannon ]
> We probably need to update https://devguide.python.org/committing/ to
> have a step-by-step list of how to make a merge works and how to
> handle backports instead of the wall of text that we have. (It's already
> outdated anyway, e.g. `Misc/ACKS` really isn't important as git
https://github.com/python/cpython/pull/13482
is a simple doc change for difflib, which I approved some months ago.
But I don't know the current workflow well enough to finish it myself.
Like:
- Does something special need to be done for doc changes?
- Since this is a 1st-time contributor,
[Barry Warsaw ]
> bpo-37757: https://bugs.python.org/issue37757
Really couldn't care less whether it's TargetScopeError or
SyntaxError, but don't understand the only rationale given here for
preferring the latter:
> To me, “TargetScopeError” is pretty obscure and doesn’t give users an
> obvious
[Guido]
> I don't see how this debate can avoid a vote in the Steering Council.
FWIW, I found Nick's last post wholly persuasive: back off to
SyntaxError for now, and think about adding a more specific exception
later for _all_ cases (not just walrus) in which a scope conflict
isn't allowed
Short course: a replacement for malloc for use in contexts that can't
"move memory" after an address is passed out, but want/need the
benefits of compactification anyway.
Key idea: if the allocator dedicates each OS page to requests of a
specific class, then consider two pages devoted to the
[Inada Naoki ,
looking into why mimalloc did so much better on spectral_norm]
> I compared "perf" output of mimalloc and pymalloc, and I succeeded to
> optimize pymalloc!
>
> $ ./python bm_spectral_norm.py --compare-to ./python-master
> python-master: . 199 ms +- 1 ms
>
[Skip Montanaro ]
> ...
> I don't think stable code which uses macros should be changed (though
> I see the INCREF/DECREF macros just call private inline functions, so
> some conversion has clearly been done). Still, in new code, shouldn't
> the use of macros for more than trivial use cases
[Raymond]
> ...
> * The ordering we have for dicts uses a hash table that indexes into a
> sequence.
> That works reasonably well for typical dict operations but is unsuitable for
> set
> operations where some common use cases make interspersed additions
> and deletions (that is why the LRU
[Petr Viktorin ]
> ...
> Originally, making dicts ordered was all about performance (or rather
> memory efficiency, which falls in the same bucket.) It wasn't added
> because it's better semantics-wise.
As I tried to flesh out a bit in a recent message, the original
"compact dict" idea got all
[Tim]
> BTW, what should
>
> {1, 2} | {3, 4, 5, 6, 7}
>
> return as ordered sets? Beats me.;
[Larry]
> The obvious answer is {1, 2, 3, 4, 5, 6, 7}.
Why? An obvious implementation that doesn't ignore performance entirely is:
def union(smaller, larger):
if len(larger) <
[Larry]
> Didn't some paths also get slightly slower as a result of maintaining
> insertion order when mixing insertions and deletions?
I paid no attention at the time. But in going from "compact dict" to
"ordered dict", deletion all by itself got marginally cheaper. The
downside was the
[Tim]
>> If it's desired that "insertion order" be consistent across runs,
>> platforms, and releases, then what "insertion order" _means_ needs to
>> be rigorously defined & specified for all set operations. This was
>> comparatively trivial for dicts, because there are, e.g., no
>> commutative
...
[Larry]
>> One prominent Python core developer** wanted this feature for years, and I
>> recall
>> them saying something like:
>>
>> Guido says, "When a programmer iterates over a dictionary and they see the
>> keys
>> shift around when the dictionary changes, they learn something!" To
[Larry]
> "I don't care about performance" is not because I'm aching for Python to
> run my code slowly. It's because I'm 100% confident that the Python
> community will lovingly optimize the implementation.
I'm not ;-)
> So when I have my language designer hat on, I really don't concern
[Larry Hastings ]
> As of 3.7, dict objects are guaranteed to maintain insertion order. But set
> objects make no such guarantee, and AFAIK in practice they don't maintain
> insertion order either.
If they ever appear to, it's an accident you shouldn't rely on.
> Should they?
>From Raymond,
[Guido]
> ...
> the language should not disappoint them, optimization opportunities be damned.
I would like to distinguish between two kinds of "optimization
opportunities": theoretical ones that may or may not be exploited
some day, and those that CPython has _already_ exploited.
That is, we
[Nick Coghlan ]
> Starting with "collections.OrderedSet" seems like a reasonable idea,
> though - that way "like a built-in set, but insertion order preserving" will
> have an obvious and readily available answer, and it should also
> make performance comparisons easier.
Ya, I suggested starting
[Nick]
> I took Larry's request a slightly different way:
Sorry, I was unclear: by "use case" I had in mind what appeared to me
to be the overwhelming thrust of the _entirety_ of this thread so far,
not Larry's original request.
> he has a use case where he wants order preservation (so built in
[David Mertz ]
> It's not obvious to me that insertion order is even the most obvious or
> most commonly relevant sort order. I'm sure it is for Larry's program, but
> often a work queue might want some other order. Very often queues
> might instead, for example, have a priority number assigned
[Nick]
> I must admit that I was assuming without stating that a full OrderedSet
> implementation would support the MutableSequence interface.
Efficient access via index position too would be an enormous new
requirement, My bet: basic operations would need to change from O(1)
to O(log(N)).
[Wes Turner ]
>> How slow and space-inefficient would it be to just implement the set methods
>> on top of dict?
[Inada Naoki ]
> Speed: Dict doesn't cache the position of the first item. Calling
> next(iter(D)) repeatedly is O(N) in worst case.
> ...
See also Raymond's (only) message in this
[Inada Naoki ]
> I just meant the performance of the next(iter(D)) is the most critical part
> when you implement orderdset on top of the current dict and use it as a queue.
Which is a good point. I added a lot more, though, because Wes didn't
even mention queues in his question:
[Wes Turner ]
PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if x
and y are the same object, then equality comparison returns True and
inequality False. No attempt is made to execute __eq__ or __ne__
methods in those cases.
This has visible consequences all over the place, but they don't
[Tim]
>> PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if x
>> and y are the same object, then equality comparison returns True
>> and inequality False. No attempt is made to execute __eq__ or
>> __ne__ methods in those cases.
>> ...
>> If it's intended that Python-the-language
[Inada Naoki ]
> FWIW, (list|tuple).__eq__ and (list|tuple).__contains__ uses it too.
> It is very important to compare recursive sequences.
>
> >>> x = []
> >>> x.append(x)
> >>> y = [x]
> >>> z = [x]
> >>> y == z
> True
That's a visible consequence, but I'm afraid this too must be
considered an
[Terry Reedy ]
> ...
> It is, in the section on how to understand and use value comparison
> *operators* ('==', etc.).
> https://docs.python.org/3/reference/expressions.html#value-comparisons
>
> First "The default behavior for equality comparison (== and !=) is based
> on the identity of the
[Tim]
>> I think it needs more words, though, to flesh out what about this is
>> allowed by the language (as opposed to what CPython happens to do),
>> and to get closer to what Guido is trying to get at with his
>> "*implicit* calls". For example, it's at work here, but there's not a
>> built-in
[Terry Reedy ]
]& skipping all the parts I agree with]
> ...
> Covered by "For user-defined classes which do not define __contains__()
> but do define __iter__(), x in y is True if some value z, for which the
> expression x is z or x == z is true, is produced while iterating over y.
> " in
>
>
[Guido]
> Honestly that looked like a spammer.
I approved the message, and it looked like "probably spam" to me too.
But it may have just been a low-quality message, and the new moderator
UI still doesn't support adding custom text to a rejection message.
Under the old system, I _would_ have
[Serhiy Storchaka]
> This is not the only difference between '.17g' and repr().
>
> >>> '%.17g' % 1.23456789
> '1.23456788'
> >>> format(1.23456789, '.17g')
> '1.23456788'
> >>> repr(1.23456789)
> '1.23456789'
More amazingly ;-), repr() isn't even always the same as a %g format
[Pau Freixes ]
> Recently I've been facing a really weird bug where a Python program
> was randomly segfaulting during the finalization, the program was
> using some C extensions via Cython.
There's nothing general that can be said that would help. These
things require excruciating details to
Sorry! A previous attempt to reply got sent before I typed anything :-(
Very briefly:
> >>> timeit.timeit("set(i for i in range(1000))", number=100_000)
[and other examples using a range of integers]
The collision resolution strategy for sets evolved to be fancier than
for dicts, to reduce
t;>> data structure that clearly describes what it does based on the name alone,
>>> IMO that's a million times better for readability purposes.
>>>
>>> Also, this is mostly speculation since I haven't ran any benchmarks for an
>>> OrderedSet implementation, but
[Kyle]
> ...
> For some reason, I had assumed in the back of my head (without
> giving it much thought) that the average collision rate would be the
> same for set items and dict keys. Thanks for the useful information.
I know the theoretical number of probes for dicts, but not for sets
anymore.
>> Also, I believe that max "reasonable" integer range of no collision
>> is (-2305843009213693951, 2305843009213693951), ...
> Any range that does _not_ contain both -2 and -1 (-1 is an annoying
> special case, with hash(-1) == hash(-2) == -2), and spans no more than
> sys.hash_info.modulus
[Tim]
>> - I don't have a theory for why dict build time is _so_ much higher
>> than dict lookup time for the nasty keys.
To be clearer, in context this was meant to be _compared to_ the
situation for sets. These were the numbers:
11184810 nasty keys
dict build 23.32
dict lookup
801 - 900 of 962 matches
Mail list logo