Ok, how about merging the two sub-threads :-) On Mon, 18 Nov 2013 16:44:59 -0600 Tim Peters <tim.pet...@gmail.com> wrote: > [Antoine] > > You can't know how much space the pickle will take until the pickling > > ends, though, which makes it difficult to decide whether you want to > > emit a PREFETCH opcode or not. > > Ah, of course. Presumably the outgoing pickle stream is first stored > in some memory buffer, right? If pickling completes before the buffer > is first flushed, then you know exactly how large the entire pickle > is. If "it's small" (say, < 100 bytes), don't write out the PREFETCH > part. Else do.
That's true. We could also have a SMALLPREFETCH opcode with a one-byte length to still get the benefits of prefetching. > > Well, yes: much better memory usage for large pickles. > > Some people use pickles to store huge data, which was the motivation to > > add the 8-byte-size opcodes after all. > > We'd have the same advantage _if_ it were feasible to know the entire > size up front. I understand now that it's not feasible. AFAICT, it would only be possible by doing two-pass pickling, which would also slow it down massively. > A long-running process can legitimately put billions of items on work > queues, far more than could ever fit in RAM simultaneously. Comparing > this to PyObject overhead makes no sense to me. Neither does the line > of argument "there are several kinds of overheads, so making this > overhead worse too doesn't matter". Well, it's a question of cost / benefit: does it make sense to optimize something that will be dwarfed by other factors in real world situations? > When possible, we should strive not to add overheads that don't repay > their costs. For small pickles, an 8-byte size field doesn't appear > to buy anything. But I appreciate that it costs implementation effort > to avoid producing it in these cases. I share the concern, although I still don't think the "ocean of tiny pickles" is a reasonable use case :-) That said, assuming you think this is important (do you?), we're left with the following constraints: - it would be nice to have this PEP in 3.4 - 3.4 beta1 and feature freeze is in approximately one week - switching to the PREFETCH scheme requires some non-trivial work on the current patch, work done by either Alexandre or me (but I already have pathlib (PEP 428) on my plate, so it'll have to be Alexandre) - unless you want to do it, of course? What do you think? Regards Antoine. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com