Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Nathaniel Smith
On Fri, Jan 20, 2017 at 7:39 PM, Nathaniel Smith  wrote:
> [...]
> Some of these strategies that you might find helpful (or not):

Oh right, and of course just after I hit send I realized I forgot one
of my favorites!

- come up with a real chunk of code from a real project that would
benefit from the change being proposed, and show what it looks like
before/after the feature is added. This can be incredibly persuasive
*but* it's *super important* that the code be as real as possible. The
ideal is for it to solve a *concrete* *real-world* problem that can be
described in a few sentences, and be drawn from a real code base that
faces that problem. One of the biggest challenges for maintainers is
figuring out how Python is actually used in the real world, because we
all have very little visibility outside our own little bubbles, so
people really appreciate this -- but at the same time, python-ideas is
absolutely awash with people coming up with weird hypothetical
situations where their pet idea would be just the ticket, so anything
that comes across as cherry-picked like that tends to be heavily
discounted. Sure, there *are* situations where the superpower of
breathing underwater can help you fight crime, but...
  http://strongfemaleprotagonist.com/issue-6/page-63-3/
  http://strongfemaleprotagonist.com/issue-6/page-64-3/

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Nathaniel Smith
On Fri, Jan 20, 2017 at 3:37 PM, Elizabeth Myers
 wrote:
[...]
>> Some of the responses on the bug are discouraging... mostly seems to
>> boil down to people just not wanting to expand the struct module or
>> discourage its use. Everyone is a critic. I didn't know adding two
>> format specifiers was going to be this controversial. You'd think I
>> proposed adding braces or something :/.
>>
>> I'm hesitant to go forward on this until the bug has a resolution.
>> ___
>> Python-ideas mailing list
>> Python-ideas@python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
> Also, btw, adding 128-bit length specifiers sounds like a good idea in
> theory, but the difficulty stems from the fact there's no real native
> 128-bit type that's portable. I don't know much about how python handles
> big ints internally, either, but I could learn.

The "b128" in "uleb128" is short for "base 128"; it refers to how each
byte contains one 7-bit "digit" of the integer being encoded -- so
just like decimal needs 1 digit for 0-9, 2 digits for 10 - 99 = (10**2
- 1), etc., uleb128 uses 1 byte for 0-127, 2 bytes for 128 - 16383 =
(128**2 - 1), etc. In practice most implementations are written in C
and use some kind of native fixed width integer as the in-memory
representation, and just error out if asked to decode a uleb128 that
doesn't fit. In Python I suppose we could support encoding and
decoding arbitrary size integers if we really wanted, but I also doubt
anyone would be bothered if we were restricted to "only" handling
numbers between 0 and 2**64 :-).

> I was looking into implementing this already, and it appears it should
> be possible by teaching the module that "not all data is fixed length"
> and allowing functions to report back (via a Py_ssize_t *) how much data
> was actually unpacked/packed. But again, waiting on that bug to have a
> resolution before I do anything. I don't want to waste hours of effort
> on something the developers ultimately decide they don't want and will
> just reject.

That's not really how Python bugs work in practice. For better or
worse (and it's very often both), CPython development generally
follows a traditional open-source model in which new proposals are
only accepted if they have a champion who's willing to run the
gauntlet of first proposing them, and then keep pushing forward
through the hail of criticism and bike-shedding from random
kibbitzers. This is at least in part a test to see how
dedicated/stubborn you are about this feature. If you stop posting,
then what will happen is that everyone else stops posting too, and the
bug will just sit there unresolved indefinitely until you get (more)
frustrated and give up.

On the one hand, this does tend to guarantee that accepted proposals
are very high quality and solve some important issue (b/c the champion
didn't *really care* about the issue then they wouldn't put up with
this). On the other hand, it's often pretty hellish for the
individuals involved, and probably drives away all kinds of helpful
contributions. But maybe it helps to know it's not personal? Having
social capital definitely helps, but well-known/experienced
contributors get put through this wringer too; the main difference is
that we do it with eyes open and have strategies for navigating the
system (at least until we get burned out).

Some of these strategies that you might find helpful (or not):

- realize that it's really common for someone to be all like "this is
TERRIBLE and should definitely not be merged because of  which
is a TOTAL SHOW-STOPPER", but then if you ignore the histrionics and
move forward anyway, it often turns out that all that person
*actually* wanted was to see a brief paragraph in your design summary
that acknowledges that you are aware of the existence of , and
once they see this they're happy. (See also: [1])

- speaking of which, it is often very helpful to write up a short
document to summarize and organize the different ideas proposed,
critiques raised, and what you conclude based on them! That's
basically what a "PEP" is - just an email in a somewhat standard
format that reviews all the important issues that were raised and then
says what you conclude and why, and which eventually also ends up on
the website as a record. If you decide to try this then there are some
guidelines [2][3] and a sample PEP [4] to start with. (The guidelines
make it sound much more formal and scary than it really is, though --
e.g. when they say "your submission may be AUTOMATICALLY REJECTED"
then in my experience what they actually mean is you might get a reply
back saying "hey fyi the formatter script barfed on your document
because you were missing a quote so I fixed it for you".) This
particular proposal is really on the lower boundary of needing a PEP
and you might well be able to push it through without one, but it

Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Elizabeth Myers
On 20/01/17 17:26, Elizabeth Myers wrote:
> On 20/01/17 16:46, Cameron Simpson wrote:
>> On 20Jan2017 14:47, Elizabeth Myers  wrote:
>>> 1) struct.unpack and struct.unpack_from should remain
>>> backwards-compatible. I don't want to return extra values from it like
>>> (length unpacked, (data...)) for that reason.
>>
>> Fully agree with this.
>>
>>> If the calcsize solution
>>> feels a bit weird (it isn't much less efficient, because strings store
>>> their length with them, so it's constant-time), there could also be new
>>> functions that *do* return the length if you need it. To me though, this
>>> feels like a use case for struct.iter_unpack.
>>
>> Often, maybe, but there are still going to be protocols that the new
>> format doesn't support, where the performant thing to do (in pure
>> Python) is to scan what you can with struct and "hand scan" the special
>> bits with special code. 
>> Consider, for example, a format like MP4/ISO14496, where there's a
>> regular block structure (which is somewhat struct parsable) that can
>> contain embedded arbitraily weird information. Or the flipside where
>> struct parsable data are embedded in a format not supported by struct.
>>
>> The mixed situation is where you need to know where the parse got up
>> to.  Calling calcsize or its variable size equivalent after a parse
>> seems needlessly repetetive of the parse work.
>>
>> For myself, I would want there to be some kind of call that returned the
>> parse and the length scanned, with the historic interface preserved for
>> the fixed size formats or for users not needing the length.
>>
>>> 2) I want to avoid making a weird incongruity, where only
>>> variable-length strings return the length actually parsed.
>>
>> Fully agree. Arguing for two API calls: the current one and one that
>> also returns the scan length.
>>
>> Cheers,
>> Cameron Simpson 
> 
> Some of the responses on the bug are discouraging... mostly seems to
> boil down to people just not wanting to expand the struct module or
> discourage its use. Everyone is a critic. I didn't know adding two
> format specifiers was going to be this controversial. You'd think I
> proposed adding braces or something :/.
> 
> I'm hesitant to go forward on this until the bug has a resolution.
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 

Also, btw, adding 128-bit length specifiers sounds like a good idea in
theory, but the difficulty stems from the fact there's no real native
128-bit type that's portable. I don't know much about how python handles
big ints internally, either, but I could learn.

I was looking into implementing this already, and it appears it should
be possible by teaching the module that "not all data is fixed length"
and allowing functions to report back (via a Py_ssize_t *) how much data
was actually unpacked/packed. But again, waiting on that bug to have a
resolution before I do anything. I don't want to waste hours of effort
on something the developers ultimately decide they don't want and will
just reject.

--
Elizabeth
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Elizabeth Myers
On 20/01/17 16:46, Cameron Simpson wrote:
> On 20Jan2017 14:47, Elizabeth Myers  wrote:
>> 1) struct.unpack and struct.unpack_from should remain
>> backwards-compatible. I don't want to return extra values from it like
>> (length unpacked, (data...)) for that reason.
> 
> Fully agree with this.
> 
>> If the calcsize solution
>> feels a bit weird (it isn't much less efficient, because strings store
>> their length with them, so it's constant-time), there could also be new
>> functions that *do* return the length if you need it. To me though, this
>> feels like a use case for struct.iter_unpack.
> 
> Often, maybe, but there are still going to be protocols that the new
> format doesn't support, where the performant thing to do (in pure
> Python) is to scan what you can with struct and "hand scan" the special
> bits with special code. 
> Consider, for example, a format like MP4/ISO14496, where there's a
> regular block structure (which is somewhat struct parsable) that can
> contain embedded arbitraily weird information. Or the flipside where
> struct parsable data are embedded in a format not supported by struct.
> 
> The mixed situation is where you need to know where the parse got up
> to.  Calling calcsize or its variable size equivalent after a parse
> seems needlessly repetetive of the parse work.
> 
> For myself, I would want there to be some kind of call that returned the
> parse and the length scanned, with the historic interface preserved for
> the fixed size formats or for users not needing the length.
> 
>> 2) I want to avoid making a weird incongruity, where only
>> variable-length strings return the length actually parsed.
> 
> Fully agree. Arguing for two API calls: the current one and one that
> also returns the scan length.
> 
> Cheers,
> Cameron Simpson 

Some of the responses on the bug are discouraging... mostly seems to
boil down to people just not wanting to expand the struct module or
discourage its use. Everyone is a critic. I didn't know adding two
format specifiers was going to be this controversial. You'd think I
proposed adding braces or something :/.

I'm hesitant to go forward on this until the bug has a resolution.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Cameron Simpson

On 20Jan2017 14:47, Elizabeth Myers  wrote:

1) struct.unpack and struct.unpack_from should remain
backwards-compatible. I don't want to return extra values from it like
(length unpacked, (data...)) for that reason.


Fully agree with this.


If the calcsize solution
feels a bit weird (it isn't much less efficient, because strings store
their length with them, so it's constant-time), there could also be new
functions that *do* return the length if you need it. To me though, this
feels like a use case for struct.iter_unpack.


Often, maybe, but there are still going to be protocols that the new format 
doesn't support, where the performant thing to do (in pure Python) is to scan 
what you can with struct and "hand scan" the special bits with special code.  

Consider, for example, a format like MP4/ISO14496, where there's a regular 
block structure (which is somewhat struct parsable) that can contain embedded 
arbitraily weird information. Or the flipside where struct parsable data are 
embedded in a format not supported by struct.


The mixed situation is where you need to know where the parse got up to.  
Calling calcsize or its variable size equivalent after a parse seems needlessly 
repetetive of the parse work.


For myself, I would want there to be some kind of call that returned the parse 
and the length scanned, with the historic interface preserved for the fixed 
size formats or for users not needing the length.



2) I want to avoid making a weird incongruity, where only
variable-length strings return the length actually parsed.


Fully agree. Arguing for two API calls: the current one and one that also 
returns the scan length.


Cheers,
Cameron Simpson 
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Paul Moore
On 20 January 2017 at 20:47, Elizabeth Myers  wrote:
> Two things:
>
> 1) struct.unpack and struct.unpack_from should remain
> backwards-compatible. I don't want to return extra values from it like
> (length unpacked, (data...)) for that reason. If the calcsize solution
> feels a bit weird (it isn't much less efficient, because strings store
> their length with them, so it's constant-time), there could also be new
> functions that *do* return the length if you need it. To me though, this
> feels like a use case for struct.iter_unpack.
>
> 2) I want to avoid making a weird incongruity, where only
> variable-length strings return the length actually parsed. This also
> doesn't really help with length calculations unless you're doing
> calcsize without the variable-length specifiers, then adding it on. It's
> just more of an annoyance.

Fair points, both. And you've clearly thought the issues through, so
I'm +1 on your decision. You have the actual use case, and I'm just
theorising, so I'm happy to defer the decision to you.

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Paul Moore
On 20 January 2017 at 18:18, Guido van Rossum  wrote:
> I'd be wary of making a grab-bag of small improvements, it encourages
> bikeshedding.

Agreed. Plus the bikeshedding and debating risks draining Elizabeth's
motivation.

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Guido van Rossum
I'd be wary of making a grab-bag of small improvements, it encourages
bikeshedding.

--Guido (mobile)

On Jan 20, 2017 10:16 AM, "Ethan Furman"  wrote:

> On 01/20/2017 10:09 AM, Joao S. O. Bueno wrote:
>
>> On 20 January 2017 at 16:51, Elizabeth Myers wrote:
>>
>
> Should I write up a PEP about this? I am not sure if it's justified or
>>> not. It's 3 changes (calcsize and two format specifiers), but it might
>>> be useful to codify it.
>>>
>>
>> Yes - maybe a PEP.
>>
>
> I agree, especially if the change, simple as it is, requires a lot of
> rewrite.  In that case someone (ELizabeth?) should collect ideas for other
> improvements and shepherd it through the PEP process.
>
> --
> ~Ethan~
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Joao S. O. Bueno
On 20 January 2017 at 15:13, Nathaniel Smith  wrote:
> On Jan 20, 2017 09:00, "Paul Moore"  wrote:
>
> On 20 January 2017 at 16:51, Elizabeth Myers 
> wrote:
>> Should I write up a PEP about this? I am not sure if it's justified or
>> not. It's 3 changes (calcsize and two format specifiers), but it might
>> be useful to codify it.
>
> It feels a bit minor to need a PEP, but having said that did you pick
> up on the comment about needing to return the number of bytes
> consumed?
>
> str = struct.unpack('z', b'test\0xxx')
>
> How do we know where the unpack got to, so that we can continue
> parsing from there? It seems a bit wasteful to have to scan the string
> twice to use calcsize for this...
>
>
> unpack() is OK, because it already has the rule that it raises an error if
> it doesn't exactly consume the buffer. But I agree that if we do this then
> we'd really want versions of unpack_from and pack_into that return the new
> offset. (Further arguments that calcsize is insufficient: it doesn't work
> for potential other variable length items, e.g. if we added uleb128 support;
> it quickly becomes awkward if you have multiple strings; in practice I think
> everyone who needs this would just end up writing a wrapper that calls
> calcsize and returns the new offset anyway, so should just provide that up
> front.)
>
> For pack_into this is also easy, since currently it always returns None, so
> if it started returning an integer no one would notice (and it'd be kinda
> handy in its own right, honestly).
>
> unpack_from is the tricky one, because it already has a return value and
> this isn't it. Ideally it would have worked this way from the beginning, but
> too late for that now... I guess the obvious solution would be to come up
> with a new function that's otherwise identical to unpack_from but returns a
> (values, offset) tuple. What to call this, though, I don't know :-).
> unpack_at? unpack_next? (Hinting that this is the natural primitive you'd
> use to implement unpack_iter.)
>
Yes - maybe a PEP.

Then we could also, for example, add the suggestion of whitespace on
the struct description string
- which is nice.

And we could things of: unpack methods returns a specialized object-
not a tuple, which
has attributes with the extra information.

So, instead of

a, str = struct.unpack("IB$", data)

people who want the length can do:

tmp = struct.unpack("IB$", data)
do_things_with_len(tmp.tell)
a, str = tmp

The struct "object" could allow other things as well. Since we are at it,
maybe a 0 copy version, that would return items from their implace
buffer positions.

But, ok, maybe most of this should just go in a third party package -
anyway, a PEP
could be open for more improvements than the variable-lenght fields proposed.

(The idea of having attributes with extra information about size, for example -
I think that is better than having:

size, (a, str) = struct.unpack2(... )

)

   js
 -><-


> -n
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Elizabeth Myers
On 19/01/17 20:54, Cameron Simpson wrote:
> On 19Jan2017 16:04, Yury Selivanov  wrote:
>> This is a neat idea, but this will only work for parsing framed
>> binary protocols.  For example, if you protocol prefixes all packets
>> with a length field, you can write an efficient read buffer and
>> use your proposal to decode all of message's fields in one shot.
>> Which is good.
>>
>> Not all protocols use framing though.  For instance, your proposal
>> won't help to write Thrift or Postgres protocols parsers.
> 
> Sure, but a lot of things fit the proposal. Seems a win: both simple and
> useful.
> 
>> Overall, I'm not sure that this is worth the hassle.  With proposal:
>>
>>   data, = struct.unpack('!H$', buf)
>>   buf = buf[2+len(data):]
>>
>> with the current struct module:
>>
>>   len, = struct.unpack('!H', buf)
>>   data = buf[2:2+len]
>>   buf = buf[2+len:]
>>
>> Another thing: struct.calcsize won't work with structs that use
>> variable length fields.
> 
> True, but it would be enough for it to raise an exception of some kind.
> It won't break any in play code, and it will prevent accidents for users
> of new variable sizes formats.
> 
> We've all got things we wish struct might cover (I have a few, but
> strangely the top of the list is nonsemantic: I wish it let me put
> meaningless whitespace inside the format for readability).
> 
> +1 on the proposal from me.
> 
> Oh: subject to one proviso: reading a struct will need to return how
> many bytes of input data were scanned, not merely returning the decoded
> values.

This is a little difficult without breaking backwards compatibility,
but, it is not difficult to compute the lengths yourself. That said,
calcsize could require an extra parameter if given a format string with
variable-length specifiers in it, e.g.:

  struct.calcsize("z", (b'test'))

Would return 5 (zero-length terminator), so you don't have to compute it
yourself.

Also, I filed a bug, and proposed use of Z and z.

> 
> Cheers,
> Cameron Simpson 
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Elizabeth Myers
On 19/01/17 20:40, Cameron Simpson wrote:
> On 19Jan2017 12:08, Elizabeth Myers  wrote:
>> I also didn't mention that when you are unpacking iteratively (e.g., you
>> have multiple strings), the code becomes a bit more hairy:
>>
> test_bytes = b'\x00\x05hello\x00\x07goodbye\x00\x04test'
> offset = 0
> while offset < len(test_bytes):
>> ... length = struct.unpack_from('!H', test_bytes, offset)[0]
>> ... offset += 2
>> ... string = struct.unpack_from('{}s'.format(length), test_bytes,
>> offset)[0]
>> ... offset += length
>>
>> It actually gets a lot worse when you have to unpack a set of strings in
>> a context-sensitive manner. You have to be sure to update the offset
>> constantly so you can always unpack strings appropriately. Yuck!
> 
> Whenever I'm doing iterative stuff like this, either variable length
> binary or lexical stuff, I always end up with a bunch of functions which
> can be called like this:
> 
>  datalen, offset = get_bs(chunk, offset=offset)
> 
> The notable thing here is just that they return the data and the new
> offset, which makes updating the offset impossible to forget, and also
> makes the calling code more succinct, like the internal call to get_bs()
> below:
> 
> such as this decoder for a length encoded field:
> 
>  def get_bsdata(chunk, offset=0):
>''' Fetch a length-prefixed data chunk.
>Decodes an unsigned value from a bytes at the specified `offset`
>(default 0), and collects that many following bytes.
>Return those following bytes and the new offset.
>'''
>##is_bytes(chunk)
>offset0 = offset
>datalen, offset = get_bs(chunk, offset=offset)
>data = chunk[offset:offset+datalen]
>##is_bytes(data)
>if len(data) != datalen:
>  raise ValueError("bsdata(chunk, offset=%d): insufficient data:
> expected %d  bytes, got %d bytes"
>   % (offset0, datalen, len(data)))
>offset += datalen
>return data, offset

Gotta be honest, this seems less elegant than just adding something like
what netstruct does to the struct module. It's also way more verbose.

Perhaps some kind of higher level module could be built on struct at
some point, maybe in stdlib, maybe not (construct imo is not that lib
for previous raised objections).

> 
> Cheers,
> Cameron Simpson 
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Ideas for improving the struct module

2017-01-20 Thread Elizabeth Myers
On 19/01/17 15:04, Yury Selivanov wrote:
> This is a neat idea, but this will only work for parsing framed
> binary protocols.  For example, if you protocol prefixes all packets
> with a length field, you can write an efficient read buffer and
> use your proposal to decode all of message's fields in one shot.
> Which is good.
> 
> Not all protocols use framing though.  For instance, your proposal
> won't help to write Thrift or Postgres protocols parsers.

It won't help them, no, but it will help others who have to do similar
tasks, or help people build things on top of the struct module.

> 
> Overall, I'm not sure that this is worth the hassle.  With proposal:
> 
>data, = struct.unpack('!H$', buf)
>buf = buf[2+len(data):]
> 
> with the current struct module:
> 
>len, = struct.unpack('!H', buf)
>data = buf[2:2+len]
>buf = buf[2+len:]

I find such a construction is not really needed most of the time if I'm
dealing with repeated frames. I could just use struct.iter_unpack. It's
not useful in all cases, but as it stands, neither is the present struct
module.

Just because it is not useful to everyone does not mean it is not useful
to others, perhaps immensely so.

The existence of third party libraries that implement a portion of my
rather modest proposal I think already justifies its existence.

> 
> Another thing: struct.calcsize won't work with structs that use
> variable length fields.

Should probably raise an error if the format has a variable-length
string in it. If you're using variable-length strings, you probably
aren't a consumer of struct.calcsize anyway.

> 
> Yury
> 
> 
> On 2017-01-18 5:24 AM, Elizabeth Myers wrote:
>> Hello,
>>
>> I've noticed a lot of binary protocols require variable length
>> bytestrings (with or without a null terminator), but it is not easy to
>> unpack these in Python without first reading the desired length, or
>> reading bytes until a null terminator is reached.
>>
>> I've noticed the netstruct library
>> (https://github.com/stendec/netstruct) has a format specifier, $, which
>> assumes the previous type to pack/unpack is the string's length. This is
>> an interesting idea in of itself, but doesn't handle the null-terminated
>> string chase. I know $ is similar to pascal strings, but sometimes you
>> need more than 255 characters :p.
>>
>> For null-terminated strings, it may be simpler to have a specifier for
>> those. I propose 0, but this point can be bikeshedded over endlessly if
>> desired ;) (I thought about using n/N but they're :P).
>>
>> It's worth noting that (maybe one of?) Perl's equivalent to the struct
>> module, whose name escapes me atm, has a module which can handle this
>> case. I can't remember if it handled variable length or zero-terminated
>> though; maybe it did both. Perl is more or less my 10th language. :p
>>
>> This pain point is an annoyance imo and would greatly simplify a lot of
>> code if implemented, or something like it. I'd be happy to take a look
>> at implementing it if the idea is received sufficiently warmly.
>>
>> -- 
>> Elizabeth
>> ___
>> Python-ideas mailing list
>> Python-ideas@python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
> 
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Context manager to temporarily set signal handlers

2017-01-20 Thread Thomas Kluyver
Not uncommonly, I want to do something like this in code:

import signal

# Install my own signal handler
prev_hup = signal.signal(signal.SIGHUP, my_handler)
prev_term = signal.signal(signal.SIGTERM, my_handler)
try:
do_something_else()
finally:
# Restore previous signal handlers
signal.signal(signal.SIGHUP, prev_hup)
signal.signal(signal.SIGTERM, prev_term)

This works if the existing signal handler is a Python function, or the
special values SIG_IGN (ignore) or SIG_DFL (default). However, it breaks
if code has set a signal handler in C: this is not returned, and there
is no way in Python to reinstate a C-level signal handler once we've
replaced it from Python.

I propose two possible solutions:

1. The high-level approach: a context manager which can temporarily set
one or more signal handlers. If this was implemented in C, it could
restore C-level as well as Python-level signal handlers.

2. A lower level approach: signal() and getsignal() would gain the
ability to return an opaque object which refers to a C-level signal
handler. The only use for this would be to pass it back to
signal.signal() to set it as a signal handler again. The context manager
from (1) could then be implemented in Python.

Crosslinking http://bugs.python.org/issue13285

Thomas
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/