Re: [Python-ideas] Add function readbyte to asyncio.StreamReader

2018-07-27 Thread Jörn Heissler
On Mon, Jul 23, 2018 at 22:25:14 +0100, Gustavo Carneiro wrote:
> Well, even if it is worth, i.e. your use case is not rare enough,

Reading a single byte is certainly a special case, but I think it's generic
enough to warrant its own function.
I believe there should be many binary protocols out there that would benefit
from such a function.

For example the first socks5 (rfc1928) message could be parsed like this:

async def read_methods(reader):
if (version := await reader.readbyte()) != 5:
raise Exception(f'Bad version: {version}')
if (nmethods := await reader.readbyte()) == 0:
raise Exception('Need at least one method')
return await reader.readexactly(nmethods)

> I would
> suggest at least making it private, readexactly can call this specialised
> function if nbytes==1:
> 
> def _readbyte(self):
>
> 
> def readexactly(self, num):
>if num == 1:
>   return self._readbyte()
>   ... the rest stays the same..

Maybe I wasn't clear in my intent: readbyte would not return a bytes object but
an integer, i.e. the byte value.

My current approach is this:

value = (await reader.readexactly(1))[0]

I'd like to make it more readable (and faster at the same time):

value = await reader.readbyte()

> But to be honest, you are probably better off managing the buffer yourself:
> Just call, e.g., stream.read(4096), it will return a buffer of up to 4k
> length, then you can iterate over the buffer byte by byte until the
> condition is met, repeat until the end of stream, or whatever.

StreamReader already manages a buffer. Managing a second buffer would
mean I'd need to copy all my data from one buffer to another.

But let's assume I went this way and iterated over my own buffer:

* I receive some bytes. Maybe it's exactly the amount I need, then I can parse
  it and discard the buffer.
* Or it's less than I need. I'd have to wait for more data and either restart my
  parser or remember the state from before.
* Or it's more than I need. I'd have to remove the parsed bytes from the buffer.
  Alternatively I could push back the unparsed bytes to my buffer.

This adds lots of code complexity. And the code is likely slower than calling
readbyte() couple of times; for my current use case, calling it once is usually
sufficient.

I like my current approach way better than managing my own buffer and thus
reinventing StreamReader.

Adding the new function as proposed would improve the code in both readability
and speed.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Python docs page: In what ways is None special

2018-07-23 Thread Jörn Heissler
On Mon, Jul 23, 2018 at 10:03:10 +0100, Jonathan Fine wrote:
> I thought, a page on how None is special would be nice.
> I've not found such a page on the web. We do have
> ===
> https://docs.python.org/3/library/constants.html

Hi,

there's also
https://docs.python.org/3/reference/datamodel.html#the-standard-type-hierarchy

None

This type has a single value. There is a single object with this
value. This object is accessed through the built-in name None. It is
used to signify the absence of a value in many situations, e.g., it
is returned from functions that don’t explicitly return anything.
Its truth value is false.


Cheers
Jörn Heissler
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Add function readbyte to asyncio.StreamReader

2018-07-22 Thread Jörn Heissler
Hello,

I'm implementing a protocol where I need to read individual bytes until
a condition is met (value & 0x80 == 0).

My current approach is: value = (await reader.readexactly(1))[0]

To speed this up, I propose that a new function is added to
asyncio.StreamReader: value = await reader.readbyte()

I duplicated readexactly and stripped out some parts. Below code appears
to work:

async def readbyte(self):
if self._exception is not None:
raise self._exception

while not self._buffer:
if self._eof:
raise EOFError()
await self._wait_for_data('readbyte')

data = self._buffer[0]
del self._buffer[0]
self._maybe_resume_transport()
return data

For comparing the speed, I'm receiving a 50 MiB file byte-by-byte.

cpython-3.7.0:
readexactly: 42.43 seconds
readbyte   : 22.05 seconds
speedup: 92.4%

pypy3-v6.0.0:
readexactly: 3.21 seconds
readbyte   : 2.76 seconds
speedup: 16.3%

Thanks
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/