Re: [Python-ideas] Where should grouping() live (was: grouping / dict of lists)

2018-07-09 Thread Franklin? Lee
(Fixing quote and attribution.)

On Fri, Jul 6, 2018, 11:32 Chris Barker - NOAA Federal via
Python-ideas  wrote:
>
> On Jul 6, 2018, at 2:10 AM, Steven D'Aprano  wrote:
>
> > On Fri, Jul 06, 2018 at 09:49:37AM +0100, Cammil Taank wrote:
> > > I would consider statistics
>
> > > to have similarities - median, mean etc are aggregate functions.
>
>
> Not really, more like reduce, actually -/ you get a single result.
>
> > > Histograms are also doing something similar to grouping.
>
> > .(Yes, a few statistics apply to nominal and ordinal data too,
>
>
> And for that, a generic grouping function could be used.
>
> In fact, allowing Counter to be used as the accumulater was one suggestion in 
> this thread, and would build s histogram.
>
> Now that I think about it, you could write a key function that built a 
> histogram for continuous data as well.
>
> Though that might be a bit klunky.
>
> But if someone thinks that’s a good idea, a PR for an example would be 
> accepted:
>
> https://github.com/PythonCHB/grouper

+1 for `collections`, because it's where you look for something
similar to Counter.

-1 for `statistics`, because the need isn't specific to statistics.
It'd be like putting `commonprefix`, which is a general string
operation, into `os.path`. It's hacky to import a domain-specific
module to use one of its non-domain-specific helpers for a different
domain.

Someone can argue for functools, as that's the functional programming
module, containing `reduce`.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fwd: grouping / dict of lists

2018-07-09 Thread Franklin? Lee
On Mon, Jul 9, 2018 at 12:22 PM, Chris Barker  wrote:
> On Fri, Jul 6, 2018 at 12:26 PM, Franklin? Lee I use this kind of function

>> I added several options, such as:
>> - key function
>> - value function
>> - "ignore": Skip values with these keys.
>> - "postprocess": Apply a function to each group after completion.
>> - Pass in the container to store in. For example, create an
>> OrderedDict and pass it in. It may already hold items.
>> - Specify the container for each group.
>> - Specify how to add to the container for each group.
>
>
> interesting...
>
>>
>> Then I cut it down to two optional parameters:
>> - key function. If not provided, the iterable is considered to have
>> key-value pairs.
>
>
> OK -- seems we're all converging on that :-)
>
>>
>> - The storage container.
>
>
> so this means you'r passing in a full set of storage containers? I'm a vit
> confused by that -- if they might be pre-populated, then they would need to
> be instance,s an you'd need to have one for every key -- how would you know
> in advance aht you needed???

No, I mean the mapping (outer) container. For example, I can pass in
an empty OrderedDict, or a dict that already contained some groups
from a previous call to the grouping function.

I took out the option for the per-group (inner) containers. I never
found it necessary to scrooge (scrooge on?) the memory, when I could
postprocessed the lists after grouping. A mapvalues function will make
postprocessing more convenient, and lend weight to a dicttools
suggestion.

# Unfortunate double meaning of 'map' in the function signature:
def mapvalues(f, mapping):
try:
items = mapping.items()
except AttributeError:
items = mapping
return {k: f(v) for k, v in items}

> I played around with passing in a optional storage object:
>
> https://github.com/PythonCHB/grouper/commit/d986816905406ec402724beaed2b88c96df64469
>
> but as we might want a list or a set, or a Counter, or ??? it got pretty
> ugly, as lists and sets and Counters all have different APIs for adding
> stuff. So I gave up and figured just saying "it's always a list) made the
> most sense.

My solution at the time was to add another parameter to specify how to
add to the container.

In fuller generality, the option for the per-group container may
consist of specifying a monad (if I remember monads correctly). You
need to at least specify a per-group container constructor and a
binary function that adds to it. In the case of `Counter`, the
constructor is `int` and the binary function is `int.__add__`, and the
Counter constructor effectively runs concurrent `reduce`.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add new `Symbol` type

2018-07-09 Thread Eric V. Smith

On 7/9/2018 5:01 PM, Brett Cannon wrote:



On Fri, 6 Jul 2018 at 09:24 Eric V. Smith > wrote:


On 7/6/2018 11:20 AM, Flavio Curella wrote:
 > I think this thread can be resolved as 'used
unittest.mock.sentinel'. It
 > doesn't have 'global sentinels', but I'm not convinced they are
actually
 > necessary, since `mock.sentinel` objects with the same name
compare as
 > equal. Thanks to Nathaniel, I now understand that JS has global
symbols
 > for historical reasons that we don't have, and I'm not convinced of
 > their usefulness.

Do all Python distributions ship with unittest.mock? I see to recall
that Debian and/or Ubuntu strips out part of the normal distribution.


It's usually tkinter and such, not unittest stuff from my understanding.


Good to know. Thanks.


For example, dataclasses.py has a sentinel, and it includes some
code to
get a more helpful repr. It would make sense to re-use the
unittest.mock.sentinel code, but not if that code isn't always
guaranteed to be present.


Would it make sense to abstract this out to the 'types'  to have a 
single 'types.sentinel' object for those rare cases that Guido pointed out?


I think so. I'd hate to import unittest.mock just to get a sentinel 
object for dataclasses.


Eric

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add new `Symbol` type

2018-07-09 Thread Brett Cannon
On Fri, 6 Jul 2018 at 09:24 Eric V. Smith  wrote:

> On 7/6/2018 11:20 AM, Flavio Curella wrote:
> > I think this thread can be resolved as 'used unittest.mock.sentinel'. It
> > doesn't have 'global sentinels', but I'm not convinced they are actually
> > necessary, since `mock.sentinel` objects with the same name compare as
> > equal. Thanks to Nathaniel, I now understand that JS has global symbols
> > for historical reasons that we don't have, and I'm not convinced of
> > their usefulness.
>
> Do all Python distributions ship with unittest.mock? I see to recall
> that Debian and/or Ubuntu strips out part of the normal distribution.
>

It's usually tkinter and such, not unittest stuff from my understanding.


>
> For example, dataclasses.py has a sentinel, and it includes some code to
> get a more helpful repr. It would make sense to re-use the
> unittest.mock.sentinel code, but not if that code isn't always
> guaranteed to be present.
>

Would it make sense to abstract this out to the 'types'  to have a single
'types.sentinel' object for those rare cases that Guido pointed out?
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fwd: grouping / dict of lists

2018-07-09 Thread David Mertz
In my mind, I *rarely* (which is more than never) have my data in the form
of a sequence of key/value pairs.  The version of the API that assumes data
starts that way feels like either a niche case, or demands preprocessing
before it's ready to pass to grouping() or collections.Grouping().

That said, an identity key is rarely interesting either. So I think have
key=None mean "assume we get key/val pairs is harmless to the more common
case where we give an explicit key function.

The uncommon need for grouping on equality can be handled with 'key=lambda
x: x'.

On Mon, Jul 9, 2018, 12:22 PM Chris Barker  wrote:

> On Fri, Jul 6, 2018 at 12:26 PM, Franklin? Lee I use this kind of function
> in several different projects over the
>>
>> years, and I rewrote it many times as needed.
>>
>
>
>> I added several options, such as:
>> - key function
>> - value function
>> - "ignore": Skip values with these keys.
>> - "postprocess": Apply a function to each group after completion.
>> - Pass in the container to store in. For example, create an
>> OrderedDict and pass it in. It may already hold items.
>> - Specify the container for each group.
>> - Specify how to add to the container for each group.
>>
>
> interesting...
>
>
>> Then I cut it down to two optional parameters:
>> - key function. If not provided, the iterable is considered to have
>> key-value pairs.
>>
>
> OK -- seems we're all converging on that :-)
>
>
>> - The storage container.
>>
>
> so this means you'r passing in a full set of storage containers? I'm a vit
> confused by that -- if they might be pre-populated, then they would need to
> be instance,s an you'd need to have one for every key -- how would you know
> in advance aht you needed???
>
> I played around with passing in a optional storage object:
>
>
> https://github.com/PythonCHB/grouper/commit/d986816905406ec402724beaed2b88c96df64469
>
> but as we might want a list or a set, or a Counter, or ??? it got pretty
> ugly, as lists and sets and Counters all have different APIs for adding
> stuff. So I gave up and figured just saying "it's always a list) made the
> most sense.
>
>
>> Finally, I removed the key function, and only took pairs and an
>> optional container. However, I don't remember why I removed the key
>> function. It may be that I was writing throwaway lambdas, and I
>> decided I might as well just write the transformation into the
>> comprehension.
>
>
> exactly -- but I suspect hat may be because you where writing a
> comprehension anyway, as you needed to manipulate the values, also -- so if
> there were a value function, you could use either API.
>
>
>> I think a key function is worth having.
>>
>
> I think there's more or less consensus on that too.
>
> -CHB
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Calling python from C completely statically

2018-07-09 Thread Alberto Garcia
Hi,

thank you for your response. I've downloaded the sources but I couldn't
find any documentation. In addition I see that there is not a single C/C++
file. What I want to do is calling python from C.
Am I missing something?

Cheers


On Mon, Jul 9, 2018 at 12:20 PM Barry Scott  wrote:

> I think you might find Gordon McMillian's installer interesting to look
> at. It has a lot
> f the tech that I think you are looking for.
>
> Works up to python 2.7. I ended up taking over when Gordon stopped
> maintaining
> it and kept it going up to python 2.7. In principle the same ideas could
> be made to
> work in python 3 I believe.
>
> https://sourceforge.net/projects/meinc-installer/
>
> The zip file is appended to the end of the .EXE or unix ELF fie.
> The boot strap knows how to import form the ZIP at the end of
> the binary.
>
> It also has a way to split out the .SO/.DLL files from the ZIP and
> allow them to be loaded. Single EXE mode.
>
> There are docs that explain how it works in the sources.
>
> Barry
>
>
>
> On 9 Jul 2018, at 19:40, Alberto Garcia  wrote:
>
> Does the zip need to reside in disk to be loaded. Or can it be loaded from
> memory? I don't want it to be loaded from disk but from Memory
>
>
>
> On Mon, Jul 9, 2018 at 9:59 AM Alberto Garcia 
> wrote:
>
>> O  I guess you mean this:
>>
>> https://github.com/anthony-tuininga/cx_Freeze/blob/master/source/bases/Common.c
>>
>> Right?
>>
>> On Mon, Jul 9, 2018 at 9:48 AM Alberto Garcia 
>> wrote:
>>
>>> Thank you for your response,
>>>
>>> I was thinking on creating that zip file with the content of the Lib
>>> folder and having my c code to download it over the network and have it in
>>> memory.
>>> I guess that the zip file should have no compression at all right?
>>>
>>> When you say that I need to use the cx_freeze approach what do you mean?
>>> Can you point me to where they do that?
>>>
>>> And why changing sys.path again to the executable again? Which part of
>>> the executable?
>>>
>>> I'll put my efforts in this.
>>>
>>> Thank you
>>>
>>> On Mon, Jul 9, 2018 at 7:16 AM Nick Coghlan  wrote:
>>>
 On 9 July 2018 at 03:10, Alberto Garcia 
 wrote:
 > Hey there,
 >
 > Yes, the part of having the pyd modules built in in library is
 already done.
 > I followed the instructions in the README. What I would like to know
 now is
 > how to embed the non frozen python (py) modules. Can you guys please
 point
 > me in the right direction.

 The gist is to:

 1. take the entire Lib directory and put it in a zip archive
 2. use the approach demonstrated in cx_freeze to point sys.path in
 your static executable at that zip archive
 3. adjust your C code to point sys.path back at the executable itself,
 and then combine your executable and the zip archive into a single
 contiguous file (similar to what zipapp does with it's helper script
 and app archive)

 There are likely to still be rough edges when doing that, since this
 isn't a well tested configuration. When all else fails, find the part
 of the source code responsible for any error messages you're seeing,
 and try to work out if there's a setting you can tweak to avoid
 hitting that code path.

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia

>>> --
>>> Alberto García Illera
>>>
>>> GPG Public Key: https://goo.gl/twKUUv
>>>
>> --
>> Alberto García Illera
>>
>> GPG Public Key: https://goo.gl/twKUUv
>>
>
>
> --
> Alberto García Illera
>
> GPG Public Key: https://goo.gl/twKUUv
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>

-- 
Alberto García Illera

GPG Public Key: https://goo.gl/twKUUv
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Calling python from C completely statically

2018-07-09 Thread Barry Scott
I think you might find Gordon McMillian's installer interesting to look at. It 
has a lot
f the tech that I think you are looking for.

Works up to python 2.7. I ended up taking over when Gordon stopped maintaining
it and kept it going up to python 2.7. In principle the same ideas could be 
made to
work in python 3 I believe.

https://sourceforge.net/projects/meinc-installer/

The zip file is appended to the end of the .EXE or unix ELF fie.
The boot strap knows how to import form the ZIP at the end of
the binary.

It also has a way to split out the .SO/.DLL files from the ZIP and
allow them to be loaded. Single EXE mode.

There are docs that explain how it works in the sources.

Barry



> On 9 Jul 2018, at 19:40, Alberto Garcia  wrote:
> 
> Does the zip need to reside in disk to be loaded. Or can it be loaded from 
> memory? I don't want it to be loaded from disk but from Memory
> 
> 
> 
> On Mon, Jul 9, 2018 at 9:59 AM Alberto Garcia  > wrote:
> O  I guess you mean this:
> https://github.com/anthony-tuininga/cx_Freeze/blob/master/source/bases/Common.c
>  
> 
> Right?
> 
> On Mon, Jul 9, 2018 at 9:48 AM Alberto Garcia  > wrote:
> Thank you for your response,
> 
> I was thinking on creating that zip file with the content of the Lib folder 
> and having my c code to download it over the network and have it in memory. 
> I guess that the zip file should have no compression at all right?
> 
> When you say that I need to use the cx_freeze approach what do you mean? Can 
> you point me to where they do that? 
> 
> And why changing sys.path again to the executable again? Which part of the 
> executable? 
> 
> I'll put my efforts in this.
> 
> Thank you 
> 
> On Mon, Jul 9, 2018 at 7:16 AM Nick Coghlan  > wrote:
> On 9 July 2018 at 03:10, Alberto Garcia  > wrote:
> > Hey there,
> >
> > Yes, the part of having the pyd modules built in in library is already done.
> > I followed the instructions in the README. What I would like to know now is
> > how to embed the non frozen python (py) modules. Can you guys please point
> > me in the right direction.
> 
> The gist is to:
> 
> 1. take the entire Lib directory and put it in a zip archive
> 2. use the approach demonstrated in cx_freeze to point sys.path in
> your static executable at that zip archive
> 3. adjust your C code to point sys.path back at the executable itself,
> and then combine your executable and the zip archive into a single
> contiguous file (similar to what zipapp does with it's helper script
> and app archive)
> 
> There are likely to still be rough edges when doing that, since this
> isn't a well tested configuration. When all else fails, find the part
> of the source code responsible for any error messages you're seeing,
> and try to work out if there's a setting you can tweak to avoid
> hitting that code path.
> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncogh...@gmail.com    |   
> Brisbane, Australia
> -- 
> Alberto García Illera
> 
> GPG Public Key: https://goo.gl/twKUUv -- 
> Alberto García Illera
> 
> GPG Public Key: https://goo.gl/twKUUv 
> 
> -- 
> Alberto García Illera
> 
> GPG Public Key: https://goo.gl/twKUUv 
> ___
> Python-ideas mailing list
> Python-ideas@python.org 
> https://mail.python.org/mailman/listinfo/python-ideas 
> 
> Code of Conduct: http://python.org/psf/codeofconduct/ 
> 
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Calling python from C completely statically

2018-07-09 Thread Alberto Garcia
Does the zip need to reside in disk to be loaded. Or can it be loaded from
memory? I don't want it to be loaded from disk but from Memory



On Mon, Jul 9, 2018 at 9:59 AM Alberto Garcia 
wrote:

> O  I guess you mean this:
>
> https://github.com/anthony-tuininga/cx_Freeze/blob/master/source/bases/Common.c
>
> Right?
>
> On Mon, Jul 9, 2018 at 9:48 AM Alberto Garcia 
> wrote:
>
>> Thank you for your response,
>>
>> I was thinking on creating that zip file with the content of the Lib
>> folder and having my c code to download it over the network and have it in
>> memory.
>> I guess that the zip file should have no compression at all right?
>>
>> When you say that I need to use the cx_freeze approach what do you mean?
>> Can you point me to where they do that?
>>
>> And why changing sys.path again to the executable again? Which part of
>> the executable?
>>
>> I'll put my efforts in this.
>>
>> Thank you
>>
>> On Mon, Jul 9, 2018 at 7:16 AM Nick Coghlan  wrote:
>>
>>> On 9 July 2018 at 03:10, Alberto Garcia  wrote:
>>> > Hey there,
>>> >
>>> > Yes, the part of having the pyd modules built in in library is already
>>> done.
>>> > I followed the instructions in the README. What I would like to know
>>> now is
>>> > how to embed the non frozen python (py) modules. Can you guys please
>>> point
>>> > me in the right direction.
>>>
>>> The gist is to:
>>>
>>> 1. take the entire Lib directory and put it in a zip archive
>>> 2. use the approach demonstrated in cx_freeze to point sys.path in
>>> your static executable at that zip archive
>>> 3. adjust your C code to point sys.path back at the executable itself,
>>> and then combine your executable and the zip archive into a single
>>> contiguous file (similar to what zipapp does with it's helper script
>>> and app archive)
>>>
>>> There are likely to still be rough edges when doing that, since this
>>> isn't a well tested configuration. When all else fails, find the part
>>> of the source code responsible for any error messages you're seeing,
>>> and try to work out if there's a setting you can tweak to avoid
>>> hitting that code path.
>>>
>>> Cheers,
>>> Nick.
>>>
>>> --
>>> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>>>
>> --
>> Alberto García Illera
>>
>> GPG Public Key: https://goo.gl/twKUUv
>>
> --
> Alberto García Illera
>
> GPG Public Key: https://goo.gl/twKUUv
>


-- 
Alberto García Illera

GPG Public Key: https://goo.gl/twKUUv
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Calling python from C completely statically

2018-07-09 Thread Alberto Garcia
O  I guess you mean this:
https://github.com/anthony-tuininga/cx_Freeze/blob/master/source/bases/Common.c

Right?

On Mon, Jul 9, 2018 at 9:48 AM Alberto Garcia 
wrote:

> Thank you for your response,
>
> I was thinking on creating that zip file with the content of the Lib
> folder and having my c code to download it over the network and have it in
> memory.
> I guess that the zip file should have no compression at all right?
>
> When you say that I need to use the cx_freeze approach what do you mean?
> Can you point me to where they do that?
>
> And why changing sys.path again to the executable again? Which part of the
> executable?
>
> I'll put my efforts in this.
>
> Thank you
>
> On Mon, Jul 9, 2018 at 7:16 AM Nick Coghlan  wrote:
>
>> On 9 July 2018 at 03:10, Alberto Garcia  wrote:
>> > Hey there,
>> >
>> > Yes, the part of having the pyd modules built in in library is already
>> done.
>> > I followed the instructions in the README. What I would like to know
>> now is
>> > how to embed the non frozen python (py) modules. Can you guys please
>> point
>> > me in the right direction.
>>
>> The gist is to:
>>
>> 1. take the entire Lib directory and put it in a zip archive
>> 2. use the approach demonstrated in cx_freeze to point sys.path in
>> your static executable at that zip archive
>> 3. adjust your C code to point sys.path back at the executable itself,
>> and then combine your executable and the zip archive into a single
>> contiguous file (similar to what zipapp does with it's helper script
>> and app archive)
>>
>> There are likely to still be rough edges when doing that, since this
>> isn't a well tested configuration. When all else fails, find the part
>> of the source code responsible for any error messages you're seeing,
>> and try to work out if there's a setting you can tweak to avoid
>> hitting that code path.
>>
>> Cheers,
>> Nick.
>>
>> --
>> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>>
> --
> Alberto García Illera
>
> GPG Public Key: https://goo.gl/twKUUv
>
-- 
Alberto García Illera

GPG Public Key: https://goo.gl/twKUUv
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Calling python from C completely statically

2018-07-09 Thread Alberto Garcia
Thank you for your response,

I was thinking on creating that zip file with the content of the Lib folder
and having my c code to download it over the network and have it in memory.
I guess that the zip file should have no compression at all right?

When you say that I need to use the cx_freeze approach what do you mean?
Can you point me to where they do that?

And why changing sys.path again to the executable again? Which part of the
executable?

I'll put my efforts in this.

Thank you

On Mon, Jul 9, 2018 at 7:16 AM Nick Coghlan  wrote:

> On 9 July 2018 at 03:10, Alberto Garcia  wrote:
> > Hey there,
> >
> > Yes, the part of having the pyd modules built in in library is already
> done.
> > I followed the instructions in the README. What I would like to know now
> is
> > how to embed the non frozen python (py) modules. Can you guys please
> point
> > me in the right direction.
>
> The gist is to:
>
> 1. take the entire Lib directory and put it in a zip archive
> 2. use the approach demonstrated in cx_freeze to point sys.path in
> your static executable at that zip archive
> 3. adjust your C code to point sys.path back at the executable itself,
> and then combine your executable and the zip archive into a single
> contiguous file (similar to what zipapp does with it's helper script
> and app archive)
>
> There are likely to still be rough edges when doing that, since this
> isn't a well tested configuration. When all else fails, find the part
> of the source code responsible for any error messages you're seeing,
> and try to work out if there's a setting you can tweak to avoid
> hitting that code path.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>
-- 
Alberto García Illera

GPG Public Key: https://goo.gl/twKUUv
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fwd: grouping / dict of lists

2018-07-09 Thread Chris Barker via Python-ideas
On Fri, Jul 6, 2018 at 12:26 PM, Franklin? Lee I use this kind of function
in several different projects over the
>
> years, and I rewrote it many times as needed.
>


> I added several options, such as:
> - key function
> - value function
> - "ignore": Skip values with these keys.
> - "postprocess": Apply a function to each group after completion.
> - Pass in the container to store in. For example, create an
> OrderedDict and pass it in. It may already hold items.
> - Specify the container for each group.
> - Specify how to add to the container for each group.
>

interesting...


> Then I cut it down to two optional parameters:
> - key function. If not provided, the iterable is considered to have
> key-value pairs.
>

OK -- seems we're all converging on that :-)


> - The storage container.
>

so this means you'r passing in a full set of storage containers? I'm a vit
confused by that -- if they might be pre-populated, then they would need to
be instance,s an you'd need to have one for every key -- how would you know
in advance aht you needed???

I played around with passing in a optional storage object:

https://github.com/PythonCHB/grouper/commit/d986816905406ec402724beaed2b88c96df64469

but as we might want a list or a set, or a Counter, or ??? it got pretty
ugly, as lists and sets and Counters all have different APIs for adding
stuff. So I gave up and figured just saying "it's always a list) made the
most sense.


> Finally, I removed the key function, and only took pairs and an
> optional container. However, I don't remember why I removed the key
> function. It may be that I was writing throwaway lambdas, and I
> decided I might as well just write the transformation into the
> comprehension.


exactly -- but I suspect hat may be because you where writing a
comprehension anyway, as you needed to manipulate the values, also -- so if
there were a value function, you could use either API.


> I think a key function is worth having.
>

I think there's more or less consensus on that too.

-CHB



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] The future of Python parallelism. The GIL. Subinterpreters. Actors.

2018-07-09 Thread Trent Nelson
On Sun, Jul 08, 2018 at 11:27:08AM -0700, David Foster wrote:

> I'd like to solicit some feedback on what might be the most
> efficient way to make forward progress on efficient parallelization
> in Python inside the same OS process. The most promising areas
> appear to be:

You might find PyParallel interesting, at least from a "here's what was
tried, it worked, but we're not doing it like that" perspective.

http://pyparallel.org

https://speakerdeck.com/trent/pyparallel-how-we-removed-the-gil-and-exploited-all-cores

I still think it was a pretty successful proof-of-concept regarding
removing the GIL without having to actually remove it.  Performance was
pretty good too, as you can see in those graphs.

> -- 
> David Foster | Seattle, WA, USA

Regards,

Trent.

--
https://trent.me
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Where should grouping() live

2018-07-09 Thread Chris Barker via Python-ideas
On Fri, Jul 6, 2018 at 5:13 PM, Michael Selik  wrote:

> On Tue, Jul 3, 2018 at 10:11 PM Chris Barker via Python-ideas <
> python-ideas@python.org> wrote:
>
>> * There are types of data well suited to the key function approach, and
>> other data not so well suited to it. If you want to support the not as well
>> suited use cases, you should have a value function as well and/or take a
>> (key, value) pair.
>>
>> * There are some nice advantages in flexibility to having a Grouping
>> class, rather than simply a function.
>>
>
> The tri-grams example is interesting and shows some clever things you can
> do. The bi-grams example I wrote in my draft PEP could be extended to
> handle tri-grams with just a key-function, no value-function.
>

hmm, I'll take a look -- 'cause I found that I was really limited to only a
certain class of problems without a way to get "custom" values.

Do you mean the "foods" example?

>>> foods = [
... ('fruit', 'apple'),
... ('vegetable', 'broccoli'),
... ('fruit', 'clementine'),
... ('vegetable', 'daikon')
... ]
>>> groups = grouping(foods, key=lambda pair: pair[0])
>>> {k: [v for _, v in g] for k, g in groups.items()}
{'fruit': ['apple', 'clementine'], 'vegetable': ['broccoli', 'daikon']}


Because that one, I think, makes my point well. To get what you want, you
have to post-processthe Grouping with a (somewhat complex) comprehension.
If someone is that adept with comprehensions, and want to do it that way,
the grouping function isn't really buying them much at all, over
setdefault, or defaultdict, or roll your own.

Contrast this with:

groups = grouping(foods,
  key=lambda pair: pair[0],
  value=lambda pair: pair[1])

and you're done.

or:

groups = grouping(foods,
  key=itemgetter(0),
  value=itemgetter0))


Or even better:

groups = grouping(foods)

:-)

However, because this example is fun it may be distracting from the core
> value of ``grouped`` or ``grouping``.
>

Actually, I think it's the opposite -- it opens up the concept to be more
general purpose -- I guess I'm thinking of this a "dict with lists as the
values" that has many purposes beyond strictly "groupby". Maybe that's
because I'm a general python programmer, and not a database guy, but if
something is going to be added to the stdlib, why not add a more general
purpose class?


> I don't think we need a nicer API for complex grouping tasks. As the tasks
> get increasingly sophisticated, any general-purpose API will be less nice
> than something built for that specific task.
>

I guess this is where we disagree -- I think we've found an API that is
general purpose, and cleanly supports multiple tasks.

Instead, I want the easiest possible interface for making groups for
> every-day use cases. The wide range of situations that ``sorted`` covers
> with just a key-function suggests that ``grouped`` should follow the same
> pattern.
>

not at all -- sorted() is about, well, sorting -- which means rearranging
items. I certainly don't expect it to break up the items for me.

Again, this is a matter of perspective -- if you you start with "groupby"
as a concept, then I can see how you see the parallel with sorted -- you
are rearranging the items, but this time into groups.

But if you start with "a dict of lists", then you take a wider perspective:

- It can naturally an easily be used to group things
- It can do another nifty things
- And as a "dict of something", it's natural to think of keys AND values,
and to want a dict-like API -- i.e. pass in (key, value) pairs.

I do think that the default, key=None, could be set to handle (key, value)
> pairs.
>

OK, so for my part, if you provide the (key, value) pair API, then you
don't really need a value_func. But as the "pass in a function to process
the data" model IS well suited to some tasks, and some people simply like
the style, why not?

And it creates an asymetry: or you have a (key, the_item) problem, you can
use either the key function API or the (key, value) API -- but if you have
a (key, value) problem, you can only use the (key, value) API

But I'm still reluctant to break the standard of sorted, min, max, and
> groupby.
>

This is the power of Python's keyword parameters -- anyone coming to this
from a perspective of "I expect this to be like sorted, min, max, and
groupby" can simply ignore the value parameter :-)

One more argument :-)

There have been comments a bout how maybe some of the classes in
collections are maybe not needed -- Counter, in particular. I tend to
agree, but i think the reason Counter is not-that-useful is because it
doesn't do enough -- not that it isn't useful -- it's just such a thin
wrapper around a dict, that I hardly see the point.

Example:

In [12]: c = Counter()

In [13]: c['f'] += 1

In [14]: c['g'] = "some random thing"

In [15]: c
Out[15]: Counter({'f': 1, 'g': 'some random thing'})

Is that really that useful? I need to do the counting by hand, 

Re: [Python-ideas] The future of Python parallelism. The GIL. Subinterpreters. Actors.

2018-07-09 Thread Nick Coghlan
On 9 July 2018 at 04:27, David Foster  wrote:
> I'd like to solicit some feedback on what might be the most efficient way to
> make forward progress on efficient parallelization in Python inside the same
> OS process. The most promising areas appear to be:
>
> 1. Make the current subinterpreter implementation in Python have more
> complete isolation, sharing almost no state between subinterpreters. In
> particular not sharing the GIL. The "Interpreter Isolation" section of PEP
> 554 enumerates areas that are currently shared, some of which probably
> shouldn't be.
>
> 2. Give up on making things work inside the same OS process and rather focus
> on implementing better abstractions on top of the existing multiprocessing
> API so that the actor model is easier to program against. For example,
> providing some notion of Channels to communicate between lines of execution,
> a way to monitor the number of Messages waiting in each channel for
> throughput profiling and diagnostics, Supervision, etc. In particular I
> could do this by using an existing library like Pykka or Thespian and
> extending it where necessary.

Yep, that's basically the way Eric and I and a few others have been
thinking. Eric started off this year's language summit with a
presentation on the topic: https://lwn.net/Articles/754162/

The intent behind PEP 554 is to eventually get to a point where each
subinterpreter has its own dedicated eval loop lock, and the GIL
either disappears entirely (replaced by smaller purpose specific
locks) or becomes a read/write lock (where write access is only needed
to adjust certain state that is shared across subinterpreters).

On the multiprocessing front, it could be quite interesting to attempt
to adapt the channel API from PEP 554 to the
https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.sharedctypes
data sharing capabilities in the modern multiprocessing module.

Also of relevance is Antoine Pitrou's work on a new version of the
pickle protocol that allows for out-of-band data sharing to avoid
redundant memory copies: https://www.python.org/dev/peps/pep-0574/

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Calling python from C completely statically

2018-07-09 Thread Nick Coghlan
On 9 July 2018 at 03:10, Alberto Garcia  wrote:
> Hey there,
>
> Yes, the part of having the pyd modules built in in library is already done.
> I followed the instructions in the README. What I would like to know now is
> how to embed the non frozen python (py) modules. Can you guys please point
> me in the right direction.

The gist is to:

1. take the entire Lib directory and put it in a zip archive
2. use the approach demonstrated in cx_freeze to point sys.path in
your static executable at that zip archive
3. adjust your C code to point sys.path back at the executable itself,
and then combine your executable and the zip archive into a single
contiguous file (similar to what zipapp does with it's helper script
and app archive)

There are likely to still be rough edges when doing that, since this
isn't a well tested configuration. When all else fails, find the part
of the source code responsible for any error messages you're seeing,
and try to work out if there's a setting you can tweak to avoid
hitting that code path.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/