Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-29 Thread Nicolas Rolin
A syntax that would work (which atm is a syntax error, and requires no new
keyword) would be

student_by_school = {school: [student] for school, student in
student_school_list, grouped=True}

with grouped=True being a modifier on the dict comprehension so that at
each iteration loop

current_dict[key] = value if key not in current_dict else current_dict[key]
+ value

This is an extremely borderline syntax (as it is perfectly legal to put
**{'grouped': True} in a dict comprehension), but it works.
It even keeps the extremely important "should look like a template of the
final object" property.

But it doesn't requires me to defines 2 lambda functions just to do the job
of a comprehension.

-- 
Nicolas Rolin


2018-06-29 4:57 GMT+02:00 Michael Selik :

> On Thu, Jun 28, 2018, 6:46 PM Nicolas Rolin 
> wrote:
>
>> The questions I should have asked In my original post was :
>> - Is splitting lists into sublists (by grouping elements) a high level
>> enough construction to be worthy of a nice integration in the comprehension
>> syntax ?
>>
>
> My intuition is no, it's not important enough to alter the syntax, despite
> being an important task.
>
> - In which case, is there a way to find a simple syntax that is not too
>> confusing ?
>>
>
> If you'd like to give it a shot, try to find something which is currently
> invalid syntax, but does not break compatibility. The latter criteria means
> no new keywords. The syntax should look nice as a single line with
> reasonably verbose variable names.
>
> One issue is that Python code is mostly 1-dimensional, characters in a
> line, and you're trying to express something which is 2-dimensional, in a
> sense. There's only so much you can do without newlines and indentation.
>



-- 

--
*Nicolas Rolin* | Data Scientist
+ 33 631992617 - nicolas.ro...@tiime.fr 


*15 rue Auber, **75009 Paris*
*www.tiime.fr *
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Greg Ewing

Steven D'Aprano wrote:


On Thu, Jun 28, 2018 at 10:01:00AM -0700, Chris Barker via Python-ideas wrote:

In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in*
groupby(sorted(student_school_list,
key=*lambda* t: t[1]), key=*lambda* t: t[
...


> the rest ought to be legal Python but isn't

We should *make* it legal Python code! Then there would be no
difficulty with adding new keywords!

--
Greg

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Michael Selik
On Thu, Jun 28, 2018, 6:46 PM Nicolas Rolin  wrote:

> The questions I should have asked In my original post was :
> - Is splitting lists into sublists (by grouping elements) a high level
> enough construction to be worthy of a nice integration in the comprehension
> syntax ?
>

My intuition is no, it's not important enough to alter the syntax, despite
being an important task.

- In which case, is there a way to find a simple syntax that is not too
> confusing ?
>

If you'd like to give it a shot, try to find something which is currently
invalid syntax, but does not break compatibility. The latter criteria means
no new keywords. The syntax should look nice as a single line with
reasonably verbose variable names.

One issue is that Python code is mostly 1-dimensional, characters in a
line, and you're trying to express something which is 2-dimensional, in a
sense. There's only so much you can do without newlines and indentation.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Nicolas Rolin
2018-06-28 22:34 GMT+02:00 David Mertz :

> I agree with these recommendations. There are excellent 3rd party tools
> that do what you want. This is way too much to try to shoehorn into a
> comprehension.
>

There are actually no 3rd party tools that can "do what I want", because if
I wanted to have a function to do a group by, I would have taken the 5
minutes and 7 lines necessary to do so (or don't use a function and do my 3
liner).

My main point is that comprehensions in python are very powerful and you
can do pretty much any basic data manipulation that you want with it EXCEPT
when you want to "split" a list in sublists, in which case you have either
to use functions or a for loop.
You can note that with list comprehension you can flatten an iterable (from
sublists to a single list) with the [a for b in c for a in b] syntax, but
doing the inverse operation is impossible.

The questions I should have asked In my original post was :
- Is splitting lists into sublists (by grouping elements) a high level
enough construction to be worthy of a nice integration in the comprehension
syntax ?
- In which case, is there a way to find a simple syntax that is not too
confusing ?

My personal answer would be respectively "yes" and "maybe I don't know".
I was hoping to have some views on the topic, and it seemed to have a bit
sidetracked :)

-- 
Nicolas Rolin
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Wes Turner
Ctrl-Shift-V pastes without HTML formatting.

On Thursday, June 28, 2018, Steven D'Aprano  wrote:

> Can I make a plea for people to not post code with source highlighting
> as HTML please? It is rendered like this for some of us:
>
> On Thu, Jun 28, 2018 at 10:01:00AM -0700, Chris Barker via Python-ideas
> wrote:
>
> In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in*
> groupby(sorted(student_school_list,
> key=*lambda* t: t[1]), key=*lambda* t: t[
> ...
>
> (Aside from the iPython prompt, the rest ought to be legal Python but
> isn't because of the extra asterisks added.)
>
> And in the archives:
>
> https://mail.python.org/pipermail/python-ideas/2018-June/051723.html
>
> Gmail, I believe, has a "Paste As Plain Text" command in the
> right-click menu. Or possibly find a way to copy the text without
> formatting in the first case.
>
>
> Thanks,
>
>
> --
> Steve
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread pylang
There are a few tools that can accomplish these map-reduce/transformation
tasks.
See Options A, B, C below.

# Given
>>> import itertools as it
>>> import collections as ct

>>> import more_itertools as mit


>>> student_school_list = [
... ("Albert", "Prospectus"), ("Max", "Smallville"), ("Nikola",
"Shockley"), ("Maire", "Excelsior"),
... ("Neils", "Smallville"), ("Ernest", "Tabbicage"), ("Michael",
"Shockley"), ("Stephen", "Prospectus")
... ]


>>> kfunc = lambda x: x[1]
>>> vfunc = lambda x: x[0]
>>> sorted_iterable = sorted(student_school_list, key=kfunc)


# Example (see OP)
>>> student_by_school = ct.defaultdict(list)
>>> for student, school in student_school_list:
...student_by_school[school].append(student)
>>> student_by_school
defaultdict(list,
{'Prospectus': ['Albert', 'Stephen'],
 'Smallville': ['Max', 'Neils'],
 'Shockley': ['Nikola', 'Michael'],
 'Excelsior': ['Maire'],
 'Tabbicage': ['Ernest']})

---

# Options

# A: itertools.groupby
>>> {k: [x[0] for x in v] for k, v in it.groupby(sorted_iterable,
key=kfunc)}
{'Excelsior': ['Maire'],
'Prospectus': ['Albert', 'Stephen'],
'Shockley': ['Nikola', 'Michael'],
'Smallville': ['Max', 'Neils'],
'Tabbicage': ['Ernest']}

# B: more_itertools.groupby_transform
>>> {k: list(v) for k, v in mit.groupby_transform(sorted_iterable,
keyfunc=kfunc, valuefunc=vfunc)}
{'Excelsior': ['Maire'],
 'Prospectus': ['Albert', 'Stephen'],
 'Shockley': ['Nikola', 'Michael'],
 'Smallville': ['Max', 'Neils'],
 'Tabbicage': ['Ernest']}

# C: more_itertools.map_reduce
>>> mit.map_reduce(student_school_list, keyfunc=kfunc, valuefunc=vfunc)
defaultdict(None,
{'Prospectus': ['Albert', 'Stephen'],
 'Smallville': ['Max', 'Neils'],
 'Shockley': ['Nikola', 'Michael'],
 'Excelsior': ['Maire'],
 'Tabbicage': ['Ernest']})

---

# Summary

- Option A: standard library, sorted iterable, some manual value
transformations (via list comprehension)
- Option B: third-party tool, sorted iterable, accepts a value
transformation function
- Option C: third-party tool, any iterable, accepts transformation
function(s)

I have grown to like `itertools.groupby`, but I understand it can be odd at
first.
Perhaps something like the `map_reduce` tool (or approach) may help?  It's
simple,
 does not require a sorted iterable as in A and B, and you have control
over how
 you want your keys, values and aggregated/reduced values to be (see docs
for more details).


# Documentation

- Option A:
https://docs.python.org/3/library/itertools.html#itertools.groupby
- Option B:
https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.groupby_transform
- Option C:
https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.map_reduce

On Thu, Jun 28, 2018 at 8:37 PM, Chris Barker - NOAA Federal via
Python-ideas  wrote:

> > On Jun 28, 2018, at 5:30 PM, Chris Barker - NOAA Federal <
> chris.bar...@noaa.gov> wrote:
> >
> > So maybe a solution is an accumulator special case of defaultdict — it
> uses a list be default and appends by default.
> >
> > Almost like counter...
>
> Which, of course, is pretty much what your proposal is.
>
> Which makes me think — a new classmethod on the builtin dict is a
> pretty heavy lift compared to a new type of dict in the collections
> module.
>
> -CHB
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Chris Barker - NOAA Federal via Python-ideas
> On Jun 28, 2018, at 5:30 PM, Chris Barker - NOAA Federal 
>  wrote:
>
> So maybe a solution is an accumulator special case of defaultdict — it uses a 
> list be default and appends by default.
>
> Almost like counter...

Which, of course, is pretty much what your proposal is.

Which makes me think — a new classmethod on the builtin dict is a
pretty heavy lift compared to a new type of dict in the collections
module.

-CHB
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Chris Barker - NOAA Federal via Python-ideas
> I think you accidentally swapped variables there:
> student_school_list
> vs student_by_school

Oops, yeah. That’s what I get for whipping out a message before catching a bus.

(And on a phone now)

But maybe you could wrap the defaultdict constructor around a
generator expression that transforms the list first.

That would get the keys right. Though still not call append for you.

So maybe a solution is an accumulator special case of defaultdict — it
uses a list be default and appends by default.

Almost like counter...

-CHB
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread David Mertz
I think you cheated a little in your cut-and-paste.  `student_by_school` is
not defined in the code you've shown.  What you **did** define, `
student_school_list` doesn't give you what you want if you use `
defaultdict(list,student_school_list)`.

I thought for a moment I might just use:

[(b,a) for a,b in student_school_list]


But that's wrong for reasons that are probably obvious to everyone else.
I'm not really sure what `student_by_school` could possibly be to make this
work as shown.

On Thu, Jun 28, 2018 at 8:13 PM Chris Barker via Python-ideas <
python-ideas@python.org> wrote:

> In [97]: student_school_list
> Out[97]:
> [('Fred', 'SchoolA'),
>  ('Bob', 'SchoolB'),
>  ('Mary', 'SchoolA'),
>  ('Jane', 'SchoolB'),
>  ('Nancy', 'SchoolC')]
>
> In [98]: result = defaultdict(list, student_by_school)
>
> In [99]: result.items()
> Out[99]: dict_items([('SchoolA', ['Fred', 'Mary']), ('SchoolB', ['Bob',
> 'Jane']), ('SchoolC', ['Nancy'])])
>
> So:  never mind 
>
> -CHB
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Michael Selik
On Thu, Jun 28, 2018 at 5:12 PM Chris Barker via Python-ideas <
python-ideas@python.org> wrote:

> In [97]: student_school_list
> Out[97]:
> [('Fred', 'SchoolA'),
>  ('Bob', 'SchoolB'),
>  ('Mary', 'SchoolA'),
>  ('Jane', 'SchoolB'),
>  ('Nancy', 'SchoolC')]
>
> In [98]: result = defaultdict(list, student_by_school)
>
> In [99]: result.items()
> Out[99]: dict_items([('SchoolA', ['Fred', 'Mary']), ('SchoolB', ['Bob',
> 'Jane']), ('SchoolC', ['Nancy'])])
>

Wait, wha...

In [1]: from collections import defaultdict

In [2]: students = [('Fred', 'SchoolA'),
   ...:  ('Bob', 'SchoolB'),
   ...:  ('Mary', 'SchoolA'),
   ...:  ('Jane', 'SchoolB'),
   ...:  ('Nancy', 'SchoolC')]
   ...:

In [3]: defaultdict(list, students)
Out[3]:
defaultdict(list,
{'Fred': 'SchoolA',
 'Bob': 'SchoolB',
 'Mary': 'SchoolA',
 'Jane': 'SchoolB',
 'Nancy': 'SchoolC'})

In [4]: defaultdict(list, students).items()
Out[4]: dict_items([('Fred', 'SchoolA'), ('Bob', 'SchoolB'), ('Mary',
'SchoolA'), ('Jane', 'SchoolB'), ('Nancy', 'SchoolC')])


I think you accidentally swapped variables there:
student_school_list
vs student_by_school
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Chris Barker via Python-ideas
Hold  the phone!

On Thu, Jun 28, 2018 at 8:25 AM, Nicolas Rolin 
wrote:

> student_by_school = defaultdict(list)
> for student, school in student_school_list:
> student_by_school[school].append(student)
>
> What I would expect would be a syntax with comprehension allowing me to
> write something along the lines of:
>
> student_by_school = {group_by(school): student for school, student in
> student_school_list}
>

OK -- I agreed that this could/should be easier, and pretty much like using
setdefault, but did like the single expression thing, so went to "there
should be a way to make a defaultdict comprehension" -- and played with
itertools.groupby (which is really really awkward for this), but then light
dawned on Marblehead:

I've noticed (and taught) that dict comprehensions are kinda redundant with
the dict() constructor, and _think_, in fact, that they were added before
the current dict() constructor was added.

so, if you think "dict constructor" rather than dict comprehensions, you
realize that defaultdict takes the same arguments as the dict(), so the
above is:

defaultdict(list, student_by_school)

which really couldn't be any cleaner and neater.

Here it is in action:

In [97]: student_school_list
Out[97]:
[('Fred', 'SchoolA'),
 ('Bob', 'SchoolB'),
 ('Mary', 'SchoolA'),
 ('Jane', 'SchoolB'),
 ('Nancy', 'SchoolC')]

In [98]: result = defaultdict(list, student_by_school)

In [99]: result.items()
Out[99]: dict_items([('SchoolA', ['Fred', 'Mary']), ('SchoolB', ['Bob',
'Jane']), ('SchoolC', ['Nancy'])])

So:  never mind 

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Chris Barker via Python-ideas
On Thu, Jun 28, 2018 at 4:59 PM, Steven D'Aprano 
wrote:

> Can I make a plea for people to not post code with source highlighting
> as HTML please? It is rendered like this for some of us:
>
> On Thu, Jun 28, 2018 at 10:01:00AM -0700, Chris Barker via Python-ideas
> wrote:
>
> In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in*
> groupby(sorted(student_school_list,
> key=*lambda* t: t[1]), key=*lambda* t: t[
>

Oh god -- yeach!! -- sorry about that -- that was copy an pasted from
iPython -- I was assuming it would strip out the formatting and give
reasonable plain text -- but apparently not.

I'll stop that.

-CHB
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Steven D'Aprano
On Thu, Jun 28, 2018 at 11:23:49AM -0700, Michael Selik wrote:

> The fact that you didn't use ``setdefault`` here, opting for repeatedly
> constructing new lists via concatenation, demonstrates the need for a
> built-in or standard library tool that is easier to use.

That would be setdefault :-)

What it indicates to me is the need for people to learn to use 
setdefault, rather than new syntax :-)


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Steven D'Aprano
Can I make a plea for people to not post code with source highlighting 
as HTML please? It is rendered like this for some of us:

On Thu, Jun 28, 2018 at 10:01:00AM -0700, Chris Barker via Python-ideas wrote:

In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in*
groupby(sorted(student_school_list,
key=*lambda* t: t[1]), key=*lambda* t: t[
...

(Aside from the iPython prompt, the rest ought to be legal Python but 
isn't because of the extra asterisks added.)

And in the archives:

https://mail.python.org/pipermail/python-ideas/2018-June/051723.html

Gmail, I believe, has a "Paste As Plain Text" command in the 
right-click menu. Or possibly find a way to copy the text without 
formatting in the first case. 


Thanks,


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Michael Selik
On Thu, Jun 28, 2018 at 4:34 PM Chris Barker via Python-ideas <
python-ideas@python.org> wrote:

> On Thu, Jun 28, 2018 at 4:23 PM, Greg Ewing 
> wrote:
>
>> Nicolas Rolin wrote:
>>
>>> student_by_school = {group_by(school): student for school, student
>>> in student_school_list}
>>>
>>
>> In the spirit of making the target expression look like
>> a template for the generated elements,
>>
>>{school: [student...] for school, student in student_school_list}
>
>
> hmm -- this seems a bit non-general -- would this only work for a list?
> maybe you would want a set, or???
>
> so could be get a defaultdict comprehension with something like:
>
> { school: (default_factory=list, student) for school, student in
> student_school_list }
>
> But I can't think of an reasonable syntax to make that work.
>

Many languages with a group-by or grouping function choose to return a
mapping of sequences, requiring any reduction, aggregation, or
transformation of those sequences to be performed after the grouping.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Chris Barker via Python-ideas
On Thu, Jun 28, 2018 at 4:23 PM, Greg Ewing 
wrote:

> Nicolas Rolin wrote:
>
>> student_by_school = {group_by(school): student for school, student in
>> student_school_list}
>>
>
> In the spirit of making the target expression look like
> a template for the generated elements,
>
>{school: [student...] for school, student in student_school_list}


hmm -- this seems a bit non-general -- would this only work for a list?
maybe you would want a set, or???

so could be get a defaultdict comprehension with something like:

{ school: (default_factory=list, student) for school, student in
student_school_list }

But I can't think of an reasonable syntax to make that work.

-CHB









>
> --
> Greg
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Greg Ewing

Nicolas Rolin wrote:
student_by_school = {group_by(school): student for school, student 
in student_school_list}


In the spirit of making the target expression look like
a template for the generated elements,

   {school: [student...] for school, student in student_school_list}

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Chris Barker via Python-ideas
On Thu, Jun 28, 2018 at 1:34 PM, David Mertz  wrote:

> I'd add one more option. You want something that behaves like SQL. Right
> in the standard library is sqlite3, and you can create an in-memory DB to
> hope the data you expect to group.
>

There are also packages designed to make DB-style queries easier.

Here's one I found with a quick google.

-CHB




> On Thu, Jun 28, 2018, 3:48 PM Wes Turner  wrote:
>
>> PyToolz, Pandas, Dask .groupby()
>>
>> toolz.itertoolz.groupby does this succinctly without any
>> new/magical/surprising syntax.
>>
>> https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.groupby
>>
>> From https://github.com/pytoolz/toolz/blob/master/toolz/itertoolz.py :
>>
>> """
>> def groupby(key, seq):
>> """ Group a collection by a key function
>> >>> names = ['Alice', 'Bob', 'Charlie', 'Dan', 'Edith', 'Frank']
>> >>> groupby(len, names)  # doctest: +SKIP
>> {3: ['Bob', 'Dan'], 5: ['Alice', 'Edith', 'Frank'], 7: ['Charlie']}
>> >>> iseven = lambda x: x % 2 == 0
>> >>> groupby(iseven, [1, 2, 3, 4, 5, 6, 7, 8])  # doctest: +SKIP
>> {False: [1, 3, 5, 7], True: [2, 4, 6, 8]}
>> Non-callable keys imply grouping on a member.
>> >>> groupby('gender', [{'name': 'Alice', 'gender': 'F'},
>> ...{'name': 'Bob', 'gender': 'M'},
>> ...{'name': 'Charlie', 'gender': 'M'}]) #
>> doctest:+SKIP
>> {'F': [{'gender': 'F', 'name': 'Alice'}],
>>  'M': [{'gender': 'M', 'name': 'Bob'},
>>{'gender': 'M', 'name': 'Charlie'}]}
>> See Also:
>> countby
>> """
>> if not callable(key):
>> key = getter(key)
>> d = collections.defaultdict(lambda: [].append)
>> for item in seq:
>> d[key(item)](item)
>> rv = {}
>> for k, v in iteritems(d):
>> rv[k] = v.__self__
>> return rv
>> """
>>
>> If you're willing to install Pandas (and NumPy, and ...), there's
>> pandas.DataFrame.groupby:
>>
>> https://pandas.pydata.org/pandas-docs/stable/generated/
>> pandas.DataFrame.groupby.html
>>
>> https://github.com/pandas-dev/pandas/blob/v0.23.1/pandas/
>> core/generic.py#L6586-L6659
>>
>>
>> Dask has a different groupby implementation:
>> https://gist.github.com/darribas/41940dfe7bf4f987eeaa#
>> file-pandas_dask_test-ipynb
>>
>> https://dask.pydata.org/en/latest/dataframe-api.html#
>> dask.dataframe.DataFrame.groupby
>>
>>
>> On Thursday, June 28, 2018, Chris Barker via Python-ideas <
>> python-ideas@python.org> wrote:
>>
>>> On Thu, Jun 28, 2018 at 8:25 AM, Nicolas Rolin 
>>> wrote:

 I use list and dict comprehension a lot, and a problem I often have is
 to do the equivalent of a group_by operation (to use sql terminology).

>>>
>>> I don't know from SQL, so "group by" doesn't mean anything to me, but
>>> this:
>>>
>>>
 For example if I have a list of tuples (student, school) and I want to
 have the list of students by school the only option I'm left with is to
 write

 student_by_school = defaultdict(list)
 for student, school in student_school_list:
 student_by_school[school].append(student)

>>>
>>> seems to me that the issue here is that there is not way to have a
>>> "defaultdict comprehension"
>>>
>>> I can't think of syntactically clean way to make that possible, though.
>>>
>>> Could itertools.groupby help here? It seems to work, but boy! it's ugly:
>>>
>>> In [*45*]: student_school_list
>>>
>>> Out[*45*]:
>>>
>>> [('Fred', 'SchoolA'),
>>>
>>>  ('Bob', 'SchoolB'),
>>>
>>>  ('Mary', 'SchoolA'),
>>>
>>>  ('Jane', 'SchoolB'),
>>>
>>>  ('Nancy', 'SchoolC')]
>>>
>>>
>>> In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in* groupby(sorted
>>> (student_school_list, key=*lambda* t: t[1]), key=*lambda* t: t[
>>>
>>> ...: 1])}
>>>
>>> ...:
>>>
>>> ...:
>>>
>>> ...:
>>>
>>> ...:
>>>
>>> ...:
>>>
>>> ...:
>>>
>>> ...:
>>>
>>> Out[*46*]: {'SchoolA': ['Fred', 'Mary'], 'SchoolB': ['Bob', 'Jane'],
>>> 'SchoolC': ['Nancy']}
>>>
>>>
>>> -CHB
>>>
>>>
>>> --
>>>
>>> Christopher Barker, Ph.D.
>>> Oceanographer
>>>
>>> Emergency Response Division
>>> NOAA/NOS/OR&R(206) 526-6959   voice
>>> 7600 Sand Point Way NE
>>> 
>>>   (206) 526-6329   fax
>>> Seattle, WA  98115   (206) 526-6317   main reception
>>>
>>> chris.bar...@noaa.gov
>>>
>> ___
>> Python-ideas mailing list
>> Python-ideas@python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-ideas mailing list
Python-ideas@python.org
http

Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Chris Barker via Python-ideas
On Thu, Jun 28, 2018 at 3:17 PM, Chris Barker  wrote:

> There are also packages designed to make DB-style queries easier.
>
> Here's one I found with a quick google.
>

opps -- hit send too soon:

http://178.62.194.22/

https://github.com/pythonql/pythonql

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread David Mertz
I agree with these recommendations. There are excellent 3rd party tools
that do what you want. This is way too much to try to shoehorn into a
comprehension.

I'd add one more option. You want something that behaves like SQL. Right in
the standard library is sqlite3, and you can create an in-memory DB to hope
the data you expect to group.

On Thu, Jun 28, 2018, 3:48 PM Wes Turner  wrote:

> PyToolz, Pandas, Dask .groupby()
>
> toolz.itertoolz.groupby does this succinctly without any
> new/magical/surprising syntax.
>
> https://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.groupby
>
> From https://github.com/pytoolz/toolz/blob/master/toolz/itertoolz.py :
>
> """
> def groupby(key, seq):
> """ Group a collection by a key function
> >>> names = ['Alice', 'Bob', 'Charlie', 'Dan', 'Edith', 'Frank']
> >>> groupby(len, names)  # doctest: +SKIP
> {3: ['Bob', 'Dan'], 5: ['Alice', 'Edith', 'Frank'], 7: ['Charlie']}
> >>> iseven = lambda x: x % 2 == 0
> >>> groupby(iseven, [1, 2, 3, 4, 5, 6, 7, 8])  # doctest: +SKIP
> {False: [1, 3, 5, 7], True: [2, 4, 6, 8]}
> Non-callable keys imply grouping on a member.
> >>> groupby('gender', [{'name': 'Alice', 'gender': 'F'},
> ...{'name': 'Bob', 'gender': 'M'},
> ...{'name': 'Charlie', 'gender': 'M'}]) #
> doctest:+SKIP
> {'F': [{'gender': 'F', 'name': 'Alice'}],
>  'M': [{'gender': 'M', 'name': 'Bob'},
>{'gender': 'M', 'name': 'Charlie'}]}
> See Also:
> countby
> """
> if not callable(key):
> key = getter(key)
> d = collections.defaultdict(lambda: [].append)
> for item in seq:
> d[key(item)](item)
> rv = {}
> for k, v in iteritems(d):
> rv[k] = v.__self__
> return rv
> """
>
> If you're willing to install Pandas (and NumPy, and ...), there's
> pandas.DataFrame.groupby:
>
>
> https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html
>
>
> https://github.com/pandas-dev/pandas/blob/v0.23.1/pandas/core/generic.py#L6586-L6659
>
>
> Dask has a different groupby implementation:
>
> https://gist.github.com/darribas/41940dfe7bf4f987eeaa#file-pandas_dask_test-ipynb
>
>
> https://dask.pydata.org/en/latest/dataframe-api.html#dask.dataframe.DataFrame.groupby
>
>
> On Thursday, June 28, 2018, Chris Barker via Python-ideas <
> python-ideas@python.org> wrote:
>
>> On Thu, Jun 28, 2018 at 8:25 AM, Nicolas Rolin 
>> wrote:
>>>
>>> I use list and dict comprehension a lot, and a problem I often have is
>>> to do the equivalent of a group_by operation (to use sql terminology).
>>>
>>
>> I don't know from SQL, so "group by" doesn't mean anything to me, but
>> this:
>>
>>
>>> For example if I have a list of tuples (student, school) and I want to
>>> have the list of students by school the only option I'm left with is to
>>> write
>>>
>>> student_by_school = defaultdict(list)
>>> for student, school in student_school_list:
>>> student_by_school[school].append(student)
>>>
>>
>> seems to me that the issue here is that there is not way to have a
>> "defaultdict comprehension"
>>
>> I can't think of syntactically clean way to make that possible, though.
>>
>> Could itertools.groupby help here? It seems to work, but boy! it's ugly:
>>
>> In [*45*]: student_school_list
>>
>> Out[*45*]:
>>
>> [('Fred', 'SchoolA'),
>>
>>  ('Bob', 'SchoolB'),
>>
>>  ('Mary', 'SchoolA'),
>>
>>  ('Jane', 'SchoolB'),
>>
>>  ('Nancy', 'SchoolC')]
>>
>>
>> In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in* 
>> groupby(sorted(student_school_list,
>> key=*lambda* t: t[1]), key=*lambda* t: t[
>>
>> ...: 1])}
>>
>> ...:
>>
>> ...:
>>
>> ...:
>>
>> ...:
>>
>> ...:
>>
>> ...:
>>
>> ...:
>>
>> Out[*46*]: {'SchoolA': ['Fred', 'Mary'], 'SchoolB': ['Bob', 'Jane'],
>> 'SchoolC': ['Nancy']}
>>
>>
>> -CHB
>>
>>
>> --
>>
>> Christopher Barker, Ph.D.
>> Oceanographer
>>
>> Emergency Response Division
>> NOAA/NOS/OR&R(206) 526-6959   voice
>> 7600 Sand Point Way NE   (206) 526-6329   fax
>> Seattle, WA  98115   (206) 526-6317   main reception
>>
>> chris.bar...@noaa.gov
>>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Rhodri James

On 28/06/18 16:25, Nicolas Rolin wrote:

Hi,

I use list and dict comprehension a lot, and a problem I often have is to
do the equivalent of a group_by operation (to use sql terminology).

For example if I have a list of tuples (student, school) and I want to have
the list of students by school the only option I'm left with is to write

 student_by_school = defaultdict(list)
 for student, school in student_school_list:
 student_by_school[school].append(student)

What I would expect would be a syntax with comprehension allowing me to
write something along the lines of:

 student_by_school = {group_by(school): student for school, student in
student_school_list}

or any other syntax that allows me to regroup items from an iterable.



Sorry, I don't like the extra load on comprehensions here.  You are 
doing something inherently somewhat complicated and then attempting to 
hide the magic.  Worse, you are hiding it by pretending to be something 
else (an ordinary comprehension), which will break people's intuition 
about what is being produced.


--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Michael Selik
On Thu, Jun 28, 2018 at 10:24 AM Rob Cliffe via Python-ideas <
python-ideas@python.org> wrote:

> def group_by(iterable, groupfunc, itemfunc=lambda x:x, sortfunc=lambda
> x:x): # Python 2 & 3 compatible!
>
> D = {}
> for x in iterable:
> group = groupfunc(x)
> D[group] = D.get(group, []) + [itemfunc(x)]
> if sortfunc is not None:
> for group in D:
> D[group] = sorted(D[group], key=sortfunc)
> return D
>

The fact that you didn't use ``setdefault`` here, opting for repeatedly
constructing new lists via concatenation, demonstrates the need for a
built-in or standard library tool that is easier to use.

I'll submit a proposal for your review soon.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Rob Cliffe via Python-ideas

Why not write a helper function?  Something like

def group_by(iterable, groupfunc, itemfunc=lambda x:x, sortfunc=lambda 
x:x): # Python 2 & 3 compatible!

    D = {}
    for x in iterable:
    group = groupfunc(x)
    D[group] = D.get(group, []) + [itemfunc(x)]
    if sortfunc is not None:
    for group in D:
    D[group] = sorted(D[group], key=sortfunc)
    return D

Then:

student_list = [ ('james', 'Dublin'), ('jim', 'Cork'), ('mary', 'Cork'), 
('fred', 'Dublin') ]
student_by_school = group_by(student_list, lambda stu_sch : stu_sch[1], 
lambda stu_sch : stu_sch[0])

print (student_by_school)

{'Dublin': ['fred', 'james'], 'Cork': ['jim', 'mary']}

Regards

Rob Cliffe


On 28/06/2018 16:25, Nicolas Rolin wrote:

Hi,

I use list and dict comprehension a lot, and a problem I often have is 
to do the equivalent of a group_by operation (to use sql terminology).


For example if I have a list of tuples (student, school) and I want to 
have the list of students by school the only option I'm left with is 
to write


    student_by_school = defaultdict(list)
    for student, school in student_school_list:
    student_by_school[school].append(student)

What I would expect would be a syntax with comprehension allowing me 
to write something along the lines of:


    student_by_school = {group_by(school): student for school, student 
in student_school_list}


or any other syntax that allows me to regroup items from an iterable.


Small FAQ:

Q: Why include something in comprehensions when you can do it in a 
small number of lines ?


A: A really appreciable part of the list and dict comprehension is the 
fact that it allows the developer to be really explicit about what he 
wants to do at a given line.
If you see a comprehension, you know that the developer wanted to have 
an iterable and not have any side effect other than depleting the 
iterator (if he respects reasonable code guidelines).
Initializing an object and doing a for loop to construct it is both 
too long and not explicit enough about what is intended.
It should be reserved for intrinsically complex operations, not one of 
the base operation one can want to do with lists and dicts.



Q: Why group by in particular ?

A: If we take SQL queries 
(https://en.wikipedia.org/wiki/SQL_syntax#Queries) as a reasonable way 
of seeing how people need to manipulate data on a day-to-day basis, we 
can see that dict comprehensions already covers most of the base 
operations, the only missing operations being group by and having.


Q: Why not use it on list with syntax such as
    student_by_school = [
    school, student
    for school, student in student_school_list
    group by school
    ]
?

A: It would create either a discrepancy with iterators or a perhaps 
misleading semantic (the one from itertools.groupby, which requires 
the iterable to be sorted in order to be useful).
Having the option do do it with a dict remove any ambiguity and should 
be enough to cover most "group by" applications.



Examples:

    edible_list = [('fruit', 'orange'), ('meat', 'eggs'), ('meat', 
'spam'), ('fruit', 'apple'), ('vegetable', 'fennel'), ('fruit', 
'pineapple'), ('fruit', 'pineapple'), ('vegetable', 'carrot')]
    edible_list_by_food_type = {group_by(food_type): edible for 
food_type, edible in edible_list}


    print(edible_list_by_food_type)
   {'fruit': ['orange', 'pineapple'], 'meat': ['eggs', 'spam'], 
'vegetable': ['fennel', 'carrot']}



   bank_transactions = [200.0, -357.0, -9.99, -15.6, 4320.0, -12000]
   splited_bank_transactions = {group_by('credit' if amount > 0 else 
'debit'): amount for amount in bank_transactions}


   print(splited_bank_transactions)
   {'credit': [200.0, 4320.0], 'debit': [-357.0, -9.99, -15.6, -1200.0]}



--
Nicolas Rolin

 
	Virus-free. www.avg.com 
 



<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Chris Barker via Python-ideas
On Thu, Jun 28, 2018 at 8:25 AM, Nicolas Rolin 
wrote:
>
> I use list and dict comprehension a lot, and a problem I often have is to
> do the equivalent of a group_by operation (to use sql terminology).
>

I don't know from SQL, so "group by" doesn't mean anything to me, but this:


> For example if I have a list of tuples (student, school) and I want to
> have the list of students by school the only option I'm left with is to
> write
>
> student_by_school = defaultdict(list)
> for student, school in student_school_list:
> student_by_school[school].append(student)
>

seems to me that the issue here is that there is not way to have a
"defaultdict comprehension"

I can't think of syntactically clean way to make that possible, though.

Could itertools.groupby help here? It seems to work, but boy! it's ugly:

In [*45*]: student_school_list

Out[*45*]:

[('Fred', 'SchoolA'),

 ('Bob', 'SchoolB'),

 ('Mary', 'SchoolA'),

 ('Jane', 'SchoolB'),

 ('Nancy', 'SchoolC')]


In [*46*]: {a:[t[0] *for* t *in* b] *for* a,b *in*
groupby(sorted(student_school_list,
key=*lambda* t: t[1]), key=*lambda* t: t[

...: 1])}

...:

...:

...:

...:

...:

...:

...:

Out[*46*]: {'SchoolA': ['Fred', 'Mary'], 'SchoolB': ['Bob', 'Jane'],
'SchoolC': ['Nancy']}


-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Allow a group by operation for dict comprehension

2018-06-28 Thread Michael Selik
On Thu, Jun 28, 2018 at 8:25 AM Nicolas Rolin 
wrote:

> I use list and dict comprehension a lot, and a problem I often have is to
> do the equivalent of a group_by operation (to use sql terminology).
>
> For example if I have a list of tuples (student, school) and I want to
> have the list of students by school the only option I'm left with is to
> write
>
> student_by_school = defaultdict(list)
> for student, school in student_school_list:
> student_by_school[school].append(student)
>

Thank you for bringing this up. I've been drafting a proposal for a better
grouping / group-by operation for a little while. I'm not quite ready to
share it, as I'm still researching use cases.

I'm +1 that this task needs improvement, but -1 on this particular solution.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/