Larry,

That explanation made more sense and provided context.

I fully agree with you that generating the cross product of multiple lists can 
be messy and large and best avoided.

As an example, someone on an R forum presented their version of a way to see 
what are potential solutions to the game WORDLE at any given time, given the 
current constraints. The details are not important except that their process 
makes multiple vectors of the characters that can be allowed for letters "one" 
through "five" and then generates a data.frame of all combinations. Early in 
the game you know little and the number of combinations can be as high as 
26*26*26*26*26 in the English version. Within a few moves, you may know more 
but even 15*18*... can be large. So you have data.frames with sometimes 
millions of rows that are then converted rowwise to five letter words to make a 
long vector than you query a huge dictionary for each word and produce a list 
of possible words. Now imagine the same game looking instead for 6 letter words 
and 7 letter words ...

I looked at it and decided it was the wrong approach and in, brief, made a much 
smaller dictionary containing only the five letter words, and made a regular 
expression that looked like"

"^[letters][^letters]S[more][^this]$"

The above has 5 matches that may be for a specific letter you know is there 
(the S in position 3) or a sequence of letters in square brackets saying any 
one of those match, or the same with a leading caret saying anything except 
those. You then simply use the R grep() function to search the list of valid 
5-letter words using that pattern and in one sweep get them all without 
creating humongous data structures.

What you describe has some similarities as you searched for an alternate way to 
do something and it is now clearer why you did not immediately vocalize exactly 
what you anticipated. But your solution was not a solution to what anyone 
trying to help was working on. It was a solution to a different problem and 
what people would have had to know about how you were using a dictionary to 
pass to a mysterious function was not stated, till now. I would have 
appreciated it if you had simply stated you decided to use a different way and 
if anyone is curious, here it is.

For the rest of us, I think what we got from the exchange may vary. Some saw it 
as a natural fit with using something like a nested comprehension, albeit empty 
lists might need to be dealt with. Others saw a module designed to do such 
things as an answer. I saw other modules in numpy/pandas as reasonable. Some 
thought iterators were a part of a solution. The reality is that making 
permutations and combinations is a fairly common occurance in computer science 
and it can be expected that many implement one solution or another. 

But looking at your code, I am amused that you seem to already have not 
individual lists but a dictionary of named lists. Code similar to what you show 
now could trivially have removed dictionary items that held only an empty list. 
And as I pointed out, some of the solutions we came up with that could 
generalize to any number of lists, happily would accept such a dictionary and 
generate all combinations. 

My frustration was not about you asking how to solve a very reasonable problem 
in Python. It was about the process and what was disclosed and then the 
expectation that we should have known about things not shared. Certainly 
sharing too much is a problem too. Your title alone was very concrete asking 
about 2 lists. It is clear that was not quite your real need.





-----Original Message-----
From: Larry Martell <larry.mart...@gmail.com>
To: Avi Gross <avigr...@verizon.net>
Cc: python-list@python.org <python-list@python.org>
Sent: Thu, Mar 3, 2022 9:07 am
Subject: Re: All permutations from 2 lists


On Wed, Mar 2, 2022 at 9:42 PM Avi Gross via Python-list
<python-list@python.org> wrote:
>
> Larry,
>
> i waited patiently to see what others will write and perhaps see if you 
> explain better what you need. You seem to gleefully swat down anything 
> offered. So I am not tempted to engage.

But then you gave in to the temptation.

> And it is hard to guess as it is not clear what you will do with this.

In the interests of presenting a minimal example I clearly
oversimplified. This is my use case: I get a dict from an outside
source. The dict contains key/value pairs that I need to use to query
a mongodb database. When the values in the dict are all scalar I can
pass the dict directly into the query, e.g.:
self._db_conn[collection_name].find(query). But if any of the values
are lists that does not work. I need to query with something like the
cross product of all the lists. It's not a true product since if a
list is empty it means no filtering on that field, not no filtering on
all the fields.  Originally I did not know I could generate a single
query that did that. So I was trying to come up with a way to generate
a list of all the permutations and was going to issue a query for each
individually.  Clearly that would become very inefficient if the lists
were long or there were a lot of lists. I then found that I could
specify a list with the "$in" clause, hence my solution.


> def query_lfixer(query):
>     for k, v in query.items():
>         if type(v)==list:
>             query[k] = {"$in": v}
>     return query
>
> self._db_conn[collection_name].find(query_lfixer(query))
>
>
> So why did so many of us bother?


Indeed - so why did you bother?


-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to