Il giorno sabato 8 gennaio 2022 alle 02:21:40 UTC+1 dn ha scritto: > Salaam Mahmood, > On 08/01/2022 12.07, Mahmood Naderan via Python-list wrote: > > I have a csv file like this > > V0,V1,V2,V3 > > 4,1,1,1 > > 6,4,5,2 > > 2,3,6,7 > > > > And I want to search two rows for a match and find the column. For > > example, I want to search row[0] for 1 and row[1] for 5. The corresponding > > column is V2 (which is the third column). Then I want to return the value > > at row[2] and the found column. The result should be 6 then. > Not quite: isn't the "found column" also required? > > I can manually extract the specified rows (with index 0 and 1 which are > > fixed) and manually iterate over them like arrays to find a match. Then I > Perhaps this idea has been influenced by a similar solution in another > programming language. May I suggest that the better-answer you seek lies > in using Python idioms (as well as Python's tools)... > > key1 = 1 > > key2 = 5 > Fine, so far - excepting that this 'problem' is likely to be a small > part of some larger system. Accordingly, consider writing it as a > function. In which case, these two "keys" will become > function-parameters (and the two 'results' become return-values). > > row1 = df.iloc[0] # row=[4,1,1,1] > > row2 = df.iloc[1] # row=[6,4,5,2] > This is likely not native-Python. Let's create lists for 'everything', > just-because: > > >>> headings = [ "V0","V1","V2","V3" ] > >>> row1 = [4,1,1,1] > >>> row2 = [6,4,5,2] > >>> results = [ 2,3,6,7 ] > > > Note how I'm using the Python REPL (in a "terminal", type "python" (as > appropriate to your OpSys) at the command-line). IMHO the REPL is a > grossly under-rated tool, and is a very good means towards > trial-and-error, and learning by example. Highly recommended! > > > > for i in range(len(row1)): > > This construction is very much a "code smell" for thinking that it is > not "pythonic". (and perhaps the motivation for this post) > > In Python (compared with many other languages) the "for" loop should > actually be pronounced "for-each". In other words when we pair the > code-construct with a list (for example): > > for each item in the list the computer should perform some suite of > commands. > > (the "suite" is everything 'inside' the for-each-loop - NB my > 'Python-betters' will quickly point-out that this feature is not limited > to Python-lists, but will work with any :iterable" - ref: > https://docs.python.org/3/tutorial/controlflow.html#for-statements) > > > Thus: > > > for item in headings: print( item ) > ... > V0 > V1 > V2 > V3 > > > The problem is that when working with matrices/matrixes, a math > background equips one with the idea of indices/indexes, eg the > ubiquitous subscript-i. Accordingly, when reading 'math' where a formula > uses the upper-case Greek "sigma" character, remember that it means "for > all" or "for each"! > > So, if Python doesn't use indexing or "pointers", how do we deal with > the problem? > > Unfortunately, at first glance, the pythonic approach may seem > more-complicated or even somewhat convoluted, but once the concepts > (and/or the Python idioms) are learned, it is quite manageable (and > applicable to many more applications than matrices/matrixes!)... > > if row1[i] == key1: > > for j in range(len(row2)): > > if row2[j] == key2: > > res = df.iloc[:,j] > > print(res) # 6 > > > > Is there any way to use built-in function for a more efficient code? > This is where your idea bears fruit! > > There is a Python "built-in function": zip(), which will 'join' lists. > NB do not become confused between zip() and zip archive/compressed files! > > Most of the time reference book and web-page examples show zip() being > used to zip-together two lists into a single data-construct (which is an > iterable(!)). However, zip() will actually zip-together multiple (more > than two) "iterables". As the manual says: > > «zip() returns an iterator of tuples, where the i-th tuple contains the > i-th element from each of the argument iterables.» > > Ah, so that's where the math-idea of subscript-i went! It has become > 'hidden' in Python's workings - or putting that another way: Python > looks after the subscripting for us (and given that 'out by one' errors > in pointers is a major source of coding-error in other languages, > thank-you very much Python!) > > First re-state the source-data as Python lists, (per above) - except > that I recommend the names be better-chosen to be more meaningful (to > your application)! > > > Now, (in the REPL) try using zip(): > > >>> zip( headings, row1, row2, results ) > <zip object at 0x7f655cca6bc0> > > Does that seem a very good illustration? Not really, but re-read the > quotation from the manual (above) where it says that zip returns an > iterator. If we want to see the values an iterator will produce, then > turn it into an iterable data-structure, eg: > > >>> list( zip( headings, row1, row2, results ) ) > [('V0', 4, 6, 2), ('V1', 1, 4, 3), ('V2', 1, 5, 6), ('V3', 1, 2, 7)] > > or, to see things more clearly, let me re-type it as: > > [ > ('V0', 4, 6, 2), > ('V1', 1, 4, 3), > ('V2', 1, 5, 6), > ('V3', 1, 2, 7) > ] > > > What we now see is actually a "transpose" of the original 'matrix' > presented in the post/question! > > (NB Python will perform this layout for us - read about the pprint library) > > > Another method which can also be employed (and which will illustrate the > loop required to code the eventual-solution(!)) is that Python's next() > will extract the first row of the transpose: > > >>> row = next( zip( headings, row1, row2, results ) ) > >>> row > ('V0', 4, 6, 2) > > > This is all-well-and-good, but that result is a tuple of four items > (corresponding to one column in the way the source-data was explained). > > If we need to consider the four individual data-items, that can be > improved using a Python feature called "tuple unpacking". Instead of the > above delivering a tuple which is then assigned to "row", the tuple can > be assigned to four "identifiers", eg > > >>> heading, row1_item, row2_item, result= next( zip( headings, row1, > row2, results ) ) > > (apologies about email word-wrapping - this is a single line of Python-code) > > > Which, to prove the case, could be printed: > > >>> heading, row1_item, row2_item, result > ('V0', 4, 6, 2) > > > (ref: > https://docs.python.org/3/tutorial/datastructures.html?highlight=tuple%20unpacking#tuples-and-sequences) > > > > Thus, if we repeatedly ask for the next() row from the zip-ped > transpose, eventually it will respond with the row starting 'V2' - which > is the desired-result, ie the row containing the 1, the 5, and the 6 - > and if you follow-through using the REPL, will be clearly visible. > > > Finally, 'all' that is required, is a for-each-loop which will iterate > across/down the zip object, one tuple (row of the transpose) at a time, > AND perform the "tuple-unpacking" all in one command, with an > if-statement to detect the correct row/column: > > >>> for *tuple-unpacking* in *zip() etc*: > ... if row1_item == *what?* and row2_item == *what?* > ... print( *which* and *which identifier* ) > ... > V2 6 > > Yes, three lines. It's as easy as that! > (when you know how) > > Worse: when you become more expert, you'll be able to compress all of > that down into a single-line solution - but it won't be as "readable" as > is this! > > > NB this question has a 'question-smell' of 'homework', so I'll not > complete the code for you - this is something *you* asked to learn and > the best way to learn is by 'doing' (not by 'reading'). > > However, please respond with your solution, or any further question > (with the next version of the code so-far, per this first-post - which > we appreciate!) > > Regardless, you asked 'the right question' (curiosity is the key to > learning) and in the right way/manner. Well done! > > > NBB the above code-outline does not consider the situation where the > search fails/the keys are not found! > > > For further information, please review: > https://docs.python.org/3/library/functions.html?highlight=zip#zip > > Also, further to the above discussion of combining lists and loops: > https://docs.python.org/3/tutorial/datastructures.html?highlight=zip#looping-techniques > > > and with a similar application (to this post): > https://docs.python.org/3/faq/programming.html?highlight=zip#how-can-i-sort-one-list-by-values-from-another-list > > > -- > Regards,
You may also transpose your dataset. Then the index will become your column name and the column name become your index: To read your dataset: import pandas as pd import io DN = """ V0,V1,V2,V3 4,1,1,1 6,4,5,2 2,3,6,7 """ df = pd.read_csv(io.StringIO(DN)) Transpose it: dft = df.T Find all the index with your condition: idt = (dft[0] == 1) & (dft[1] == 5) Print the columns that satisfy your condition: print(dft[idt]) As you see, without explicit loop. -- https://mail.python.org/mailman/listinfo/python-list