Hi all. I'm writing a Python script that will be used to compare two database tables. Currently, those two tables are dumped into .csv files, whereby my code goes through both files and makes comparisons. Thus far, I only have functionality coded to make comparisons on the headers to check for similarities and differences. Here is the code for that functionality:
similar_headers = 0 different_headers = 0 source_headers = sorted(source_mapping.headers) target_headers = sorted(target_mapping.headers) # Check if the headers between the two mappings are the same if set(source_headers) == set(target_headers): similar_headers = len(source_headers) else: # We're going to do two run-throughs of the tables, to find the # different and similar header names. Start with the source # headers... for source_header in source_headers: if source_header in target_headers: similar_headers += 1 else: different_headers += 1 # Now check target headers for any differences for target_header in target_headers: if target_header in source_headers: pass else: different_headers += 1 As you can probably tell, I make two iterations: one for the 'source_headers' list, and another for the 'target_headers' list. During the first iteration, if a specific header (mapped to a variable 'source_header') exists in both lists, then the 'similar_headers' variable is incremented by one. Similarly, if it doesn't exist in both lists, 'different_headers' is incremented by one. For the second iteration, it only checks for different headers. My code works as expected and there are no bugs, however I get the feeling that I'm not doing this comparison in the most efficient way possible. Is there another way that I can make this same comparison while making my code more Pythonic and efficient? I would prefer not to have to install an external module from elsewhere, though if I have to then I will. Thanks in advance for any and all answers! -- http://mail.python.org/mailman/listinfo/python-list