Here's my task in plain language:

Compare two files of different sizes, each with two columns:
        For each element(s) in the first column of the larger file,
find all matches to those elements in the first column of the smaller file, and Then print the matching row(s) of the larger file, followed by the matching rows of the smaller file.

Sometimes there will be more than one element in the larger file that is duplicated. For those, print the duplicated rows from the larger file before printing the matching rows of the smaller file.

The output file should be two columns, with the first column containing the pertinent (i.e., duplicated) elements, and the second column listing all the associated instances.

There aren't any duplications in the second column of either file;
and nothing in either second column matches anything in the other second column.

This seems easy enough to explain and to perform by visual inspection,
but only a script will work for the main files.

I'm attaching three files: LargerFile.txt; SmallerFile.txt; and OutputFile.txt. They've been obtained by visually matching the Real-LargerFile.txt (500,000 rows)
with the Real-SmallerFile.txt (200,000 rows) and then doing some obfuscation.

All files are sorted on their first columns.

George Langford


pool.giga.net.ru        91.210.179.94
pool.giga.net.ru        91.210.179.95
pool.giga.net.ru        91.210.179.96
pool.giga.net.ru        91.210.179.97
pool.giga.net.ru        91.210.179.98
pool.giga.net.ru        91.210.179.99
pool.mininghub.cc       185.139.70.151
pool.mirgiga.net        78.158.193.1
pool.mirgiga.net        78.158.193.10
pool.mirgiga.net        78.158.193.104
pool.mirgiga.net        78.158.193.105
pool.mirgiga.net        78.158.193.106
pool.mirgiga.net        78.158.193.107
pool.mirgiga.net        78.158.193.11
pool.mirgiga.net        78.158.193.110
pool.mirgiga.net        78.158.193.111
pool.mirgiga.net        78.158.193.112
pool.mirgiga.net        78.158.193.113
pool.sevtele.com        46.172.203.8
pool.sevtele.com        46.172.203.80
pool.sevtele.com        46.172.203.83
pool.sevtele.com        46.172.203.85
pool.sevtele.com        46.172.203.87
pool.sevtele.com        46.172.203.88
pool-98-118-56.net      Ghbfght
pool-p13.46-149.tv      Ghbfght
pool.giga.net.ru        Ghbfght
pool.giga.net.ru        Kmnslet
pool.giga.net.ru        Evgbhan
pool.giga.net.ru        Wnhmahy
pool.giga.net.ru        Loasfrt
pool.megalink.lg        Ghbfght
pool.mirgiga.net        Uhnagty
pool.mirgiga.net        Yjnmase
pool.mirgiga.net        Bnhjyht
pool.sevtele.com        Ghbfght
pool0326.cvx27.net      Ghbfght
conceptcable.com        Ghbfght
pool148-168.a.net       Ghbfght
pool161-158-44.ua       Ghbfght
pool.giga.net.ru        91.210.179.94
pool.giga.net.ru        91.210.179.95
pool.giga.net.ru        91.210.179.96
pool.giga.net.ru        91.210.179.97
pool.giga.net.ru        91.210.179.98
pool.giga.net.ru        91.210.179.99
pool.giga.net.ru        Ghbfght
pool.giga.net.ru        Kmnslet
pool.giga.net.ru        Evgbhan
pool.giga.net.ru        Wnhmahy
pool.giga.net.ru        Loasfrt
pool.mirgiga.net        78.158.193.1
pool.mirgiga.net        78.158.193.10
pool.mirgiga.net        78.158.193.104
pool.mirgiga.net        78.158.193.105
pool.mirgiga.net        78.158.193.106
pool.mirgiga.net        78.158.193.107
pool.mirgiga.net        78.158.193.11
pool.mirgiga.net        78.158.193.110
pool.mirgiga.net        78.158.193.111
pool.mirgiga.net        78.158.193.112
pool.mirgiga.net        78.158.193.113
pool.sevtele.com        46.172.203.8
pool.sevtele.com        46.172.203.80
pool.sevtele.com        46.172.203.83
pool.sevtele.com        46.172.203.85
pool.sevtele.com        46.172.203.87
pool.sevtele.com        46.172.203.88
pool.sevtele.com        Ghbfght

Reply via email to