First of all:
SmallerFile_0.txt is not sorted (conceptcable.com would be first): below, I
sort the files;
I do not understand why OutputFile_0.txt does not associate pool.mirgiga.net
with Uhnagty, Yjnmase, and Bnhjyht: below, I assume it should.
The format you use is redundant. Moreover, in the output, it becomes hard
(if not impossible) to set apart what comes from the "larger file" and from
the "smaller file". I suggest to transform the two input files to have no
duplicate in the first columns and a list of comma-separated values in the
second columns (if commas can appear in the files, change that character),
using twice the same command line:
$ sort -k 1,1 LargerFile_0.txt | awk '{ if ($1 == key) printf "," $2; else {
printf "\n" $0; key = $1 } }' | tail -n +2 > LargerFile_0.csv
$ sort -k 1,1 SmallerFile_0.txt | awk '{ if ($1 == key) printf "," $2; else {
printf "\n" $0; key = $1 } }' | tail -n +2 > SmallerFile_0.csv
You then only need to "join" the two files (see
https://en.wikipedia.org/wiki/Relational_algebra#Natural_join_(%E2%8B%88) for
the theory):
$ join LargerFile_0.csv SmallerFile_0.csv
pool.giga.net.ru
91.210.179.94,91.210.179.95,91.210.179.96,91.210.179.97,91.210.179.98,91.210.179.99
Evgbhan,Ghbfght,Kmnslet,Loasfrt,Wnhmahy
pool.mirgiga.net
78.158.193.1,78.158.193.10,78.158.193.104,78.158.193.105,78.158.193.106,78.158.193.107,78.158.193.11,78.158.193.110,78.158.193.111,78.158.193.112,78.158.193.113
Bnhjyht,Uhnagty,Yjnmase
pool.sevtele.com
46.172.203.8,46.172.203.80,46.172.203.83,46.172.203.85,46.172.203.87,46.172.203.88
Ghbfght
As a script taking the two files as arguments and running everything in
parallel:
#!/bin/sh
if [ -z "$2" ]
then
printf "Usage: $0 file1 file2
"
exit
fi
TMP=$(mktemp)
trap "rm $TMP* 2>/dev/null" 0
mkfifo $TMP.1 $TMP.2
sort -k 1,1 "$1" | awk '{ if ($1 == key) printf "," $2; else { printf "\n"
$0; key = $1 } }' | tail -n +2 > $TMP.1 &
sort -k 1,1 "$2" | awk '{ if ($1 == key) printf "," $2; else { printf "\n"
$0; key = $1 } }' | tail -n +2 > $TMP.2 &
join $TMP.1 $TMP.2 # | awk '{ for (i = 1; ++i