Taking Magic Banana's cue, I applied the join command in round-robin fashion:
> HNs.Ed.tropic.ssec.wisc.edu.txt join -1 2 -2 2 join -1 2 -2 2 join -1 2
-2 2 join -1 2 -2 2 join -1 2 -2 2 time cat `ls
/home/george/Desktop/June2019/DataSets/HNs-TwoColumn/HNusage/HNs.Ed.tropic.ssec.wisc.edu`
>
/home/george/Desktop/June2019/DataSets/HNs-TwoColumn/HNusage/HNs.Ed.tropic.ssec.wisc.edu.visitors.txt
Produces 1.4 MB output, but without any filenames; this can be remedied by
adding the filename to each file in turn
with a print statement or (shudder) in Leafpad, one file at a time, 44 times
over ... but there's another way:
See this link:
https://unix.stackexchange.com/questions/117568/adding-a-column-of-values-in-a-tab-delimited-file
>> awk '{print $0, FILENAME}' file1 file2 file3 ...
> awk '{print FILENAME,"\t",$0}' 01.txt 02.txt 03.txt ... which works because
I renamed the files to fit this format.
I kept the roster list ...
Here's the whole awk script, which concatenates all 44 files and inserts the
file name associated with the data as desired:
> time awk '{print FILENAME,"\t",$0}' 01.txt 02.txt 03.txt 04.txt 05.txt
06.txt 07.txt 08.txt 09.txt 10.txt 11.txt 12.txt 13.txt 14.txt 15.txt 16.txt
17.txt 18.txt 19.txt 20.txt 21.txt 22.txt 23.txt 24.txt 25.txt 26.txt 27.txt
28.txt 29.txt 30.txt 31.txt 32.txt 33.txt 34.txt 35.txt 36.txt 37.txt 38.txt
39.txt 40.txt 41.txt 42.txt 43.txt 44.txt >
Backup/ProcessedVisitorLists/FILENAME.txt
It's 1.8 MB and in a pretty format, not yet processed as mentioned elsewhere
... a luncheon date beckons.
Timing ? The join commands took about an hour to set up, ca. 1 or 2 seconds
real time for each one (after copy
and paste into the console, about 15 seconds for each of the 44 join commands
==> 11-1/2 minutes, and this last
monstrocity took 0.05 second real time, not to mention all morning struggling
with a prettier method of reading
what's been in the current directory all along. Repeating it for the other 43
combinations should now be
a breeze, as I can switch the file names around with Leafpad.
All because I haven't yet spent those ten hours that've been mentioned every
so often ...