Oops ... still not quite time for a victory lap ...
Some time ago, we worked out that I could run several awk commands like this
concurrently in separate terminal windows:
> time sudo nmap -sS -p3389 -T4 --host-timeout 300 --min-hostgroup 25 -iL
Input-IPv4s.txt > nMap-IPv4s-to-HNs.txt
and
Magic Banana wrote:
>> tee twixt - | dig -f twixt
> That should be 'tee twixt | dig -f -'
In the meantime I had been forced to change 'tee twixt -' to 'tee twixt' in
the process of troubleshooting ...
> ... (or only 'dig -f -' if you do not want to save "twixt")...
My dig scripts had been
Magic Banana, teaching, says:
> > cat CPV-ThreeNone-Output.txt | more [and remove the intervening rows &
trailing dots with LibreOffice Calc]
> I do not think you realize how much time of your life you could save by
seriously learning (say for ~10 hours) GNU's text-processing commands.
Magic Banana checked my homework:
> It works here (hostname_list contains the first column of your
spreadsheet):
... [selecting the last three rows]
>
abts-tn-dynamic-70.160.49.171.airtelbroadband.in 70.160.49.171 171.49.160.70 160.49.171.70
>
After composing yet another chapter in this treatise and then losing it after
an errant pinkie brushed across the
freeze-the-system key, I'm starting over to find a way of finishing the
hostname-resolution process:
There are three methods of resolving hostnames:
1. Magic Banana's technique
Backtracking to find a less error-prone script-creation process:
Step (1): Apply Magic Banana's one-line processor to the single-column
Hostname_List containing all the Webalizer
records for a time period:
tr -sc 0-9\\n ' ' < hostname_list | awk '{ k = 0; for (i = 0; k < 4 && ++i
Magic Banana grades my homework constructively:
> ... redirect the standard output of 'grep '[0-9]$' to a file before a
pipe: that is wrong
Here's my mistake: ... | grep '[0-9]$' > intermediate.txt | awk ...
It should have been: ... | grep '[0-9]$'- | awk ...
And Magic Banana's last version
Starting with Magic Banana's firm suggestion:
tr -sc 0-9\\n ' ' < hostname_list | awk '{ k = 0; for (i = 0; k < 4 && ++i
intermediate.txt | awk { 'for (i = 2; ++i
This may be on the right track, but has a syntax error when I try to process
the three permutated IPv4 addresses:
time tr -sc 0-9\\n ' ' < CPV-GB-OneCol0-6192019.txt | awk '{ k = 0; for (i =
0; k < 4 && ++i CPV-GB-TwoCol-OutputY.txt
The awk command is intended to select the one
Correction to the output file:
time tr -sc 0-9\\n ' ' < CPV-GB-OneCol0-6192019.txt | awk '{ k = 0; for (i =
0; k < 4 && ++i
Magic Banana worries:
> I am not sure I understand what you mean by "It remains to be seen how to
manage that so the interlopers are discarded".
Once the [long] command is done, it's too late to compare the [two or] three
outputs from the nmap lookups of the candidate
IP addresses [that I
Magic Banana suggested:
> Just need to suppress that non-matching data ... which can be done with
LibreOffice Calc in post processing by sorting and the deleting the rows with
blank entries ...
> Are you referring to the lines where my command line does not detect any
IPv4 address? If
Magic Banana wrote:
> As you can see, I also chose AWK's default separator, the space. It is
indeed usually easier to work with file
> that do not have tabulations (especially with 'sed'). Well, except with
'cut' and 'paste' where option -d must
> then be set (as I did above). If you really
Magic Banana lamented:
>The third column of "output-2-column-file.txt" cannot be computed from the
input alone, as far as I understand.
If it were mine to do as well as you can ... I'd reverse the octet order in
the third column, look them both
up with nMap or nslookup, and compare to the
Magic Banana woke me up with: "Input excerpt and expected output..."
OK. The beginning-5-col-file shows all five tab-separated fields.
The output-2-column-file actually has an added third column listing the
proper IPv4 addresses that resolve to the hostnames in the first column.
You will see
Armed with my new knowledge, I processed all 35,000 lines of the source file
so as to separate one file with
the original four-octet-containing hostnames and four additional columns each
containing one of those four
octets. That list of IPv4's is 3500 lines long, out of 14,000 hosts that were
It turns out that nMap doesn't care which way the lookup process goes; I'm
currently processing the entire file (35,000 lines) for one four-month data
set, and the hostnames as well as the IPv4 addresses are being handled with
the same output schemes.
The actual separation process will
Here's a sample of one online Webalizer file (stripped of reams of frequency
and byte data)
which is trivial to separate. Now imagine about half a million lines of such
data in random
order.
I can strip out the hostnames by putting two identical target lists
side-by-side in
OpenOffice
I went through the file HNs-nMap-LixedList.txt and resolved all the
hostnames, mainly by inspection
(two were based on ARPA naming and so had their octets arranged in backwards
order), many with Google,
usually the first item in the list, and one by truncating the prefix from
q.jaso.ru to
Magic Banana wisely suggested:
> You can redirect the output of any command (including 'nmap') to a file.
> Append "&> file_name" (without the quotes and with the file name of your
choice) to the command line,
> to redirect what it writes on both the standard and the error outputs.
That
After removing the unconverted IPv4 addresses from the Webalizer statistics
of a domain,
there's a long list of PTR records and an occasional A record left behind.
nMap has a
useful script (--script asn-query) to perform lookups of the ASN, port
status, CIDR
block, country code, and even
21 matches
Mail list logo