Re: [Trisquel-users] Trisquel software update broke my WiFi connection
The "updated laptop" is till giving me cause for worry ... Today about noon, after letting the laptop run a big script grepping an even larger set of data files, I returned to find the wifi dongle uncooperative, but after considerable editing and swapping my two dongles back & forth, I managed to obtain a reliable wireless reconnection to the router. I then applied the Disconnect function, unplugged the dongle, and went about my business. An hour later, I looked at the "Edit Connections" function, and to my surprise, there were three SSID's listed: Wi-FiLast Used SSID4 2 now SSID4 1 1 hour ago SSID4 1 month ago The upper two of these are news to me. My active SSID is SSID4, not SSID4 1 or SSID4 2. Where should I look for information about this behavior ?
Re: [Trisquel-users] wordpress icecat ssl connection failed
Wordpress is an open source webpage-creation package that is open to new apps from just about any source, but there is insufficient vetting of new material. Users must be exceptionally vigilant that their own wordpress installations use only current and trusted apps that come from official wordpress.com sites and not from third-party authors. I looked into the consequences by examining the access logs of my own website, which does not use wordpress: https://www.pinthetaleonthedonkey.com/StatisticsAllYears/May-2018-WordPress/WordPress-attacks-MiDomane.com-May-2018.htm Summarizing those results: A worldwide 364/24/7 attack on my [non-existent] wordpress installation. Currently, even though dig wordpress.com returns two IP addresses, there are actually many more IP addresses that return "wordpress.com" upon receiving a dig -x inquiry; my current list has 67 addresses. They are all owned by a single entity and are probably legitimate. That still causes a lot of trouble for their users: dig -x 66.155.9.238 ==> PTR wordpress.com PTRwordpress.com https://otx.alienvault.com/indicator/ip/192.0.77.32 George Langford
Re: [Trisquel-users] A sed script to replace one HTML string with a different one
In my forays into the mysteries of Fortran fifty years ago, mistakes in the coding often generated error messages that had no discernible relation to the mistakes, but instead pointed to their consequences. That's what I suspect is happening as a consequence of attempting to use the command line to edit HTML. The attached file has five versions of the offending script and the bash responses: (1) Is the script that was the first that I tried, with escaped dots; ")" was flagged. (2) All the dots are escaped; also the single quotes; "(" was flagged. (3) Target file was altered to eliminate the p's; so was the script; ")" was flagged. (4) Target file still altered to eliminate the p's; single quotes escaped; "(" was flagged. (5) All the HTML-sensitive characters were translated; `/\' (That's "
Re: [Trisquel-users] A sed script to replace one HTML string with a different one
Trying another tack with awk ... see: https://stackoverflow.com/questions/50244876/how-to-use-gsub-in-awk-to-find-and-replace-and-txt-characters-within where it's said: echo "./file_name.txt|1230" | awk '{gsub(/\.\/|\.txt/,"")}1' file_name|1230 In the present task, taking just one exemplar PTR, see the attached file, which also shows bash's response. The character that's flagged is the end-parenthesis, but that's part of the standard gsub syntax. George Langford
Re: [Trisquel-users] A sed script to replace one HTML string with a different one
After learning how to count characters in bash: https://linuxhint.com/length_of_string_bash/ I found out that the offending character in the sed expression is the forward slash preceding the second "p" in the string to be replaced. The attached text file demonstrates this result. I have yet to see whether or not my substitution will work in the real world. George Langford
Re: [Trisquel-users] A sed script to replace one HTML string with a different one
Regarding the Table-html-excerpt.txt file: After converting it back to HTML and trying out the links, I discovered an easily corrected error in about two-thirds of them. In Leafpad the correction is to search for the string, ../../ScoreCards" and replace it with "../ScoreCards" That should fix all the broken links. By the way: The linked files are all plain text without any scripts and contain no more links to anywhere else. That said, you can test the names with "dig hostname" and the IP addresses with "dig -x IPaddress" Many of the hostnames come back as on the server, "92.242.140.21" which is a catchall address used by a fellow who maintains a site called "barefruit error handling" or "unallocated.barefruit.co.uk" but it's not where these oftimes malicious servers are. "whois IPaddress" will tell you where they are located and what their autonomous server number (ASN) is. George Langford
Re: [Trisquel-users] A sed script to replace one HTML string with a different one
Ignacio Agullo inquired: Was it necessary to attach dozens of files, about one megabyte of size? Yes; they're all different, with differing goals, impacts, patterns, and the like. I'd also like to encourage others to attempt similar analyses. It's taking me a couple of months to gather the data and put it into an order which can be examined to find out why and how so many attacks are being made by servers located at addresses which cannot be traced. These results show that they can be examined for country of origin, degree of obfuscation, location of additional addresses, etc. There are other months in the year; one person cannot possibly keep up with the task; yet there are hundreds of folks picking up the traces left behind in the headers of malicious messages; you can find out for yourselves by putting one of the PTR records (a.k.a. hostnames) in an Internet search engine, enclosed in quotation marks, and then gathering the IP addresses gleaned from malicious Internet traffic by the many folks who monitor such traffic. That's another webpage like this excerpt that can be generated. George Langford
Re: [Trisquel-users] Trisquel software update broke my WiFi connection
wget -drc http://archive.trisquel.info/trisquel/pool/main/l/linux-hwe/linux-image-4.15.0-112-generic_4.15.0-112.113~16.04.1+8.0trisquel3_amd64.deb > /media/george/IPv4/Destination/ None of those three options is useful here. The redirection is not either: the working directory is used. You always make things so complicated... Bear in mind that I was forced to use the working directory in the flash drive, because that flash drive was my only usable file-transfer medium at the time. If you really want to use Wget, just execute: $ cd /media/george/IPv4/Destination/ That part is correct. $ wget http://archive.trisquel.info/trisquel/pool/main/l/linux-hwe/linux-image-4.15.0-112-generic_4.15.0-112.113~16.04.1+8.0trisquel3_amd64.deb http://archive.trisquel.info/trisquel/pool/main/l/linux-hwe/linux-modules-4.15.0-112-generic_4.15.0-112.113~16.04.1+8.0trisquel3_amd64.deb There's a missing step in the above script: Transfer the T420 clone to the bottom of the pile on my physical desk, restart the crippled T420, plug the flash drive into the USB port, and then invoke the installer from the working directory on the flash drive. So, you execute Wget instead of right clicking on two links on this Web page and choosing "Save Link As...", but you then take a graphical file manager to navigate down a deep branch of the file hierarchy (deep because you added options you do not understand to wget) instead of using cd and the auto-completion of the shell... Those steps were not available on the T420 clone, because that computer had no electronic connection to the WiFi-less T420 in spite of its close physical proximity. Nevertheless, read on: Does additionally installing http://archive.trisquel.info/trisquel/pool/main/l/linux-hwe/linux-modules-extra-4.15.0-112-generic_4.15.0-112.113~16.04.1+8.0trisquel3_amd64.deb help? Now, several hours after the above discourse, I bit the bullet, re-learned how to navigate the grub menu by consulting my library of spiral notebooks, activated an earlier version of Trisquel 8, and continued with my usual morning ritual of downloading emails, trashing spam, etc. with an un-buggy Trisquel operating system, which makes the WiFi connection seamless. Now, the installation of the linux-modules-extra from Abrowser from its address line was trivial, and the result was completely satisfactory after re-booting the current operating system. The condensed version of my reply is Yes! Thank you. George Langford
Re: [Trisquel-users] Trisquel software update broke my WiFi connection
OK ... sort of. I used wget -drc http://archive.trisquel.info/trisquel/pool/main/l/linux-hwe/linux-image-4.15.0-112-generic_4.15.0-112.113~16.04.1+8.0trisquel3_amd64.deb > /media/george/IPv4/Destination/ and http://archive.trisquel.info/trisquel/pool/main/l/linux-hwe/linux-modules-4.15.0-112-generic_4.15.0-112.113~16.04.1+8.0trisquel3_amd64.deb and then double-clicked down through a series of subdirectories /media/george/IPv4/Desktop/BLBK5/FirstFile ... to the final /media/george/IPv4/Desktop/BLBK5/FirstFile/archive.trisquel.info/trisquel/pool/main/l which then performed the install automatically and without complaint. Alas, still no WiFi activation.
Re: [Trisquel-users] Trisquel software update broke my WiFi connection
Easier said than done, especially with no working Internet connection on the affected ThinkPad T420. I do have some space on a USB flashdrive ... and my trusty T420 clone. George Langford
[Trisquel-users] Trisquel software update broke my WiFi connection
Thankfully there is a clone of my ThinkPad T420 handy ... Everything was fine this morning while I downloaded emails, but then the Trisquel software update popup appeared, and I proceeded with the 58+/- MB update. Afterwards, I proceeded with some LibreOffice Calc. spreadsheet editing ... suddenly it was 12:43 PM EDT and I noticed that the WiFi wasn't connected ... dmesg |tail-100 says: [ 688.650872] usb 2-1.2: new high-speed USB device number 9 using ehci-pci [ 688.757565] usb 2-1.2: New USB device found, idVendor=0cf3, idProduct=9271 [ 688.757577] usb 2-1.2: New USB device strings: Mfr=16, Product=32, SerialNumber=48 [ 688.757583] usb 2-1.2: Product: UB93 [ 688.757589] usb 2-1.2: Manufacturer: ATHEROS [ 688.757594] usb 2-1.2: SerialNumber: 12345 [ 688.759155] usb 2-1.2: ath9k_htc: Firmware ath9k_htc/htc_9271-1.4.0.fw requested [ 688.759482] usb 2-1.2: Direct firmware load failed with error -2 [ 688.759491] usb 2-1.2: ath9k_htc: Firmware htc_9271.fw requested [ 689.048834] usb 2-1.2: ath9k_htc: Transferred FW: htc_9271.fw, size: 50980 [ 689.290394] ath9k_htc 2-1.2:1.0: ath9k_htc: HTC initialized with 33 credits I have two of these: Extended-range and thumbnail, both USB-connected. They're sold by Think Penguin and have been 100% satisfactory in my experience for years. I tried editing the [empty] connection; when I disable it, the usual Wi-Fi disconnected popup appears in the expected place, but I get no response at all when I subsequently enable it. nor when I unplug either one and insert it in another USB port ... nor after a reboot with the adapter in place ... nor when I insert either one after a reboot. The same WiFi adapter[s] automatically connect when I re-insert one in a USB port in this T420 clone ... on which I have not yet updated the software. Upon re-editing the WiFi connection, I see that all the settings are the same on the T420 clone as on the broken one. It appears as though the Think Penguin firmware upgrade permanently installed on both the T420 clone and its broken twin initially fails to load but subsequently loads successfully, followed by no further activity. The dmesg response is shown above; it's the same for either WiFi device. Here's what happens on the T420 clone when I re-insert the extended-range WiFi adapter, according to dmesg |tail -100: [ 2409.060178] usb 2-1.1: ath9k_htc: USB layer deinitialized [ 2409.848115] usb 2-1.1: new high-speed USB device number 8 using ehci-pci [ 2409.957236] usb 2-1.1: New USB device found, idVendor=0cf3, idProduct=9271 [ 2409.957248] usb 2-1.1: New USB device strings: Mfr=16, Product=32, SerialNumber=48 [ 2409.957255] usb 2-1.1: Product: UB91C [ 2409.957261] usb 2-1.1: Manufacturer: ATHEROS [ 2409.957267] usb 2-1.1: SerialNumber: 12345 [ 2409.957984] usb 2-1.1: ath9k_htc: Firmware ath9k_htc/htc_9271-1.4.0.fw requested [ 2409.958319] usb 2-1.1: Direct firmware load failed with error -2 [ 2409.958329] usb 2-1.1: ath9k_htc: Firmware htc_9271.fw requested [ 2410.245114] usb 2-1.1: ath9k_htc: Transferred FW: htc_9271.fw, size: 50980 [ 2410.486568] ath9k_htc 2-1.1:1.0: ath9k_htc: HTC initialized with 33 credits [ 2411.558453] ath9k_htc 2-1.1:1.0: ath9k_htc: FW Version: 1.3 [ 2411.558464] ath9k_htc 2-1.1:1.0: FW RMW support: Off [ 2411.558470] ath: EEPROM regdomain: 0x833a [ 2411.558474] ath: EEPROM indicates we should expect a country code [ 2411.558479] ath: doing EEPROM country->regdmn map search [ 2411.558483] ath: country maps to regdmn code: 0x37 [ 2411.558487] ath: Country alpha2 being used: GB [ 2411.558491] ath: Regpair used: 0x37 [ 2411.567684] ieee80211 phy4: Atheros AR9271 Rev:1 [ 2411.581488] ath9k_htc 2-1.1:1.0 wlx00c0ca82ec36: renamed from wlan0 Thanks George Langford
[Trisquel-users] Can Icedove's search email function be improved ?
Faced with an inquiry about an item whose name is a common three-letter word combined with a couple of additional characters, my Icedove search returns nearly a thousand responses, which is too burdensome to scan visually. Searching on a much longer phrase increased the number of responses proportionately more, not less. I tried enclosing the pattern in quotes, but Icedove ignores those, returning the same thousand responses, nearly every one containing the common three-letter word alone. Then I navigated to the .icedove application data and did a grep search from the terminal: grep -B 10 "Saw-09" /home/george/.icedove/REDACTED.08202015/Mail/mail*/* > /home/george/Desktop/Saw-09.txt which looked through all the mailbox files in about two seconds and returned all the data that I could hope for, including the inquiry that prompted my search in the first place. The arguments -A and -B are a big help, although the output organizes the responses by listing all the first lines of matching emails, then all the second lines, then all the third lines, and so on. George Langford
Re: [Trisquel-users] Find the instances of each of a list of strings and print each set in a separate file
Magic Banana added responses to questions he didn't know I'd asked; our notes apparently crossed in the ether ! My inefficient (but clear to me) scripts needed that extra collection of single-line PTR's because I use them in my one-at-a-time join command; afterwards, they're not needed any more, but take up a lot of inefficiently used disk unit storage. I've fixed the duplicates issue in my join command. Those unwieldy tables of PTR's and their multiple addresses will soon be scrolling down the screen so far that the filename may disappear from view; and the -Vk 2 option puts the IPv4's in visually searchable order. 2GB/60kB times 0.3 seconds is seriously longer than times 0.01 second: 2.8 hours vs. 5.6 minutes. Touche. Magic Banana condensed his already slender script to the utmost: Now, if you want the duplicates [No !], if you insist on the file name[s?] you chose and if you really want the repeated PTR in a first field (what only looks like a waste of disk space), it is trivial to adapt my solution: awk 'FILENAME == ARGV[1] { a[$1] = $2 } FILENAME == ARGV[2] && $1 in a { print $1, $2 >> "CountsFiles/" $1 "." a[$1] ".txt" }' PTRList.txt IPv4.May2020.37.nMapoG.txt It works ! Thanks for your script's astounding brevity, for your accurate scripting, and for being so diligent. In the final reckoning, IPv4.May2020.37.nMapoG.txt (60kB) becomes IPv4.May2020.100.nMapoG.txt (2GB). PTRList.txt can keep the same name. George Langford
Re: [Trisquel-users] Find the instances of each of a list of strings and print each set in a separate file
Magic Banana said: Maybe. In response to my concern: There appears to be a misunderstanding and then requested: If only you would give an example of the expected output... Here are a couple of the output files: CountsFiles/yellowipsdirty.singlehop.com.3.txt yellowipsdirty.singlehop.com 96.127.177.1 yellowipsdirty.singlehop.com 99.198.113.0 yellowipsdirty.singlehop.com 99.198.113.5
Re: [Trisquel-users] Find the instances of each of a list of strings and print each set in a separate file
There appears to be a misunderstanding: I'm looking to reconcile the list of PTR's and the number of each PTR's occurrences (PTRList.txt) and the (severely truncated) output of the nmap scan(s) of the randomly filled out 3rd & 4th octets of a set of CIDR/16 prefixes (IPv4.May2020.37.nMapoG.txt). On the other hand, Magic Banana's script is looking at only one of the two files; which one ? I've tried to save the scripts as MB.suggestion02.txt & MB.suggestion03.txt, cp them to *.bin and *.sh, respectively, before chmod +x MB.suggestion02.bin and chmod +x Mb.suggestioin03.sh before lastly executing them with ./mb.suggestion02.bin filename01 - or ./MB.suggestion03.sh filename02 - The error messages repeatedly say for either combination of filenames: bash: ./MBsuggestion03.sh: No such file or directory or ./MBsuggestion02.bin: No such file or directory No new directories appear. In my alternative scripting, I created the two necessary directories beforehand. George Langford
Re: [Trisquel-users] Find the instances of each of a list of strings and print each set in a separate file
With both apologies and thanks to Magic Banana, yesterday I set about to find an independent solution within which I can follow the logic. Here is the series of scripts that accomplish the stated task. They are based on the previously attached PTRListSort.txt and IPv4.May2020.37.nMapoG.txt as illustrative data. The first script makes the 42 files that will hold the multi-address PTR's and their IPv4 addresses: awk '{print $1,$2}' 'PTRList.txt' | awk '{print "touch CountsFiles/"$1"."$2".txt ;"}' '-' > Script.MakeFiles.txt The second script makes 42 additional files, each one ready to contain just one PTR: awk '{print $1,$2}' 'PTRList.txt' | awk '{print "touch PTR-files/"$1".txt ; "}' > Script.MakePTR-files.txt The third script writes the PTR names into their just-created files: awk '{print "echo "$1}' 'PTRList.txt' > Temp0718A.txt ; awk '{print " > PTR-files/"$1".txt ;"}' 'PTRList.txt' > Temp0718B.txt ; paste -d ' ' Temp0718A.txt Temp0718B.txt > Script.Fill.PTR-files.txt ; rm Temp0718A.txt Temp0718B.txt The fourth script creates and lists 42 individual scripts, each one of which collects and writes the joined address data to its just created file. Note that the first file in the join command contains only one line: awk '{print "join -a 1
Re: [Trisquel-users] Find the instances of each of a list of strings and print each set in a separate file
Magic Banana, on the subject of the -a argument of join: The real "succinct one", as you write, would be without option -a. Neither -a 1 nor -a 2. That is what I meant. Which is absolutely correct; subconsciously I was using -a as an either/or choice, but it's also useful to make the choice in the present anaylsis because unmatched lines indicate errors. Magic Banana wondered about my inability to recognize those two arguments: The two files? They are in the command line I gave. The two files in the script that I immediately recognize as my own are PTRList.txt and IPv4.May2020.37.nMapoG.txt, I now appreciate that the first argument is the (presumably first) PTR in PTRList.txt, but that second one still goes over my head. Let's look at Magic Banana's awk command: awk 'FILENAME == ARGV[1] { a[$1] = $2 } FILENAME == ARGV[2] && $1 in a { print $2 >> "out/" a[$1] "," $1 }' PTRList.txt Column $1 of PTRList.txt holds the multi-address PTR's; Column $2 holds the corresponding number of instances, so ARGV[1] has to be www.newsgeni.us and ARGV[2] ought to be 10. That makes the first trial command: ./MB.suggestion.bin www.newsgeni.us 10 Which elicits the following responses: ./MB.suggestion.bin: line 1: $: command not found awk: cmd. line:1: (FILENAME=- FNR=2) fatal: can't redirect to `out/2,lo0-100.NYCMNY-VFTTP-421.verizon-gni.net' (No such file or directory) I tried the two files instead, but I get the exact same responses as the first two arguments. Here's the text of MB.suggestion.txt (from which MB.suggestion.bin was made): $ mkdir out; sort -u IPv4.May2020.37.nMapoG.txt | awk 'FILENAME == ARGV[1] { a[$1] = $2 } FILENAME == ARGV[2] && $1 in a { print $2 >> "out/" a[$1] "," $1 }' PTRList.txt - Just in case, I'll sort PTRList.txt before running ./MB.suggestion.bin in the second trial command: sort PTRList.txt > PTRListSort.txt ; ./MB.suggestion.sort.bin lo0-100.NYCMNY-VFTTP-421.verizon-gni.net 2 PTRList.txt was changed to PTRListSort.txt & MB.suggestion.sort.bin subsequently was made executable beforehand. Terminal responses: ./MB.suggestion.sort.bin: line 1: $: command not found awk: cmd. line:1: (FILENAME=- FNR=2) fatal: can't redirect to `out/2,lo0-100.NYCMNY-VFTTP-421.verizon-gni.net' (No such file or directory) Now lo0-100.NYCMNY-VFTTP-421.verizon-gni.net in the error response is the same as the first argument of MB.suggestion.sort.bin. That's progress. George Langford
Re: [Trisquel-users] Find the instances of each of a list of strings and print each set in a separate file
As I've stated before, there are redundancies in my scripts, but they force me to check the order of the columns in the original files. Once the script functions as I expect it should, I relax and press on. Time flies when one's making progress. The whole exercise in this posting is exactly to create a whole mess of files, each one listing the (sometimes millions) of IP addresses claiming the same PTR record. These text files will be linked to the Recent Visitor data spreadsheets (which Magic Banana has helpfully taught me to code in HTML), thereby keeping the basic presentation within reason. The entire set of illustrative address listings would make one webpage hopelessly immense, defeating the purpose of calling attention to the abuse of IPv4 and IPv6 address space. The address listings ought to remain in text form so that exploring the contents of those many like-named servers is done only by experienced and wary observers. The number-of-instances column in PTRList.txt is retained only as a check on the accuracy of any script that creates individual text files for all the multi-address PTR records. Concatenated into one file makes such a list a very slow-loading page. The final join between the listing of the PTR's in the overall PTRList.txt file will dictate which PTR's make the final cut. The counts can be restored later. The initial step is to extract all the multi-address PTR's after concatenating the outputs of the nmap scripts that queried the 50,000 CIDR/16 blocks extracted from the Current Visitor data. Those scripts are already written & tested. Next, join the PTRList.txt file (no need for that superfluous Column $3) with the Recent Visitor data to discover which (of the multi-address PTR's resolved by the nmap scans) are found in the gratuitously looked up hostnames cataloged by the Webalizer analyses of the reporting domains. Those many Internet addresses can be readily checked with dig -x or nslookup; most of the ones that haven't been defensively changed already on the servers should resolve to their hostnames as listed in the Recent Visitor data. The last step is to publish these correlations to facilitate the defense of domains undergoing attack by hosts that cannot be resolved, except by tedious perusing of search engine data, and therefore are unblockable. Uploading the text files listing the known addresses of the reported multi-address PTR records will be a slow process, but the separate listings linked in the presentation webpage will be more accessible on a one-file-at-a-time basis than an all-encompassing master list that might quickly become obsolete. George Langford
Re: [Trisquel-users] Find the instances of each of a list of strings and print each set in a separate file
Would I pick -a 2 in my join file, the output file would be twenty times as big as the file output with -a 1, and I would have a great deal of editing to do to clean it up. Yes, either -a 1 or -a 2 ultimately gives the same net output. I prefer the succinct one. I suspect that $ mkdir out; sort -u IPv4.May2020.37.nMapoG.txt | awk 'FILENAME == ARGV[1] { a[$1] = $2 } FILENAME == ARGV[2] && $1 in a { print $2 >> "out/" a[$1] "," $1 }' PTRList.txt - Is meant to produce something similar to my preamble script: join -a 1 -1 1 -2 1
Re: [Trisquel-users] Trisquel 9.0 Etiona Fully updated Gapcoin Core........CAN WE BUILD IT?
LpSkywalker has been pushing gapcoin on this forum for some time, and yet none of his posts are available for review. This thread belongs in the same wastebasket ... That said, see: Trisquel-users Digest, Vol 124, Issues 43 & 46 Trisquel-users Digest, Vol 125, Issues 15, 32 & 33 George Langford
[Trisquel-users] Find the instances of each of a list of strings and print each set in a separate file
Working towards a list of multi-address hostnames (PTR's) in a long file of PTR records (column $1) and their IPv4 addresses (column $2), I'll soon need to process the combined list of all those outputs so as to produce a relatively large number of separate files listing all the IPv4 addresses found for each PTR. Let's say that the list of multi-address hostnames is PTRList.txt (attached). The following join command provides a sorted list of all those hostnames and their IPv4 addresses: join -a 1 -1 1 -2 1 www.newsgeni.us 10 zimbra.themicrobuttery.com 8 lo0-100.WASHDC-VFTTP-312.verizon-gni.net3 lo0-100.PHLAPA-VFTTP-329.verizon-gni.net3 yellowipsdirty.singlehop.com3 luigi.nodeservers.net 2 mail.cafetech.biz 2 lo0-100.PHLAPA-VFTTP-330.verizon-gni.net2 lo0-100.PHLAPA-VFTTP-328.verizon-gni.net2 lo0-100.PHLAPA-VFTTP-313.verizon-gni.net2 mail31.newsgeni.us 2 ymfb.uat.yesmail.com2 mail.beaumont.ab.ca 2 mail1.yhti.net 2 mail6.marketing.csid.com2 mail2.mail.mykroll.idmonitoringservice.com 2 localhost.localdomain 2 xplr-96-44-95-129.xplornet.com 2 xplr-96-44-87-230.xplornet.com 2 xplr-96-44-83-41.xplornet.com 2 xplr-96-44-72-226.xplornet.com 2 xplr-96-44-66-20.xplornet.com 2 xplr-96-44-65-185.xplornet.com 2 xplr-96-44-122-238.xplornet.com 2 xplr-96-44-121-154.xplornet.com 2 xplr-96-44-101-143.xplornet.com 2 xplr-96-44-100-62.xplornet.com 2 www.stuttgarthaus.com 2 lo0-100.WASHDC-VFTTP-318.verizon-gni.net2 lo0-100.WASHDC-VFTTP-325.verizon-gni.net2 lo0-100.WASHDC-VFTTP-371.verizon-gni.net2 lo0-100.PHLAPA-VFTTP-359.verizon-gni.net2 lo0-100.PHLAPA-VFTTP-319.verizon-gni.net2 mail3.softerware.com2 lo0-100.WASHDC-VFTTP-323.verizon-gni.net2 lo0-100.NYCMNY-VFTTP-421.verizon-gni.net2 lo0-100.WASHDC-VFTTP-338.verizon-gni.net2 lo0-100.RCMDVA-VFTTP-304.verizon-gni.net2 zircon.superdnssite.com 2 low.lowe001.net 2 mail.allta.com.ua 2 xgrin2-lo.cosmonova.net.ua 2 lo0-100.NYCMNY-VFTTP-415.verizon-gni.net96.232.255.153 lo0-100.NYCMNY-VFTTP-421.verizon-gni.net96.232.100.1 lo0-100.NYCMNY-VFTTP-421.verizon-gni.net96.232.99.1 lo0-100.NYCMNY-VFTTP-422.verizon-gni.net96.232.255.145 lo0-100.NYCMNY-VFTTP-452.verizon-gni.net96.246.217.1 lo0-100.PGHKNY-VFTTP-307.verizon-gni.net96.248.54.1 lo0-100.PHLAPA-VFTTP-302.verizon-gni.net96.245.76.1 lo0-100.PHLAPA-VFTTP-304.verizon-gni.net98.115.179.1 lo0-100.PHLAPA-VFTTP-307.verizon-gni.net98.114.204.1 lo0-100.PHLAPA-VFTTP-310.verizon-gni.net98.115.35.1 lo0-100.PHLAPA-VFTTP-312.verizon-gni.net98.115.6.1 lo0-100.PHLAPA-VFTTP-313.verizon-gni.net98.111.132.1 lo0-100.PHLAPA-VFTTP-313.verizon-gni.net98.115.208.1 lo0-100.PHLAPA-VFTTP-315.verizon-gni.net98.115.124.1 lo0-100.PHLAPA-VFTTP-319.verizon-gni.net96.245.14.1 lo0-100.PHLAPA-VFTTP-319.verizon-gni.net98.114.50.1 lo0-100.PHLAPA-VFTTP-328.verizon-gni.net98.114.231.1 lo0-100.PHLAPA-VFTTP-328.verizon-gni.net98.114.249.1 lo0-100.PHLAPA-VFTTP-329.verizon-gni.net96.227.106.1 lo0-100.PHLAPA-VFTTP-329.verizon-gni.net96.227.90.1 lo0-100.PHLAPA-VFTTP-329.verizon-gni.net98.115.87.1 lo0-100.PHLAPA-VFTTP-330.verizon-gni.net98.114.255.49 lo0-100.PHLAPA-VFTTP-330.verizon-gni.net98.115.113.1 lo0-100.PHLAPA-VFTTP-333.verizon-gni.net98.115.239.1 lo0-100.PHLAPA-VFTTP-336.verizon-gni.net98.114.255.57 lo0-100.PHLAPA-VFTTP-337.verizon-gni.net96.245.59.1 lo0-100.PHLAPA-VFTTP-339.verizon-gni.net98.114.33.1 lo0-100.PHLAPA-VFTTP-351.verizon-gni.net96.227.133.1 lo0-100.PHLAPA-VFTTP-359.verizon-gni.net96.245.62.1 lo0-100.PHLAPA-VFTTP-359.verizon-gni.net98.114.21.1 lo0-100.PHLAPA-VFTTP-368.verizon-gni.net96.227.99.1 lo0-100.PHLAPA-VFTTP-385.verizon-gni.net98.115.209.1 lo0-100.PITBPA-VFTTP-304.verizon-gni.net96.236.217.1 lo0-100.PITBPA-VFTTP-310.verizon-gni.net96.236.210.1 lo0-100.PITBPA-VFTTP-311.verizon-gni.net96.236.130.1 lo0-100.PITBPA-VFTTP-312.verizon-gni.net98.111.211.1 lo0-100.PITBPA-VFTTP-314.verizon-gni.net96.235.28.1 lo0-100.PITBPA-VFTTP-326.verizon-gni.net96.236.211.1 lo0-100.PRVDRI-VFTTP-305.verizon-gni.net96.253.59.1 lo0-100.PRVDRI-VFTTP-308.verizon-gni.net96.253.49.1 lo0-100.PRVDRI-VFTTP-315.verizon-gni.net96.253.22.1 lo0-100.PRVDRI-VFTTP-317.verizon-gni.net96.253.44.1 lo0-100.RCMDVA-VFTTP-303.verizon-gni.net96.253.69.1 lo0-100.RCMDVA-VFTTP-304.verizon-gni.net96.228.33.1 lo0-100.RCMDVA-VFTTP-304.verizon-gni.net96.248.6.1 lo0-100.RCMDVA-VFTTP-307.verizon-gni.net96.253.88.1 lo0-100.RCMDVA-VFTTP-308.verizon-gni.net96.228.32.1 lo0-100.RCMDVA-VFTTP-310.verizon-gni.net98.117.79.1
Re: [Trisquel-users] Syntax problems with nested loops
After sitting on Magic Banana's suggested executable script for the near-forever four days, I found the time and concentration to try it. First, I stripped those trailing dots (which facilitated pasting nmap's random 3rd & 4th octets) with the sed script suggested by Magic Banana: sed -i 's/\.$//' Prefixes.IPv4.May2020.Corrected00.txt which doesn't need or use a renamed output. Then I copied the suggested script to a file I call MB.IPv4.urandom.txt, saved it as MB.IPv4.urandom.bin, and made it executable with: sudo chmod +x MB.IPv4.urandom.bin I then applied it to the first of the 38 divisions of the CIDR/16 prefixes from May's Current Visitor data: ./MB.IPv4.urandom.bin Prefixes.IPv4.May2020.Corrected00.txt 2560 > Addresses.IPv4.00.678x2560.txt This script took about three seconds & produced 25.5MB + 1,735,680 addresses at 2,560 for each of the 678 CIDR/16 prefixes. My nmap method of generating those 3rd & 4th octets takes considerably more scripting and about 45 seconds to produce about the same number of addresses. Many fewer places to enter new arguments, too. Nice work, Magic Banana ! George Langford
Re: [Trisquel-users] How can awk as a substitute for paste
Here's one of the old scripts which was causing the output file to be updated: awk '{print $1}' 'B-May2020-L.768-secondary.txt' | sudo nmap -6 -Pn -sn -T4 --max-retries 16 -iL '-' -oG - | grep "Host:" '-' | awk '{print $2,$3}' '-' | sed 's/()/(No_DNS)/g' | tr -d '()' | uniq -u > May2020-L.768-secondary.oGnMap.txt There's no sort there, just uniq, which I was expecting only to remove adjacent duplicates. So we aren't at odds. There are sixteen instances of the script running right now, which keeps the network I/O nice & steady. Some dips happen when there's other traffic, but not because the responses that nmap receives are sporadic; those dips are ironed out while so many other nmap scans are ongoing. George Langford
Re: [Trisquel-users] How can awk as a substitute for paste
Magic Banana astutely proposed: (...) | tee SyncIPv4/Random.IPv4.May2020.37.nMap.primary.oG_unsorted.txt | sort > SyncIPv4/Random.IPv4.May2020.37.nMap.primary.oG.txt Which works right away and without modification; and in concert with the preceding awk statement that I had not yet removed: awk '{print $0}' 'Addresses.IPv4.May2020.35.txt' | The ... unsorted.txt file and the nMap.primary.oG.txt files both appeared, first nMap.primary.oG.txt with zero bytes, followed (in a few seconds, not five minutes as in the "./SyncIPv4.bin") by the ... unsorted.txt file, which began filling up frequently (as the main file used to do with that awk script for reasons unbeknownst to either of us). Thank you ! George Langford
Re: [Trisquel-users] How can awk as a substitute for paste
Magic Banana wrote: The file contents are sync'ed, not the file names. Should I expect the time stamp to change ? amenex wrote: sudo .filename.bin ... alas, Terminal's response was "command not found." To which Magic Banana responded: There is no executable named .filename.bin in any directory listed in $PATH. Reflects the truth: .filename.bin doesn't actually reside anywhere; it's invoked from the command line in the working directory (where the files to be sync'ed are located) but it resides in live memory (RAM). And then Magic Banana continued: If there is an executable named filename.bin in the working directory and if it accepts 5m as a single argument, Magic Banana also said that [one] may execute it in this way: $ ./filename.bin 5m Adding, there is here no need for administrator privileges, as granted by sudo. Amenex exclaimed: Progress ! Terminal responded as it would when a live script is in progress ... The output of my long & tortuous nmap script is meant to go into the directory from which filename.bin was executed as recommended by Magic Banana, appears with a time stamp corresponding to its starting time, has had zero bytes from the get-go (as in my usual experience) but several five minute sleep periods have gone by, and there's no sign of any changes in the output file. George Langford
Re: [Trisquel-users] How can awk as a substitute for paste
Regarding the non-scholarly: | sed 's/()/(No_DNS)/g' | tr -d '()' Nmap writes parentheses with nothing between them when it receives no PTR data for an IP address; it also writes parentheses around an IP address whose PTR has been received. My expression puts the IP addresses all in one column and the PTR responses (including No_DNS) in a different column. Yes, it is redundant; I really should have written: | sed 's/()/No_DNS/g' | tr -d '()' There are still two expressions, and only two fewer characters among them. Those sort -k 1's are indeed redundant, but writing the extra characters reminds me to check where the pertinent column actually is, and I don't perceive any harm in the result. Data buffers: I actually do understand, because I watch the System Monitor while the nmap scans are ongoing, and there's a dip in the network traffic about every 100 seconds, presumably while those buffers are being updated. The RAM usage climbs steadily as a number of simultaneously running scripts gather their multi-megabyte data. Also, losing network connectivity does not cause errors to appear in the accumulated data. "the counts that uniq -c outputs are removed immediately after" I have had trouble with sort -u and uniq -u, whereas uniq -c works every time; and discarding the counts doesn't tax the HDD because that write isn't otherwise saved. "Save the script I wrote in a file ... " Alas, the script puzzles me, partly because I can find no man page for "fi" but mainly because it is composed with shorthand expressions which I don't comprehend. As I rarely mix nmap script executions with other tasks, mainly because everything else is dramatically slowed down when there are a dozen scripts running, can I be assured that this sync script is a generalized one that I don't have to modify at all ? There won't be very frequent updates (think five minutes (5m) because that would make filename updates every 25 seconds while a dozen scripts are running; greater frequency would have the list of files changing too frequently. After changing Magic Banana's sync program to increase the sleep interval to five minutes (period=5m) I saved it as filename.bin, made it executable with chmod +x filename.bin, and tried to start it with sudo .filename.bin ... alas, Terminal's response was "command not found." I couldn't find that file either ... ".filename.bin" is lost. I actually called it SyncIPv4.bin. George Langford
Re: [Trisquel-users] How can awk as a substitute for paste
Here's the result from my first thoughtful guess: while sleep 300; do sync -f /home/george/Desktop/May2020/nMapScans/ScoreCards-IPv4/SyncIPv4/; done | time sudo nmap -Pn -sn -T4 --max-retries 8 -iL Addresses.IPv4.May2020.37.txt -oG - | grep "Host:" '-' | awk '{print $2,$3}' '-' | sed 's/()/(No_DNS)/g' | tr -d '()' | uniq -c | awk '{print $3"\t"$2}' '-' | sort -k 1 > SyncIPv4/Random.IPv4.May2020.37.nMap.primary.oG.txt Nothing appeared until the script finished, whereupon time added the following: 195.00user 99.48system 4:15:16elapsed 1%CPU followed by sync's responses: (0avgtext+0avgdata 13032maxresident)k0inputs+0outputs (0major+2486minor)pagefaults 0swaps without any actual conclusion of the script; the prompt hasn't reappeared yet, even though there is no network activity. I used Ctrl+C to regain the prompt. Where was sync saving the cached output during the 4-1/4 hours the script was running ? There were many singular PTR records, a modest number of No_DNS responses, and few, if any, multi-address PTR's.
Re: [Trisquel-users] How can awk as a substitute for paste
While all this coding was going on, I discovered a flaw in my logic, whereby my method of separating not-looked-up IPv4 addresses from the Recent Visitor data was extracting some lookalike IPv4 data from the hostnames, including impossible addresses. I used comm to select only those addresses that were in both the calculated list and the Recent Visitor data, reducing the list of extracted two-octet prefixes from over 50,000 to around 27,000. Naturally, now that Magic Banana tells us that awk won't do what I expect, sure enough, it stopped causing those user-friendly intermediate writes. The man sync and info sync pages are not helping me; as yet I haven't the slightest clue where sync stores the data in persistent storage; if I knew, I could watch it develop. There are now 38 files to be created with long-running nmap scripts; if I could write one sync script before I start and not stop it until the last of those 38 files finishes, and still be able to watch the occasional disk writes so I could evaluate progress of the scripts, that would be ideal. If I could list the 38 output files in advance, that would save a lot of scrambling while the nmap searches are onging about ten at a time. The good news is that when nmap started complaining about 942.103.*.* addresses, I used grep to search all the Recent visitor files and found (in a few seconds) three instances of: as45942.103.28.157.43.lucknow.sikkanet.com Dig -x reveals that the actual IP address is 103.28.157.43, not 942.103.28.157, neither of which ought to contribute to the two-octet prefix list (but 103.28 got in there anyway !) If I create a subfolder called Sync under the May2020/IPv4 folder and store a list of the expected output filenames to which sync is to be applied in a text file there, can sync be made to activate and save cached data to the appropriate filename as soon as the nmap script is started ? George Langford
[Trisquel-users] How can awk as a substitute for paste
Some time ago I noticed that an awk script piped into a long-running nmap script caused the overall script to make relatively frequent saves of the output file, providing some assurance that the script was making progress and providing a record of its accomplishment. That occasionally allowed me to restart the script (as in the case of a loss of mains power) to pick up where the script stopped by truncating its source file and thereby avoiding having to run the entire source file from the beginning. Here's such a script: awk '{print $1}' 'B-May2020-L.768-secondary.txt' | sudo nmap -6 -Pn -sn -T4 --max-retries 16 -iL '-' -oG - | grep "Host:" '-' | awk '{print $2,$3}' '-' | sed 's/()/(No_DNS)/g' | tr -d '()' | uniq -u > May2020-L.768-secondary.oGnMap.txt There is no other need for awk '{print $1}' 'B-May2020-L.768-secondary.txt' because it's not actually changing the file, but the result of its use is to provide the extremely helpful & protective effect of causing the overall script to make frequent saves to HDD. Now I am using a paste script immediately before the nmap script, but this similar use of the pipe process is not causing any peridic saves to take place. With fifteen such scripts running at once, a lot of data can disappear if the mains power is lost in a windstorm. The source file in each example is simply a list of IPv6 or IPv4 addresses made up in parts of real and randomly generated blocks with nothing that requires rearrangement or sorting. A crude way of accomplishing this could be to insert a similar redundant awk script between the paste and nmap sections of the main script, but is there a more geek-like way of forcing the script to make periodic saves ? That method need not be especially efficient, as my network's sending/receiving speeds appear to be the rate-limiting factors, presently about 30 kB/second. George Langford
Re: [Trisquel-users] Syntax problems with nested loops
Here's what I was trying to accomplish: Starting with Magic Banana's script: TMP=$(mktemp -u) trap "rm $TMP 2>/dev/null" 0 mkfifo $TMP awk '{ for (i = 0; ++i $TMP & od -A n -t u1 -w2 -N $(expr 1024 \* $(wc -l < "Prefixes.May2020.Slash16.txt")) /dev/urandom | tr -s ' ' . | paste -d '' $TMP - > TempMB42920.5160x512.txt Trial run, using the single prefix in Prefix.May2020.Slash16.txt ==> TMP=$(mktemp -u) trap "rm $TMP 2>/dev/null" 0 mkfifo $TMP awk '{ for (i = 0; ++i $TMP & [1] 31825 od -A n -t u1 -w2 -N $(expr 1024 \* $(wc -l < "Prefix.May2020.Slash16.txt")) /dev/urandom | tr -s ' ' . | paste -d '' $TMP - > Temp0710G.txt [1]+ Doneawk '{ for (i = 0; ++i
Re: [Trisquel-users] Syntax problems with nested loops
Alas, I could not make Magic Banana's script for generating reandomized IPv4 addresses work; most certainly by my failure to comprehend what's going on sufficiently. I tried constructing an od script, but the best I could do was to generate 3rd & 4th octets which violate the 255 maxiumum of the two-byte octets with no clear way of limiting the decimal equivalent octets. So I involved that famous, but maligned search engine, Google, where I found this gem: https://stackoverflow.com/questions/24059607/nmap-taking-a-random-sample-from-a-range-of-ip-ranges-is-combining-ir-and where it's said: nmap -n -sL -iL ip_ranges -oG - | awk '/^Host/{print $2}' | shuf -n 10 I tried that script out, but it looks up all the IP addresses in the entire CIDR block first and then shuffles the results before selecting a modest number of them ... after we have had to wait for the resolution of them all. But there's a better one leading the way: https://unix.stackexchange.com/questions/455081/how-does-this-script-generate-an-ip-address-randomly where it's said: nmap -n -iR 10 -sL | awk '/report for/ { print $NF }' but that's apparently all that any of the many nmap scripts are written to do. Then it dawned on me that I could pinch the 3rd and 4th octets and graft them onto the first two octets of my recent-visitor data's collected and not-looked-up IPv4 addresses: sudo nmap -n -iR 2286000 -sL | awk '/report for/ { print $NF }' - | sed 's/\./\t/g' '-' | awk '{print $3"."$4}' '-' > Octets.3and4.File01.txt awk '{ for (i = 0; ++i
Re: [Trisquel-users] Can a path statement be too long or the file too big?
Grep worries me because it selects a lot of names that aren't in the pattern file, even while the operation remains orderly and manages to select just a fraction of the target file's entries. I had thought that grep has the advantage of allowing me to identify long PTR records based on permutations of their IPv6 addresses, but such comparisons did not occur in the present set of patterns based on IPv4 addresses, where there weren't any examples of permutated IPv4 addresses in the target file's PTR's. Join selected just 141 matches, which were easy to recognize because those matches alone included the data in the pattern file's second column. Comm also selects those 141 matches; and I used join to restore their counts column. The join, sort, and comm-based scripts all were executed orders of magnitude faster than the grep script. The original pattern file and a randomized (sort -R) as well as reduced-length (one million+ to 300,000 rows) target file are attached, for which join as well as comm find 35 matches in short order.
Re: [Trisquel-users] Can a path statement be too long or the file too big?
Pressing on to delve into another part of the project ... Again another grep script in the same vein worked very hard for about twenty minutes and then disgorged the original target file in its output, using an increment of 4.7GB of RAM. Even with the target file moved into the working directory ... so that workaround failed. Faced with no alternative, this lazy semi-geek condensed the target file to a simple one-column list (which he ought to have done at the outset) and re-ran the grep script on the condensed target, with the happy response of a slowly rising output file (rising in kB incrementsrather than multi-MB), the same RAM usage (the additional 4.7GB), and usage of 520kB of SWAP, which it had not done before. Grep again took about twenty minutes to accomplish all this. Grep was clearly going through the same motions each time, but somehow its actual output has gotten overwritten in the first instance, but was protected in the second run. The target file was 63.5MB in the first run and still 26.5MB the second time, but now the output file has real grepped data without superfluous material, about 3MB and 71,000 rows of non-duplicated matching PTR's. One difference between yesterday and today is that the pattern file now has nearly 4,000 hostnames. What should I be seeing with the terminal ? dmesg |tail or dmesg |more
Re: [Trisquel-users] Can a path statement be too long or the file too big?
After thinking about my observation last night, it dawned on me that the script has to reach down through the directory structure of the host computer before negotiating the directory structure of the flash drive. Let's articulate that structure in the scripts as though they were being run from arbitrary working directories: grep -f /home/george/Desktop/May2020/nMapScans/Pattern-File-A.txt /home/george/FlashDrive/...4 steps.../Target-File-A.txt > Output-File-A.txt versus the unsuccessful script: grep -f /home/george/Desktop/January2020/DataSets/MB/Multi-addressed-HNs/Counts/Pattern-File-C.txt /home/george/FlashDrive/...4 steps.../Target-File-B.txt > Output-File-B.txt and after moving the target file: cd /home/george/Desktop/January2020/DataSets/MB/Multi-addressed-HNs/Counts ; grep -f Pattern-File-C.txt Target-File-B.txt > Output-File-C.txt Analogously to moving a multi-gigabyte file: mv hugefileA.txt /home/george/someplaceelse/folderB/folderC/hugefileA.txt which takes just the blink of an eye so long as the move takes place within the file structure of the storage medium, all that really matters is the number of characters in the path statement. The multi-megabyte target file moved in just a few blinks of an eye and can be moved back to its flash drive just as quickly ... I actually copied the file, so I can just delete it and then clear the trash file. In the present case it was painfully obvious that the unsuccessful script wasn't inaccurate or noisy but plainly wholly unsuccessful instead, as the output file was the exact same size as the target file. This is not a bug; it's a feature that is necessary for an historic or practical reason, whatever that may be.
[Trisquel-users] Can a path statement be too long or the file too big?
Faced with the task of using grep to find related strings in a target file located on a USB flash drive, with the strings located in a file in the local directory, I had success with the first two scripts, generalized to protect the innocent: grep -f Pattern-File-A.txt /home/george/FlashDrive/...4 steps.../Target-File-A.txt > Output-File-A.txt grep -f Pattern-File-B.txt /home/george/FlashDrive/...4 steps.../Target-File-B.txt > Output-File-B.txt These worked fine, with a satisfying number of matches. Continuing, I used a similar script with the same path length from a third pattern file to the second target file, but the output file turned out to be identical to the input target file, with zero matches. My non-geek workaround was simply to move the target file into the working directory, whereupon I met with success: grep -f Pattern-File-C.txt Target-File-B.txt > Output-File-C.txt The output was clean, with nothing irrelevant. Target file sizes were 32.2MB, 63.5MB, and 63.5MB (the last two being the same file). The Pattern files contained 385, 155, and 232 strings, respectively. The successful output files included about 5500, 4000, and 600 matches, respectively. There were no other scripts running during this exercise. The scripts each took a small fraction of a second CPU time, even the unsuccessful one. 8GB RAM, 18GB swap, T420 ThinkPad George Langford
Re: [Trisquel-users] Another uniq -u feature emerges
Sort has a problem, as illustrated with the attached pair of files, wherein the first application of sort below fails to separate the three sets of PTR's: sort -k 1 Temp0706AR.txt > Temp0706AS.txt My workaround does manage the separation OK without changing the file otherwise: awk '{print $1,$2}' 'Temp0706AR.txt' | sed 's/\./'\ '/g' | sort -k 3 | awk '{print $1,$2,$3,$4,$5"\t"$6}' '-' | sed 's/'\ '/\./g' | awk '{print $1"\t"$2}' '-' > Temp0706AS.txt The second script is working and follows my straightforward logic. The first script sometimes does work; why should that be so ? My source files contain IPv6 addresses for each PTR, but none of these PTR's can be otherwise resolved; even when dig returns an appropriate nameserver, that nameserver nearly always turns out to be unavailable. At times, a nonauthoritative nameserver will reply with an IPv4 address corresponding to ...barefruit.co.uk, a catchall site. George Langford
Re: [Trisquel-users] Another uniq -u feature emerges
Within 'info sort' there are sections: Ordering options:-b though -V Other options: --batch-size=NMERGE through -u and on to --version The lesson for me: when advised to read, read all of it ... George Langford, suffering the consequences of failing to appreciate the subleties of uniq -u and sort -u until very recently.
Re: [Trisquel-users] Another uniq -u feature emerges
Magic Banana quoted the magic words: "used by itself, -u causes 'uniq' to print unique lines, and nothing else" Put another way: Uniq -u skips duplicated lines altogether. Sort -u needs to be mentioned earlier on the man sort page, as it will handle unpredictable outputs reliably. Thanks are due to the always-conscientious teacher !
Re: [Trisquel-users] Another uniq -u feature emerges
Following up, I noticed a pattern among the outputs of| sort | uniq -u versus | sort u: The three files that I evaluated had 26.1GB, 12GB, and 2.0GB, repectively, among 1. The original file, the result of grepping about 10GB of nMap output files, with many duplicates; 2. The | sort -u file; and 3. The | sort | uniq -u file, the smallest of the three. I applied comm (with no arguments): comm IPv6-uniq.lns01.v6.018.net.il.txt IPv6-uniqB.lns01.v6.018.net.il.txt > IPv6-commAll.lns01.v6.018.net.il.txt An excerpt from this last script's output is attached; it has no Column $2 (files unique to the second (smaller) file; Column $3 (the less well represented among the two files) has nothing obviously different from the entries above & below. Not to contradict man uniq's description of uniq -u, but I'm suspicious. I'll be using sort -u from now on.
[Trisquel-users] Another uniq -u feature emerges
Over a year ago, I lamented that sort followed by uniq -u wasn't removing duplicates from a list: https://trisquel.info/en/forum/sort-and-uniq-fail-remove-all-duplicates-list-hostnames-and-their-ipv4-addresses Recently I've been faced with the results of grep searches in other files that overlap because they contain the same string on which grep was searching. After sorting the grep outputs, then cutting & pasting, I ended up with pairs of files that contain many duplicates because the strings were caught twice. grep -h lns03.v6.018.net.il *Rev.oGnMap.txt >> PTR.IPv6-Data/IPv6-lns03.v6.018.net.il.txt ; grep -h cable-lns03.v6.018.net.il *Rev.oGnMap.txt >> PTR.IPv6-Data/IPv6-cable-lns03.v6.018.net.il.txt The grep outputs were expected to list the PTR record in the first column and the corresponding IPv6 address in the second column, because I reversed the order of those columns in the outputs of the originsl nMap -oG searches as well as removing the parentheses enclosing the IPv6 addresses. In the sorting scripts below, $1 is the PTR and $2 is the IPv6 address, except for the uniq -c script where I printed $2 and $3 to skip the counts column produced by uniq -c. Here are the three pairs of scripts intended to consolidate the files: sort IPv6-lns03.v6.018.net.il.txt | uniq -u > IPv6-uniq.lns03.v6.018.net.il.txt ; sort IPv6-cable-lns03.v6.018.net.il.txt | uniq -u > IPv6-uniq.cable-lns03.v6.018.net.il.txt sort -k 2 IPv6-lns03.v6.018.net.il.txt | uniq -c | awk '{print $2"\t"$3}' '-' > IPv6-uniq.lns03.v6.018.net.il.txt ; sort -k 2 IPv6-cable-lns03.v6.018.net.il.txt | uniq -c | awk '{print $2"\t"$3}' '-' > IPv6-uniq.cable-lns03.v6.018.net.il.txt sort -u IPv6-lns03.v6.018.net.il.txt > IPv6-uniqB.lns03.v6.018.net.il.txt sort -u IPv6-cable-lns03.v6.018.net.il.txt > IPv6-uniqB.cable-lns03.v6.018.net.il.txt The first pair produced zero bytes output for both scripts; the original files were not zero. The second pair reduced both files by half as expected. Then I remembered to check this forum, wherein Magic Banana had suggested using sort -u instead of the first pair's combination of sort and uniq -u. This third pair produced the exact same halving of the original file sizes as my less efficient use of uniq -c and awk to eliminate the counts column. Thank you again, Magic Banana ! I had tried to "fix" the uniq -u debacle of the second pair of sorting scripts by copying the affected file names directly from the File manager into the script text, as that has been a useful workaround in the past, but this time the first pair of sorting scripts produced zero bytes output again, same as did my first attempt. What is it about uniq -u of which I should be wary ? George Langford
Re: [Trisquel-users] Clientes do Banco do Brasil: reclamem do software proprietário de novo de novo!
For those who may wrongly suspect that the writers are up to something ... Google Translate claims that Magic Banana said: Another thread to invite Banco do Brasil customers to complain in the new BB survey of proprietary software for remote access to their accounts. See Link 1 and Link 2 for examples of texts I wrote in previous years. I did a mix of the two this year. To which augustoborin replied (according to Google Translate): I find these banks absurd. The apps of the Federal Savings Bank and Banco do Brasil are so invasive. They ask for access to location, contacts, storage and your soul and so much more. The alternative I use is to use a virtual machine, but it is not practical, besides being slow on older computers. I would like to know if anyone will ever discover a digital bank really concerned with these issues. Perhaps they are motivated by attempts to track money laundering and tax evasion or to prepare for escheatment in the light of the ongoing pandemic. The U.S. Congress is similarly unfriendly to Linux, although I did find one browser in the Trisquel apps that deals with the captcha anti-robot feature satisfactorily. George Langford
Re: [Trisquel-users] Permissions on a new USB flash drive
boba inquired about my experience with the full-up 256GB thumb drive: What I wrote: I have another 256GB thumb drive which was never reluctant to accept new files, but I don't remember what I did right with that one that I'm not doing today. It's full, too, however. "Could you check which file system its partition is using? If it is using fat, then the explanation is complete." GParted tells me the file system is msdos (i.e., same as the drive vendor used). Formatted to ext4. "Permissions not determined." I could read & write right off. I'll apply Magic Banana's chown solution sudo chown -R $USER:$george IPv4 Afterwards, its little red light blinked for a couple of minutes while it was doing it recursive thing. Properties: I've got the 755 permissions now; 77,474 items, totalling 226.1 GB. Thanks for looking into this. Your theory is correct, it seems. Linux is more secure ! George Langford
Re: [Trisquel-users] Permissions on a new USB flash drive
Magic Banana comes to the rescue yet again: "With such permissions, only the owner (root, apparently) can write. You can change the owner (and the group) with 'chown':" $ sudo chown -R $USER:$USER Thumb256A Which I successfully applied as follows: $ sudo chown -R $USER:$george Thumb256A I was previously unaware of that usage of chown; even Google didn't help. George Langford
Re: [Trisquel-users] Permissions on a new USB flash drive
boba provided some historical experience of successs ... Actually, GParted performed the necessary operations to remove the original msdos/ext4 and replace it with gpt/ext4 in about five minutes; then I used sudo chmod -R 755 Thumb256B as before, but still its permissions are staying the same. So there's another issue involved; except that I _really_ need to offload data. George Langford
[Trisquel-users] Permissions on a new USB flash drive
After years of accumulating scan results, my HDD is filled up to within 2GB of its 1TB capacity, so operations of most kinds are excruciatingly slow. It really does not like editing 2GB text files in LeafPad. I acquired a new SanDisk 256GB drive, removed its partition table, and formatted the new partition ext4 with GParted. Using sudo, I created folders and even changed permissions using sudo chmod -R 755 Thumb256A ... but they don't change: still root, through & through. The thumb drive mounts itself when I place it in the USB port, but the permissions impasse won't let me cut & paste into it. I have another 256GB thumb drive which was never reluctant to accept new files, but I don't remember what I did right with that one that I'm not doing today. It's full, too, however. The 'puter is a Lenovo T420 ThinkPad running flidas with 8GB of RAM and 18GB of swap space. I'm not used to devices that are more difficult to use with trisquel than was the task of getting them out of the maker's packaging. George Langford
Re: [Trisquel-users] Are there better hostname resolvers than dig or nslookup ?
Magic Banana grumbled: "Still no expected output..." Maybe our postings passed each other by like ships in the night ... I really did add several example files attached to their relevant postings or referenced to a preceding posting. Maybe a list of what appears to be missing would help. I was careful to test all the scripts with their source files. "If the so-called "painfully obvious IPv4 addresses" are those that my last sed's substitution extract, ..." Not so. Too much is being read into my hyperbolic remark. Those pesky IPv4 addresses are listed, thoroughly intermixed, with the looked-up PTR data. They're difficult to unmix, because sorting doesn't discriminate between numerical parts of PTR's and real IP addresses. IPv6 addresses are easy, because they can be found with grep by searching on the colon ':' separator. Repeating: those IPv4and IPv6 addresses that weren't looked up by the website's apache server had no DNS service, and so are in classes by themselves. The addresses that Magic Banana's excellent scripts extract can often be tested by applying reverse DNS, but I'm finding that some IPv6 addresses that came to my attention by means of nMap scans can be rather refractory, sometimes because their server has been powered down or because new PTR records have been applied to them. I even found one set of addresses that came back online between a Sunday nMap scan and a Tuesday re-scan. Magic Banana's fourth script in his June 23 (17:00) works convincingly: sed 's/[^0-9]*\([0-9]\{1,3\}\)[^0-9]\{1,5\}\([0-9]\{1,3\}\)[^0-9]\{1,5\}\([0-9]\{1,3\}\)[^0-9]\{1,5\}\([0-9]\{1,3\}\).*/\1.\2.\3.\4/' IPv4-SourceList.txt | paste IPv4-SourceList.txt - > FourthScriptOutput.txt Bear in mind that various obfuscation schemes may have been applied to these PTR's, as Magic Banana addressed some time ago with more sophisticated scripts, so the IP addresses have to be checked by reverse DNS against their PTR records as recorded in the Recent Visitor data. I'll check these with: awk '{print $2}' 'FourthScriptOutput.txt' > Temp0623E.txt ; dig -x -f Temp0623E.txt | grep -A 1 "ANSWER SECTION:" '-' | awk '{print $5} '-' |more The output of this ad hoc script is simply, 92.242.140.21, which is the catch-all for unresolvable addresses. That's the same as the apache server reported for those IPv4 addresses in May 2020. Their CIDR/24 scans may be more revealing. That said, my check does not discover any IPv4 addresses that inadvertently came from any of the PTR's. Incidentally, the IPv6-List.txt file gives the identical result, which is as expected, as there are no PTR's with colon ':' separators. In the meantime, collecting multi-addressed PTR's lurking in CIDR blocks has been proving very fruitful, especially when applied to the CIDR/32 portions (i.e., the left-most two hextets of the not-looked-up IPv6 addresses in the Recent Visitor data, as well as the right-most octets of the not-looked-up IPv4 addresses in the same Recent Visitor data). There remain a great many simple PTR records with no embedded IP data at all or with inscrutible obfuscations that can only be resolved by the nMap searching scripts which are filling up my storage media. Thank you for your continuing constructive analyses. George Langford
Re: [Trisquel-users] Are there better hostname resolvers than dig or nslookup ?
After a good night's sleep, and while still waiting from some very slow nMap scans to finish, I found a roundabout way of separating those pesky IPv4 addresses from the source list without accidentally dismembering any PTR's that deserve to be treated differently later. Start here with the fourth script from my June 22 posting, applied globally, but with a larger original source file, one-tenth of the Recent Visitors collection: grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' SourceList.0pt1.txt > IPv4-ListB.txt Continue with my two-minute drill, adding a modified fourth script from June 22: awk '{print $1}' 'SourceList.0pt1.txt' | sed 's/\./\t/g' '-' | awk '{print $1"."$2"."$3"."$4"$"$5}' '-' > IPv4-ListC.txt; grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\$' IPv4-ListC.txt | tr -d $ | sort -nrk 1 > IPv4-Only-List.txt Use comm to match IPv4-Only-List.txt with SourceList.0pt1.txt sort -k 1 SourceList.0pt1.txt > Temp0623B.txt; sort -k 1 IPv4-Only-List.txt > Temp0623C.txt; comm -12 Temp0623B.txt Temp0623C.txt > Clean.IPv4-List.txt ; rm Temp0623B.txt Temp0623C.txt The IPv4 addresses found in Clean.IPv4-List.txt look OK to me; however, there should be an effort to come up with a verification script. Try grep: grep -hf Clean.IPv4-List.txt SourceList.0pt1.txt |more This script starts out OK, with a nice list of IPv4 addresses, but soon turns into a memory hog that has to be stopped. George Langford
Re: [Trisquel-users] Are there better hostname resolvers than dig or nslookup ?
Magic Banana lamented, "You attached an input but still no expected output." ... and then jumped to the conclusion, " Assuming you again want to extract the IPV4 addresses..." While that's the long-term (end of next week ...) goal, the immediate concern is to find a reliable script to separate the painfully obvious IPv4 addresses, not from the bodies of the PTR's, but from the list of them. My subterfuge of switching from the dot '.' separator to the $ separator for the fifth column at least cleans up the visual aspect of the sorting problem. The output of that script flags the fourth octets of the proper IPv4 addresses with a trailing dollar sign. I assumed that the 2nd script would provide its own answer, as it's laden with 53 unintended IPv4 consequences. The 1st script's output is 100% IPv6 addresses, with no stowaways. The NoIPv6-List.txt file has PTR's with glaringly obvious IPv6 origins, but no actual IPv6 addresses that you could resolve with dig -x, plus one IPv4 leftover. Those IPv4 and IPv6 addresses that I'm trying to cut out of the herd are special, because they weren't looked up by that apache hostname-lookup option (which to apache's credit is deprecated in their instructions), because the associated server was unavailable or misconfigured. Now I want to see what else is doing on those servers, a few weeks or months later. Once the heifers have been separated from the bulls, then the ensuing tasks are a little easier. Analyzing the contents of the near-infinite address spaces of IPv6 CIDR blocks is best addressed by Magic Banana's random- selection of IPv6 addresses to be searched with nMap scripts, whereas the very cramped space of IPv4 CIDR blocks can be addressed by inquiring with more direct scripts. Finding those multi-addressed hostnames in the outputs of scripts that provide answers in CPU time scales is a huge step forward compared to nosing around, one hostname at a time, for the finite data gathered by a few Internet watchdogs that is agglomerated in the near-infinite data hoarded by Google. The geek definitely is on a better track than the uneducated plodder. George Langford
Re: [Trisquel-users] Are there better hostname resolvers than dig or nslookup ?
Magic Banana suggests that I cite a few ferinstances: I've changed a couple of the filenames to separate them in my tenuous logic: First script: grep ":" IPv6-SourceList.txt | sort | uniq -c | awk '{print $2}' '-' > IPv6-List.txt Second script: grep -v ":" IPv6-SourceList.txt | sort | uniq -c | awk '{print $2}' '-' > NoIPv6-List.txt #note: IPv6-SourceList.txt originally had only one IPv4: 2.63.83.182 Fourth script: grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' NoIPv6-List.txt > IPv4-List.txt Ref: https://superuser.com/questions/202818/what-regular-expression-can-i-use-to-match-an-ip-address #note: Now there are quite a few (53) additional IPv4's that have crept in because of the weakness of the script. I also tried a "two-minute drill" by splitting all the addresses on the dots "." and reassembling the first four octets with a $ as the separator between the fourth octet and the hostname-remnant in the fifth column: awk '{print $1}' 'IPv4-SourceList.txt' | sed 's/\./\t/g' '-' | awk '{print $1"."$2"."$3"."$4"$"$5}' '-' > IPv4-List.txt At this writing I'm stumped by the task of sorting the $-separated file to capture just the rows containing proper IPv4 addresses. This script at least shouldn't snatch any IPv4 prefixes from the dot-separated PTR's. Note: IPv6-SourceList.txt and IPv4-SourceList.txt were each extracted from the same original multi-megabyte source file. George Langford
Re: [Trisquel-users] Are there better hostname resolvers than dig or nslookup ?
In preparation for applying Horrible sed's in their various forms, I set about the task of extracting the IPv6 and IPv4 addresses from my list of gratuitously looked up hostnames gleaned from nearly two hundred sets of publicly available recent visitor data. It's a long list, way too big to use my Libre Office Calc. crutches. Separate the IPv6's from the GLU hostnames from which the IPv4's have just been removed: grep ":" SourceFile.txt | sort | uniq -c | awk '{print $2}' '-' > IPv6-List.txt It's difficult to use comm; I used the invert-match option in grep instead: grep -v ":" SourceFile.txt | sort | uniq -c | awk '{print $2}' '-' > NoIPv6-List.txt The NoIVp6-List.txt still has a lot of not-looked-up IPv4 addresses. Ref.: https://superuser.com/questions/202818/what-regular-expression-can-i-use-to-match-an-ip-address grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' /etc/hosts Applied here: grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' NoIPv6-List.txt > IPv4-List.txt Alas, this IPv4-List file includes many addresses extracted from the hostnames where '.' is used as the separator. I'll be doing that with a sed script later, but now I want the IPv4's that were not gratuitously looked up (because they couldn't be ?). Reversing grep would erase those legitimate PTR's. Is there sed-way of doing this separation? George Langford
[Trisquel-users] What limits the rate of data transfer on a local area network ?
For months my ThinkPad T420 has been chugging along, running nMap-based scripts to gather Internet address data. "Not quickly enough," I thought. So I dragged out another ThinkPad T420, updated & upgraded it to the latest flidas spec's, and started another batch of scripts to fill in the blanks. Whereupon a revelation appeared: The System Monitors started displaying a limitation of which I was utterly unaware: When the data transfer rate on one 'puter goes up, the rate on the other 'puter goes down, as though they're playing by the rules of a game that's not under the control of either 'puter. Also, the data transfer rate averaged between those two 'puters is about half what it was with one 'puter running those scripts by itself. The two 'puters are obviously on the same LAN, with each one connected wirelessly to the router. They have no remote access arrangements of one to the other. The router is on its own FIOS LAN, judging from the various locations where its server appears to b, from GPS analysis based on its IP address. I'd been adding scripts, each one running from its own terminal, until all the ups and downs in the data transfer rate averaged out to a smooth, nearly flat wiggly line ... indicating that more scripts could not extract any more data than the limiting rate, which was about 90 kB per second. With two 'puters, the total limiting rate is still about 90 kB per second, so I'm no worse off than before. Is there a way of nudging that limiting rate upwards ? I can access my router's admin page, but there's no place I can find that suggests any control of the bit rate... George Langford
Re: [Trisquel-users] Are there better hostname resolvers than dig or nslookup ?
Actually, a few came along a little sooner, along with their addresses: 2a00:1110:253:910b:5ca1:eb1e:c7f9:9ae1 2A0011100253910B5CA1EB1EC7F99AE1.mobile.pool.telekom.hu 2a00:1110:656:55f8:f243:e02e:c1dc:21d1 2A001110065655F8F243E02EC1DC21D1.mobile.pool.telekom.hu 2a00:1110:22a:ff9b:5fc4:7808:faa2:51a3 2A001110022AFF9B5FC47808FAA251A3.mobile.pool.telekom.hu awk '{print $2}' 'SetA.txt' | sed 's/.*\([[:xdigit:]]\{4\}\)\([^[:xdigit:]]\{0,1\}\)\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\).*/\1:\3:\4:\5:\6:\7:\8:\9/g' > SetA-output.txt The answers are: 2A00:1110:0253:910B:5CA1:EB1E:C7F9:9AE1 2A00:1110:0656:55F8:F243:E02E:C1DC:21D1 2A00:1110:022A:FF9B:5FC4:7808:FAA2:51A3 And dig -x is happy now. George Langford
Re: [Trisquel-users] Are there better hostname resolvers than dig or nslookup ?
Magic Banana's Horrible-Sed script has potential: awk '{print $0}' 'MB-HorribleSed-Set01.txt' | sed 's/.*\([[:xdigit:]]\{4\}\)\([^[:xdigit:]]\{0,1\}\)\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\).*/\1:\3:\4:\5:\6:\7:\8:\9/g' > Set01-output.txt Where MB-HorribleSed-Seto1.txt is: a7d74f79-4640-4158-b5c2-3097778fe363.fr-par-2.baremetal.scw.cloud jobqueue-listener.jobqueue.netcraft.com-u840912b2930611eab47d156d838d6ab1u-digitalocean jobqueue-listener.jobqueue.netcraft.com-u8af4d1e48e2711ea94c96760838d6ab1u-digitalocean-2gb jobqueue-listener.jobqueue.netcraft.com-ubd544f468e2411ea94c96760838d6ab1u-digitalocean-2gb The first hostname is a mess of hexadecimal characters, but the script deciphers the other three nicely. Alas, dig -x doesn't make any headway with the resulting pretty IPv6 addresses. Here they are: 8409:12b2:9306:11ea:b47d:156d:838d:6ab1 8af4:d1e4:8e27:11ea:94c9:6760:838d:6ab1 bd54:4f46:8e24:11ea:94c9:6760:838d:6ab1 Note that the first hextet is a bald-faced lie; IPv6 hasn't gotten there yet. There's further obfuscation at play. There's no 'gotcha' here. These hostnames were plucked from the wild; I may know in a week or so if my onging nMap scans come up with some similar PTR records ... George Langford
Re: [Trisquel-users] Are there better hostname resolvers than dig or nslookup ?
Continuing this thread: Long pointer records like the following aren't easily resolvable by dig but their embedded IPv6 addresses are plainly visible: 2001-1c05-0001-e6e5-99c1-3191-b0e7-aca8.cable.dynamic.v6.ziggo.nl 2806-1000-0001-ae95-7768-0cea-3f9e-7ba0.ipv6.infinitum.net.mx 2603-6000-0001-60a5-1f58-fd79-7344-53fc.res6.spectrum.com 2a02-8388-57ba-fb8e-f2f7-e78c-7468-2b2c.cable.dynamic.v6.surfer.at 2001-b011-0001-4a39-6db8-b711-8a27-3281.dynamic-ip6.hinet.net dynamic-2a00-1028-7f41-b8de-5e49-2809-729d-0736.ipv6.broadband.iol.cz These addresses can be recovered with an awk script and some obvious editing: awk '{print $1}' 'LongNames.txt' | sed 's/\-/\:/g' > LongNamesIPv6.txt Here's one from which the separators (:) have been removed: 2a01cb0c000110f46cbc7ba4c95fcb23.ipv6.abo.wanadoo.fr That's a bit more challenging and can be solved by trial-and-error relocation of the separators until dig -x returns the original hostname: dig -x 2a01:cb0c:0001:10f4:6cbc:7ba4:c95f:cb23 ==> IN PTR 2a01cb0c000110f46cbc7ba4c95fcb23.ipv6.abo.wanadoo.fr
[Trisquel-users] Are there better hostname resolvers than dig or nslookup ?
Here are three hostnames that do not have embedded addresses: dbbvxj3yl0rc53l1vf8ct-4.rev.dnainternet.fi node-1w7jr9wnqcus27otj0ldvrzd4.ipv6.telus.net ptr-gooru9pqyo5unun4j40.18120a2.ip6.access.telenet.be Dig will return authoritative nameservers for the first and third of the above three hostnames, but further requests either timeout or return failure responses. Are there more sophisticated name resolvers available ? George Langford
Re: [Trisquel-users] strange UFW blocks on startup
Magic Banana said: By default, Trisquel does not have a root user. As a consequence, it cannot log in, even locally. When mate's System Monitor is running, under Processes, the following programs appear with user set to root: ksoftirqd/0, ksoftirqd/2, or ksoftirqd/3, and Xorg occasionally, as the need occurs. I've even seen "nobody" running dnsmasq. It would be a relief to know that this is normal operation. The rest of us get root privileges with sudo. George Langford
Re: [Trisquel-users] How to turn off touch-pad, when mouse is plugged in?
On my flidas OS Desktop, under System/Preferences/Hardware, there's a selection called "Mouse" where there's a menu with "General" at the top (highlighted) and "Touchpad." You'll want to make sure that the Touchpad is _not_ "enabled." George Langford
Re: [Trisquel-users] The best way to block site in Trisquel
Reliable IPv4 or IPv6 addresses generally come from the headers of emails, which are revealed by pressing the Ctrl and (u.c.) U keys simultaneously. The validity of an IP address can be confirmed by reverse DNS lookup (dig -x). Browsers reveal only hostnames, not IP addresses. Internet searches on hostnames can reveal a bewildering number of IP addresses collected by anti-spam websites. Dig and nslookup are no help at all for these multi-addressed PTR's. Internet Service Providers generally perform gratuitous hostname lookups on the IP addresses that are required for reassembling the incoming data packets. Sadly for the present state of the Internet, some servers are packed with large numbers of identical pointer records (PTR's) that obfuscate the particular IP addresses of those PTR's. There is also a population of open proxies that perform a similar address-obfuscation "service." Internet Service Providers ought to provide both the IP address and the PTR record of each data request to a user's domain. Until they do, the only block that will work is on the entire CIDR address space of the offending server. George Langford
Re: [Trisquel-users] The best way to block site in Trisquel
Reliable IPv4 or IPv6 addresses generally come from the headers of emails, which are revealed by pressing
Re: [Trisquel-users] Syntax problems with nested loops
About my playing with the end-of-line character (\n) ... Applying nMap's grepable-output function, as in this script: sort TestGrepOutnMapNine.txt | sudo nmap -Pn -sn -T4 --max-retries 16 --script asn-query -iL '-' -oG - | grep "Host:" '-' > TestGrepOutnMapNineGrep.txt eliminates the blank lines without further ado. It skips the --script asn-query data, but for reconnaissance purposes, that's OK. Getting back to the end-of-line characters: the output of a similar nMap scan without the -oG argument is attached. Incidentally, with all the worldwide home schooling and online university classes, these scans are currently (April-May 2020) intolerably slow, 100bytes/sec instead of the former 70kbytes/sec. Nevertheless, my T420's CPU is working very hard to process what little data it's getting. George Langford
Re: [Trisquel-users] Syntax problems with nested loops
Alas, sed won't let me play with the end-parentheses character, even when I'm using it as a handle to select the immediately following newline character and to select where to position the substitute tab: [)n] to be replaced by [)t]. I haven't worked out a suitable adaptation of Magic Banana's suggested syntax: $ sed 's/\\[nt]\(\\[nt]\)*/\t/g' Another tack would be to use tr -d '()' to eliminate the pesky parentheses and then attach the \n's to the preceding digit (0 through 9) of the IPv4 address. I was hoping to remove the newline characters that could be associated with adjacent strings on the same line and replace them with tabs next to those strings, and at last remove the \n's that appear as the sole characters on the intervening lines. The goal is to remove all the newline characters except those that are after the following two types of string sequences: "Origin AS: 701" where the last digit is 0 through 9 and "See the result for 100.19.104.32" where the last digit is 0 through 9 Success in that endeavour would give one line for each IPv4 address in the nmap results. The excess tabs are easier to control. George Langford
Re: [Trisquel-users] Syntax problems with nested loops
Magic Banana's magnificent script to fill out the listing of prefixes with random octets looks like this for a specific set of two-octet prefixes: TMP=$(mktemp -u) trap "rm $TMP 2>/dev/null" 0 mkfifo $TMP awk '{ for (i = 0; ++i
Re: [Trisquel-users] Syntax problems with nested loops
Following Jaret's contribution, and also restructuring my script according to this link: https://www.linuxquestions.org/questions/programming-9/nested-while-loops-for-bash-novice-4175439318/ !/bin/bash retnuoc=1 counter=1 for (( retnuoc ; retnuoc < 3 ; retnuoc++ )); do for (( counter ; counter < 41 ; counter++ )); do # xiferp=(`expr tail -n+$retnuoc SS.IPv4-NLU-January2020-26Prefixes.txt | head -n1`) prefix=185.45; dd if=/dev/urandom bs=2 count=1 2>/dev/null | od -An -tu1 | tr -s ' ' . | sed s/^/$prefix/ >> Temp41922A.txt done done This script generates one list of 40 IPv4 addresses while the expression "xiferp=(`expr tail..." is commented out. The first step in any solution must be to find out why bash doesn't like the syntax of: xiferp=(`expr tail -n+$retnuoc SS.IPv4-NLU-January2020-26Prefixes.txt | head -n1`) which also works OK if $retnuoc is replaced by an integer like 1 or 2 (as it's limited in the actual script). The second step will be to discover how to pass the prefix extracted from the file's list of 26 prefixes to: prefix=xiferp; dd if=/dev/urandom bs=2 count=1 2>/dev/null | od -An -tu1 | tr -s ' ' . | sed s/^/$prefix/ which computes OK if xiferp is replaced by a two-octet stand-in value such as 185.45. George Langford
[Trisquel-users] Syntax problems with nested loops
Here is a script that is intended to generate a series of IPv4 addresses in CIDR/16 address space: counter=1 while [ $counter -le 4096 ] do prefix=185.180; dd if=/dev/urandom bs=2 count=1 2>/dev/null | od -An -tu1 | tr -s ' ' . | sed s/^/$prefix/ >> Temp42120Works.txt ((counter++)) done The script produces a list of 4,096 IPv4 addresses, each of which starts with 185.180, in about ten seconds. It works OK in the Mate terminal, but not when I invoke bash (/bin/sh), which doesn't like counter++ and fails to stop generating IPv4 addresses as a result. The following code is an effort to expand the scope of the task to encompass an entire list of differently prefixed addresses: retnuoc=0 while [ "retnuoc" -le 26 ] do NUM=$(`expr $retnuoc + 1`) counter=1 while [ "$counter" -le 4 ] prefix=`expr tail -n+$NUM SS.IPv4-NLU-January2020-26Prefixes.txt | head -n1`; dd if=/dev/urandom bs=2 count=1 2>/dev/null | od -An -tu1 | tr -s ' ' . | sed s/^/$prefix/ >> Temp41921F13.txt counter=$(`expr $counter + 1`) done retnuoc=$(`expr $retnuocm+ 1`) done The first two octets are contained in the file SS.IPv4-NLU-January2020-26Prefixes.txt (attached) and extracted with the expression tail -n+NUM SS.IPv4-NLU-January2020-26Prefixes.txt | head -n1 which works OK when NUM is replaced by a number between 1 and 26 (in the present example). The second pair of octets is randomly generated with this expression wherein $NUM is replaced by an exemplar octet pair: prefix=185.39; dd if=/dev/urandom bs=2 count=1 2>/dev/null | od -An -tu1 | tr -s ' ' . | sed s/^/$prefix/ Alas, progress has stalled now that the script no longer produces complaints of syntax errors. If it were to work, 104 IPv4 addresses would be forthcoming, four for each of the twenty six octet pairs. George Langford 99.79 99.192 98.25 98.187 98.162 97.107 96.9 96.44 96.30 96.2 96.114 95.90 95.86 95.85 95.84 95.80 95.73 95.72 95.71 95.70 95.67 95.66 95.64 95.58 95.47 95.46
Re: [Trisquel-users] Scripting the random replacement of fields in an IPv6 address
Some more progress: What with all the gracious and informed help of Magic Banana, the extent of multi-addressed PTR records is becoming ever clearer. I collected all the PTR records from the never-looked-up IPv6 addresses which I found for January 2020, that had two or more addresses resolving to the same PTR. Then I applied these randomization techniques to the upper level fields of those IPv6 addresses, truncated thusly: field01:field02::/32, field01:field02:field03::/48, field01:field02:field03:field04::/64, or sometimes field01:field02::/112 or even field01:field02::field07:/112, and then adjusting the parameters of the randomization scripts to process a number of similar scripts as a batch so as to limit the number of IPv6 addresses to be resolved to about 10 million and the address file to less than 150MB. The end result: All of the multi-addressed PTR records this evaluated had more than a thousand additional addresses in the same CIDR blocks as those ones recorded for their IPv6 addresses, with some extending to a million or more addresses, all for the same-named PTR record. The next step in this procedure is to bring together the many additional singly addressed PTR records in the published recent visitor data so as to to find out which among them has been similarly obfuscated. One CIDR block which appeared to be essentially _all_ one PTR name last week was shut off and unavailable over the weekend, back in service on Monday, and further populated on Wednesday, indicating that it is being used dynamically to obfuscate sensitive payloads stored behind that unresolvably recorded PTR. George Langford P.S. You can do this analysis at home without setting foot outdoors. There are other months from which to choose, extending back into ancient history, before the 2016 U.S. election, or even to the present, i.e., though March 2020.
Re: [Trisquel-users] Scripting the random replacement of fields in an IPv6 address
Magic Banana continued our previous discussion: >> which seems rather realistic > I thought you were saying groups of four hexadecimal digits in real-world IPv6 addresses more often start > with 0 than not. I extrapolated that interpretation: "yet another distribution obeying Benford's law", I > thought: https://en.wikipedia.org/wiki/Benford%27s_law Quite a few very tall buildings have heights (in feet) which start with the numeral 1. Far fewer 2's. Buildings with heights starting in 3's have physical limitations. On the other hand, IPv6 addresses very often start with 2's; rarely 3's ... the address space starts with zero within the governing base, but that was arbitrary and had (to my knowledge) no physical limitation such as that which occurs within the radio spectrum. An address range such as 2a02:2788::/32 starts with zero: Therefore, 2a02:2788::::::0001 is the second address. 2a02:2789::/32 starts the same: 2a02:2789::::::0001 and may belong to another party. 2a02:2788::/31 has two cycles that look exactly like 2a02:2788::/32 and 2a02:2789::/32 even though they are in consecutive positions along the address line of the unbounded IPv6 address line that started at zero only once. There is no overlap and no sale of one address on the infinite IPv6 line more than once. The zeros in the hexadecimal version of the decimal 79,228,162,514,264,337,593,543,950,336 are fewer in number but are all in the same places in the six fields of four hexadecimal digits in 2a02:2788::/32 as they are in 2a02:2789::/32. Think of the successive rings on a dart board, the one labelled :000h being the smallest and the one labelled :0hhh the largest. Our randomized looks at the ;::/32 dart board ought to have similar relative numbers of :, :000h, :00hh, and :0hhh every time. Magic Banana also said: > Writing the prefixes with simple commands (rather than sed), may save some CPU cycles: $ prefix=2a02:aa1; sample_size=1048576; od -A n -N $(expr 12 \* $sample_size) -xw12 /dev/urandom | tr ' ' : | paste -d '' SS.IPv6-NLU-January2020-mobile.tre.se.txt Applied to another prefix, it finishes in three seconds; the grep's come out :0 ==> 336,650, :00 ==> 24,393, :000 ==> 1,537 & : ==> 98. Those steps come out ~15 to one. The advantage of this script is that I can scale it readily. The nmap script remains the rate limiting step in this exercise. I'm gathering prefixes for a marathon nmap session, but, with the randomized method of condensing, the impossible-to-scan verbatim prefix=2a02:aa1::/32 can be visualized in an hour. My laptop was formerly enduring CIDR/12's that took on the order of a thousand hours to return a result. Internet searches on PTR addresses gleaned from email databases remain a tedious roadblock to the evaluation of the gratuitously resolved addresses in the other two-thirds of published recent visitor data. Magic Banana, grading my homework, said: [quoting] awk 'NR >= 1 { print $5, $6 }' | awk 'NR >= 1 { print $2, $1 }' > Read again my previous post. Don't need to; that expedient script helped me navigate my logic, but can also be written: awk '{print $6,$5}' efficiently if I were to proofread it before publishing. I corrected it and am randomizing the 2a02:2788::/32 block, which has come back to life after being closed over the weekend: prefix=2a02:2788; sample_size=1048576; od -A n -N $(expr 12 \* $sample_size) -xw12 /dev/urandom | tr ' ' : | paste -d '' SS.IPv6-NLU-January2020-host.dynamic.voo.beB.txt nmap -6 -sn -T4 -sL -iL SS.IPv6-NLU-January2020-host.dynamic.voo.beS.txt | grep "Nmap scan report for " - | tr -d '()' | sort -k5 | awk '{ print $6, $5 }' | uniq -Df 1 | sed '/^\s\s*/d' | awk '{ print $2 "\t" $1 }' >> Multi-SS.IPv6-NLU-January2020-host.dynamic.voo.beS.txt awk '{print $2,$1}' 'Multi-SS.IPv6-NLU-January2020-host.dynamic.voo.beS.txt' | sort -k 2 | uniq -cdf 1 | awk '{print $3"\t"$1}' '-' > Multi-SS.IPv6-NLU-January2020-host.dynamic.voo.beS.Tally.txt The count of "host.dynamic.voo.be" came to 983,391 out of a possible 1,048,576 (93.8%) George Langford
Re: [Trisquel-users] Scripting the random replacement of fields in an IPv6 address
Magic Banana asserted: " ... I believe groups of four hexadecimal digits chosen by local network administrators ..." Waitaminnit. The IPv6 addresses start at 1 and then climb one-by-one to an astronomical number like the 79,228,162,514,264,337,593,543,950,336 addresses in 2a02:2788::/32 which are boiled down to the more workable hexadecimal notation 2a02:2788::::::. Zeroes are just part of the originating consecutively realized decimal number, but they come less often than every 10th count as in ...7,8,9,a,b,c,d,e,f,10,11,12,13,14,15,16,17,18,19,1a,1b,1c,1d,1e,1f,20, i.e., every 16th count. In a dynamically addressed addressing scheme, the local network administrator might just be creating the appearance of that astronomical number. Just patch together an array of four-digit hexadecimal numbers with scripts like the ones with which we have been experimenting. That could be a large but not astronomical number of arbitrary addresses. One of your scripts generates these arbitrary addresses immeasurably quickly. Those addresses need only conform to the astronomical address space which he owns (rents ? ...bought at an ill-advertised public auction ? ... got for free ?). A script can be written that would call up a series of text-file addresses, create new addresses, populate each one with .html code or a script, upload the data to the new addresses on the appropriate server, delete the former addresses' code, store the new addresses on the local HDD, delete the replaced addresses, and move on without ever closing the cover of his laptop or running out of storage. Magic Banana's efficient script: prefix=2a02.2788; od -A n -N 12582912 -xw12 /dev/urandom | tr ' ' : | sed s/^/$prefix/ > SS.IPv6-NLU-MB4520A.txt produces 1,048,576 IPv6 addresses in three (a thousand one, a thousand two, a thousand three ...) seconds real time. Grep does the :0, :00 & :000 counting for us: grep -c :0 SS.IPv6-NLU-MB4520A.txt = 336,443 grep -c :00 SS.IPv6-NLU-MB4520A.txt = 24,358 grep -c :000 SS.IPv6-NLU-MB4520A.txt = 1,500 grep -c : SS.IPv6-NLU-MB4520A.txt = 93 which seems rather realistic, despite the original script's neglecting actually to count from the beginning. Evaluating our progress ... nmap -6 -sn -T4 -sL -iL SS.IPv6-NLU-MB4520A.txt | grep "Nmap scan report for " - | tr -d '()' | sort -k5 | awk 'NR >= 1 { print $5, $6 }' | awk 'NR >= 1 { print $2, $1 }' | uniq -Df 1 | sed '/^\s\s*/d' | awk '{ print $2 "\t" $1 }' >> Multi-SS.IPv6-NLU-MB4520A.txt Alas, 2a02.2788::/32 is gome; WhoIs returns "no one found at this address"... "Failed to resolve ..." is all that nmap gets. Try mobile.tre.se's 2a02:aa1::/32 instead: prefix=2a02:aa1; od -A n -N 12582912 -xw12 /dev/urandom | tr ' ' : | sed s/^/$prefix/ > SS.IPv6-NLU-January2020-mobile.tre.se.txt Forgot to count ... I thought something was wrong ... but there it was: 41MB with 1,048,576 addresses. nmap -6 -sn -T4 -sL -iL SS.IPv6-NLU-January2020-mobile.tre.se.txt | grep "Nmap scan report for " - | tr -d '()' | sort -k5 | awk 'NR >= 1 { print $5, $6 }' | awk 'NR >= 1 { print $2, $1 }' | uniq -Df 1 | sed '/^\s\s*/d' | awk '{ print $2 "\t" $1 }' >> Multi-SS.IPv6-NLU-January2020-mobile.tre.se.txt Network initially running ten times as fast as for the first nmap script ... took 63 minutes (55.2MB) ==> 1,048,534 mobile.tre.se's. Enumerating: awk '{print $2,$1}' 'Multi-SS.IPv6-NLU-January2020-mobile.tre.se.txt' | sort -k 2 | uniq -cdf 1 | awk '{print $3"\t"$1}' '-' > Multi-SS.IPv6-NLU-January2020-mobile.tre.se.Tally.txt Result: mobile.tre.se 1048534 Apply grep again: grep -c :0 SS.IPv6-NLU-January2020-mobile.tre.se.txt = 336,811 grep -c :00 SS.IPv6-NLU-January2020-mobile.tre.se.txt = 24,452 grep -c :000 SS.IPv6-NLU-January2020-mobile.tre.se.txt = 1,553 grep -c : SS.IPv6-NLU-January2020-mobile.tre.se.txt = 84 A million makes a pretty good sample size, and grep's actual counts have the 1:16:256:4093 (i.e, ~16 times per step). Those randomly generated four-character fields appear not to exclude the normal numbers of zeros as I first thought. In graduate school, I was at first trying to attain my Sc.D. in Materials Engineering, but my advisor wanted me to take one more course: Statistical Mechanics, which does thermodynamics from the beginning. I contacted the department's registration officer, and he said that I had spent enough time in graduate school (six years) and I had sufficient course credits for a degree in Metallurgy. That's how I escaped mathematical statistics. I got though my metallurgical consulting career with plain old experimentally based thermodynamics, which is what the elements do when left to their own devices. Never missed those statistical calculations until now. George Langford
Re: [Trisquel-users] Scripting the random replacement of fields in an IPv6 address
Magic Banana suggested a useful script to provide IPv6 addresses with :000x, :00xx, and :0xxx fields: $ prefix=0123:4567; sample_size=10; od -A n -N $(expr $sample_size \* 48) -dw8 /dev/urandom | awk -Mv prefix=$prefix -v PREC=64 'NR % 6 == 1 { printf prefix } { n = 0; for (p = 0; p != 4; ++p) n += $(p + 1) * 65536^p; n *= 6.3250068069543573221e-19; cdf = 1; for (i = 2; n > cdf; ++i) cdf += 1 / i; printf ":%04x", i - 2 } NR % 6 == 0 { print "" }' As a test, I applied the Magic Banana script to a specific CIDR block's prefix: prefix=2a02:2788 ; sample_size=4096; od -A n -N 196608 -dw8 /dev/urandom | awk -Mv prefix=$prefix -v PREC=64 'NR % 6 == 1 { printf prefix } { n = 0; for (p = 0; p != 4; ++p) n += $(p + 1) * 65536^p; n *= 6.3250068069543573221e-19; cdf = 1; for (i = 2; n > cdf; ++i) cdf += 1 / i; printf ":%04x", i - 2 } NR % 6 == 0 { print "" }' > IPv6-SS.IPv6-NLU-2a02.2788.MB4420-4096.txt That script generates a 164KB file with 4096 entries in about five minutes real time. Let's count the :0xxx, :00xx and :000x occurrences. See: https://www.tecmint.com/count-word-occurrences-in-linux-text-file/ Where it's said: grep -o -i mauris example.txt | wc -l grep -c -o -i :0 IPv6-SS.IPv6-NLU-2a02.2788.MB4420-4096.txt ==> 4095 grep -c -o -i :00 IPv6-SS.IPv6-NLU-2a02.2788.MB4420-4096.txt ==> 4053 grep -c -o -i :000 IPv6-SS.IPv6-NLU-2a02.2788.MB4420-4096.txt ==> 3599 Extending Magic Banana's reasoning about the relative frequency of occurrences of :0001, :0002 and :0003, the relative frequencies of the occurrences of :0xxx, :00xx, and :000x in a 4096-row list of IPv6 addresses ought to be 256/4096, 16/4096, and 1/4096, respectively. In a 65,536-address list, prefix::0/128 may happen just once. Then I used nmap to evaluate those addresses: nmap -6 -sn -T4 -sL -iL IPv6-SS.IPv6-NLU-2a02.2788.MB4420-4096.txt | grep "Nmap scan report for " - | tr -d '()' | sort -k5 | awk 'NR >= 1 { print $5, $6 }' | awk 'NR >= 1 { print $2, $1 }' | uniq -Df 1 | sed '/^\s\s*/d' | awk '{ print $2 "\t" $1 }' >> Multi-IPv6-SS.IPv6-NLU-2a02.2788.MB4420-4096.txt This script resolves 4064 of the 4096 addresses as host.dynamic.voo.be in fifteen seconds real time. Enumerating the output file from the nmap script: awk '{print $2,$1}' 'Multi-IPv6-SS.IPv6-NLU-2a02.2788.MB4420-4096.txt' | sort -k 2 | uniq -cdf 1 | awk '{print $3"\t"$1}' '-' > Multi-IPv6-SS.IPv6-NLU-2a02.2788.MB4420-4096.Tally.txt The output file reads: "host.dynamic.voo.be4064" That's because the first 32 of the 4096 addresses return NXDOMAIN. CIDR blocks with less intensely multi-addressed PTR's will reveal lists of all the different multi-addressed PTR's with these scripts. However, the more addresses that are included in the randomized search, the more (and different !) multi-addressed PTR's will be found. It would appear that one needs to concatenate the variously randomized lists of addresses, eliminate duplicates, and then apply the last pair of scripts to achieve a relatively accurate evaluation of the target CIDR block. Could it be that the 79,228,162,514,264,337,593,543,950,336 addresses in 2a02:2788::/32 are dynamically generated on demand ? George Langford
Re: [Trisquel-users] Scripting the random replacement of fields in an IPv6 address
Magic Banana suggested > ... you should abandon simple commands and find an implementation in your favorite programming language. < Alas, over a sixty-year career I have had to get a rudimentary knowledge of MAD (Michigan Algorithm Decoder) Fortran (with decks of cards punched on an O29 console), Basic & True Basic (a Kemeny product - my favorite). I reached a brick wall with True Basic when I could not get any form of scrolling to work on the PC-AT that I was using at the time (ca. 1998). The PC-AT had none of the graphics capabilities of an Apple product. There is a slim chance I can retrieve my original media for True Basic (intended for MS-DOS) and install that with the aide of Wine. Trisquel flidas has Basic-256, an educational form of Basic, so I'll start with that. I'm more than fifty years rusty in any form of programming. What currently stumps me is that I'd like to post-process the output of your all-in-one-step script by using its output file of IPv6 addresses (a list by substituting "-N 3072" for "-N 786432" to just 256 lines, but I cannot get past that echo command with something like "cat DPv6-file.txt ..." That appears to put two & three digit hexadecimals into the fields. George Langford
Re: [Trisquel-users] Scripting the random replacement of fields in an IPv6 address
After doing the homework suggested by Magic Banana, I tried out his well thought out one-line solution, substituting the prefix for a representative multi-addressed PTR record (2a02:120b): $ prefix=2a02:120b ; od -A x -N 786432 -xw12 /dev/urandom | tr ' ' : | sed s/^/$prefix/ The time to run this script, which produces 65,536 complete IPv6 addresses ready for analysis, is 0.000 second CPU time & 0.000 second real time, even when run four times at once. It appears not to produce any "000x", "00xx" or "0xxx" fields, which do occur in real IPv6 addresses. "-A x" means hexadecimal notation throughout; "-N 786432" means number of bytes output; and "-xw12" means six, four-character hexadecimal fields. Together, these arguments gather sufficient bytes of random data from /dev/urandom to fill the 3rd through 8th fields of 65,536 IPv6 addresses. Putting it all together with our nmap script and an enumeration script: prefix=2a02:120b ; od -A n -N 786432 -xw12 /dev/urandom | tr ' ' : | sed s/^/$prefix/ > TempMB4320A.txt ; nmap -6 -sn -T4 -sL -iL TempMB4320A.txt | grep "Nmap scan report for " - | tr -d '()' | sort -k5 | awk 'NR >= 1 { print $5, $6 }' | awk 'NR >= 1 { print $2, $1 }' | uniq -Df 1 | sed '/^\s\s*/d' | awk '{ print $2 "\t" $1 }' > TempMB4320B.txt ; awk '{print $2,$1}' 'TempMB4320B.txt' | sort -k 2 | uniq -cdf 1 | awk '{print $3"\t"$1}' '-' > Multi-IPv6-MB-dynamic.wline.6rd.res.cust.swisscom.ch-Tally.txt | awk '{print $3"\t"$1}' '-' > Multi-IPv6-MB-dynamic.wline.6rd.res.cust.swisscom.ch-Tally.txt ; rm TempMB4320A.txt TempMB4320B.txt Final output file: dynamic.wline.6rd.res.cust.swisscom.ch 65536 Giving a similar answer in about six minutes that my other scripts took 58 hours to complete when run on 508 IPv6 addresses: All the IPv6 addresses have the same dig -x result, even when randomly generated rather than hunted down with Internet searches.
Re: [Trisquel-users] Scripting the random replacement of fields in an IPv6 address
Some small amount of progress to report: Reference: https://subscription.packtpub.com/book/virtualization_and_cloud/9781788990554/6/ch06lvl1sec71/nested-loops Suitably combined with my simplified Magic Banana script: for (( sample = 1; sample
Re: [Trisquel-users] Scripting the random replacement of fields in an IPv6 address
While investigating. Magic Banana's elegant & correctly interpreted solution: echo 2a02:2788:1000:0:6037:fc9a:27ac:10c7 | sed s/'[0-9a-f]\{4\}'/$(openssl rand -hex 2)/$(shuf -i 4-7 -n 1) I discovered that this script occasionally places the newly generated four-digit hex number in the eighth field. Examples: 2a02:2788:1000:0:6037:fc9a:27ac:5467 and 2a02:2788:1000:0:6037:fc9a:27ac:5558 However, {shuf -i 4-7 -n 1} stays within its allotted boundaries, producing 4's, 5's, 6's & 7's randomly; never 8's. Also, {openssl rand -hex 2} consistently cranks out four-digit hex numbers. What I'm actually trying to do is to replace all of the last six fields of the initiating IPv6 address with new four-digit hex numbers, run the script 65,536 times, and then use a simplified version of my nmap script to capture the registered PTR for each new IPv6, a far quicker approach than my earlier method of capturing the PTR records of a hundred, 64K groups of IPv6 addresses, which does the hostname lookups 6,553,600 times. Much more probitive and a hundred times quicker. So far, I've tried replacing (shuf -i 4-7 -n 1) with $i in a six-step "do" loop and nesting that inside a 65,536 step nested do loop (as in Basic) but I'm wrestling with the syntax of that approach. Back to the prospect that nslookup (and dig -x) are getting hijacked: Doesn't matter, as the server's gratuitous hostname lookups would be hijacked the same way. And none of them can be resolved to any usable numerical address. George Langford
Re: [Trisquel-users] Scripting the random replacement of fields in an IPv6 address
Magic Banana translated my wordy english: > Summing up (could you try to be brief and clear?), you want to replace, in an IPv6 address, > the n-th group of four hexadecimal digits (with n a random integer from 4 to 7) with a random one. Right? Restated: ... replacing the [randomly selected] n-th field [of eight, skipping the first three which likely define the CIDR block of the IPv6 address] to be replaced with another [randomly generated] group of four hexadecimal digits ... Example: Original IPv6 address: 2a02:2788:1000:0:6037:fc9a:27ac:10c7 Select field number with shuf -i 4-7 -n 1 ==> 5 Generate a new field with [suitably simplified] echo "$(openssl rand -hex 2)" ==> 83bb Place the new field: 2a02:2788:1000:0:83bb:fc9a:27ac:10c7 (83bb is the new 5th field) Demonstrating Magic Banana's elegant & correctly interpreted solution: [code] echo 2a02:2788:1000:0:6037:fc9a:27ac:10c7 | sed s/'[0-9a-f]\{4\}'/$(openssl rand -hex 2)/$(shuf -i 4-7 -n 1) [/code] with the result: 2a02:2788:1000:0:6037:fc9a:2b1e:10c7 (2ble is the new 7th field) Imagine the task of writing the PTR's of the 79,228,162,514,264,337,593,543,950,336 addresses in 2a02:2788::/32 Ref: https://www.ultratools.com/tools/netMaskResult?ipAddress=2a02%3A2788%3A%3A%2F32 Now imagine the task of looking up all those 79,228,162,514,264,337,593,543,950,336 addresses ... For those of you at home: 2a02:2788:5d3a:f8e2:83bb:198c:4a68:b1be gives the same nslookup result as all the original and modified IPv6 addresses here. Could it be that nslookup is being hijacked ? George Langford
[Trisquel-users] Scripting the random replacement of fields in an IPv6 address
In an effort to estimate the degree to which a block of Internet addresses have been assigned the same PTR record, I'm attempting to reassign the contents of randomly selected fields in the retrieved addresses of the block. I've found a script which generates a random number among the numerals 4 through 7: shuf -i 4-7 -n 1 Reference: https://stackoverflow.com/questions/2556190/random-number-from-a-range-in-a-bash-script Also another script to create a random four-digit hexadecimal number, suitably modified: echo "#$(openssl rand -hex 2)" | tr -d '\#' https://stackoverflow.com/questions/40277918/shell-script-to-generate-random-hex-numbers/40278205 These both produce the desired outputs, but I have been unable to write a script which causes the randomly generated field number from the output of the first function to replace that field with the output from the second function. This technique is based on my training in metallurgy, where averaging of randomly selected fields in a microscopic view can be proven mathematically to represent the property of the entire view. Why I want to do this: The number of addresses in a block such as field:field::/32 is too large to look up over several lifetimes. I've written a script which replaces the last field in the IPv6 address with :0/112 so that the script which looks up the PTR records has just 64K addresses & PTR's in its output. Repeating the script for a hundred or so found IPv6 addresses takes several hours, which is tolerably quick for my purposes. Repeating that task for my suggested random changes in the source IPv6 addresses within just the 4th through 7th fields will not usually cause the search to stray outside the original CIDR blocks of the source addresses. That would randomly sample the originating CIDR block, all the more so, the more times the proposed script is run. I've done something like this by running my basic nmap search script on two data sets for the same PTR record, one gleaned from the Internet with a search on the hostname/PTR record, and the other from a database of publicly available recent-visitor data gathered without first applying hostname-lookup to the original visitor addresses. Each address set was different from the other, both had around a hundred addresses, and the outputs of each nmap search script lists over six million identical PTR records, making twelve million ... how many more are there ? George Langford
Re: [Trisquel-users] Removing unwanted carriage returns
Previously, I had attempted a join script: > Step (5) I attempted to join the present spreadsheet with the domains-visited and visits-per-domain data: > join -a 2 -1 1 -2 1 But the results look incomplete: only 13,000 rows of fully filled-in data with correct & complete counts, > yet there are 330,000 rows of the uncombined data ... adding up to 343,000 rows. Needs some work ... You bet it needs some work ... I had made a couple of irreparable errors, so I restarted the construction of the useless spreadsheet, which is now ready to be filled in per a previous posting. More about this later. George Langford
Re: [Trisquel-users] Removing unwanted carriage returns
Remember the Delta Process from Calculus 1.01 ? https://www.brighthubeducation.com/homework-math-help/108376-finding-the-derivative-of-a-function-in-calculus/ That's where I am in Scripting 1.01 ... Back to the problem at hand. Step (1) selected the IPv6 addresses of the Type A & Type B rows in the cleansed File01.txt: awk '{print $2}' 'File01.txt' | sort | uniq -c > TempTQ01.txt ; awk '{print $2, $1}' 'TempTQ01.txt' | sort -nrk 2 > TempTQ02.txt Step (2) selects and lists all the Type B entries in File01.txt (SS.IPv6-HN-GLU-MB-Domains-January2020-All.ods.txt): awk '{print $1}' 'TempTQ02.txt' | sort > TempTQ10.txt ; awk '{print $1}' 'TempTQ10.txt' - | grep - File01.txt | awk '{print $1,$2,$3,$4}' '-' > TempTQ11.txt Never mind simplicity or efficiency; it took 0.006 second CPU time and 0.032 second real time. It did reveal a number of Type C rows that I had missed in my visual inspection ==> TempTQ13.txt Next step: For each row in TempTQ11.txt, print $2,$3,$4 to cache, find $1 in File10.txt's $2 column, and print that $2 to Column $1 along with cache's $2,$3,$4 into a new file ... Step (3) matches the Keys in Col.$2 of the Type A rows with the data in Col's $2,$3 & $4 of Type B rows: join -a 2 -1 1 -2 2
Re: [Trisquel-users] Removing unwanted carriage returns
These "trivial" AWK programs are presently beyond my ken. Way too compact for me at this hour. In the meantime I started with this script: awk '{print $2}' 'File01.txt' | sort | uniq -c > TempTQ01.txt ; awk '{print $2, $1}' 'TempTQ01.txt' | sort -nrk 2 > TempTQ02.txt where File01.txt is the original 370,000 row file, albeit truncated to exclude the 38,000 rows that I've already filled in and also all the five-column resolved hostname rows, leaving only Type A and Type B IPv6 address data in 347,000 rows. Type B are the keyed IPv6 addresses and the number of occurrences; Type B are CIDR blocks distinguishable by their '/' characters. We don't need to use the Type B data except as a reality check. What remains is to print all the $1 columns of the IPv6 rows that match the first IPv6 key in the TempTQ02 list, plus the $2, $3, and $4 columns of the corresponding Type B row to make C rows (Column $2 of TempTQ02.txt) of filled-in data, then move on to the next IPv6 key in the TempTQ02.txt file. The largest number of occurrences (55,000) exist in one contiguous group of 55,000 rows, one of which contains the IPv6 key address and its three columns of asn-query data. The occurrences data (C) are also needed only as a reality check. I also meddled with the asn-query source code (https://svn.nmap.org/nmap/scripts/asn-query.nse) and learned how to store & retrieve it as a program file which returns the same data for those eight IPv6 addresses given above, plus the asn-query data. Alas, further meddling (beyond just putting something else between the quotes of "See the result for %s") has been unproductive. George Langford
Re: [Trisquel-users] Removing unwanted carriage returns
I'll restate the problem, unencumbered by distracting arrays of colons and hexadecimals. All 387,000 rows fall into one of three types, each IP address appearing only once in the first column: Type A: $1("key" IP address), $2(CIDR block), $3(country code), $4(AS number) Type B: $1(IP address falling within the $2CIDR block of Type A), $2(Type A's "key" IP address, repeated many times in successive rows) Type C: $1(hostname), $2(Ip address from which $1hostname can be resolved), $3(CIDR block), $4(country code), $5(AS number) (Type C is not very populous and can be handled with Leafpad) The desired script: awk should locate Type A's $1Key and find all the Type B rows whose $2Key match $1's Key, and then copy Type A's columns $2, $3 & $4 in place of Type B's column $2 in every instance of a match with Type A's $1Key I have found a small number of Type A rows with no data, but those I can look up with whois and fix easily. The already looked-up hostnames are the only non-IP data in the $1 columns of Types A & B, so awk can safely concentrate on all the Columns $1. Also, all the IP addresses of looked-up hostnames will not reappear as not-looked-up IP addresses. If awk can do everything described above with the first Type A $1Key before proceeding, even if that involves searching the entire 370,000 rows once for each Type A $1Key, then we're on the right track. George Langford
Re: [Trisquel-users] Removing unwanted carriage returns
See: https://svn.nmap.org/nmap/scripts/asn-query.nse where the applicable (?) script reads, noting especially "( "See the result for %s" ):format( last_ip )": -- ... begin snip --- -- Checks whether the target IP address is within any BGP prefixes for which a query has -- already been performed and returns a pointer to the HOST SCRIPT RESULT displaying the applicable answers. -- @param ip String representing the target IP address. -- @return Boolean true if there are cached answers for the supplied target, otherwise -- false. -- @return Table containing a string for each answer or nil if there are none. function check_cache( ip ) local ret = {} -- collect any applicable answers for _, cache_entry in ipairs( nmap.registry.asn.cache ) do if ipOps.ip_in_range( ip, cache_entry.cache_bgp ) then ret[#ret+1] = cache_entry end end if #ret < 1 then return false, nil end -- /0 signals that we want to kill this thread (all threads in fact) if #ret == 1 and type( ret[1].cache_bgp ) == "string" and ret[1].cache_bgp:match( "/0" ) then return true, nil end -- should return pointer unless there are more than one unique pointer local dirty, last_ip = false for _, entry in ipairs( ret ) do if last_ip and last_ip ~= entry.pointer then dirty = true; break end last_ip = entry.pointer end if not dirty then return true, ( "See the result for %s" ):format( last_ip ) else return true, ret end return false, nil end ... end snip -- Where we should _print_ the result for %s instead of just pointing to it ... George Langford
Re: [Trisquel-users] Removing unwanted carriage returns
Here's my present dilemma, exemplified by a snippet from the spreadsheet: 2401:4900:1888:c07f:1:2:4283:5767 2401:4900:1888:fcb4:1:2:4282:aab3 2401:4900:1888:cd70:1:1:4a58:fc0c 2401:4900:1888:fcb4:1:2:4282:aab3 2401:4900:1888:d068:fce8:8739:a7a0:4c60 2401:4900:1888:fcb4:1:2:4282:aab3 2401:4900:1888:e8f5:1:2:4cde:e7ca 2401:4900:1888:fcb4:1:2:4282:aab3 2401:4900:1888:ee55:23c5:e0ec:79fb:59dd 2401:4900:1888:fcb4:1:2:4282:aab3 2401:4900:1888:fcb4:1:2:4282:aab3 2401:4900:1888::/48 IN AS45609 2401:4900:1889:9396:5693:8b98:3a70:da67 2401:4900:1889:ec73:92c5:3a0a:76d5:10c0 2401:4900:1889:a2d9:382e:b73:73dd:8693 2401:4900:1889:ec73:92c5:3a0a:76d5:10c0 2401:4900:1889:aa8c:730c:fa94:8c27:7bf9 2401:4900:1889:ec73:92c5:3a0a:76d5:10c0 2401:4900:1889:aad7:1:1:7b54:1e4c 2401:4900:1889:ec73:92c5:3a0a:76d5:10c0 2401:4900:1889:c648:2161:968a:1c9e:b1c1 2401:4900:1889:ec73:92c5:3a0a:76d5:10c0 2401:4900:1889:c7c0:f461:a726:a208:3ccb 2401:4900:1889:ec73:92c5:3a0a:76d5:10c0 2401:4900:1889:cd44:e950:74db:8fd2:c134 2401:4900:1889:ec73:92c5:3a0a:76d5:10c0 2401:4900:1889:ec73:92c5:3a0a:76d5:10c0 2401:4900:1889::/48 IN AS45609 The positions (i.e. $2) ending in ...:aab3 have to be replaced with 2401:4900:1888::/48 IN AS45609 and the positions ending in ...:10c0 ($2) have to be replaced with 2401:4900:1889::/48 IN AS45609 (i.e., $2,$3,$4) Those key rows, returned by nmap but not repeated by nmap, could have been anywhere in the preceding rows. Of course nmap should not have to repeat the look-ups, but merely repeating the stating of them would be helpful. It is open source ... The entire text file has 387,000 rows, so even an inefficient script would be plenty fast enough. I can fill in about five thousand rows an hour ... leading to a time estimate of 387,000/5,000*1 hour = 77 hours ... not impossible while I'm housebound. It may look silly when the spreadsheet is sorted by IPv6 address, but it's all very necessary when it's sorted by the number of domains visited and/or the number of visits per domain. George Langford
Re: [Trisquel-users] Removing unwanted carriage returns
janet admonished me: > Did you even bother to read the regular expression I provided to use in vim? I stopped reading after the word, "windows."
Re: [Trisquel-users] Removing unwanted carriage returns
Magic Banana constructively added: > It looks like you could have nmap format its output ... Oh ! Gee ! That's a welcome suggestion. I have two more sets of IPv6 data already nmap'ed over quite a few hours that are in the old grep-unfriendly format. Fortunately, my brute-force workarounds are less time-consuming than the original nmap scans, from which there is no escape. Unfortunately, nmap's avoidance of excessive repetition runs afoul of my use of the simple-to-use LibreOffice Calc in that I'm faced with multiple days of filling in the empty cells between the infrequent asn-query results, which nmap limits to one lookup per CIDR block. Another roadblock is Google's aversion to robots, so my search for "other" IP addresses of multi-addressed PTR's is necessarily a manual task, what with the scores of CIDR blocks filled with identically named PTR's. Try chasing down hn.kd.ny.adsl, motorspb.fvds.ru or hosted-by.leaseweb.com. George Langford
Re: [Trisquel-users] Removing unwanted carriage returns
jaret remarked: > I believe you are referencing to a new line character, ... Not just _any_ new line character: A combination of the new line character on the end of one row, plus the phrase at the beginning of the following row. Removing the new line characters willy-nilly will leave a one-row file with all 750,000 lines all concatenated together ... I've done that inadvertently. What I did do was to divide those 750,000 rows into twenty 50,000 row files and then apply search & replace in Leafpad, which took a couple of minutes for each file. It took longer to subdivide the original file by hand ... George Langford
[Trisquel-users] Removing unwanted carriage returns
The output from my nmap script for gleaning hostname, ASN, CIDR and country code from a list of IP addresses generally looks like this: Nmap scan report for 2a00:1298:8011:212::165 Host is up. Host script results: | asn-query: | BGP: 2a00:1298::/32 | Country: SK |_ Origin AS: 5578 - AS-BENESTRA Bratislava, Slovak Republic, SK Nmap scan report for 2a00:1370:8110:3eea:ddea:8b70:415a:f33e Host is up. Host script results: |_asn-query: See the result for 2a00:1370:8114:b2d1:45ee:f77e:facb:d2e8 Nmap scan report for 2a00:1370:8110:79d7:2821:a9b2:9315:cb0f Host is up. Host script results: |_asn-query: See the result for 2a00:1370:8114:b2d1:45ee:f77e:facb:d2e8 I'm using the following grep script to separate the desired data: grep -e "Nmap scan report for" -e "BGP:" -e "Origin AS:" -e "asn-query: See the result for" SS.IPv6-HN-GLU-MB-Domains-January2020-Uniq-nMap.txt > SS.IPv6-HN-GLU-MB-Domains-January2020-Resolve.txt Which [nearly instantly] produces results that look like this (after stripping a few (9000+) carriage returns with Leafpad: Nmap scan report for 2a00:1298:8011:212::165 2a00:1298::/32 | Country: SK AS5578 - AS-BENESTRA Bratislava, Slovak Republic, SK Nmap scan report for 2a00:1370:8110:3eea:ddea:8b70:415a:f33e |_asn-query: See the result for 2a00:1370:8114:b2d1:45ee:f77e:facb:d2e8 Nmap scan report for 2a00:1370:8110:79d7:2821:a9b2:9315:cb0f |_asn-query: See the result for 2a00:1370:8114:b2d1:45ee:f77e:facb:d2e8 I can remove "|_asn-query:" with sed: sed 's/|_asn-query://g' SS.IPv6-HN-GLU-MB-Domains-January2020-ResolvePart.txt > SS.IPv6-HN-GLU-MB-Domains-January2020-ResolveStep01.txt With the following general result: Nmap scan report for 2a00:1298:8011:212::165 2a00:1298::/32 | Country: SK AS5578 - AS-BENESTRA Bratislava, Slovak Republic, SK Nmap scan report for 2a00:1370:8110:3eea:ddea:8b70:415a:f33e See the result for 2a00:1370:8114:b2d1:45ee:f77e:facb:d2e8 Nmap scan report for 2a00:1370:8110:79d7:2821:a9b2:9315:cb0f See the result for 2a00:1370:8114:b2d1:45ee:f77e:facb:d2e8 Replacing the carriage return in the string "f33e [C.R.] See the result for" with a tab and just "See" is proving problematic. In Leafpad, it will take way too long (days ...) so I'm forced to learn some more scripting tricks ... I need to do this without inadvertently stripping all 400,000 carriage returns. George Langford
Re: [Trisquel-users] Unforeseen feature of the join command
Magic Banana is perplexed by my experience with grep's memory usage: > Your experience with grep is weird. As far as I understand, grep's memory requirement > do not depend on the number of lines (which could be infinite). It processes one single > line, outputs it if and only if it matches the regular expression, then forgets about > it to process the subsequent line and so on. My example had a pattern file of 7700 rows and a target 0f 2200 rows; the grep script searching that combination saturated both 7.7 GB of Ram and 18 GB of swap. I divided the 7700 rows into eight pieces of 1000 rows each (except the last one of 700 rows). Those eight grep scripts took about 20 seconds each to finish without taxing either form of memory. George Langford
Re: [Trisquel-users] Unforeseen feature of the join command
What I wrote yesterday was tongue-in-cheek: >It's great that sort can be "trained" to do the same kind of multi-stage sorting task that appears to be built into LibreOffice Calc. What I meant was that its user [me] can be trained ... George Langford
Re: [Trisquel-users] Unforeseen feature of the join command
We wrote: > I realized that LibreOffice Calc. cannot handle more than a million rows ... > Spreadsheets are only meant to do computation on little data. > To store many data, use text files or a database management system. Starting with 134 sets of recent visitor data, the spreadsheet comes to 1.7 million rows, eventually expanding to about fifteen columns. It's not unwieldy yet at that point, even with an adults' table and a kids' table. Your interest & expertise in data mining can best be brought to bear after the spreadsheet is populated to the extent that I envision. The multi- address PTR records can have millions of cells for each such name. There will be patterns in the visits to the domains and their varied subject matter. Some CIDR address spaces are filled with large numbers of different multi-address PTR records, which demand the database treatment. Not to mention the IPv6 data, which have ballooned to about a third of the entire data set. The number of ISP's that choose to publish the recent visitor data will grow exponentially (I hope) and that will make my approach burdensome; It's already doubled between June 2019 and January 2020. It's great that sort can be "trained" to do the same kind of multi-stage sorting task that appears to be built into LibreOffice Calc. By June I'll be forced to face up to that homework assignment. George Langford
Re: [Trisquel-users] Unforeseen feature of the join command
Regarding "sort" ... After going back to my collection of gratuitously looked-up hostnames, and after organizing the potential spreadsheet so as contain the essential information: hostname, visits-per-domain, domains visited, and target domain, I realized that LibreOffice Calc. cannot handle more than a million rows ... my file containing 1.7 million ... I decided to sort the file, first on Column 3, then Column 1, and then on Column 2. Accordingly, I rearranged the columns thusly: $3, $1, $2, $4 and sorted with: "sort -nrk 1,4" where "nr" puts the biggest numbers at the top of the column, but sort evidently did not reach to the third column, resulting in an ordering of only hostname and visits-per-domain. That was sufficient, because it allowed me to split the file at the 1,000,000 mark, whereupon I completed the sort with Calc. Rows from 1,000,001 on to 700,000 have about one domain visit per hostname; the actual numbering of the secondary portion is 1 through 700,001, of course. Good thing I'm doing it this way: sorting the secondary list thusly: $2, $1, $3, $4 reveals some eye-popping numbers: The uppermost ~1500 of those rows have visitors making one request per hour up to over three visits per second (upwards of 900,000 in a month) ... the uppermost visits-per-domain were to a meteorological domain, but there were other visitors with less understandable motives making about a hundred visits per hour. George Langford
Re: [Trisquel-users] Unforeseen feature of the join command
Yesterday in some despair, I wrote: > I note on looking at the resulting files that my earlier generated domain visit > counts are inaccurate, but that it is not the result of an inaccurate join. With a better grasp of the scripting & join process, I re-started from the collected original data, cleaned out the invalid material and (a great many) IPv6 data and repeated the appropriate steps. Now, the domains-visited counts are correct and the visits per domain data are carried through intact; there are no empty cells. I had been guilty of not reading (i.e., unaware of !) the "info sort" material; even if I had, Magic Banana would still have had to interpret it for me. "cut" is not yet in my scripting vocabulary. I looked into that so as to learn what "cut -d ' ' -f -2" means in the present context. I've used "tr ':' '\t'" to separate the IPv6 data into fields so that I could use LibreOffice Calc to sort the fourth column so as to collect all the IPv6 data into one series of lines in the spreadsheet. Now I'll use it again to reconstruct the IPv6 files in their own domain.txt, IPv6, Visits-per-domain spreadsheet file. As long as I'm not making entries one-at-a-time, I'm happy when my inefficient script does the same task in less than an hour. I can look up the PTR records that go with the IPv4 data, but in my parallel study of the gratuitously resolved IP address data (now about 90% PTR's or hostnames) the reverse is extraordinarily tedious. I've been using Google to gather the server CIDR data (with nMaps ASN-query function) for the multi- addressed PTR's, and Google does not like scripts ... I am challenged every few minutes while at that task. I adjusted Magic Banana's suggested script for my actual data: From: join
Re: [Trisquel-users] Unforeseen feature of the join command
Magic Banana comes to the rescue yet again! I tried the two suggested scripts: The second takes half as much CPU time as the first. I applied MB's first suggestion to a couple of multi-megabyte files, obfuscated to protect the innocent: awk '{print $1, $3, $2}' 'DataX-VpDCts.txt' > Temp01.txt ; awk '{print $1, $3, $2}' 'DataY-DVCts.txt' > Temp02.txt ; join -2 1 DataZ-VpDCts-DVCts-Keyed.txt ; rm Temp01.txt Temp02.txt Temp03.txt Forgive me for using crutches ... but the result has ~300,000 rows, four columns, no missing cells and the script(s) took less than a second real time for processing. Here's the story behind the two subject files: Each has three columns: IPv4, Count 1 or Count 2, Textfile name. The join command is to produce IPv4, Count 1, Count 2, Textfile name in four colums and does so OK. A few of the IPv4's visited more than one Textfile name; and some IPv4's visited a single Textfile name multiple times. Therefore, some IPv4's appear up to twenty or more times in successive rows when the resulting file is sorted on the IPv4 column. A total of thirty-two sets of domain visitor data were examined; some IPv4's visited twenty-nine of the thirty-two domains. Over a thousand IPv4's visited a single domain over a thousand times. The maximum was over 100,000 visits in the month examined. I note on looking at the resulting files that my earler generated domain visit counts are inaccurate, but that it is not the result of an inaccurate join. More to come, it appears. George Langford
Re: [Trisquel-users] Unforeseen feature of the join command
Here's what I found that works; I simply append a letter to each "proxy-IP address" for the duration of the processing and then remove it afterwards. Workaround: awk '{print$1"\t""w"}' 'ListA.txt' | tr -d "\t" > ListA-w.txt ... somewhat redundant awk '{print $1"w",$2, $3, $4}' ListB.txt > ListB-w.txt .. this works also Then awk '{print $1}' 'ListA-w.txt' | sort -k1 | uniq -c > ListA-w-Counts.txt Followed by: join -a 2 -1 1 -2 2 Test-DV-w.txt ; rm TempX-w.txt The end result is attached; the other files can be generated from the original ListA.txt and ListB.txt Watch out for any other entries with "w" in the name ! Purely IPv4 data are safe, of course. George Langford
Re: [Trisquel-users] Unforeseen feature of the join command
This has occurred before; see: https://www.researchgate.net/post/The_join_command_seems_to_be_messing_up_when_using_numbers_as_the_comparing_fields_Any_ideas Alas, my actual data are IPv4 addresses, numbered in the millions; that workaround may be unworkable.
[Trisquel-users] Unforeseen feature of the join command
Here's a task that has been defeating me for a couple of weeks: Start with a list of items in a database file: ListA.txt Combine with the original database: ListB.txt The steps that I've been using: awk '{print $1}' 'ListA.txt' | sort -k1 | uniq -c > ListA-Counts.txt Note: there are no duplicates, for sake of simplicity. Then: join -a 2 -1 1 -2 2 Test-DV.txt ; rm TempX.txt The output file has twenty-five incomplete rows whose the second column is missing, leaving Columns $1, $3, and $4 as $1, $2, $3. See Test-DVS.txt (sorted on $4). George Langford, surrounded by Covid19 in SE PA abc.148 49 1 32.txt abc.155 64 1 32.txt abc.211 75 1 32.txt abc.234 20 1 32.txt abc.90 32 1 32.txt abc.94 99 1 32.txt abc.130 39 1 31.txt abc.186 65 1 31.txt abc.43 91 1 31.txt abc.49 66 1 31.txt abc.72 8 1 31.txt abc.74 37 1 31.txt abc.138 87 1 30.txt abc.174 79 1 30.txt abc.230 19 1 30.txt abc.243 47 1 30.txt abc.27 31 1 30.txt abc.69 67 1 30.txt abc.82 17 1 30.txt abc.99 98 1 30.txt abc.102 39 1 29.txt abc.124 48 1 29.txt abc.126 19 1 29.txt abc.131 44 1 29.txt abc.182 39 1 29.txt abc.222 4 1 29.txt abc.223 12 1 29.txt abc.195 17 1 28.txt abc.208 84 1 28.txt abc.241 26 1 28.txt abc.36 62 1 28.txt abc.47 95 1 28.txt abc.64 35 1 28.txt abc.66 58 1 28.txt abc.92 6 1 28.txt abc.178 48 1 26.txt abc.217 70 1 26.txt abc.218 67 1 26.txt abc.238 33 1 26.txt abc.244 52 1 26.txt abc.251 40 1 26.txt abc.63 97 1 26.txt abc.115 4 1 25.txt abc.149 20 1 25.txt abc.171 93 1 25.txt abc.219 22 1 25.txt abc.247 28 1 25.txt abc.32 54 1 25.txt abc.119 32 1 24.txt abc.166 5 1 24.txt abc.170 84 1 24.txt abc.179 59 1 24.txt abc.180 93 1 24.txt abc.190 65 1 24.txt abc.210 25 1 24.txt abc.237 13 1 24.txt abc.246 2 1 24.txt abc.250 11 1 24.txt abc.42 9 1 24.txt abc.51 83 1 24.txt abc.153 35 1 23.txt abc.215 70 1 23.txt abc.34 56 1 23.txt abc.45 33 1 23.txt abc.107 10 1 22.txt abc.114 48 1 22.txt abc.121 70 1 22.txt abc.127 62 1 22.txt abc.128 46 1 22.txt abc.141 66 1 22.txt abc.37 75 1 22.txt abc.38 18 1 22.txt abc.68 19 1 22.txt abc.110 54 1 21.txt abc.122 79 1 21.txt abc.164 83 1 21.txt abc.176 87 1 21.txt abc.191 59 1 21.txt abc.201 24 1 21.txt abc.202 68 1 21.txt abc.212 19 1 21.txt abc.44 53 1 21.txt abc.108 32 1 20.txt abc.134 88 1 20.txt abc.142 28 1 20.txt abc.216 51 1 20.txt abc.220 25 1 20.txt abc.239 98 1 20.txt abc.58 69 1 20.txt abc.123 33 1 19.txt abc.154 15 1 19.txt abc.167 8 1 19.txt abc.181 61 1 19.txt abc.209 82 1 19.txt abc.78 36 1 19.txt abc.88 72 1 19.txt abc.116 76 1 18.txt abc.177 57 1 18.txt abc.203 52 1 18.txt abc.249 74 1 18.txt abc.252 85 1 18.txt abc.89 31 1 18.txt abc.132 42 1 17.txt abc.192 26 1 17.txt abc.253 58 1 17.txt abc.33 79 1 17.txt abc.35 82 1 17.txt abc.40 32 1 17.txt abc.54 79 1 17.txt abc.57 15 1 17.txt abc.60 61 1 17.txt abc.62 61 1 17.txt abc.71 22 1 17.txt abc.97 62 1 17.txt abc.101 3 1 16.txt abc.120 10 1 16.txt abc.173 70 1 16.txt abc.206 11 1 16.txt abc.41 82 1 16.txt abc.135 19 1 15.txt abc.156 71 1 15.txt abc.193 52 1 15.txt abc.225 89 1 15.txt abc.233 45 1 15.txt abc.55 1 1 15.txt abc.61 52 1 15.txt abc.77 1 1 15.txt abc.118 15 1 14.txt abc.125 20 1 14.txt abc.137 34 1 14.txt abc.139 57 1 14.txt abc.143 43 1 14.txt abc.187 8 1 14.txt abc.189 87 1 14.txt abc.39 84 1 14.txt abc.52 6 1 14.txt abc.140 14 1 13.txt abc.147 34 1 13.txt abc.157 4 1 13.txt abc.169 9 1 13.txt abc.197 52 1 13.txt abc.199 37 1
Re: [Trisquel-users] ThinkPenguin WiFi adapter disruption
To find out more about nmap, type "man nmap" in a terminal; to find out even more, type "info nmap". To improve your understanding of syntax, etc., explore others' experience with online nmap searches. This webpage illustrates my use of nmap to find out which attackers to my website were operating from servers with open ports 3389, the Windows Remote Access Port, commonly also left open by naive users of the so-called Cloud, which is a marketing scheme to put unused server hard-drive capacity to work generating income for Internet Service Providers: https://www.pinthetaleonthedonkey.com/Timeline-RussianInterference.htm About your Indicator Applet: It could be that your connection to the router is OK, but that the router is getting overwhelmed by external traffic. I would also be suspicious of the USB port into which your ThinkPenguin WiFi dongle is plugged. Try a different USB port. I had a scan script running for almost two weeks, during which the WiFi connection was broken several times; the script kept right on from the point of interruption when the connection was re-made. I've sometimes had to try another USB port when the connection didn't wake up right away on reconnecting. ThinkPenguin does provide a firmware update for their WiFi dongle; don't hesitate to ask. My (free) update came with clear instructions, which made its application easy and quick.
Re: [Trisquel-users] ThinkPenguin WiFi adapter disruption
Better to view this as a security feature rather than a sign of failure. My own ThinkPenguin dongle experiences broken connections now & then, more so when I've got multiple nMap scans going on. No data is ever lost; and the scans pick up right where they left off when the connection is re-made. Unplugging & then reconnecting the dongle re-makes the connection OK ... if it doesn't, use a different USB port to wake it up. I also have a Think Penguin extended-coverage device, which has the accessory antenna for boosting coverage. If I'm using them while I'm running the nMap scans, the connection breaks more frequently, confirming the cause: too many competing WiFi installations. Disconnecting the antenna helps a lot; the coverage drops dramatically, but connects to my own router reliably. That said, my connection indicator never lies to me: if the fan is blank, there's no connection; if there are multiple "waves" it's connected. I'm using Trisquel flidas on a ThinkPad T420 and the "Indicator Applet Complete 1.12.1" See http://www.mate-desktop.org
Re: [Trisquel-users] CPU frequency stuck at minimum speed
Magic Banana's 'puter is putzing along in low gear ... Thinking inside the box: A few of years ago I was working on a project involving the storage of grapes in a produce warehouse, wherein the grapes were in small crates protected against mold by sulfur dioxide generated from packets of a sulfate that release the gas by reacting with moisture in the air. Sulfur dioxide dissolves in water to form sulfurous acid, which is rather corrosive of copper alloys. I was given one of those packets to find out how they work ... whereupon the USB ports on my laptop started to misbehave. Oxides of transition metals are semiconductors; and the diode drops across the transition metal contacts were causing the contacts to introduce electrical noise that greatly slowed the USB transmissions. It might be that your 'puter's internal connections are being affected by sulfur dioxide in urban air pollution, and so re-making the 'puter's internal connections might just work a cure, at least for a while. Sulfur dioxide is also formed by the combustion of fossil fuels containing sulfur compounds; the oderant in methane-containing gas is also a sulfur compound.
[Trisquel-users] Is trisquel using OpenSMTPD ?
One of my security news feeds just happened to mention a bug in OpenSMTPD: https://nakedsecurity.sophos.com/2020/01/31/serious-security-how-special-case-code-blew-a-hole-in-opensmtpd/ As I'm getting all manner of unsolicited nasty-looking emails because of past anti-spam activity, I wonder if I should be extraordinarily concerned with the state of OpenSMTPD. George Langford, stunned in SE PA
Re: [Trisquel-users] Is there an application that will resolve obfuscated hostnames ?
martinh suggests: "... the --resolve-all option" Alas, my trisquel flidas is using version 7.01, and Nmap didn't start using "--resolve-all" until version 7.70. Here's the script that I tried, based on https://nmap.org/nsedoc/scripts/resolveall.html time nmap -Pn -sn --script=resolveall --script-args=newtargets, resolveall.hosts="126.64.uzpak.uz","129.mtsnet.ru", "139.188.94-binat-smaug.in-addr.arpa","14.mtsnet.ru","154-70-132.static.xpressgt.co.za", "173-232-44.static.rdns.serverhub.com","177.97.223.dynamic.adsl.gvt.net.br" > HostsAll02.txt Output: Error: segmentation fault (core dumped) Later, when I returned to my usual scan, but on some hostnames known to have duplicate IP addresses; some of which have IPv6 alternatives (-6 option in nMap): time nmap -Pn -sn -6 --script asn-query -iL Hosts03UAll.txt > HostsAll03U6.txt and time nmap -Pn -sn --script asn-query -iL Hosts03UAll.txt > HostsAll03U.txt Both scans produce alternative IPv4 or IPv6 addresses, which I hadn't noticed in any prior scans before today. That's a step in the right direction, but the explicit method that I tried in the first of the three scripts above appears to have some sort of bug. These are all real hostnames, but one shouldn't try to visit them, as their owners either don't know how to configure their servers, or do know and are hiding something. I think I need to upgrade my version of nMap, but I'm unsure about the approved trisquel method of doing so.
[Trisquel-users] Is there an application that will resolve obfuscated hostnames ?
After several months of learning text scripting at Magic Banana's insistence, there's one last gap in my project: While nMap is excellent for getting WhoIs-like data (CIDR blocks, autonomous system numbers, and country codes) from IP addresses, there are still some multiply-addressed hostnames that can only be discovered with online searches like Google. True, nMap can even do the same sorts of searches for hostnames as for IP addresses, but it doesn't get 'em all. Google does well because there are many helpful folks who keep track of the many email scams and malevolences that are frequently used to attack the Internet infrastructure; and the emails have headers which reveal the originating IP addresses of their senders as well as the hostnames that go with those IP addresses. I have been finishing off my spreadsheets by using Google to find the IP addresses that the scripts cannot find. While entertaining - because of all the multiply-addressed hostnames that the searches reveal - it's tedious and time-consuming. Are there such applications that can be used ? My manual searches just put the target hostname in quotes ... and then I scan the results for the IP address(es) that the reporters associate with that hostname. Any suspicious combination of IP address and nonspecific hostname I can also search in Hurricane Electric's "BGP" application, which will return all the hostnames (actually, pointer (PTR) records) that are used to identify the IP addresses in the associated CIDR block. I sometimes find that address space populated with many identical PTR's. Thanks, George Langford
Re: [Trisquel-users] Turn off auto complete feature in LibreOffice Calc
nadebula.1984 is using a different version of LibreOffice My Standard Issue Trisquel has LibreOffice Version: 5.1.6.2, within which AutoInput is nowhere to be seen.