Re: Merging Related Information from 2 Tables
On Thursday 29 October 2009 20:44:12 Martin McCormick wrote: Giorgos Keramidas writes: You should use a Perl or Python script, and a hash... If you show us a few sample lines from the input file and how you want the output to look, it shouldn't be too hard to quickly hack one of those together. The alternative is to use join(1). A records look like: hydrogen.cis.osu. 43200 IN A 192.168.2.123 Text or TXT records look similar [...] hydrogen.cis.osu. 5 IN TXT cordell-north,009,192.168.2.123 This will work well since the default join field is the first field in the line. Jonathan ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Merging Related Information from 2 Tables
This is probably going to be a hashing exercise but I am checking to see if any of the building blocks needed are already out there. The problem is simple to describe in that there are 2 tables. One is a DNS zone transfer table of all the A or Address records in a given zone or from several zones for that matter. the other table is from the same zones and consists of text or TXT records. The only thing the 2 tables have in common is that some of the TXT records share the exact same name field as the A records so we should be able to display the important contents of the A and TXT records on the same line if their names match. The challenge is to do this quickly so some sort of hash function is needed to locate A and TXT records having the same name. Grep does this beautifully for single entries across multiple files, but I need to merge the text part of the TXT record with the IP address and host name from the A record with the same name. The only hard part is finding the quickest way to match the roughly 25,000 host names in the A records with around half as many TXT records. This is basically a bucket list problem in which we can either have an A record name in a bucket by itself or an A record in a given bucket and a TXT record in another bucket with the same name as the A record. In the interest of standing on the shoulders of giants, I am checking to see how much tried and tested tools already exist and how much needs to be home-grown. It is also possible to use egrep to search for A and TXT records in 1 pass through a file in which case one would search from the same file for both record types but the problem is the same. In case anybody wonders: egrep '([[:space:]]IN([[:space:]]TXT[[:space:]]|[[:space:]]A[[:space:]]))' okstate.zone ATXT.txt The line break here is for Email consideration. The above command should all be on one line. Thanks for any suggestions. Martin McCormick WB5AGZ Stillwater, OK Systems Engineer OSU Information Technology Department Telecommunications Services Group ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Merging Related Information from 2 Tables
On Thu, 29 Oct 2009 10:38:56 -0500, Martin McCormick mar...@dc.cis.okstate.edu wrote: This is probably going to be a hashing exercise but I am checking to see if any of the building blocks needed are already out there. The problem is simple to describe in that there are 2 tables. One is a DNS zone transfer table of all the A or Address records in a given zone or from several zones for that matter. the other table is from the same zones and consists of text or TXT records. The only thing the 2 tables have in common is that some of the TXT records share the exact same name field as the A records so we should be able to display the important contents of the A and TXT records on the same line if their names match. The challenge is to do this quickly so some sort of hash function is needed to locate A and TXT records having the same name. Hi Martin, You should use a Perl or Python script, and a hash... If you show us a few sample lines from the input file and how you want the output to look, it shouldn't be too hard to quickly hack one of those together. With a short input file like this: : keram...@kobe:/tmp$ cat input-file : localhost IN A 127.0.0.1 : kobeIN A 127.0.0.1 : kobeIN TXT This is a test You can construct a hash map of hostname - list of records in Python with a relatively short script: : #!/usr/bin/env python : : import re : import sys : : are = None # a regexp for matching 'A' records : txtre = None# a regexp for matching 'TXT' records : : try: : are = re.compile(r'^\s*(\S+)\s+[iI][nN]\s+[aA]\s+(((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)).*$') : txtre = re.compile(r'^\s*(\S+)\s+[iI][nN]\s+[tT][xX][tT]\s+(.*)$') : except Exception, inst: : sys.stderr.write('regexp error: %s' % str(inst)) : sys.exit(1) : : hosts = {} : : for l in sys.stdin.readlines(): : l = l.rstrip('\n\r') : # Is this an A record? : m = are.match(l) : if m: : (name, addr) = (m.group(1), m.group(2)) : rec = ('A', addr) : if not name in hosts: : hosts[name] = [rec] : else: : hosts[name].append(rec) : # Is this a TXT record? : m = txtre.match(l) : if m: : (name, text) = (m.group(1), m.group(2)) : rec = ('TXT', text) : if not name in hosts: : hosts[name] = [rec] : else: : hosts[name].append(rec) : : print hosts Running this script should produce something like: : keram...@kobe:/tmp$ python martin.py input-file : {'kobe': [('A', '127.0.0.1'), ('TXT', 'This is a test')], : 'localhost': [('A', '127.0.0.1')]} When you have the hash map of hostname to record-list for each host, you can select and print any combination of host=record from this hash. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Merging Related Information from 2 Tables
Giorgos Keramidas writes: You should use a Perl or Python script, and a hash... If you show us a few sample lines from the input file and how you want the output to look, it shouldn't be too hard to quickly hack one of those together. Perl and python-- I wasn't even thinking of that! Thank you. I have installed python now on the FreeBSD system and will start learning it. A records look like: hydrogen.cis.osu. 43200 IN A 192.168.2.123 Text or TXT records look similar except that the data they convey are ASCII text strings of various information that are either read by people or maybe tell servers how to behave toward that particular client. hydrogen.cis.osu. 5 IN TXT cordell-north,009,192.168.2.123 Our hope is to have an output line looking like: 192.168.2.123 hydrogen.cis.osu cordell-north,009,192.168.2.123 We will actually run that output through sed to convert the 's to blanks and also the ,'s to blanks but that is trivial. Thanks for the examples. Martin McCormick ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Merging Related Information from 2 Tables
On Thu, 29 Oct 2009 18:37:09 +0200, Giorgos Keramidas keram...@ceid.upatras.gr wrote: You should use a Perl or Python script, and a hash... ... Running this script should produce something like: : keram...@kobe:/tmp$ python martin.py input-file : {'kobe': [('A', '127.0.0.1'), ('TXT', 'This is a test')], : 'localhost': [('A', '127.0.0.1')]} When you have the hash map of hostname to record-list for each host, you can select and print any combination of host=record from this hash. On Thu, 29 Oct 2009 13:44:12 -0500, Martin McCormick mar...@dc.cis.okstate.edu wrote: Perl and python-- I wasn't even thinking of that! Thank you. I have installed python now on the FreeBSD system and will start learning it. A records look like: hydrogen.cis.osu. 43200 IN A 192.168.2.123 Text or TXT records look similar except that the data they convey are ASCII text strings of various information that are either read by people or maybe tell servers how to behave toward that particular client. hydrogen.cis.osu. 5 IN TXT cordell-north,009,192.168.2.123 Once you slurp all the A and TXT records in a hash-map or another data structure of your own with Python, you can iterate over the hash and print parts or all of it. For example, if you have the hash I printed in my previous reply, you can print all addresses and text records with a small bit of code: : keram...@kobe:/home/keramida$ cat hello.py : #!/usr/bin/env python : : hosts = {'kobe': [('A', '127.0.0.1'), : ('TXT', 'This is a test')], : 'localhost': [('A', '127.0.0.1')]} : : for h in sorted(hosts): : addrs = [x[1] for x in hosts[h] if x[0] == 'A'] : txts = [x[1] for x in hosts[h] if x[0] == 'TXT'] : for a in addrs: : if len(txts) == 0: : txts = [] : for t in txts: : print %-20s %-30s %s % (a, h, t) : keram...@kobe:/home/keramida$ python hello.py : 127.0.0.1kobe This is a test : 127.0.0.1localhost : keram...@kobe:/home/keramida$ Add or remove formatting as you see fit :-) ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org