Sure, thanks for looking at it. I'll attach it as a txt. Thanks, Adrian
On Thu, Dec 2, 2010 at 7:46 PM, Grymoire <[email protected]> wrote: > > >Uhm, that's the one I'm already using. > > Yes Adrian, I know. And the code I posted worked fine for me. > When I use it, I get > > ['74.125.113.27', '127.0.0.1', '127.0.0.1', '127.0.0.1', '89.216.227.38', > [snip] > > So either we have different inputs, or your code is doing something > different than mine. But the problem does NOT seem to be the fault of > the regular expression. Something else is happening. > > Can you post more of your code so we can see why it does not work? > > BTW - if you want to clean up the output in shell, and make sure all of the > IP addresses are valid, I'd use something like > > program | tr ",]['" "\n " | sort -un |\ > awk -F. '{$1 < 256 && $2 < 256 && $3 < 256 && $4 < 256 }' > > > and sorry for misspelling your name. > > > > _______________________________________________ > Pauldotcom mailing list > [email protected] > http://mail.pauldotcom.com/cgi-bin/mailman/listinfo/pauldotcom > Main Web Site: http://pauldotcom.com >
""" NetDB scraping code This is used to obtain a list of IPs from our local NetDB cache. The RegEX needs some work. Coded by Adrian Crenshaw Please excuse the sloppiness of the code, I've only recently began to learn Python http://irongeek.com """ # import sys import urllib2 import threading from threading import Thread import time from datetime import datetime import string import os import re #ScanStartedTime = string.replace(string.replace(str(datetime.now())," ", "-"),":","-") #OutputFile=open('temp-router-ips-' + ScanStartedTime + '.txt', 'wb',1) OutputFile=open('all-sorted-uniq.txt', 'ab',1) netdbdir= 'C:\\Windows\\SysWOW64\\config\\systemprofile\\AppData\\Roaming\\I2P\\netDb\\' for root, dirs, files in os.walk(netdbdir): for afile in files: #print "-----"+ afile +"-----" InputFile = open(netdbdir + afile,'rb') TextBlob= str(InputFile.read()) #help from http://www.regular-expressions.info/examples.html IPsInFile = re.findall('(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)',TextBlob) for RouterIP in IPsInFile: OutputFile.write(RouterIP + '\n') InputFile.close() OutputFile.close() TheInput=open("all-sorted-uniq.txt",'r').readlines() Output= sorted(list(set(TheInput))) open("all-sorted-uniq.txt",'wb').writelines(Output)
_______________________________________________ Pauldotcom mailing list [email protected] http://mail.pauldotcom.com/cgi-bin/mailman/listinfo/pauldotcom Main Web Site: http://pauldotcom.com
