[powershell] RE: CSV Comparisons:

Michael B. Smith Tue, 24 May 2016 16:11:22 -0700

Add an additional field in both files that is the uint32 version of the IP. 
Sort both files ascending by the new field.


Since the files are already sorted, compare-object will be very fast. If you 
want it even faster, you can create a couple of while() loops yourself to do 
the comparisons.

There are a variety of mechanisms for converting the text-ip to uint32-ip. The 
easiest is not the fastest. :) Depends on how often you will need to do this as 
to whether that's relevant.

Probably the easiest:

               ( '192.168.231.10' -as [IpAddress] ).Address

From: [email protected] [mailto:[email protected]] On 
Behalf Of Orlebeck, Geoffrey
Sent: Tuesday, May 24, 2016 6:03 PM
To: '[email protected]'
Subject: [powershell] CSV Comparisons:

I have two CSV files. One is a list of IP addresses and DNS hostnames. The 
other has IP, DNS name (if it resolves) as well as "firstseen" and "lastseen" 
on the network. I am attempting to compare the list of IP addresses to the 
second list that contains additional fields.

The dataset has the following format:
[cid:[email protected]]

The goal is to loop through the list of IP addresses and check for matches in 
the CSV file. If found, I want to note the entire row with all fields 
(IP,Name,FirstSeen,LastSeen). I'm not sure of the most efficient way to process 
this. The smaller list is ~9k entries. The larger list has over 260k entries. 
On a small scale I am able to see this logic will at least give me a list of IP 
addresses found in both files:

$List = Import-csv C:\IP_Addresses.csv
$CSV = Import-Csv C:\DB_Export.csv
$Matches = $List | % {$CSV.ip -like $_.ip}

The efficiency of this method is where I'm concerned. I'm currently working 
with a limited subset of the data (about 500 entries each), and it takes a few 
minutes to parse. However, with the full list being 9,000 IP addresses against 
260,000+ rows in $CSV, I have a feeling it will take quite some time to process.

I'm wondering if anyone has any examples of such data comparisons. My end goal 
is to parse the 9k IP addresses, compare them against our 260k entries, and 
eventually filter based on the 'LastSeen' field. My gut says using  
Compare-Object or the above example is a bad process, I'm just not sure how 
else to go about processing such large files. Basically I read about the 
differences between Get-Content versus Streamreader 
here<https://foxdeploy.com/2016/03/23/coding-for-speed/> and thought there is 
probably a better way than the built-in cmdlets I am attempting to use.

Thank you.

-Geoff
Confidentiality Notice: This is a transmission from Community Hospital of the 
Monterey Peninsula. This message and any attached documents may be confidential 
and contain information protected by state and federal medical privacy 
statutes. They are intended only for the use of the addressee. If you are not 
the intended recipient, any disclosure, copying, or distribution of this 
information is strictly prohibited. If you received this transmission in error, 
please accept our apologies and notify the sender. Thank you.

[powershell] RE: CSV Comparisons:

Reply via email to