I realized that identical Icon/Unicon strings are implemented share the
same actual storage of a sequence of bytes in memory (since they are
immutable, this is okay).
Thus, I could simply concatenation of the configuration-item-versions to
represent each configuration; as long as I get or sort the configuration
items into the same order, this does not have a severe impact on physical
storage.
Attached is my solution in case anyone is curious - it is less readable
than I would like because there were too many kinds of variables and
structures to make "plain" names meaningful. It less than a week of my
spare time, once I had the plan figured out. However, I believe that
writing a program to do something this complicated in any other language
would have taken MUCH longer (as would the debugging).
Now I have something that I can extend to process an arbitrary number of
configuration items for an arbitrary number of hosts. Right now, it
processes six configuration items for several thousand hosts in under a
second (I give it an input file that has been gathered over the network;
thus, the network-response time for each hosts does not affect the time
that it takes the program to run).
On Sun, September 14, 2008 1:21 pm, Art Eschenlauer said:
> Clint,
>
> You have understood my question and concisely phrased my concern ("Icon's
> built-in hash function is useless for sets, because sets are mutable").
> Choice B, or a variant of it (defining my own hash for a set, and
> maintaining a table with hashes as keys and sets as values), was all that
> I could come up with as an alternative to using seteq( ).
>
>
>
> Hugh,
>
> Here is a brief concrete summary of the problem:
> - There are hundreds of computers.
> - Each computer has a configuration consisting of a combination of
> versions of several dozen configuration items.
> - Each configuration item is defined by a unique id and a path to where a
> file may be located on a computer.
> - Each configuration item may be absent or may be one of up to a dozen
> versions.
> - Each version of a configuration item is distinguished (from other
> versions of the same configuration item) by its file size and last
> modification date.
> - I want to know all of the configurations, how many there are of each,
> and for each configuration, which computers have it.
>
> Right now, I am fetching the configurations, reading them into tables of a
> relational database, and performing SQL queries to get the information
> that I want. But this approach takes a lot of storage, and every time
> that I want to add a new file, I have to rewrite a bunch of queries. So,
> I was trying for a more elegant solution in Icon.
>
>
############################################################################
#
# File: vrsncmbntns.icn
#
# Subject: Summarizes combinations of configuration items on several hosts
#
# Author: Arthur C. Eschenlauer
#
# Date: October 1, 2008
#
# Version: 0.5
#
############################################################################
#
# This file is in the public domain.
# The freedom of its content is protected by the Lesser GNU public
# license, version 2.1, February 1999,
# http://www.gnu.org/licenses/lgpl.html
# which means you are granted permission to use this in any way that
# does not limit others' freedom to use it.
#
############################################################################
# Using the definitions below, this program
# 1. Read from a tab-separated-value "config-item-instance-file"
# the versions of configuration items for several hosts
# 2. Reads from a tab-terminated-value "config-item-product-version-file"
# the product versions associated with versions of configuration items
# 3. Writes a summary of the number of hosts in each configuration
# 4. Writes the configuration for each hosts
# 5. If any configuration-item versions are not in the "config-item-
# product-version-file", a meaningful dummy entry is added; this
# value may then be used to retrieve the name of the affected host
# using the host-configuration file
#
# The file formats are as follows:
# 1. config-item-instance-file (tab-separated-values):
# a. Location name
# b. Host name
# c. Last modification date of config item
# d. File size of config item, in bytes
# e. File name of config item, without path
# f. Name of config item
# 2. config-item-product-version file (mostly tab-terminated-values)
# There are actually two types of lines in this file
# a. Tab-terminated values
# i. File name of config item, without path
# ii. Last modification date of config item
# iii. File size of config item, in bytes
# iv. Version number of package to which config item belongs
# v. Optional package code (unique ID for a version of a config item)
# vi. Name of config item
# vii. Optional Product code (unique ID for a config item, common among
# all versions)
# b. Aliases mapping config item names to labels for output files
# (no tabs required), match the pattern
# ="item" || tab(many(' \t')) ||
# (item_name := tab(upto(' \t'))) || tab(many(' \t')) ||
# ="label" || tab(many(' \t')) ||
# (item_label := tab(0)
# 3. configuration-summary file (comma-separated-values)
# a. First column, number of locations with a combination of config items
# b. Second column, number of hosts with a combination of config items
# c. Remaining columns, the combination of config-item-versions, i.e.,
# the version numbers of the config items in the combination
# 4. host-configuration file (comma-separated-values)
# a. First column, name of location
# b. Second column, name of host
# c. Remaining columns, the combination of config-item-versions, i.e.,
# the version numbers of the config items in the combination
############################################################################
$define USAGE " config-item-instance-file config-item-product-version-file
stats-csv hosts-csv"
$define PRD_VER_SEP ", "
$define TXT_STRT "\""
# use the first definition of TXT_END for Microsoft Excel
# use the second definition for programs that are CSV-conformant
$define TXT_END "\"" # Excel may mess up things that look like numbers
# $define TXT_END " \xA0\"" # Excel plays well with numbers - others suffer
# # define UNORDERED_INPUT if order of config items for a host is
non-deterministic
# $define UNORDERED_INPUT 1
# record representing a scan of configuration of each hosts at all locations
record Rscan(
S_sloc, # set of location-name strings
S_shost, # set of host-name strings
T_sloc_Shost, # table mapping location-name string to set of host-names
T_shost_scnfg, # table mapping host-name string to config string
T_scnfg_Shost, # table mapping config string to set of host-names
T_scnfg_Sloc, # table mapping config string to set of location-names
T_scnfg_sprdvr,# table mapping config string to product version string
S_sprdnm, # set of product-name strings
T_scinm_scilbl,# table mapping config item name to label for output
T_shost_sloc # table mapping host-name string to location-names
)
# (temporary) record representing configuration item instance
record Rciitmp(
sloc, # string - name of location of host
shost, # string - name of host
smodified, # string - modification date of file
ssize, # string - size of file
sfilename, # string - file name
scnfgitem # string - configuration item
)
# record representing a configuration item's product version
record RciiPrdVrsn(
sFilename, # string - matches Rciitmp.sfilename
sModified, # string - matches Rciitmp.smodified
sSize, # string - matches Rciitemp.ssize
sVersion, # string - may not be empty; package version
sPackageCode, # string - may be empty; MSI Package code GUID
sConfigItemName, # string - matches Rciitmp.scnfgitem
sProductId # string - may be empty; MSI Product Code GUID
)
procedure main( argv )
local scan # an Rscan record
local cnfg # a string-representation of a configuration
local loc # a location-name
local host # a host-name
local vers # a product version
local flocscsv # locations CSV output file
local fhostscsv # locations CSV output file
write( &errout, &progname
, "\nCopyright (c) 2008 Arthur Eschenlauer ([EMAIL PROTECTED])\n",
, "This program was not developed using Target's time or resources.\n")
# check arguments
( *argv = 4 ) | stop("usage: " || &progname || USAGE )
every writes( &errout, &storage, "\t" ) ; write(&errout, "storage before")
# open first argument and read configuration Rscan data structure
scan := do_scan( argv[1], argv[2] ) |
stop("cannot scan " || argv[1] || "\nusage: " || &progname || USAGE )
every writes( &errout, &storage, "\t" ) ; write(&errout, "storage between")
# write csv of statistics for each configuration
# and write csv of configuration for each host
write_config_stats( scan, open(argv[3],"w"), open(argv[4],"w") ) |
stop("cannot write output files\nusage: " || &progname || USAGE )
every writes( &errout, &storage, "\t" ) ; write(&errout, "storage after")
# print hosts grouped by configuration
# every cnfg := key( scan.T_scnfg_Shost )
# do {
# write( "\nConfiguration:\n", cnfg )
# write( "Hosts:" )
# every write( !sort(scan.T_scnfg_Shost[cnfg]) )
# }
end
procedure write_config_stats( scan, fcnfgcsv, fhostcsv )
# write csv of statistics for each configuration
local sprdnms # a CSV string of statistics and product-names
local sprdvals # a CSV string of statistics and product-values
local shostcfg # a CSV string of location, host, and product-versions
local shosthdr # header for fhostcsv
local Lprdnm # a list of product-names
local cnfg # a string-representation of a configuration
local Tprdnv # a list of product name-value pairs
local sprdkey # a product-name
local sloc # a location name
local Sslocs # a set of location names
local shost # a host name
local LLhostcfgs # a list of [host-name,cnfg] lists
local Lhostcfg # a [host-name,cnfg] list
local Lslocs # a sorted list of loc-names
local Sscnfg # a set of configuration strings
# 1. get a sorted list of product names
Lprdnm := sort( scan.S_sprdnm )
# 2. make a CSV of column headers - "locations","hosts",product-names
sprdnms := ""
every sprdkey := ! ( ["locations","hosts"] ||| Lprdnm ) do {
sprdkey := \scan.T_scinm_scilbl[sprdkey]
sprdnms := extend_csv( sprdnms, sprdkey )
}
# 3. output the headers for the configuration-statistics file
write( fcnfgcsv, sprdnms ) | fail
# 4. make a CSV of column headers - "locations","hosts",product-names
sprdnms := ""
every sprdkey := ! ( ["location","host"] ||| Lprdnm ) do {
sprdkey := \scan.T_scinm_scilbl[sprdkey]
sprdnms := extend_csv( sprdnms, sprdkey )
}
shosthdr := sprdnms # save for later ...
# 6. for each configuration row
every cnfg := key( scan.T_scnfg_Shost )
do {
# 7. populate the first two columns
sprdnms := *(scan.T_scnfg_Sloc[cnfg]) || "," ||
*(scan.T_scnfg_Shost[cnfg]) || ","
# 8. split cnfg string into a table mapping product-name to
product-version
Tprdnv := ttv2table( cnfg )
# 9. populate the rest of the columns with values from the table
sprdvals := ""
every sprdkey := ! Lprdnm do sprdvals := extend_csv( sprdvals,
Tprdnv[sprdkey] )
# 10. output the result
write( fcnfgcsv, sprdnms, sprdvals ) | fail
}
# 11. output the headers for the host-configurations file
write( fhostcsv, shosthdr ) | fail
LLhostcfgs := sort( scan.T_shost_scnfg, 1 )
# 12. for each host row
every Lhostcfg := ! LLhostcfgs
do {
shost := Lhostcfg[1] ; cnfg := Lhostcfg[2]
# 13. populate the first two columns
shostcfg := "\"" || scan.T_shost_sloc[ shost ] || "\",\"" || shost ||
"\","
# 14. split cnfg string into a table mapping product-name to
product-version
Tprdnv := ttv2table( cnfg )
# 15. populate the rest of the columns with values from the table
sprdvals := ""
every sprdkey := ! Lprdnm do sprdvals := extend_csv( sprdvals,
Tprdnv[sprdkey] )
# 16. output the result
write( fhostcsv, shostcfg, sprdvals ) | fail
}
# check for locs with more than one cnfg
Sslocs := set( )
every shost := key( scan.T_shost_sloc )
do insert( Sslocs, scan.T_shost_sloc[shost] )
Lslocs := sort( Sslocs )
every sloc := ! Lslocs do {
Sscnfg := set( )
every insert( Sscnfg, scan.T_shost_scnfg[ ! scan.T_sloc_Shost[sloc] ] )
if *Sscnfg > 1 then {
write(&output, "Location ", sloc, " has ", *Sscnfg, " configurations.")
}
}
return # produce &null
end
procedure ttv2table(s)
# split a list in the format ( key PRD_VER_SEP value TAB )* into a table
local T
T := table( "" )
s ? {
while insert( T
, tab(find(PRD_VER_SEP))
, 2( =PRD_VER_SEP, 1( tab(upto('\t')), move(1) ) | tab(0) )
)
}
return T
end
procedure table2ttv(T)
# sort a table and join the results into a tab-terminated value string
# with PRD_VER_SEP as the separator between the product name and the version
local sorted, result, Lkvp
# first, sort into list of [key,value] lists, ordered by key
sorted := sort( T, 1 )
# next, concatenate values onto result string
result := ""
every Lkvp := ! sorted
do ( result ||:= Lkvp[1], result ||:= PRD_VER_SEP
, result ||:= Lkvp[2], result ||:= "\t" )
return result
end
procedure sort_ttv(s)
return table2ttv( ttv2table(s) )
end
procedure extend_csv( csv, str )
# append to (a CSV string) a quoted string that starts with a non-breaking
space
local esc_quot ; esc_quot := "" ; str ? {
while esc_quot ||:= tab(upto('"')+1)||"\""
esc_quot ||:= tab(0)
}
if * csv = 0
then csv := TXT_STRT || esc_quot || TXT_END
else csv ||:= "," || TXT_STRT || esc_quot || TXT_END
return csv
end
procedure do_scan( cii_fn, ciipv_fn )
# procedure do_scan build an Rscan data structure
local cii_file, ciipv_file, cii, ciipv, pv_key, scan, scnfg, scii
local cii_count, loop_time
# validate arguments
\cii_fn | stop("missing argument 1 (configuration-item instance file)")
\ciipv_fn | stop("missing argument 1 (config-item product-version file)")
# open file of configuration-item instances
cii_file := open( cii_fn, "r" ) |
stop("could not open file " || cii_fn || " for reading" )
# open file relating configuration-item versions to product versions
ciipv_file := open( ciipv_fn, "r" ) |
stop("could not open file " || ciipv_fn || " for reading" )
# construct the Rscan instance
scan := Rscan(
set( ), # S_sloc - set of location-name strings
set( ), # S_shost - set of host-name strings
table( ), # T_sloc_Shost - table maps location-name to set of host-names
table(""), # T_shost_scnfg - table maps host-name to config string
table( ), # T_scnfg_Shost - table maps config str to set of host-names
table( ), # T_scnfg_Sloc - table maps config str to set of location-names
table( ), # T_scnfg_sprdvr - table maps config str to product-version-str
set( ), # S_sprdnm - set of product-name strings
table( ), # T_scinm_scilbl - table mapping config item name to label for
output
table( ) # T_shost_sloc - table mapping host-name string to set of
location-names
)
# read the relationships of configuration-item versions to product versions
cii_count := 0; loop_time := &time
while ciipv := read_ciiPrdVrsn(ciipv_file, scan) do {
cii_count +:= 1
pv_key := ciipv.sSize || "\t" || ciipv.sModified || "\t" ||
ciipv.sFilename || "\t" || ciipv.sConfigItemName
insert( scan.T_scnfg_sprdvr
, pv_key
, ciipv.sConfigItemName || "\t" ||
ciipv.sVersion || "\t" ||
ciipv.sPackageCode || "\t" ||
ciipv.sProductId
)
}
close( ciipv_file ) # done reading, so close, then
write( &errout, "Read ", cii_count, " product versions in ", loop_time :=
&time - loop_time, " ms")
write( &errout, "Read ", cii_count / (if loop_time > 0 then loop_time else
1), " product versions per ms")
# reopen relationships file for appending any newly discovered cfg itm vrsns
ciipv_file := open( ciipv_fn, "a" ) |
stop("could not open file " || ciipv_fn || " for appending" )
# process the configuration item instance data into the Rscan structure
cii_count := 0; loop_time := &time
while cii := read_ciitmp( cii_file ) do {
cii_count +:= 1
# insert location-name into set of location-names
insert( scan.S_sloc, cii.sloc )
# insert host-name into set of host-names
insert( scan.S_shost, cii.shost )
# create host-name set for location-name if the set does not exist
/scan.T_sloc_Shost[ cii.sloc ] := set( )
# insert host-name into set mapped from location-name
insert( scan.T_sloc_Shost[ cii.sloc ], cii.shost )
# get config string (empty string by default) mapped from host-name
scnfg := scan.T_shost_scnfg[ cii.shost ]
$ifdef UNORDERED_INPUT
# sort the result
scnfg := sort_ttv( scnfg )
$endif
# delete location-name from set (if it exists) for old host config string
delete( \scan.T_scnfg_Sloc[ scnfg ], cii.sloc )
# delete host-name from set (if it exists) for old host config string
delete( \scan.T_scnfg_Shost[ scnfg ], cii.shost )
# construct config string
scii := cii.ssize || "\t" || cii.smodified || "\t" ||
cii.sfilename || "\t" || cii.scnfgitem
# transform config string to product-version info
if /scan.T_scnfg_sprdvr[ scii ]
then {
ciipv := RciiPrdVrsn( "", "", "", "", "", "", "" )
ciipv.sFilename := cii.sfilename
ciipv.sModified := cii.smodified
ciipv.sSize := cii.ssize
ciipv.sVersion := cii.smodified[-4:0] || "." || cii.smodified[1:3] ||
"." || cii.smodified[4:6] || "-" || strip_commas(cii.ssize)
ciipv.sConfigItemName := cii.scnfgitem
write( ciipv_file, ciipv.sFilename || "\t" || ciipv.sModified || "\t"
|| ciipv.sSize ||
"\t" || ciipv.sVersion || "\t\t" || ciipv.sConfigItemName || "\t" )
insert( scan.T_scnfg_sprdvr
, scii
, ciipv.sConfigItemName || "\t" ||
ciipv.sVersion || "\t" ||
ciipv.sPackageCode || "\t" ||
ciipv.sProductId
)
}
\scan.T_scnfg_sprdvr[ scii ] | stop( "T_scnfg_sprdvr[" || scii || "] is
null" )
# modify config-string for host, saving a reference in scnfg
scnfg :=
( scan.T_shost_scnfg[ cii.shost ] ||:=
( scan.T_scnfg_sprdvr[ scii ] ?
tab(upto('\t')) || (move(1),PRD_VER_SEP) || tab(upto('\t'))
) || "\t"
)
$ifdef UNORDERED_INPUT
# sort the result
scnfg := sort_ttv( scnfg )
$endif
# create location-name set for new config string if the set does not exist
/scan.T_scnfg_Sloc[ scnfg ] := set( )
# create host-name set for new config string if the set does not exist
/scan.T_scnfg_Shost[ scnfg ] := set( )
# insert location-name into set for new host config string
insert( scan.T_scnfg_Sloc[ scnfg ], cii.sloc )
# insert host-name into set for new host config string
insert( scan.T_scnfg_Shost[ scnfg ], cii.shost )
# map host-name to location-name
scan.T_shost_sloc[ cii.shost ] := cii.sloc
}
write( &errout, "Read ", cii_count, " config item instances in ", loop_time
:= &time - loop_time, " ms")
write( &errout, "Read ", cii_count / (if loop_time > 0 then loop_time else
1), " config item instances per ms")
# for each configuration string
every scnfg := key( scan.T_scnfg_Shost ) do {
# if a configuration string has no corresponding host...
if * scan.T_scnfg_Shost[scnfg] = 0
then { # ... then discard it
delete( scan.T_scnfg_Shost, scnfg )
}
else { # ... else put its product-name into scan's product-name set
scnfg ? while insert( scan.S_sprdnm
, 1( tab(find(PRD_VER_SEP)), tab(upto('\t')),
move(1) ) )
}
}
# return data structure describing configuration of all hosts and locs
return scan
end
procedure strip_commas( s )
local result
result := ""
s ? while not pos(0) do result ||:= 1( tab( upto(',') ), move(1) ) | tab( 0 )
return result
end
procedure read_ciiPrdVrsn( ciipv_file, scan )
local ciipv, line, item_name, item_label
ciipv := RciiPrdVrsn( "", "", "", "", "", "", "" )
while line := read(ciipv_file) do line ? {
# if this is an item ... label line, process it and skip it
( ="item" || tab(many(' \t')) ||
(item_name := tab(upto(' \t'))) || tab(many(' \t')) ||
="label" || tab(many(' \t')) || (item_label := tab(0))
, insert( scan.T_scinm_scilbl, item_name, item_label )
, next
)
# Okay, see if this is a data line ...
ciipv.sFilename := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read sFilename", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after sFilename", "'", line, "'"), next )
ciipv.sModified := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read sModified", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after sModified", "'", line, "'"), next )
ciipv.sSize := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read sSize", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after sSize", "'", line, "'"), next )
ciipv.sVersion := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read sVersion", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after sVersion", "'", line, "'"), next )
ciipv.sPackageCode := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read sPackageCode", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after sPackageCode", "'", line, "'"),
next )
ciipv.sConfigItemName := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read sConfigItemName", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after sConfigItemName", "'", line, "'"),
next )
ciipv.sProductId := ( tab( 0 ) ) | ""
return ciipv
}
end
procedure read_ciitmp( cii_file )
local cii, line
static nottab
initial nottab := &cset -- '\t'
cii := Rciitmp( "", "", "", "", "", "" )
while line := read(cii_file) do line ? {
cii.sloc := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read sloc", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after sloc", "'", line, "'"), next )
cii.shost := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read shost", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after shost", "'", line, "'"), next )
cii.smodified := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read smodified", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after smodified", "'", line, "'"), next )
cii.ssize := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read ssize", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after ssize", "'", line, "'"), next )
cii.sfilename := ( tab( upto('\t' ) ) ) |
( write(&errout, "could not read sfilename", "'", line, "'"), next )
move(1) |
( write(&errout, "could not move after sfilename", "'", line, "'"), next )
cii.scnfgitem := ( move(1) || tab( many( nottab ) ) ) |
( write(&errout, "could not read scnfgitem", "'", line, "'"), next )
pos(0) |
( write(&errout, "\nnot at end of line after scnfgitem:\n ", "'", line,
"'\n")
, next )
return cii
}
end
# vim: et sw=2 ts=2 ai :-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Unicon-group mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/unicon-group