I realized that identical Icon/Unicon strings are implemented share the
same actual storage of a sequence of bytes in memory (since they are
immutable, this is okay).
Thus, I could simply concatenation of the configuration-item-versions to
represent each configuration; as long as I get or sort the configuration
items into the same order, this does not have a severe impact on physical
storage.

Attached is my solution in case anyone is curious - it is less readable
than I would like because there were too many kinds of variables and
structures to make "plain" names meaningful.  It less than a week of my
spare time, once I had the plan figured out.  However, I believe that
writing a program to do something this complicated in any other language
would have taken MUCH longer (as would the debugging).

Now I have something that I can extend to process an arbitrary number of
configuration items for an arbitrary number of hosts.  Right now, it
processes six configuration items for several thousand hosts in under a
second (I give it an input file that has been gathered over the network;
thus, the network-response time for each hosts does not affect the time
that it takes the program to run).

On Sun, September 14, 2008 1:21 pm, Art Eschenlauer said:
> Clint,
>
> You have understood my question and concisely phrased my concern ("Icon's
> built-in hash function is useless for sets, because sets are mutable").
> Choice B, or a variant of it (defining my own hash for a set, and
> maintaining a table with hashes as keys and sets as values), was all that
> I could come up with as an alternative to using seteq( ).
>
>
>
> Hugh,
>
> Here is a brief concrete summary of the problem:
> - There are hundreds of computers.
> - Each computer has a configuration consisting of a combination of
> versions of several dozen configuration items.
> - Each configuration item is defined by a unique id and a path to where a
> file may be located on a computer.
> - Each configuration item may be absent or may be one of up to a dozen
> versions.
> - Each version of a configuration item is distinguished (from other
> versions of the same configuration item) by its file size and last
> modification date.
> - I want to know all of the configurations, how many there are of each,
> and for each configuration, which computers have it.
>
> Right now, I am fetching the configurations, reading them into tables of a
> relational database, and performing SQL queries to get the information
> that I want.  But this approach takes a lot of storage, and every time
> that I want to add a new file, I have to rewrite a bunch of queries.  So,
> I was trying for a more elegant solution in Icon.
>
>
############################################################################
#
# File:     vrsncmbntns.icn
#
# Subject:  Summarizes combinations of configuration items on several hosts
#
# Author:   Arthur C. Eschenlauer
#
# Date:     October 1, 2008
#
# Version:  0.5
#
############################################################################
#
#   This file is in the public domain.
#   The freedom of its content is protected by the Lesser GNU public
#     license, version 2.1, February 1999,
#     http://www.gnu.org/licenses/lgpl.html
#   which means you are granted permission to use this in any way that 
#   does not limit others' freedom to use it.
#
############################################################################
# Using the definitions below, this program 
#   1. Read from a tab-separated-value "config-item-instance-file" 
#      the versions of configuration items for several hosts
#   2. Reads from a tab-terminated-value "config-item-product-version-file"
#      the product versions associated with versions of configuration items
#   3. Writes a summary of the number of hosts in each configuration
#   4. Writes the configuration for each hosts
#   5. If any configuration-item versions are not in the "config-item-
#      product-version-file", a meaningful dummy entry is added; this 
#      value may then be used to retrieve the name of the affected host
#      using the host-configuration file
#
# The file formats are as follows:
#   1. config-item-instance-file (tab-separated-values):
#      a. Location name
#      b. Host name
#      c. Last modification date of config item 
#      d. File size of config item, in bytes
#      e. File name of config item, without path
#      f. Name of config item
#   2. config-item-product-version file (mostly tab-terminated-values)
#      There are actually two types of lines in this file
#      a. Tab-terminated values
#         i.   File name of config item, without path
#         ii.  Last modification date of config item 
#         iii. File size of config item, in bytes
#         iv.  Version number of package to which config item belongs
#         v.   Optional package code (unique ID for a version of a config item)
#         vi.  Name of config item
#         vii. Optional Product code (unique ID for a config item, common among
#              all versions)
#      b. Aliases mapping config item names to labels for output files 
#         (no tabs required), match the pattern
#           ="item"                         || tab(many(' \t')) || 
#           (item_name := tab(upto(' \t'))) || tab(many(' \t')) ||
#           ="label"                        || tab(many(' \t')) || 
#           (item_label := tab(0)
#   3. configuration-summary file (comma-separated-values)
#      a. First column, number of locations with a combination of config items
#      b. Second column, number of hosts with a combination of config items
#      c. Remaining columns, the combination of config-item-versions, i.e.,
#         the version numbers of the config items in the combination
#   4. host-configuration file (comma-separated-values)
#      a. First column, name of location
#      b. Second column, name of host
#      c. Remaining columns, the combination of config-item-versions, i.e.,
#         the version numbers of the config items in the combination
############################################################################

$define USAGE " config-item-instance-file config-item-product-version-file 
stats-csv hosts-csv"

$define PRD_VER_SEP ", "

$define TXT_STRT "\""

# use the first definition of TXT_END for Microsoft Excel
# use the second definition for programs that are CSV-conformant
$define TXT_END "\"" # Excel may mess up things that look like numbers
# $define TXT_END " \xA0\"" # Excel plays well with numbers - others suffer

# # define UNORDERED_INPUT if order of config items for a host is 
non-deterministic
# $define UNORDERED_INPUT 1

# record representing a scan of configuration of each hosts at all locations
record Rscan( 
  S_sloc,        # set of location-name strings
  S_shost,       # set of host-name strings
  T_sloc_Shost,  # table mapping location-name string to set of host-names
  T_shost_scnfg, # table mapping host-name string to config string
  T_scnfg_Shost, # table mapping config string to set of host-names
  T_scnfg_Sloc,  # table mapping config string to set of location-names
  T_scnfg_sprdvr,# table mapping config string to product version string
  S_sprdnm,      # set of product-name strings
  T_scinm_scilbl,# table mapping config item name to label for output
  T_shost_sloc   # table mapping host-name string to location-names
)
# (temporary) record representing configuration item instance
record Rciitmp(
  sloc,          # string - name of location of host
  shost,         # string - name of host
  smodified,     # string - modification date of file
  ssize,         # string - size of file
  sfilename,     # string - file name
  scnfgitem      # string - configuration item
)
# record representing a configuration item's product version
record RciiPrdVrsn(
  sFilename,       # string - matches Rciitmp.sfilename
  sModified,       # string - matches Rciitmp.smodified
  sSize,           # string - matches Rciitemp.ssize
  sVersion,        # string - may not be empty; package version
  sPackageCode,    # string - may be empty; MSI Package code GUID
  sConfigItemName, # string - matches Rciitmp.scnfgitem
  sProductId       # string - may be empty; MSI Product Code GUID
)
procedure main( argv )
  local scan      # an Rscan record
  local cnfg      # a string-representation of a configuration
  local loc       # a location-name
  local host      # a host-name
  local vers      # a product version 
  local flocscsv  # locations CSV output file
  local fhostscsv # locations CSV output file

  write( &errout, &progname
       , "\nCopyright (c) 2008 Arthur Eschenlauer ([EMAIL PROTECTED])\n",
       , "This program was not developed using Target's time or resources.\n")
  # check arguments
  ( *argv = 4 ) | stop("usage: " || &progname || USAGE )

  every writes( &errout, &storage, "\t" ) ; write(&errout, "storage before")
  
  # open first argument and read configuration Rscan data structure
  scan := do_scan( argv[1], argv[2] ) | 
    stop("cannot scan " || argv[1] || "\nusage: " || &progname || USAGE )

  every writes( &errout, &storage, "\t" ) ; write(&errout, "storage between")
  
  # write csv of statistics for each configuration
  #   and write csv of configuration for each host
  write_config_stats( scan, open(argv[3],"w"), open(argv[4],"w") ) |
    stop("cannot write output files\nusage: " || &progname || USAGE )

  every writes( &errout, &storage, "\t" ) ; write(&errout, "storage after")
  
  # print hosts grouped by configuration
  # every cnfg := key( scan.T_scnfg_Shost ) 
  #   do {
  #     write( "\nConfiguration:\n", cnfg )
  #     write( "Hosts:" )
  #     every write( !sort(scan.T_scnfg_Shost[cnfg]) )
  #   }
end

procedure write_config_stats( scan, fcnfgcsv, fhostcsv )
  # write csv of statistics for each configuration
  local sprdnms   # a CSV string of statistics and product-names
  local sprdvals  # a CSV string of statistics and product-values
  local shostcfg  # a CSV string of location, host, and product-versions
  local shosthdr  # header for fhostcsv
  local Lprdnm    # a list of product-names
  local cnfg      # a string-representation of a configuration
  local Tprdnv    # a list of product name-value pairs
  local sprdkey   # a product-name
  local sloc      # a location name
  local Sslocs    # a set of location names
  local shost     # a host name
  local LLhostcfgs # a list of [host-name,cnfg] lists
  local Lhostcfg  # a [host-name,cnfg] list
  local Lslocs    # a sorted list of loc-names
  local Sscnfg    # a set of configuration strings
  #   1. get a sorted list of product names
  Lprdnm := sort( scan.S_sprdnm )
  #   2. make a CSV of column headers - "locations","hosts",product-names 
  sprdnms := ""
  every sprdkey := ! ( ["locations","hosts"] ||| Lprdnm ) do {
    sprdkey := \scan.T_scinm_scilbl[sprdkey]
    sprdnms := extend_csv( sprdnms, sprdkey )
  }
  #   3. output the headers for the configuration-statistics file
  write( fcnfgcsv, sprdnms ) | fail
  #   4. make a CSV of column headers - "locations","hosts",product-names 
  sprdnms := ""
  every sprdkey := ! ( ["location","host"] ||| Lprdnm ) do {
    sprdkey := \scan.T_scinm_scilbl[sprdkey]
    sprdnms := extend_csv( sprdnms, sprdkey )
  }
  shosthdr := sprdnms # save for later ...
  #   6. for each configuration row
  every cnfg := key( scan.T_scnfg_Shost ) 
    do {
      #   7. populate the first two columns
      sprdnms := *(scan.T_scnfg_Sloc[cnfg]) || "," || 
*(scan.T_scnfg_Shost[cnfg]) || ","
      #   8. split cnfg string into a table mapping product-name to 
product-version
      Tprdnv := ttv2table( cnfg )
      #   9. populate the rest of the columns with values from the table
      sprdvals := ""
      every sprdkey := ! Lprdnm do sprdvals := extend_csv( sprdvals, 
Tprdnv[sprdkey] )
      #  10. output the result
      write( fcnfgcsv, sprdnms, sprdvals ) | fail
    }
  #  11. output the headers for the host-configurations file
  write( fhostcsv, shosthdr ) | fail
  LLhostcfgs := sort( scan.T_shost_scnfg, 1 )
  #  12. for each host row
  every Lhostcfg := ! LLhostcfgs 
    do {
      shost := Lhostcfg[1] ; cnfg := Lhostcfg[2]
      #  13. populate the first two columns
      shostcfg := "\"" || scan.T_shost_sloc[ shost ] || "\",\"" || shost || 
"\","
      #  14. split cnfg string into a table mapping product-name to 
product-version
      Tprdnv := ttv2table( cnfg )
      #  15. populate the rest of the columns with values from the table
      sprdvals := ""
      every sprdkey := ! Lprdnm do sprdvals := extend_csv( sprdvals, 
Tprdnv[sprdkey] )
      #  16. output the result
      write( fhostcsv, shostcfg, sprdvals ) | fail
    }
  # check for locs with more than one cnfg
  Sslocs := set( )
  every shost := key( scan.T_shost_sloc ) 
    do insert( Sslocs, scan.T_shost_sloc[shost] )
  Lslocs := sort( Sslocs )
  every sloc := ! Lslocs do {
    Sscnfg := set( )
    every insert( Sscnfg, scan.T_shost_scnfg[ ! scan.T_sloc_Shost[sloc] ] )
    if *Sscnfg > 1 then {
      write(&output, "Location ", sloc, " has ", *Sscnfg, " configurations.")
    }
  }
  return # produce &null
end

procedure ttv2table(s)
  # split a list in the format ( key PRD_VER_SEP value TAB )* into a table
  local T
  T := table( "" )
  s ? {
    while insert( T
                , tab(find(PRD_VER_SEP))
                , 2( =PRD_VER_SEP, 1( tab(upto('\t')), move(1) ) | tab(0) )
                )
    }
  return T
end

procedure table2ttv(T)
  # sort a table and join the results into a tab-terminated value string
  #   with PRD_VER_SEP as the separator between the product name and the version
  local sorted, result, Lkvp
  # first, sort into list of [key,value] lists, ordered by key
  sorted := sort( T, 1 )
  # next, concatenate values onto result string
  result := ""
  every Lkvp := ! sorted 
    do ( result ||:= Lkvp[1], result ||:= PRD_VER_SEP
       , result ||:= Lkvp[2], result ||:= "\t"        )
  return result
end

procedure sort_ttv(s)
  return table2ttv( ttv2table(s) )
end

procedure extend_csv( csv, str )
  # append to (a CSV string) a quoted string that starts with a non-breaking 
space 
  local esc_quot ; esc_quot := "" ; str ? {
    while esc_quot ||:= tab(upto('"')+1)||"\""
    esc_quot ||:= tab(0)
  }
  if * csv = 0
    then csv := TXT_STRT || esc_quot || TXT_END
    else csv ||:= "," || TXT_STRT || esc_quot || TXT_END
  return csv
end

procedure do_scan( cii_fn, ciipv_fn )
  # procedure do_scan build an Rscan data structure
  local cii_file, ciipv_file, cii, ciipv, pv_key, scan, scnfg, scii
  local cii_count, loop_time
  # validate arguments
  \cii_fn | stop("missing argument 1 (configuration-item instance file)")
  \ciipv_fn | stop("missing argument 1 (config-item product-version file)")
  # open file of configuration-item instances
  cii_file := open( cii_fn, "r" ) | 
    stop("could not open file " || cii_fn || " for reading" )
  # open file relating configuration-item versions to product versions
  ciipv_file := open( ciipv_fn, "r" ) | 
    stop("could not open file " || ciipv_fn || " for reading" )
  # construct the Rscan instance
  scan := Rscan( 
    set( ),    # S_sloc         - set of location-name strings
    set( ),    # S_shost        - set of host-name strings
    table( ),  # T_sloc_Shost   - table maps location-name to set of host-names
    table(""), # T_shost_scnfg  - table maps host-name to config string
    table( ),  # T_scnfg_Shost  - table maps config str to set of host-names
    table( ),  # T_scnfg_Sloc   - table maps config str to set of location-names
    table( ),  # T_scnfg_sprdvr - table maps config str to product-version-str
    set( ),    # S_sprdnm       - set of product-name strings
    table( ),  # T_scinm_scilbl - table mapping config item name to label for 
output
    table( )   # T_shost_sloc   - table mapping host-name string to set of 
location-names
  )
  # read the relationships of configuration-item versions to product versions
  cii_count := 0; loop_time := &time 
  while ciipv := read_ciiPrdVrsn(ciipv_file, scan) do {
    cii_count +:= 1
    pv_key := ciipv.sSize || "\t" || ciipv.sModified || "\t" || 
      ciipv.sFilename || "\t" || ciipv.sConfigItemName
    insert( scan.T_scnfg_sprdvr
          , pv_key
          , ciipv.sConfigItemName || "\t" ||
            ciipv.sVersion || "\t" || 
            ciipv.sPackageCode || "\t" || 
            ciipv.sProductId
          )
  }
  close( ciipv_file ) # done reading, so close, then
  write( &errout, "Read ", cii_count, " product versions in ", loop_time := 
&time - loop_time, " ms")
  write( &errout, "Read ", cii_count / (if loop_time > 0 then loop_time else 
1), " product versions per ms")
  # reopen relationships file for appending any newly discovered cfg itm vrsns
  ciipv_file := open( ciipv_fn, "a" ) | 
    stop("could not open file " || ciipv_fn || " for appending" )
  # process the configuration item instance data into the Rscan structure
  cii_count := 0; loop_time := &time 
  while cii := read_ciitmp( cii_file ) do {
    cii_count +:= 1
    # insert location-name into set of location-names
    insert( scan.S_sloc, cii.sloc ) 
    # insert host-name into set of host-names
    insert( scan.S_shost, cii.shost )
    # create host-name set for location-name if the set does not exist
    /scan.T_sloc_Shost[ cii.sloc ] := set( )
    # insert host-name into set mapped from location-name
    insert( scan.T_sloc_Shost[ cii.sloc ], cii.shost )
    # get config string (empty string by default) mapped from host-name
    scnfg := scan.T_shost_scnfg[ cii.shost ]
    $ifdef UNORDERED_INPUT
      # sort the result
      scnfg := sort_ttv( scnfg )
    $endif
    # delete location-name from set (if it exists) for old host config string
    delete( \scan.T_scnfg_Sloc[  scnfg ], cii.sloc  )
    # delete host-name from set (if it exists) for old host config string
    delete( \scan.T_scnfg_Shost[ scnfg ], cii.shost )
    # construct config string
    scii := cii.ssize || "\t" || cii.smodified || "\t" || 
            cii.sfilename || "\t" || cii.scnfgitem
    # transform config string to product-version info
    if /scan.T_scnfg_sprdvr[ scii ]
      then  {
        ciipv := RciiPrdVrsn( "", "", "", "", "", "", "" )
        ciipv.sFilename := cii.sfilename
        ciipv.sModified := cii.smodified
        ciipv.sSize     := cii.ssize
        ciipv.sVersion  := cii.smodified[-4:0] || "." || cii.smodified[1:3] ||
           "." || cii.smodified[4:6] || "-" || strip_commas(cii.ssize)
        ciipv.sConfigItemName := cii.scnfgitem
        write( ciipv_file, ciipv.sFilename || "\t" || ciipv.sModified || "\t" 
|| ciipv.sSize || 
          "\t" || ciipv.sVersion || "\t\t" || ciipv.sConfigItemName || "\t" )
        insert( scan.T_scnfg_sprdvr
              , scii
              , ciipv.sConfigItemName || "\t" ||
                ciipv.sVersion || "\t" || 
                ciipv.sPackageCode || "\t" || 
                ciipv.sProductId
              )
      }
    \scan.T_scnfg_sprdvr[ scii ] | stop( "T_scnfg_sprdvr[" || scii || "] is 
null" )
    # modify config-string for host, saving a reference in scnfg
    scnfg := 
      ( scan.T_shost_scnfg[ cii.shost ] ||:= 
        ( scan.T_scnfg_sprdvr[ scii ] ? 
          tab(upto('\t')) || (move(1),PRD_VER_SEP) || tab(upto('\t')) 
        ) || "\t" 
      )
    $ifdef UNORDERED_INPUT
      # sort the result
      scnfg := sort_ttv( scnfg )
    $endif
    # create location-name set for new config string if the set does not exist
    /scan.T_scnfg_Sloc[  scnfg ] := set( )
    # create host-name set for new config string if the set does not exist
    /scan.T_scnfg_Shost[ scnfg ] := set( )
    # insert location-name into set for new host config string
    insert( scan.T_scnfg_Sloc[  scnfg ], cii.sloc  )
    # insert host-name into set for new host config string
    insert( scan.T_scnfg_Shost[ scnfg ], cii.shost )
    # map host-name to location-name
    scan.T_shost_sloc[ cii.shost ] := cii.sloc
  }
  write( &errout, "Read ", cii_count, " config item instances in ", loop_time 
:= &time - loop_time, " ms")
  write( &errout, "Read ", cii_count / (if loop_time > 0 then loop_time else 
1), " config item instances per ms")
  # for each configuration string
  every scnfg := key( scan.T_scnfg_Shost ) do {
    # if a configuration string has no corresponding host...
    if * scan.T_scnfg_Shost[scnfg] = 0 
      then { # ... then discard it
        delete( scan.T_scnfg_Shost, scnfg )
      }
      else { # ... else put its product-name into scan's product-name set
        scnfg ? while insert( scan.S_sprdnm
                            , 1( tab(find(PRD_VER_SEP)), tab(upto('\t')), 
move(1) ) )
      }
  }
  # return data structure describing configuration of all hosts and locs
  return scan
end

procedure strip_commas( s )
  local result
  result := ""
  s ? while not pos(0) do result ||:= 1( tab( upto(',') ), move(1) ) | tab( 0 )
  return result
end

procedure read_ciiPrdVrsn( ciipv_file, scan )
  local ciipv, line, item_name, item_label
  ciipv := RciiPrdVrsn( "", "", "", "", "", "", "" )
  while line := read(ciipv_file) do line ? {
    # if this is an item ... label line, process it and skip it
    ( ="item" || tab(many(' \t')) || 
      (item_name := tab(upto(' \t'))) || tab(many(' \t')) ||
      ="label" || tab(many(' \t')) || (item_label := tab(0))
    , insert( scan.T_scinm_scilbl, item_name, item_label )
    , next
    )
    # Okay, see if this is a data line ...
    ciipv.sFilename      := ( tab( upto('\t' ) ) ) | 
      ( write(&errout, "could not read sFilename", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after sFilename", "'", line, "'"), next )
    ciipv.sModified      := ( tab( upto('\t' ) ) ) | 
      ( write(&errout, "could not read sModified", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after sModified", "'", line, "'"), next )
    ciipv.sSize      := ( tab( upto('\t' ) ) ) | 
      ( write(&errout, "could not read sSize", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after sSize", "'", line, "'"), next )
    ciipv.sVersion      := ( tab( upto('\t' ) ) ) | 
      ( write(&errout, "could not read sVersion", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after sVersion", "'", line, "'"), next )
    ciipv.sPackageCode      := ( tab( upto('\t' ) ) ) | 
      ( write(&errout, "could not read sPackageCode", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after sPackageCode", "'", line, "'"), 
next )
    ciipv.sConfigItemName      := ( tab( upto('\t' ) ) ) | 
      ( write(&errout, "could not read sConfigItemName", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after sConfigItemName", "'", line, "'"), 
next )
    ciipv.sProductId      := ( tab( 0 ) ) | ""
    return ciipv
  }
end

procedure read_ciitmp( cii_file )
  local cii, line
  static nottab
  initial nottab := &cset -- '\t' 
  cii := Rciitmp( "", "", "", "", "", "" )
  while line := read(cii_file) do line ? {
    cii.sloc      := ( tab( upto('\t' ) ) ) | 
      ( write(&errout, "could not read sloc", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after sloc", "'", line, "'"), next )
    cii.shost     := ( tab( upto('\t' ) ) ) | 
      ( write(&errout, "could not read shost", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after shost", "'", line, "'"), next )
    cii.smodified := ( tab( upto('\t' ) ) ) |
      ( write(&errout, "could not read smodified", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after smodified", "'", line, "'"), next )
    cii.ssize     := ( tab( upto('\t' ) ) ) |
      ( write(&errout, "could not read ssize", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after ssize", "'", line, "'"), next )
    cii.sfilename := ( tab( upto('\t' ) ) ) |
      ( write(&errout, "could not read sfilename", "'", line, "'"), next )
    move(1) |
      ( write(&errout, "could not move after sfilename", "'", line, "'"), next )
    cii.scnfgitem := ( move(1) || tab( many( nottab ) ) ) |
      ( write(&errout, "could not read scnfgitem", "'", line, "'"), next )
    pos(0) |
      ( write(&errout, "\nnot at end of line after scnfgitem:\n  ", "'", line, 
"'\n")
      , next )
    return cii
  }
end
# vim: et sw=2 ts=2 ai :
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Unicon-group mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/unicon-group

Reply via email to