Send Beginners mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."
Today's Topics:
1. Re: Noobie attempt to process log output into dependency
graph (John Lusk)
----------------------------------------------------------------------
Message: 1
Date: Wed, 14 Dec 2016 18:28:07 -0500
From: John Lusk <[email protected]>
To: The Haskell-Beginners Mailing List - Discussion of primarily
beginner-level topics related to Haskell <[email protected]>
Subject: Re: [Haskell-beginners] Noobie attempt to process log output
into dependency graph
Message-ID:
<cajqkmby+qd+jpkuicnoh6d6vrtjqao7j0tvakk1fj-ehgzt...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
(Or you could find it here: https://github.com/JohnL4/DependencyGraph)
On Wed, Dec 14, 2016 at 6:26 PM, John Lusk <[email protected]> wrote:
> Hi, all,
>
> Here's my question:
>
> I thought, for grins, I'd try to turn some log output into a dependency
> graph (using GraphViz's dot(1)). I'm having difficulty forcing my
> stateful paradigm into a functional one, so I need some help.
>
> If I was to do this with an imperative (stateful) language, I'd build a
> set of edges (or a map to a frequency count, really, since I'll use freq
> > 1 to add some output text noting the repeated occurrences), and then
> dump out the set elements to a text file that would look something like
> this fragment:
>
> a -> q
> q -> d
> d -> e [color=red]
> d -> f [color=red
>
> My big problem now is that if I process a subtree that looks like:
>
> a
> b
> c
> d
> b
> d
> e
>
> my current plan is to proces the first b-c-d subtree and then process the
> b-d-e subtree, *BUT* I need to pass the updated edge set to the second
> processing call, which is pretty stateful.
>
> Do I need to just bite the bullet and find some succinct way to do that,
> or is my entire approach just wrong, stuck in my stateful mindset?
>
> My (awful) code looks like this:
>
> -- Emit to stdout a series of dot(1) edges specifying dependencies.-- "A ->
> B" means "A depends on B".---- Build with 'ghc dependency-graph.hs'-- --
> Input is a text file containing lines as follows:-- (some indentation)
> (some extraneous text) (file-A) in (some directory)-- (some extra
> indentation) (some extraneous text) (file-B) in (some directory)-- (some
> indentation matching the first line above) (some extraneous text) (file-C) in
> (some directory)---- This means that file-A depends on file-B, but neither
> file-A nor file-B depend on file-C.---- Sample:--
> Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 :
> Processing SXA.Compass.Config.ViewModel.dll in C:\Program Files
> (x86)\Allscripts Sunrise\Clinical Manager Client\7.2.5575.0\--
> Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 :
> Adding C:\Program Files (x86)\Allscripts Sunrise\Clinical Manager
> Client\7.2.5575.0\SXA.Compass.Config.ViewModel.dll (IsPresent=true) to
> assemblyList at beginning of GetAssemblyListEx()--
> Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 :
> Processing SXA.Compass.Config.Utils.dll in C:\Program Files
> (x86)\Allscripts Sunrise\Clinical Manager Client\7.2.5575.0\---- (Need to
> skip the line containing "Adding", and only process the ones containing
> "Processing".)-- -- Algorithm:-- Read first line, parse, remember
> indentation-- Repeat for other lines, but if indentation increases,
> store pair A -> B in hashset.-- At end, dump out hashset.
> -- import Debug.Trace-- import System.Environment-- import
> System.Console.GetOpt-- import Data.Maybe (fromMaybe)-- import
> Data.List.Splitimport Prelude -- hiding (readFile) -- Because we want the
> System.IO.Strict version-- import System.IO (hPutStr, hPutStrLn, stderr)--
> import System.IO.Strict-- import Control.Monad-- import System.Directory--
> import System.FilePathimport Text.Regex.TDFA-- import
> Text.Regex.TDFA.String-- import Text.Printf
> -- import qualified Data.Map.Lazy as Mapimport qualified Data.Map.Strict as
> Map
> ---------------------------------------------------------------- Test Datal1
> = " Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList()
> Information: 0 : Processing SXA.Compass.Config.ViewModel.dll\tin C:\\Program
> Files (x86)\\Allscripts Sunrise\\Clinical Manager Client\\7.2.5575.0\\"l2 = "
> Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0
> : Adding C:\\Program Files (x86)\\Allscripts Sunrise\\Clinical Manager
> Client\\7.2.5575.0\\SXA.Compass.Config.ViewModel.dll\t(IsPresent=true)\tto
> assemblyList at beginning of GetAssemblyListEx()"l3 = "
> Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 :
> Processing SXA.Compass.Config.Utils.dll\tin C:\\Program Files
> (x86)\\Allscripts Sunrise\\Clinical Manager
> Client\\7.2.5575.0\\"----------------------------------------------------------------
> Test Data Ends-- See http://stackoverflow.com/q/32149354/370611-- toRegex =
> makeRegexOpts defaultCompOpt{multiline=False} defaultExecOpt
> -- Escape parens?-- initialFillerRegex :: String-- initialFillerRegex =
> "Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList\\(\\) Information:
> 0 : Processing"
> -- Regex matching (marking) a line to be processed-- valuableLineRegex ::
> String-- valuableLineRegex = "\\bProcessing\\b"
> -- |Regex matching line to be parsedparseLineRegex :: StringparseLineRegex =
> "^(.* Information: 0 : Processing )([^ ]*)[ \t]+in (.*)" -- 3subexpressions
> main :: IO()main = do
> logContents <- getContents
> putStrLn $ unlines $ fst $ edges (parseIndent $ lines logContents) Map.empty
> ------------------------------------------------------------------ |Parses
> out the leading indentation of the given String into a string of spaces and
> the rest of the lineparseIndent :: String -> (String,String)parseIndent s =
> ((fourth $ (s =~ "^( *)(.*)" :: (String,String,String,[String]))) !! 0,
> (fourth $ (s =~ "^( *)(.*)" ::
> (String,String,String,[String]))) !! 1)
> ------------------------------------------------------------------ |Returns a
> list of strings describing edges in the form "a -> b /* comment */"edges ::
> [(String,String)] -- ^ Input tuples: (indent, restOfString)
> -> Map.Map String Int -- ^ Map of edges in form "a -> b" with a count of
> the number of times that edge occurs
> -> [String] -- ^ Output list of edge descriptions in form "a -> b
> optionalExtraText"
> edges [] edgeSet =
> (edgeDump $ Map.assocs edgeSet, 0)
> edges (lastLine:[]) edgeSet =
> (edgeDump $ Map.assocs edgeSet, 1)
> edges (fstLogLine:sndLogLine:[]) edgeSet =
> let fstFields = (snd fstLogLine) =~ parseLineRegex ::
> (String,String,String,[String])
> sndFields = (snd sndLogLine) =~ parseLineRegex ::
> (String,String,String,[String])
> in
> if length (fourth fstFields) == 0
> then error ("Unmatched: " ++ (first fstFields)) -- First line must always
> match
> else if length (fourth sndFields) == 0 -- "Adding", not "Processing"
> then edges (fstLogLine:[]) edgeSet -- Skip useless line
> else if indentLength fstLogLine >= indentLength sndLogLine
> then edges (sndLogLine:[]) edgeSet -- Can't be an edge from first to
> second line; drop first line and keep going.
> else edges (sndLogLine:[])
> (Map.insertWith (+) ((fullName fstFields) ++ (fullName sndFields)) 1)
> edges (fstLogLine:sndLogLine:thdLogLine:logLines) edgeSet =
> let fstFields = (snd fstLogLine) =~ parseLineRegex ::
> (String,String,String,[String])
> sndFields = (snd sndLogLine) =~ parseLineRegex ::
> (String,String,String,[String])
> thdFields = (snd thdLogLine) =~ parseLineRegex ::
> (String,String,String,[String])
> in
> if length (fourth fstFields) == 0
> then error ("Unmatched: " ++ (first fstFields)) -- First line must always
> match
>
> else if length (fourth sndFields) == 0 -- "Adding", not "Processing"
> then edges (fstLogLine:thdLogLine:logLines) edgeSet -- Skip useless line
>
> else if indentLength fstLogLine >= indentLength sndLogLine
> then [] -- Stop processing at outdent
>
> else
> -- Looking one of:
> -- 1
> -- 2 -- process 1 -> 2, then process 2.. as subtree
> -- 3 -- Need to process as subtree rooted at 2, then drop
> subtree (zero or more lines at same level as 3)
> -- or
> -- 1
> -- 2 -- processs, then drop this line (process 2.. as empty
> subtree?)
> -- 3
> -- or
> -- 1
> -- 2 -- process, then drop this line (drop entire subtree
> rooted at 1) (same as above, drop empty subtree? (2))
> -- 3
> -- or
> -- 1
> -- 2 -- same as above? Drop empty subtree rooted at 2
> -- 3
> edges (sndLogLine:thdLogLine:logLines) (Map.insertWith (+) ((fullName
> fstFields) ++ (fullName sndFields)) 1) -- now what? I need to pass the
> UPDATED edgeSet on to the next call, after the subtree rooted at 2 is dropped.
>
>
>
> then edges (sndLogLine:logLines) edgeSet -- Can't be an edge from first
> to second line; drop first line and keep going.
> else edges (sndLogLine:(takeWhile (increasingIndent $ length $ fst
> fstLogLine) logLines))
> (Map.insertWith (+) ((fullName fstFields) ++ (fullName sndFields)) 1)
> else ((fst $ edges (sndLogLine:logLines) edgeSet)
> ++ (fst $ edges (fstLogLine:(drop
> (snd $ edges (sndLogLine:logLines)
> edgeSet) -- # of lines processed
> logLines)) edgeSet),
> (snd $ edges (sndLogLine:logLines) edgeSet)
> + (snd $ edges (fstLogLine:(drop
> (snd $ edges (sndLogLine:logLines)
> edgeSet) -- # of lines processed
> logLines)) edgeSet)
> )
> ----------------------------------------------------------------fullname ::
> (String,String,String,[String]) -> Stringfullname
> (_,_,_,[_,fileName,directoryName]) = directoryName ++ fileName
> ------------------------------------------------------------------ |Edges
> from the first line to all following linesedgesFrom :: String --
> ^ First line
> -> [String] -- ^ Following lines
> -> Map.Map String Int -- ^ Set of edges built so far
> -> [String]edgesFrom a b c = []
> ------------------------------------------------------------------ |Return
> length of indent or errorindentLength :: (String,String,String,[String]) -- ^
> Regex match context
> -> Int -- ^ Length of
> indentindentLength (prefix,_,_,[]) = error $ "Not matched: " ++
> prefixindentLength (_,_,_,subexprs) =
> length $ subexprs !! 0
> ------------------------------------------------------------------ |Returns a
> list of edges, possibly with comments indicating occurrence counts >
> 1edgeDump :: [(String,Int)] -- ^ List of (edge,count) tuples
> -> [String] -- ^ List of edges, possibly
> w/commentsedgeDump [] = []edgeDump ((edge,count):rest)
> | count <= 1 = edge:(edgeDump rest)
> | otherwise = (edge ++ " /* " ++ (show count) ++ " occurrences
> */"):(edgeDump rest)
> ----------------------------------------------------------------first ::
> (a,b,c,d) -> afirst (x,_,_,_) = x
> fourth :: (a,b,c,d) -> dfourth (_,_,_,x) = x
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.haskell.org/pipermail/beginners/attachments/20161214/d71c1195/attachment.html>
------------------------------
Subject: Digest Footer
_______________________________________________
Beginners mailing list
[email protected]
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
------------------------------
End of Beginners Digest, Vol 102, Issue 6
*****************************************