Beginners Digest, Vol 102, Issue 6

beginners-request Wed, 14 Dec 2016 15:29:17 -0800

Send Beginners mailing list submissions to
        beginners@haskell.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        beginners-requ...@haskell.org


You can reach the person managing the list at
        beginners-ow...@haskell.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1. Re:  Noobie attempt to process log output into    dependency
      graph (John Lusk)


----------------------------------------------------------------------

Message: 1
Date: Wed, 14 Dec 2016 18:28:07 -0500
From: John Lusk <johnlu...@gmail.com>
To: The Haskell-Beginners Mailing List - Discussion of primarily
        beginner-level topics related to Haskell <beginners@haskell.org>
Subject: Re: [Haskell-beginners] Noobie attempt to process log output
        into    dependency graph
Message-ID:
        <cajqkmby+qd+jpkuicnoh6d6vrtjqao7j0tvakk1fj-ehgzt...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

(Or you could find it here: https://github.com/JohnL4/DependencyGraph)

On Wed, Dec 14, 2016 at 6:26 PM, John Lusk <johnlu...@gmail.com> wrote:

> Hi, all,
>
> Here's my question:
>
> I thought, for grins, I'd try to turn some log output into a dependency
> graph (using GraphViz's dot(1)). I'm having difficulty forcing my
> stateful paradigm into a functional one, so I need some help.
>
> If I was to do this with an imperative (stateful) language, I'd build a
> set of edges (or a map to a frequency count, really, since I'll use freq
> > 1 to add some output text noting the repeated occurrences), and then
> dump out the set elements to a text file that would look something like
> this fragment:
>
> a -> q
> q -> d
> d -> e [color=red]
> d -> f [color=red
>
> My big problem now is that if I process a subtree that looks like:
>
> a
>   b
>     c
>     d
>   b
>     d
>     e
>
> my current plan is to proces the first b-c-d subtree and then process the
> b-d-e subtree, *BUT* I need to pass the updated edge set to the second
> processing call, which is pretty stateful.
>
> Do I need to just bite the bullet and find some succinct way to do that,
> or is my entire approach just wrong, stuck in my stateful mindset?
>
> My (awful) code looks like this:
>
> -- Emit to stdout a series of dot(1) edges specifying dependencies.-- "A -> 
> B" means "A depends on B".---- Build with 'ghc dependency-graph.hs'-- -- 
> Input is a text file containing lines as follows:--      (some indentation) 
> (some extraneous text) (file-A) in (some directory)--          (some extra 
> indentation) (some extraneous text) (file-B) in (some directory)--      (some 
> indentation matching the first line above) (some extraneous text) (file-C) in 
> (some directory)---- This means that file-A depends on file-B, but neither 
> file-A nor file-B depend on file-C.---- Sample:--    
> Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : 
> Processing SXA.Compass.Config.ViewModel.dll  in C:\Program Files 
> (x86)\Allscripts Sunrise\Clinical Manager Client\7.2.5575.0\--    
> Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : 
> Adding C:\Program Files (x86)\Allscripts Sunrise\Clinical Manager 
> Client\7.2.5575.0\SXA.Compass.Config.ViewModel.dll (IsPresent=true)        to 
> assemblyList at beginning of GetAssemblyListEx()--      
> Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : 
> Processing SXA.Compass.Config.Utils.dll    in C:\Program Files 
> (x86)\Allscripts Sunrise\Clinical Manager Client\7.2.5575.0\---- (Need to 
> skip the line containing "Adding", and only process the ones containing 
> "Processing".)-- -- Algorithm:--      Read first line, parse, remember 
> indentation--      Repeat for other lines, but if indentation increases, 
> store pair A -> B in hashset.--      At end, dump out hashset.
> -- import Debug.Trace-- import System.Environment-- import 
> System.Console.GetOpt-- import Data.Maybe (fromMaybe)-- import 
> Data.List.Splitimport Prelude -- hiding (readFile) -- Because we want the 
> System.IO.Strict version-- import System.IO (hPutStr, hPutStrLn, stderr)-- 
> import System.IO.Strict-- import Control.Monad-- import System.Directory-- 
> import System.FilePathimport Text.Regex.TDFA-- import 
> Text.Regex.TDFA.String-- import Text.Printf
> -- import qualified Data.Map.Lazy as Mapimport qualified Data.Map.Strict as 
> Map
> ---------------------------------------------------------------- Test Datal1 
> = "    Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() 
> Information: 0 : Processing SXA.Compass.Config.ViewModel.dll\tin C:\\Program 
> Files (x86)\\Allscripts Sunrise\\Clinical Manager Client\\7.2.5575.0\\"l2 = " 
>    Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 
> : Adding C:\\Program Files (x86)\\Allscripts Sunrise\\Clinical Manager 
> Client\\7.2.5575.0\\SXA.Compass.Config.ViewModel.dll\t(IsPresent=true)\tto 
> assemblyList at beginning of GetAssemblyListEx()"l3 = "      
> Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList() Information: 0 : 
> Processing SXA.Compass.Config.Utils.dll\tin C:\\Program Files 
> (x86)\\Allscripts Sunrise\\Clinical Manager 
> Client\\7.2.5575.0\\"----------------------------------------------------------------
>  Test Data Ends-- See http://stackoverflow.com/q/32149354/370611-- toRegex = 
> makeRegexOpts defaultCompOpt{multiline=False} defaultExecOpt
> -- Escape parens?-- initialFillerRegex :: String-- initialFillerRegex = 
> "Helios.MigrationTool.Common.AssemblyUtils.GetAssemblyList\\(\\) Information: 
> 0 : Processing"
> -- Regex matching (marking) a line to be processed-- valuableLineRegex :: 
> String-- valuableLineRegex = "\\bProcessing\\b"
> -- |Regex matching line to be parsedparseLineRegex :: StringparseLineRegex = 
> "^(.* Information: 0 : Processing )([^ ]*)[ \t]+in (.*)" -- 3subexpressions
> main :: IO()main = do
>   logContents <- getContents
>   putStrLn $ unlines $ fst $ edges (parseIndent $ lines logContents) Map.empty
> ------------------------------------------------------------------ |Parses 
> out the leading indentation of the given String into a string of spaces and 
> the rest of the lineparseIndent :: String -> (String,String)parseIndent s = 
> ((fourth $ (s =~ "^( *)(.*)" :: (String,String,String,[String]))) !! 0,
>                  (fourth $ (s =~ "^( *)(.*)" :: 
> (String,String,String,[String]))) !! 1)
> ------------------------------------------------------------------ |Returns a 
> list of strings describing edges in the form "a -> b /* comment */"edges ::
>   [(String,String)]             -- ^ Input tuples: (indent, restOfString)
>   -> Map.Map String Int -- ^ Map of edges in form "a -> b" with a count of 
> the number of times that edge occurs
>   -> [String]           -- ^ Output list of edge descriptions in form "a -> b 
> optionalExtraText"
> edges [] edgeSet =
>   (edgeDump $ Map.assocs edgeSet, 0)
> edges (lastLine:[]) edgeSet =
>   (edgeDump $ Map.assocs edgeSet, 1)
> edges (fstLogLine:sndLogLine:[]) edgeSet =
>   let fstFields = (snd fstLogLine) =~ parseLineRegex :: 
> (String,String,String,[String])
>       sndFields = (snd sndLogLine) =~ parseLineRegex :: 
> (String,String,String,[String])
>   in
>     if length (fourth fstFields) == 0
>     then error ("Unmatched: " ++ (first fstFields)) -- First line must always 
> match
>     else if length (fourth sndFields) == 0 -- "Adding", not "Processing"
>     then edges (fstLogLine:[]) edgeSet -- Skip useless line
>     else if indentLength fstLogLine >= indentLength sndLogLine
>     then edges (sndLogLine:[]) edgeSet -- Can't be an edge from first to 
> second line; drop first line and keep going.
>     else edges (sndLogLine:[])
>          (Map.insertWith (+) ((fullName fstFields) ++ (fullName sndFields)) 1)
> edges (fstLogLine:sndLogLine:thdLogLine:logLines) edgeSet =
>   let fstFields = (snd fstLogLine) =~ parseLineRegex :: 
> (String,String,String,[String])
>       sndFields = (snd sndLogLine) =~ parseLineRegex :: 
> (String,String,String,[String])
>       thdFields = (snd thdLogLine) =~ parseLineRegex :: 
> (String,String,String,[String])
>   in
>     if length (fourth fstFields) == 0
>     then error ("Unmatched: " ++ (first fstFields)) -- First line must always 
> match
>
>     else if length (fourth sndFields) == 0 -- "Adding", not "Processing"
>     then edges (fstLogLine:thdLogLine:logLines) edgeSet -- Skip useless line
>
>     else if indentLength fstLogLine >= indentLength sndLogLine
>     then []                     -- Stop processing at outdent
>
>     else
>       -- Looking one of:
>       --       1
>       --          2 -- process 1 -> 2, then process 2.. as subtree
>       --             3 -- Need to process as subtree rooted at 2, then drop 
> subtree (zero or more lines at same level as 3)
>       -- or
>       --       1
>       --          2 -- processs, then drop this line (process 2.. as empty 
> subtree?)
>       --          3
>       -- or
>       --       1
>       --          2 -- process, then drop this line (drop entire subtree 
> rooted at 1) (same as above, drop empty subtree? (2))
>       --       3
>       -- or
>       --       1
>       --          2 -- same as above? Drop empty subtree rooted at 2
>       --    3
>       edges (sndLogLine:thdLogLine:logLines) (Map.insertWith (+) ((fullName 
> fstFields) ++ (fullName sndFields)) 1) -- now what? I need to pass the 
> UPDATED edgeSet on to the next call, after the subtree rooted at 2 is dropped.
>
>
>
>     then edges (sndLogLine:logLines) edgeSet -- Can't be an edge from first 
> to second line; drop first line and keep going.
>     else edges (sndLogLine:(takeWhile (increasingIndent $ length $ fst 
> fstLogLine) logLines))
>          (Map.insertWith (+) ((fullName fstFields) ++ (fullName sndFields)) 1)
>     else ((fst $ edges (sndLogLine:logLines) edgeSet)
>            ++ (fst $ edges (fstLogLine:(drop
>                                         (snd $ edges (sndLogLine:logLines) 
> edgeSet) -- # of lines processed
>                                         logLines)) edgeSet),
>           (snd $ edges (sndLogLine:logLines) edgeSet)
>           + (snd $ edges (fstLogLine:(drop
>                                       (snd $ edges (sndLogLine:logLines) 
> edgeSet) -- # of lines processed
>                                       logLines)) edgeSet)
>          )
> ----------------------------------------------------------------fullname :: 
> (String,String,String,[String]) -> Stringfullname 
> (_,_,_,[_,fileName,directoryName]) = directoryName ++ fileName
> ------------------------------------------------------------------ |Edges 
> from the first line to all following linesedgesFrom :: String             -- 
> ^ First line
>   -> [String]                   -- ^ Following lines
>   -> Map.Map String Int         -- ^ Set of edges built so far
>   -> [String]edgesFrom a b c = []
> ------------------------------------------------------------------ |Return 
> length of indent or errorindentLength :: (String,String,String,[String]) -- ^ 
> Regex match context
>   -> Int                                        -- ^ Length of 
> indentindentLength (prefix,_,_,[]) = error $ "Not matched: " ++ 
> prefixindentLength (_,_,_,subexprs) =
>   length $ subexprs !! 0
> ------------------------------------------------------------------ |Returns a 
> list of edges, possibly with comments indicating occurrence counts > 
> 1edgeDump :: [(String,Int)]     -- ^ List of (edge,count) tuples
>   -> [String]                  -- ^ List of edges, possibly 
> w/commentsedgeDump [] = []edgeDump ((edge,count):rest)
>   | count <= 1  = edge:(edgeDump rest)
>   | otherwise   = (edge ++ " /* " ++ (show count) ++ " occurrences 
> */"):(edgeDump rest)
> ----------------------------------------------------------------first :: 
> (a,b,c,d) -> afirst (x,_,_,_) = x
> fourth :: (a,b,c,d) -> dfourth (_,_,_,x) = x
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mail.haskell.org/pipermail/beginners/attachments/20161214/d71c1195/attachment.html>

------------------------------

Subject: Digest Footer

_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners


------------------------------

End of Beginners Digest, Vol 102, Issue 6
*****************************************

Beginners Digest, Vol 102, Issue 6

Reply via email to