NoQ created this revision.
NoQ added reviewers: dcoughlin, xazax.hun, a_sidorin, rnkovacs, Szelethus, 
baloghadamsoftware, Charusso, george.karpenkov.
Herald added subscribers: cfe-commits, dkrupp, donat.nagy, mikhail.ramalho, 
a.sidorin, szepet.
Herald added a project: clang.

That's my attempt on D62553 <https://reviews.llvm.org/D62553> that i (just) 
promised.

Here's roughly how it looks:

F8965939: Screen Shot 2019-05-29 at 6.12.07 PM.png 
<https://reviews.llvm.org/F8965939>

Full dump of some random test file:

F8965929: graph.svg <https://reviews.llvm.org/F8965929>

The rough idea behind this script is that it deserializes JSON dumps into 
familiar python objects. Serializing them back into a pretty graph dump is only 
one of the possible use-cases; we can manipulate the deserialized graph by, 
say, hiding nodes that we aren't interested in.

This initial commit only supports Environment and Store and silently drops 
unsupported information; i'll hopefully follow up with more this week, most 
importantly ranges. As of today it's only a dot-to-dot converter, so 
@Charusso's SVG cleanup is not supported yet. Diffs are not supported yet; 
tomorrow i'll think a bit more how exactly do i want to implement them.

I wrote some tests but i'm not really sure they're worth it. I guess some 
corner-cases would be nice to document this way.

@Charusso, could you see if you can:

- Bring back stable IDs for nodes into JSON (not only pointers). They're very 
useful for debugging.
- See if your trick for reducing SVG size with text joining is even applicable 
to the dumps that i produce.
- Understand my code here :)


Repository:
  rC Clang

https://reviews.llvm.org/D62638

Files:
  clang/test/Analysis/exploded-graph-rewriter/edge.dot
  clang/test/Analysis/exploded-graph-rewriter/empty.dot
  clang/test/Analysis/exploded-graph-rewriter/environment.dot
  clang/test/Analysis/exploded-graph-rewriter/lit.local.cfg
  clang/test/Analysis/exploded-graph-rewriter/program_points.dot
  clang/test/Analysis/exploded-graph-rewriter/store.dot
  clang/utils/analyzer/exploded-graph-rewriter.py

Index: clang/utils/analyzer/exploded-graph-rewriter.py
===================================================================
--- /dev/null
+++ clang/utils/analyzer/exploded-graph-rewriter.py
@@ -0,0 +1,386 @@
+#!/usr/bin/env python3
+
+import argparse
+import collections
+import json as json_module  # 'json' is a great name for a variable
+import logging
+import re
+
+
+# A deserialized source location. Lack of filename represents the main file.
+class SourceLocation(object):
+    def __init__(self, json):
+        super(SourceLocation, self).__init__()
+        self.line = json['line']
+        self.col = json['column']
+        self.filename = json['filename'] \
+            if 'filename' in json else '(main file)'
+
+
+# A deserialized program point.
+class ProgramPoint(object):
+    def __init__(self, json):
+        super(ProgramPoint, self).__init__()
+        self.kind = json['kind']
+        self.tag = json['tag']
+        if self.kind == 'Edge':
+            self.src_id = json['src_id']
+            self.dst_id = json['dst_id']
+        elif self.kind == 'Statement':
+            self.stmt_kind = json['stmt_kind']
+            self.pointer = json['pointer']
+            self.pretty = json['pretty']
+            self.sloc = SourceLocation(json['location']) \
+                if json['location'] is not None else None
+        elif self.kind == 'BlockEntrance':
+            self.block_id = json['block_id']
+
+
+# A value of a single expression in a deserialized Environment.
+class EnvironmentBinding(object):
+    def __init__(self, json):
+        super(EnvironmentBinding, self).__init__()
+        self.lctx_id = json['lctx_id']
+        self.stmt_id = json['stmt_id']
+        self.pretty = json['pretty']
+        self.value = json['value']
+
+
+# Deserialized description of a location context.
+class LocationContext(object):
+    def __init__(self, caption, decl, line):
+        super(LocationContext, self).__init__()
+        self.caption = caption
+        self.decl = decl
+        self.line = line
+
+
+# A group of deserialized Environment bindings that correspond to a specific
+# location context.
+class EnvironmentFrame(object):
+    def __init__(self, json):
+        super(EnvironmentFrame, self).__init__()
+        self.location_context = LocationContext(json['location_context'],
+                                                json['calling'],
+                                                json['call_line'])
+        self.bindings = [EnvironmentBinding(b) for b in json['items']]
+
+
+# A deserialized Environment.
+class Environment(object):
+    def __init__(self, json):
+        super(Environment, self).__init__()
+        self.frames = [EnvironmentFrame(f) for f in json]
+
+
+# A single binding in a deserialized RegionStore cluster.
+class StoreBinding(object):
+    def __init__(self, json):
+        super(StoreBinding, self).__init__()
+        self.kind = json['kind']
+        self.offset = json['offset']
+        self.value = json['value']
+
+
+# A single cluster of the deserialized RegionStore.
+class StoreCluster(object):
+    def __init__(self, json):
+        super(StoreCluster, self).__init__()
+        self.base_region = json['cluster']
+        self.bindings = [StoreBinding(b) for b in json['items']]
+
+
+# A deserialized RegionStore.
+class Store(object):
+    def __init__(self, json):
+        super(Store, self).__init__()
+        self.clusters = [StoreCluster(c) for c in json]
+
+
+# A deserialized program state.
+class ProgramState(object):
+    def __init__(self, state_id, json):
+        super(ProgramState, self).__init__()
+        logging.debug('Adding ProgramState ' + str(state_id))
+
+        self.state_id = state_id
+        self.store = Store(json['store']) \
+            if json['store'] is not None else None
+        self.environment = Environment(json['environment']) \
+            if json['environment'] is not None else None
+        # TODO: Objects under construction.
+        # TODO: Constraint ranges.
+        # TODO: Dynamic types of objects.
+        # TODO: Checker messages.
+
+
+# A deserialized exploded graph node. Has a default constructor because it
+# may be referenced as part of an edge before its contents are deserialized,
+# and in this moment we already need a room for parents and children.
+class ExplodedNode(object):
+    def __init__(self):
+        super(ExplodedNode, self).__init__()
+        self.parents = []
+        self.children = []
+
+    def set_json(self, node_id, json):
+        logging.debug('Adding ' + node_id)
+
+        self.ptr = json['node_id']
+        self.points = [ProgramPoint(p) for p in json['program_points']]
+        self.state = ProgramState(json['state_id'], json['program_state']) \
+            if json['program_state'] is not None else None
+
+        assert 'Node' + self.ptr == node_id
+
+    def node_name(self):
+        return 'Node' + self.ptr
+
+
+# A deserialized ExplodedGraph. Constructed by consuming a .dot file
+# line-by-line.
+class ExplodedGraph(object):
+    # Parse .dot files with regular expressions.
+    node_re = re.compile(
+        '^(Node0x[0-9a-f]*) \\[shape=record,.*label="{(.*)\\\\l}"\\];$')
+    edge_re = re.compile(
+        '^(Node0x[0-9a-f]*) -> (Node0x[0-9a-f]*);$')
+
+    def __init__(self):
+        super(ExplodedGraph, self).__init__()
+        self.nodes = collections.defaultdict(ExplodedNode)
+        self.root_id = None
+        self.incomplete_line = ''
+
+    def add_raw_line(self, raw_line):
+        if raw_line.startswith('//'):
+            return
+
+        # Allow line breaks by waiting for ';'. This is not valid in
+        # a .dot file, but it is useful for writing tests.
+        if len(raw_line) > 0 and raw_line[-1] != ';':
+            self.incomplete_line += raw_line
+            return
+        raw_line = self.incomplete_line + raw_line
+        self.incomplete_line = ''
+
+        # Apply regexps one by one to see if it's a node or an edge
+        # and extract contents if necessary.
+        logging.debug('Line: ' + raw_line)
+        result = self.edge_re.match(raw_line)
+        if result is not None:
+            logging.debug('Classified as edge line.')
+            self.nodes[result[1]].children.append(result[2])
+            self.nodes[result[2]].parents.append(result[1])
+            return
+        result = self.node_re.match(raw_line)
+        if result is not None:
+            logging.debug('Classified as node line.')
+            node_id = result[1]
+            if len(self.nodes) == 0:
+                self.root_id = node_id
+            # Note: when writing tests you don't need to escape everything,
+            # even though in a valid dot file everything is escaped.
+            raw_json = result[2].replace('\\l', '') \
+                                .replace('&nbsp;', '') \
+                                .replace('\\"', '"') \
+                                .replace('\\{', '{') \
+                                .replace('\\}', '}') \
+                                .replace('\\<', '\\\\<') \
+                                .replace('\\>', '\\\\>') \
+                                .rstrip(',')
+            logging.debug(raw_json)
+            json = json_module.loads(raw_json)
+            self.nodes[node_id].set_json(node_id, json)
+            return
+        logging.debug('Skipping.')
+
+
+# A visitor that dumps the ExplodedGraph into a DOT file with fancy HTML-based
+# syntax highlighing.
+class DotDumpVisitor(object):
+    def __init__(self):
+        super(DotDumpVisitor, self).__init__()
+
+    @staticmethod
+    def _raw(s):
+        print(s, end='')
+
+    @staticmethod
+    def _dump(s):
+        print(s.replace('&', '&amp;')
+               .replace('{', '\\{')
+               .replace('}', '\\}')
+               .replace('\\<', '&lt;')
+               .replace('\\>', '&gt;')
+               .replace('\\l', '<br />')
+               .replace('|', ''), end='')
+
+    def visit_begin_graph(self, graph):
+        self._graph = graph
+        print('digraph "ExplodedGraph" {')
+        print('label="";')
+
+    def visit_program_point(self, p):
+        if p.kind in ['Edge', 'BlockEntrance', 'BlockExit']:
+            color = 'gold3'
+        elif p.kind in ['PreStmtPurgeDeadSymbols',
+                        'PostStmtPurgeDeadSymbols']:
+            color = 'red'
+        elif p.kind in ['CallEnter', 'CallExitBegin', 'CallExitEnd']:
+            color = 'blue'
+        else:
+            color = 'cyan3'
+
+        if p.kind == 'Statement':
+            if p.sloc is not None:
+                self._dump('<tr><td align="left" width="0">'
+                           '%s:<b>%s</b>:<b>%s</b>:</td>'
+                           '<td align="left" width="0"><font color="%s">'
+                           '%s</font></td><td>%s</td></tr>'
+                           % (p.sloc.filename, p.sloc.line,
+                              p.sloc.col, color, p.stmt_kind, p.pretty))
+            else:
+                self._dump('<tr><td align="left" width="0">'
+                           '<i>Invalid Source Location</i>:</td>'
+                           '<td align="left" width="0">'
+                           '<font color="%s">%s</font></td><td>%s</td></tr>'
+                           % (color, p.stmt_kind, p.pretty))
+        else:
+            # TODO: Print more stuff for other kinds of points.
+            self._dump('<tr><td align="left" width="0">'
+                       '<i>Invalid Source Location</i>:</td>'
+                       '<td align="left" width="0" colspan="2">'
+                       '<font color="%s">%s</font></td></tr>'
+                       % (color, p.kind))
+
+    def visit_environment(self, e):
+        self._dump('<table border="0">')
+
+        for f in e.frames:
+            self._dump('<tr><td align="left" balign="left" colspan="3">'
+                       'Frame %s%s - '
+                       '<font color="grey60">%s</font></td></tr>'
+                       % (f.location_context.caption,
+                          ('line %s - ' % f.location_context.line)
+                          if f.location_context.line is not None else '',
+                          f.location_context.decl))
+            for b in f.bindings:
+                self._dump('<tr><td align="left"><i>S%s</i></td>'
+                           '<td align="left">%s</td>'
+                           '<td align="left">%s</td></tr>'
+                           % (b.stmt_id, b.pretty, b.value))
+
+        self._dump('</table>')
+
+    def visit_store(self, s):
+        self._dump('<table border="0">')
+
+        for c in s.clusters:
+            for b in c.bindings:
+                self._dump('<tr><td align="left">%s</td>'
+                           '<td align="left">%s</td>'
+                           '<td align="left">%s</td>'
+                           '<td align="left">%s</td></tr>'
+                           % (c.base_region, b.offset,
+                              '(<i>Default</i>)' if b.kind == 'Default'
+                              else '',
+                              b.value))
+
+        self._dump('</table>')
+
+    def visit_state(self, s):
+        self._dump('<tr><td align="left">'
+                   '<b>Store:</b> ')
+        if s.store is None:
+            self._dump(' <i>Nothing Yet!</i>')
+        else:
+            self._dump('</td></tr>'
+                       '<tr><td align="left">')
+            self.visit_store(s.store)
+
+        self._dump('</td></tr><hr />'
+                   '<tr><td align="left">'
+                   '<b>Environment:</b> ')
+        if s.environment is None:
+            self._dump(' <i>Nothing Yet!</i>')
+        else:
+            self._dump('</td></tr>'
+                       '<tr><td align="left">')
+            self.visit_environment(s.environment)
+
+        self._dump('</td></tr>')
+
+    def visit_node(self, node):
+        self._dump('%s [shape=record,label=<<table border="0">'
+                   % (node.node_name()))
+
+        self._dump('<tr><td bgcolor="grey"><b>Node %s - State %s</b></td></tr>'
+                   % (node.node_name(), node.state.state_id
+                      if node.state is not None else 'Unspecified'))
+        self._dump('<tr><td align="left" balign="left" width="0">')
+        if len(node.points) > 1:
+            self._dump('<b>Program points:</b></td></tr>')
+        else:
+            self._dump('<b>Program point:</b></td></tr>')
+        self._dump('<tr><td align="left" balign="left" width="0">'
+                   '<table border="0" align="left" width="0">')
+        for p in node.points:
+            self.visit_program_point(p)
+        self._dump('</table></td></tr>')
+
+        if node.state is not None:
+            self._dump('<hr />')
+            self.visit_state(node.state)
+        print('</table>>];')
+
+    def visit_edge(self, parent, child):
+        print('%s -> %s;' % (parent.node_name(), child.node_name()))
+
+    def visit_end_of_graph(self):
+        print('}')
+
+
+# A class that encapsulates traversal of the ExplodedGraph. Different explorer
+# kinds could potentially traverse specific sub-graphs.
+class Explorer(object):
+    def __init__(self):
+        super(Explorer, self).__init__()
+
+    def explore(self, graph, visitor):
+        visitor.visit_begin_graph(graph)
+        for node in graph.nodes:
+            logging.debug('Visiting ' + node)
+            visitor.visit_node(graph.nodes[node])
+            for child in graph.nodes[node].children:
+                logging.debug('Visiting edge: %s -> %s ' % (node, child))
+                visitor.visit_edge(graph.nodes[node], graph.nodes[child])
+        visitor.visit_end_of_graph()
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('filename', type=str)
+    parser.add_argument('-d', '--debug', action='store_const', dest='loglevel',
+                        const=logging.DEBUG, default=logging.WARNING,
+                        help='enable debug prints')
+    parser.add_argument('-v', '--verbose', action='store_const',
+                        dest='loglevel', const=logging.INFO,
+                        default=logging.WARNING,
+                        help='enable info prints')
+    args = parser.parse_args()
+    logging.basicConfig(level=args.loglevel)
+
+    graph = ExplodedGraph()
+    with open(args.filename) as fd:
+        for raw_line in fd:
+            raw_line = raw_line.strip()
+            graph.add_raw_line(raw_line)
+
+    explorer = Explorer()
+    visitor = DotDumpVisitor()
+    explorer.explore(graph, visitor)
+
+
+if __name__ == '__main__':
+    main()
Index: clang/test/Analysis/exploded-graph-rewriter/store.dot
===================================================================
--- /dev/null
+++ clang/test/Analysis/exploded-graph-rewriter/store.dot
@@ -0,0 +1,41 @@
+// RUN: %exploded_graph_rewriter %s | FileCheck %s
+
+// CHECK: <b>Store:</b>
+// CHECK-SAME: <table border="0">
+// CHECK-SAME:   <tr>
+// CHECK-SAME:     <td align="left">
+// CHECK-SAME:       x
+// CHECK-SAME:     </td>
+// CHECK-SAME:     <td align="left">
+// CHECK-SAME:       0
+// CHECK-SAME:     </td>
+// CHECK-SAME:     <td align="left">
+// CHECK-SAME:       (<i>Default</i>)
+// CHECK-SAME:     </td>
+// CHECK-SAME:     <td align="left">
+// CHECK-SAME:       Undefined
+// CHECK-SAME:     </td>
+// CHECK-SAME:   </tr>
+// CHECK-SAME: </table>
+Node0x1 [shape=record,label=
+ "{
+    { "node_id": "0x1",
+      "state_id": 2,
+      "program_points": [],
+      "program_state": {
+        "environment": null,
+        "store": [
+          {
+            "cluster": "x",
+            "items": [
+              {
+                "kind": "Default",
+                "offset": 0,
+                "value": "Undefined"
+              }
+            ]
+          }
+        ]
+      }
+    }
+\l}"];
Index: clang/test/Analysis/exploded-graph-rewriter/program_points.dot
===================================================================
--- /dev/null
+++ clang/test/Analysis/exploded-graph-rewriter/program_points.dot
@@ -0,0 +1,55 @@
+// RUN: %exploded_graph_rewriter %s | FileCheck %s
+
+// CHECK: <b>Program point:</b>
+// CHECK-SAME: <table border="0" align="left" width="0">
+// CHECK-SAME:   <tr>
+// CHECK-SAME:     <td align="left" width="0">
+// CHECK-SAME:       <i>Invalid Source Location</i>:
+// CHECK-SAME:     </td>
+// CHECK-SAME:     <td align="left" width="0" colspan="2">
+// CHECK-SAME:       <font color="gold3">Edge</font>
+// CHECK-SAME:     </td>
+// CHECK-SAME:   </tr>
+// CHECK-SAME: </table>
+Node0x1 [shape=record,label=
+ "{
+    { "node_id": "0x1", "program_state": null, "program_points": [
+      {
+        "kind": "Edge",
+        "src_id": 0,
+        "dst_id": 1,
+        "terminator": null,
+        "term_kind": null,
+        "tag": null }
+    ]}
+\l}"];
+
+// CHECK-NEXT: <b>Program point:</b>
+// CHECK-SAME: <table border="0" align="left" width="0">
+// CHECK-SAME:   <tr>
+// CHECK-SAME:     <td align="left" width="0">
+// CHECK-SAME:       (main file):<b>4</b>:<b>5</b>:
+// CHECK-SAME:     </td>
+// CHECK-SAME:     <td align="left" width="0">
+// CHECK-SAME:       <font color="cyan3">DeclRefExpr</font>
+// CHECK-SAME:     </td>
+// CHECK-SAME:     <td>x</td>
+// CHECK-SAME:   </tr>
+// CHECK-SAME:   </table>
+Node0x2 [shape=record,label=
+ "{
+    { "node_id": "0x2", "program_state": null, "program_points": [
+      {
+        "kind": "Statement",
+        "stmt_kind": "DeclRefExpr",
+        "stmd_id": 3,
+        "pointer": "0x3",
+        "pretty": "x",
+        "location": {
+          "line": 4,
+          "column": 5
+        },
+        "tag": null
+      }
+    ]}
+\l}"];
Index: clang/test/Analysis/exploded-graph-rewriter/lit.local.cfg
===================================================================
--- /dev/null
+++ clang/test/Analysis/exploded-graph-rewriter/lit.local.cfg
@@ -0,0 +1,13 @@
+import lit.util
+import lit.formats
+import os
+
+use_lit_shell = os.environ.get("LIT_USE_INTERNAL_SHELL")
+config.test_format = lit.formats.ShTest(use_lit_shell == "0")
+
+config.substitutions.append(('%exploded_graph_rewriter',
+                             lit.util.which('exploded-graph-rewriter.py',
+                                            config.clang_src_dir +
+                                            '/utils/analyzer/')))
+
+config.suffixes = ['.dot']
Index: clang/test/Analysis/exploded-graph-rewriter/environment.dot
===================================================================
--- /dev/null
+++ clang/test/Analysis/exploded-graph-rewriter/environment.dot
@@ -0,0 +1,46 @@
+// RUN: %exploded_graph_rewriter %s | FileCheck %s
+
+// CHECK: <b>Environment:</b>
+// CHECK-SAME: <table border="0">
+// CHECK-SAME:   <tr>
+// CHECK-SAME:     <td align="left" balign="left" colspan="3">
+// CHECK-SAME:       Frame #0 Callline 3 -  - <font color="grey60">foo</font>
+// CHECK-SAME:     </td>
+// CHECK-SAME:   </tr>
+// CHECK-SAME:   <tr>
+// CHECK-SAME:     <td align="left">
+// CHECK-SAME:       <i>S5</i>
+// CHECK-SAME:     </td>
+// CHECK-SAME:     <td align="left">
+// CHECK-SAME:       bar()
+// CHECK-SAME:     </td>
+// CHECK-SAME:     <td align="left">
+// CHECK-SAME:       Unknown
+// CHECK-SAME:     </td>
+// CHECK-SAME:   </tr>
+// CHECK-SAME: </table>
+Node0x1 [shape=record,label=
+ "{
+    { "node_id": "0x1",
+      "state_id": 2,
+      "program_points": [],
+      "program_state": {
+        "store": null,
+        "environment": [
+          {
+            "location_context": "#0 Call",
+            "calling": "foo",
+            "call_line": 3,
+            "items": [
+              {
+                "lctx_id": 4,
+                "stmt_id": 5,
+                "pretty": "bar()",
+                "value": "Unknown"
+              }
+            ]
+          }
+        ]
+      }
+    }
+\l}"];
Index: clang/test/Analysis/exploded-graph-rewriter/empty.dot
===================================================================
--- /dev/null
+++ clang/test/Analysis/exploded-graph-rewriter/empty.dot
@@ -0,0 +1,9 @@
+// RUN: %exploded_graph_rewriter %s | FileCheck %s
+
+digraph "Exploded Graph" {
+  label="Exploded Graph";
+}
+
+// CHECK:      digraph "ExplodedGraph" {
+// CHECK-NEXT:   label="";
+// CHECK-NEXT: }
Index: clang/test/Analysis/exploded-graph-rewriter/edge.dot
===================================================================
--- /dev/null
+++ clang/test/Analysis/exploded-graph-rewriter/edge.dot
@@ -0,0 +1,10 @@
+// RUN: %exploded_graph_rewriter %s | FileCheck %s
+
+Node0x1 [shape=record,label=
+ "{{ "node_id": "0x1", "program_state": null, "program_points": []}\l}"];
+
+// CHECK: Node0x1 -> Node0x2;
+Node0x1 -> Node0x2;
+
+Node0x2 [shape=record,label=
+ "{{ "node_id": "0x2", "program_state": null, "program_points": []}\l}"];
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to