[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis

Zoltán Gera via Phabricator via cfe-commits Wed, 06 Dec 2017 08:00:13 -0800

gerazo added a comment.

The code modifications are coming soon (after doing some extensive testing) for 
the scan-build part.




================
Comment at: tools/scan-build-py/libscanbuild/analyze.py:223
+    ctu_config = get_ctu_config(args)
+    if ctu_config.collect:
+        shutil.rmtree(ctu_config.dir, ignore_errors=True)
----------------
danielmarjamaki wrote:
> danielmarjamaki wrote:
> > not a big deal but I would use early exits in this function
> with "not a big deal" I mean; feel free to ignore my comment if you want to 
> have it this way.
I've checked it through. The only place for an early exit now would be before 
the else. The 1st and 2nd ifs are in fact non-orthogonal.


================
Comment at: tools/scan-build-py/libscanbuild/analyze.py:145
+            mangled_ast_pairs.append((mangled_name, ast_files.pop()))
+
+    return mangled_ast_pairs
----------------
george.karpenkov wrote:
> Overall, instead of creating a dictionary with multiple elements, and then 
> converting to a list, it's much simper to only add an element to 
> `mangled_to_asts` when it is not already mapped to something (probably 
> logging a message otherwise), and then just return `mangled_to_asts.items()`
The reason for the previous is that we need to count the occurence number 
ofdifferent mappings only let those pass through which don't have multiple 
variations.


================
Comment at: tools/scan-build-py/libscanbuild/analyze.py:189
+            # Remove all temporary files
+            shutil.rmtree(fnmap_dir, ignore_errors=True)
+
----------------
gerazo wrote:
> george.karpenkov wrote:
> > Having an analysis tool remove files is scary, what if (maybe not in this 
> > revision, but in a future iteration) a bug is introduced, and the tool 
> > removes user code instead?
> > Why not just create a temporary directory with `tempfile.mkdtemp`, put all 
> > temporary files there, and then simply iterate through them?
> > Then you would be able to get rid of the constant `CPU_TEMP_FNMAP_FOLDER` 
> > entirely, and OS would be responsible for cleanup.
> Yes, you are right. We are essentially using a temp dir. Because of the size 
> we first had to put it next to the project (not on tmp drive for instance) 
> and for debugging purposes we gave a name to it. Still it can be done with 
> mkdtemp as well.
Finally, I came to the conclusion that mkdtemp would not be better than the 
current solution. In order to find our created dir by other threads, we need a 
designated name. Suffixing it by generated name would further complicate things 
as we need not to allow multiple concurrent runs here. The current solution is 
more robust from this point of view.


================
Comment at: tools/scan-build-py/libscanbuild/analyze.py:241
+        run_analyzer_parallel(args)
+        shutil.rmtree(ctu_config.dir, ignore_errors=True)
+    else:
----------------
george.karpenkov wrote:
> Same as the comment above about removing folders. Also it seems like there 
> should be a way to remove redundancy in `if collect / remove tree` block 
> repeated twice.
Th previous call for data removal happens because the user asked for a collect 
run, so we clean data to do a recollection. This second one happens because the 
user asked for a full recollection and anaylsis run all in one. So here we 
destroy the temp data for user's convenience. This happens after, not before 
like previously. The default behavior is to do this when the user uses the tool 
the easy way (collect and analyze all in one) and we intentionally keep 
collection data if the user only asks for a collect or an analyze run. So with 
this, the user can use a collect run's results for multiple analyze runs. This 
is written in the command line help. I will definitely put comments here to 
explain.


================
Comment at: tools/scan-build-py/libscanbuild/analyze.py:296
+        'command': [execution.cmd[0], '-c'] + compilation.flags,
+        'ctu': json.loads(os.getenv('ANALYZE_BUILD_CTU'))
     }
----------------
george.karpenkov wrote:
> Again, is it possible to avoid JSON-over-environment-variables?
There is an other thing against changing this. Currently the interface here 
using env variables is used by intercept-build, analyze-build and scan-build 
tool as well. In order to drop json, we need to change those tools too. It 
would be a separate patch definitely.


================
Comment at: tools/scan-build-py/libscanbuild/analyze.py:561
+            except OSError:
+                pass
+        ast_command = [opts['clang'], '-emit-ast']
----------------
gerazo wrote:
> george.karpenkov wrote:
> > `try/except/pass` is almost always bad.
> > When can the error occur? Why are we ignoring it?
> I think this code is redundant with the if above.
Here the folders are created on demand. Because these are created parallel by 
multiple processes, there is small chance that an other process already created 
the folder between the isdir check and the makedirs call. This is why the the 
pass is needed to make it always run correctly. I will add a comment.


================
Comment at: tools/scan-build-py/libscanbuild/analyze.py:567
+        ast_command.append('-o')
+        ast_command.append(ast_path)
+        logging.debug("Generating AST using '%s'", ast_command)
----------------
george.karpenkov wrote:
> The above can be written more succinctly as:
> `ast_command = [opts['clang'], ...] + args + ['-w', ...]`
After several iterations of the code, I find it easier to version control such 
multiline constructs. If someone changes a data source, it is clear which one 
(which line) was modified. The succint notation does not allow clean VCS 
annotations.


================
Comment at: tools/scan-build-py/tests/unit/test_analyze.py:338
+
+class PrefixWithTest(unittest.TestCase):
+
----------------
george.karpenkov wrote:
> Probably more tests are required for almost 400 lines of functional Python 
> code in this PR.
> Would it be hard to have a full LIT-style integration test? E.g. have a dummy 
> script emulating Clang with a dummy directory structure, which would show how 
> all pieces are meant to fit together?
You are right. The testing infra in scan-build-py is not right anyway (uses 
nosetests). However, this should be a new patch as you've mentioned earlier.


https://reviews.llvm.org/D30691



_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis

Reply via email to