rjmccall added a comment.

In D80344#2286666 <https://reviews.llvm.org/D80344#2286666>, @tentzen wrote:

> Thank you for prompt reply again.
>
>> [rjmccall] And I agree with him that the potential benefits are substantial. 
>> I'm just saying that one barely-trafficked RFC thread is not evidence of a 
>> community consensus.
>
> OK, thanks.  it's good to know you are also supportive in adding this feature.
>
>> [rjmccall] As I understand it, you are creating implicit edges within the 
>> function to the active SEH handler (potentially part of the same function) 
>> from arbitrary places within covered basic blocks, turning basic blocks from 
>> a single-entry single-(internal-)exit representation to a single-entry 
>> multiple-exit representation. That is a huge change. How do you expect LLVM 
>> transformations to preserve or merge this information across blocks? How do 
>> you expect transformations to not move code between blocks in problematic 
>> ways, or reorder code within blocks in ways that will suddenly be visible to 
>> the optimizer? These are the standard objections to supporting features like 
>> this, that they have basically unspecified behavior and appear to work by 
>> good intentions and happy thoughts.
>
> There is absolutely NO explicit edge added in this design.

I didn't say there was.  I said that there was an *implicit* edge: from the 
memory access to the handler.  It's precisely because that edge is implicit 
that it's problematic for analysis and optimization.

> For LLVM transformations/optimizations, the only constrain added is 
> **volatile** attribute on memory operations within a SEH region.

When is this attribute applied?  Because you can't just apply it at a single 
point; you also need to mark operations when they're copied/inlined into the 
region.  We once had a similar attempt at handling exceptions with a 
setjmp-based ABI (not the "SJLJ" exceptions supported by the backends, 
something frontend-centric) that had persistent problems because of poor 
modeling in the IR, which it feels like this is doomed to repeat.

>> [rjmccall] Does it pass under all combinations of optimization and 
>> code-generation settings? How do LLVM contributors ensure that this test 
>> suite continues to pass?
>
> Yes. this is just the first patch.  there will be second patch to land LLVM 
> code-gen part in, then 3rd patch to add some test cases (Ex, xcpt4u.c from 
> MSVC). Without option -EHa, the patch is a zero-impact change for now.

Do you expect to maintain a CI bot that runs the full test suite continuously 
as changes go into LLVM?  Because this sort of thing is extremely easy to break.

>> [rjmccall] There is a major difference between HW exceptions and non-HW 
>> exceptions, which is that non-HW exceptions are observed only at call sites. 
>> It is quite easy to write code in IRGen that works because the code emitted 
>> at a certain point isn't totally arbitrary
>
> still not completely sure what you are referring to. let me guess. Are you 
> concerned that the Cleanup mechanism of a SEH region may not be setup 
> properly if there is no call instruction in the region? This is actually a 
> good question. The answer is that the presence of seh.try.begin() intrinsic 
> will ensure that a Cleanup stack is established properly because there is an 
> EH edge built for seh.try.begin() automatically. hope this answer your 
> concern here.

Does this happen whenever there are changes to the active cleanups, and not 
just when entering the scope?  Since you're not modeling the control edge from 
the access to the handler, IRGen isn't naturally going to create a cleanup just 
because there's a memory access in a scope, and even if it did, the cleanup 
would appear to be unreachable.

That's really a major part of my concern, that the test suite might be passing 
because of some common structure to the tests which minimizes the impact of the 
poor modeling in IR.  IR modeling inadequacies are not going to cause obvious 
problems if in practice the SEH scope is always entered in a function that 
doesn't do much inside the scope besides call a function, and then the trap 
happens with the callee.  In fact, I would be a lot more confident in this 
feature if you actually forced the IR to model it that way, so that there's 
nothing but a by-fiat unoptimizable invoke within the scope.  You could extract 
the function in the frontend and then, if necessary, inline it in a very late 
pass after any IR-level optimizations, although that would leave me still 
concerned about LLVM CodeGen's transformations.

But even doing that wouldn't help with code like the following, where the only 
operation within the scope of s2 is a memory access:

  void test() {
    std::string s1 = <blah>;
    int *ip = get_invalid_ptr();
    std::string s2 = <blah>;
    *ip = 0;
  }



>> [rjmccall] Yes, but it's not known well to all the compiler developers who 
>> are reading and maintaining the Clang codebase. I'm not saying you should 
>> rename the command-line flag, I'm asking that you not just say "EHa" in 
>> comments and identifiers.
>
> Ok, no problem. more comments will be added.

Comments and identifiers.  So, for example, the intrinsic should not be called 
`eha_scope_begin`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80344/new/

https://reviews.llvm.org/D80344

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to