Re: r178105 - [modules] Re-enable the "ambiguous expansion of macro" warning.

Argyrios Kyrtzidis Wed, 13 Nov 2013 09:23:38 -0800

On Nov 12, 2013, at 6:03 PM, Richard Smith <[email protected]> wrote:


> On Tue, Nov 12, 2013 at 12:29 PM, Argyrios Kyrtzidis <[email protected]> 
> wrote:
> 
> On Oct 30, 2013, at 6:41 PM, Richard Smith <[email protected]> wrote:
> 
>> On Wed, Oct 30, 2013 at 5:32 PM, Argyrios Kyrtzidis <[email protected]> 
>> wrote:
>> 
>> On Oct 28, 2013, at 1:13 PM, Richard Smith <[email protected]> wrote:
>> 
>>> On Mon, Oct 28, 2013 at 12:15 PM, Richard Smith <[email protected]> 
>>> wrote:
>>> On Sat, Oct 26, 2013 at 10:14 AM, Argyrios Kyrtzidis <[email protected]> 
>>> wrote:
>>> As I said above, I think there's a misunderstanding here; to reiterate: 
>>> when you say "I want to use the interface provided by M1. If I've not 
>>> imported M2" this is case 1) I mentioned and I agree with what you said 
>>> should happen. If this does not happen currently then it should be fixed.
>>> 
>>> Should we focus on what should happen when importing both M1 and M2 ?
>>> 
>>> Yes, I think refocusing will help. In particular, I think this is the key 
>>> question on which we have different views:
>>>  
>>> How do I know that two modules are *related* ? One module may have imported 
>>> the other but if it doesn't re-export it it might as well be an *unrelated* 
>>> one from my perspective; why do I need to care what modules it imported 
>>> internally ?
>>> 
>>> There are (I think) two different ways that two modules can have a conflict 
>>> on what a macro name means:
>>> 
>>> 1) Both modules want the name to mean something (which be mean "it's a 
>>> macro expanding to X" or might be "it's not a macro").
>>> 2) One module wants to *override* what another module meant the name to 
>>> mean (M2 says "it expands to FOO BAR", and M1 says "actually, I override M2 
>>> and say it expands to BAZ")
>>> 
>>> There are also two different ways that a module can express its intent for 
>>> a macro name to mean something:
>>> 
>>> A) It can #define the name to something
>>> B) It can #undef the name
>>> 
>>> I think we agree that:
>>> 
>>> 1A: If two modules want the same macro name to be defined to different 
>>> things, and that macro name is used, that is clearly ambiguous and should 
>>> be diagnosed.
>>> 
>>> I'm on the fence about this one:
>>> 
>>> 1B: If one module wants the macro name to be #defined to something, and 
>>> another module wants it to be undefined, currently Clang silently picks the 
>>> definition. I am not entirely convinced that's appropriate -- if my module 
>>> explicitly #undef's a macro, that's a statement of intent: in my interface, 
>>> I'm using that name to mean something that's not a macro. I don't think 
>>> it's obvious that:
>>> 
>>>   #undef X
>>>   #define X X
>>> 
>>> and 
>>> 
>>>   #undef X
>>> 
>>> should mean fundamentally different things here. Either way, I'm saying "I 
>>> own the macro name X and it produces the token sequence X."
>>> 
>>> Hmm, I think it's actually rather important that we diagnose this: on the 
>>> upgrade path from #includes to modules, we'll get a silent change of 
>>> behavior without this.
>>> 
>>> If we do diagnose this case (where two imported modules disagree on a 
>>> macro: one says it should be defined and the other says it should not), 
>>> then I care a lot less about case 2. (It's enough for me that we diagnose 
>>> this; we don't need to preserve the existing behavior.)
>>> 
>>> That said, I think we should take as a guiding rule that modules and 
>>> submodules should behave the same, and we currently violate that in the 
>>> case of macros.
>>>  
>>> Case 2 is where we really seem to disagree. It's also a weird case for 
>>> modules, since we rarely have cases in which the interface of one module 
>>> includes "modify the interface of another module". Which brings me to your 
>>> next point:
>>>  
>>> I'm suggesting that If I explicitly import 2 modules, and the modules' 
>>> interfaces disagree on what an identifier means, then there is ambiguity.
>>> One module may or may not have imported the other module internally, but 
>>> that doesn't change the fact that the modules' interfaces disagree on the 
>>> meaning of an identifier.
>>> 
>>> The key thing for me is, if one module imports another module, undefines 
>>> some part of it, then re-exports it, it seems pretty clear that it intends 
>>> to *modify* the interface of that first module. This happens in other 
>>> guises too: extending overload sets, specializing templates, shadowing a 
>>> tag name with a non-tag name, and so on. In all other cases, importing both 
>>> modules gives you the modified interface *with no ambiguity*. The ambiguity 
>>> has already been resolved by the dominating module saying how to resolve 
>>> it. And if I import a module that says that it modifies another module's 
>>> interface, I expect to get that modified interface.
>>> 
>>> So: 2A) if M2 defines our macro, and M1 imports M2 and undefines and 
>>> redefines that macro, and either I import M1 or I import *both* M1 and M2, 
>>> I've expressed my intent to use M1's resolution to the conflict, and I 
>>> think I should get M1's definition.
>>> 2B) if M2 defines our macro, and M1 imports M2 and undefines the macro, and 
>>> either I import M1 or I import both M1 and M2, I think the macro should not 
>>> be defined.
>>> 
>>> By exporting a macro undefinition that undefines M2's macro, M1 is saying 
>>> "importing me undefines M2's macro definition". There's no ambiguity in 
>>> that, and if I import both modules, that's what I expect to happen. This 
>>> would behave in (largely) the same way we behave across submodules in the 
>>> same module, which makes the intuitive model simpler. If M1 didn't want 
>>> this behavior, it should not export the macro.
>>> 
>>> 
>>> Let me make my proposal a bit more concrete, since I think I've been vague 
>>> up until now.
>>> 
>>> Instead of storing a single flat list of macro definitions and undefines, 
>>> we store a DAG. The predecessors of each directive are the set of 
>>> directives that were visible at the point of the directive. (For a #define, 
>>> these must all be either #undefs or #defines to the same value.) When we 
>>> see a macro name being used, we take the set of visible leaves of this 
>>> graph, check they are consistent[1], and use that consistent value as the 
>>> value of the macro. Otherwise, we diagnose.
>>> 
>>> [1] I'm not certain whether consistency should allow a mixture of #undefs 
>>> and #defines, but I'm inclined to think it should not.
>>> 
>> 
>> Richard, thanks for the detailed explanation, I'm convinced that if a module 
>> undefines a macro of another module and re-exports it, the macro should be 
>> considered undefined even if I import both modules.
>> I have another question, what if the module doesn't re-export the imported 
>> module ? Specifically, M1 imports M2, undefines a macro from M2, but then it 
>> doesn't re-export module M2 ? That seems like a sufficiently different case, 
>> to be considered separately.
>> 
>> Yes, I agree that that case needs separate consideration.
>> 
>> I think it's clear that:
>> - If you import only M1, the macro should not be defined
>> - If you import only M2, the macro should be defined
>> 
>> More interesting questions:
>>  - If you import M1 and M3 (where M3 defines the same macro), either the 
>> macro should be defined or any use of it should be ambiguous. Which?
>>  - If you import M1 and M2, then either (1) macro is not defined, (2) macro 
>> is defined, or (3) uses of macro are ambiguous might be reasonable. Which?
>> 
>> 
>> The user-facing model I'm currently thinking about is:
>>  * Preprocessor directives in modules are entities in their own right (every 
>> definition and undefinition of a macro is a distinct entity).
>>  * These entities are visible if they are not private and the corresponding 
>> (sub)module has been imported.
>>  * A #define X or #undef X directive overrides all definitions of X that are 
>> visible at the point of the directive.
>>  * A #define or #undef directive is active if it is visible and no visible 
>> directive overrides it.
>>  * If a macro name is used, the set of active directives for that macro name 
>> must be consistent (either all #undefs, or all #defines defining the macro 
>> name to the same value).
> 
> Hi Richard, apologies for taking so long to get back to you.
> 
> I'd like to expand a bit on the user-facing model, particularly regarding a 
> module that imports another module but does not re-export it (M1 imports M2 
> but does not re-export M2).
> What are the semantics here, is it:
>       - User imports M1
>       - M2 is not visible at all (hidden due to non getting exported)
>       - The fact that M1 imports M2 is strictly an implementation detail, M1 
> can remove the import of M2 or change to importing M3 and there will be no 
> noticeable change in the user's code (regardless of what modules the user 
> imported).
> ?
> 
> To start off with, we're only talking about the case where M1 undefines M2's 
> macro, and *exports* that undefine. I separately want to move us to a module 
> where preprocessor directives are not exported by default, so (slightly 
> longer-term) I want M1 to need to take explicit action for this to happen. In 
> this case, M1 has exported that (1) the identifier in question is not a 
> macro, and (2) in a conflict between this intent and M2's intent, M1's intent 
> should win.
> 
> So: if M1 re-exports its undefine of the macro, and user code imports both M1 
> and M2, it will not see the macro. If M1 stops importing M2 before undefining 
> the macro, then it's no longer undefining M2's macro, and the user code will 
> now see an ambiguity between M1's exported undef and M2's exported define.
> 
> If M1 didn't re-export its undefine of the macro, then it's not part of M1's 
> interface, and the preprocessing of the user code is entirely unaffected by 
> it.
> 
> But in either case, if M1 imports M2, and user code imports M1, then M2 is 
> part of the user's program, and the user's program can break if M1 switches 
> from using M2 to using M3. This problem is not specific to macros: imagine 
> both the user program and M3 define a global function named 'foo'. If you 
> transitively include any part of a module, that module is part of your 
> program.
> 
> Also, if M1 importing M2 is an implementation detail that doesn't "leak out" 
> to the user of M1, what about if M1 exports a declaration using a type from 
> M2 ?
> 
> There are a number of issues here, and I'm not sure which one concerns you.

I wanted to see how the macro questions fits with the rest of the user-model.

> The way this works at the moment is that you can implicitly use the 
> declarations of things from M2, but you can't name them or use their 
> definitions. So:
> 
> // M2
> #define s_n S_FIELD_NAME
> struct S { int s_n; };
> 
> // M1
> import M2; // and don't re-export
> 
> S *make();
> void take(S *p);
> void take_val(S s);
> 
> // user code
> import M1;
> 
> int s_n, foo; // ok, not a (visible) macro
> #define s_n foo // ok, no conflict
> int bar = s_n; // ok, assigns from foo
> #undef s_n // ok, undefines our s_n, not M2's
> 
> int main() {
>   take(make()); // ok
>   auto *p = make(); // ok
>   S *p = make(); // error, 'S' is not visible
>   take_val(*make()); // error, 'S::S' is not visible
>   return make()->s_n; // error, definition of 'S' is not visible
> }
> 
> import M2;
> int k = make()->s_n; // ok, S's definition is visible and M2's s_n macro is 
> active now
> 
> It doesn't matter that S's field is not actually called 's_n', because S's 
> definition is visible only when S is visible. The existence of the type 'S' 
> leaks out of M1, but its name *mostly* does not. This will fail, though:
> 
> import M1;
> struct S {}; // error, multiple definitions of S.

Thanks again for the detailed explanation. I think that your proposed model for 
macros is consistent with the general model, and I have no concerns, thanks!

>> That gives the following answers to the above:
>> 
>> If M2 contained a #__private_macro directive for the macro name:
>>  - If you import M1 and M3, where M3 defines the same macro that M1 
>> undefines, the macro is defined to M3's value (that is the only visible 
>> directive).
>>  - If you import M1 and M2, the macro is defined to M2's value (that is the 
>> only visible directive).
>>  - (For completeness) if you import M1, M2, and M3, any use of the macro is 
>> ambiguous.
>> (That is, a #__private_macro undefinition has no effect.)
>> 
>> If M2 did not contain a #__private_macro directive for the macro name:
>>  - If you import M1 and M3, where M3 defines the same macro that M1 
>> undefines, then any use of the macro is ambiguous (because M1 said "it's not 
>> a macro", and M3 said "it's this macro").
>>  - If you import M1 and M2, then the macro is not defined (M2's definition 
>> is not active, because M1's undefinition is visible and overrides it).
>>  - If you import M1, M2, and M3, this is the same as the first case. Any use 
>> of the macro is ambiguous: the active directives are M1's undefinition and 
>> M3's definition, and those conflict.
>> 
>> I think I'm comfortable with those answers. For case of importing both M1 
>> and M2, another option would be to treat it like importing M1 and M3. 
>> However, that leads to a situation where (M1 re-exports M2 and I import M1) 
>> behaves differently from (M1 does not re-export M2 and I import both M1 and 
>> M2), which seems surprising.
>> 
>> 
>> What do you think? Are you comfortable with the above answers, or do you 
>> think we need to go in a different direction for this case?
> 
>

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: r178105 - [modules] Re-enable the "ambiguous expansion of macro" warning.

Reply via email to