RE: GSoC proposal: Provide optimizations feedback through post-compilation messages

2012-04-12 Thread Thibault Raffaillac
Quite lengthy but very interesting mail! It took me a while to formulate a 
proper reply :)

 Feedback can be scarce, but don't let that stop you from submitting a
 proposal.
 Either way, can you keep me informed about any progress? I might wish to help
 though that would probably be later in the cycle (got a lot queued up for
 the comming months).

Submitted :) The reviews are not too positive yet, my biggest efforts go into
making my plan clear. If any progress, help will be very appreciable indeed.

 Great that's exactly what I'm aiming at:) It's not just presenting the
 results of static analysis in real-time, as I actually dislike most
 kinds of it like finding memory leaks, to me that seems like an attempt
 to make the computer do what it's really bad at (understanding the
 code). I just want to give the programmer the fullest picture of the
 situation but at the same time make it so it doesn't become noise that
 interferes. More or less you can say the goal is To provide feedback
 that allows the user to extend his understanding of the program. That
 mostly means giving access to all the information that can be
 unambiguously concluded from the code by the computer. To what degree
 we carry it and how much the compiler is involved is only a question of
 practicality and performance.

I quite agree for the most part, still there is a subtle nuance on which I want
to argue: Do we really help the programmer by offering all the valuable
information that is possible to infer? Ten years from now, would he/she be a
better programmer if we had not let him/her strive to simulate the program in
mind, or code a portion in assembly and finally learn about machine
architecture?

My point is to avoid creating an interface that assists of helps the
programmer, as he/she might become dependent on it. This is just helping in the
short term, and the only person who ever learns something is the one who
actually creates the compiler. If a statement could sum my view, it would be
that the user improves through his/her use of the interface (here the
feedback messages).

How does it make a difference in practice? I want to minimize the information 
given :)
The reason I want to introduce feedback messages is that this particular
information (the inner workings of compilers) is very hard to find in practice.
I want to give a slight help to put the user on the rails, nothing more.

 Perfect! However, how to do that so that it actually works seems a bit
 complex. The first (practically unsolvable) issue is what actually
 constitutes better code, as given two pieces one may be faster in some
 cirtumstances while the other in different. But as I understand that's
 not really what we're trying to tell the user, rather we want him to
 explore for himself what's possible and what are the results and why
 they are the way they are? I'm guessing this will unfortunately (or
 fortunately) require him to actually see and undestand the intermediate
 code, see how it changes after different optimizations, and see the
 output assembly. Personally I really need/want that;) Though my end
 target is a bit more to broaden the abstraction when programming
 (both up and down), so not to just show what's happening with the code
 but also allow the programmer to interact with it on that lower level.
 LLVM seems like the perfect fit for that but I've got some gripes with
 it, and that is still far away in the future.

Excellent! Letting the user explore by himself sounds great, and seing the
output assembly/IR besides is indeed a must. I like the idea that compilation
is a cooperation between programmer and machine (as far as the programmer is
inclined to help of course). It would also be nice to see compilation be split
at Value range propagation, as one could verify it is properly computed, before
proceeding into optimizations.

 Unfortunately I only saw 36m of it as it broke and seeking doesn't work
 on vimeo for me, so I'll watch the rest later. To me it touches on some
 of the right issues/concepts but in slightly the wrong way, and it
 completely ignores some issues.

Agreed. (Only the first half of the video is relevant for the programming
prototype)

Thibault


Re: GSoC proposal: Provide optimizations feedback through post-compilation messages

2012-04-04 Thread Tomasz Borowik
On Mon, 2 Apr 2012 19:57:20 +
Thibault Raffaillac t...@kth.se wrote:

 Bump!
 
 Let me renew my interest in contributing through GSoC with post-compilation
 feedback (This was not an early april joke). Do you think it could lead to an
 acceptable GSoC proposal? (mentor interested?)

Feedback can be scarce, but don't let that stop you from submitting a
proposal.
Either way, can you keep me informed about any progress? I might wish to help
though that would probably be later in the cycle (got a lot queued up for
the comming months).

 @Tomasz:
 On the interaction side I totally agree that communication between compiler 
 and
 programmer is scarce (and there is room for improvement). Focusing too soon on
 the editor would overlook the vast users needs though, as:
 _ some users do not use an IDE (and will kindly refuse);
 _ some users do not need more communication, as they already know what GCC can
   and cannot do;
 _ some users do not want more communication, as they have other business to
   focus on;

Sure, I'm one of the people who don't use an IDE as it causes more
issues than it solves for me. This isn't meant for everyone the same
way anything else isn't, it just can't;p Still looking at it, other
languages, different IDEs, I'd say my way of tackling the issues is
more usable and useful than most other, and could easily see wider
adoption. Btw my experience is mostly in low-level kernel/driver
programming, 2/3d graphics, games.

 I think the editor being split from the compiler is good thing. There still
 exist tools to expose static analysis data from the compiler (and choose the
 editor to visualize it with), but fundamentally they are assisting him/her
 rather than helping him/her improve. Instead of gathering loads of data on the
 optimizations/analysis performed, and filtering it for visualization by the
 user, we could relate the optimization technique used so that the user truly
 knows what GCC is capable of (instead of guessing by observation).

Great that's exactly what I'm aiming at:) It's not just presenting the
results of static analysis in real-time, as I actually dislike most
kinds of it like finding memory leaks, to me that seems like an attempt
to make the computer do what it's really bad at (understanding the
code). I just want to give the programmer the fullest picture of the
situation but at the same time make it so it doesn't become noise that
interferes. More or less you can say the goal is To provide feedback
that allows the user to extend his understanding of the program. That
mostly means giving access to all the information that can be
unambiguously concluded from the code by the computer. To what degree
we carry it and how much the compiler is involved is only a question of
practicality and performance.

 My proposal is thus not to be confused with a static analysis visualization:
 the programmer learns what techniques are implemented in GCC (or in compilers
 in general), how to write code that is more easily compiled, and can further
 browse the Intwawaernet for detailed theory on the techniques involved.

Perfect! However, how to do that so that it actually works seems a bit
complex. The first (practically unsolvable) issue is what actually
constitutes better code, as given two pieces one may be faster in some
cirtumstances while the other in different. But as I understand that's
not really what we're trying to tell the user, rather we want him to
explore for himself what's possible and what are the results and why
they are the way they are? I'm guessing this will unfortunately (or
fortunately) require him to actually see and undestand the intermediate
code, see how it changes after different optimizations, and see the
output assembly. Personally I really need/want that;) Though my end
target is a bit more to broaden the abstraction when programming
(both up and down), so not to just show what's happening with the code
but also allow the programmer to interact with it on that lower level.
LLVM seems like the perfect fit for that but I've got some gripes with
it, and that is still far away in the future.

 The point on the possible-optimizations-which-could-be-enabled-if-specific-
 -constraint-is-lifted is particularly interesting, but is also extremely risky
 if the compiler makes a stupid remark on a constraint which can obviously
 (for the programmer) not be lifted. If ever, I would introduce it with a LOT 
 of
 care.

Yes and no. First of all I don't necessarily mean for the
compiler/editor to suggest anything to the programmer, rather if the
programmer asks just say what's physically possible, and not what's
right, since if the compiler could do that it would just perform the
optimization. Furthermore the situation with my source code is that I
can probably make all this in such a form that it is actually usable
and useful which seems to me close to impossible with normal languages.
I can also with almost no effort store within the source code the
dialogue between 

Re: GSoC proposal: Provide optimizations feedback through post-compilation messages

2012-04-02 Thread Thibault Raffaillac
Bump!

Let me renew my interest in contributing through GSoC with post-compilation
feedback (This was not an early april joke). Do you think it could lead to an
acceptable GSoC proposal? (mentor interested?)

@Tomasz:
On the interaction side I totally agree that communication between compiler and
programmer is scarce (and there is room for improvement). Focusing too soon on
the editor would overlook the vast users needs though, as:
_ some users do not use an IDE (and will kindly refuse);
_ some users do not need more communication, as they already know what GCC can
  and cannot do;
_ some users do not want more communication, as they have other business to
  focus on;

I think the editor being split from the compiler is good thing. There still
exist tools to expose static analysis data from the compiler (and choose the
editor to visualize it with), but fundamentally they are assisting him/her
rather than helping him/her improve. Instead of gathering loads of data on the
optimizations/analysis performed, and filtering it for visualization by the
user, we could relate the optimization technique used so that the user truly
knows what GCC is capable of (instead of guessing by observation).

My proposal is thus not to be confused with a static analysis visualization:
the programmer learns what techniques are implemented in GCC (or in compilers
in general), how to write code that is more easily compiled, and can further
browse the Internet for detailed theory on the techniques involved.

The point on the possible-optimizations-which-could-be-enabled-if-specific-
-constraint-is-lifted is particularly interesting, but is also extremely risky
if the compiler makes a stupid remark on a constraint which can obviously
(for the programmer) not be lifted. If ever, I would introduce it with a LOT of
care.

Thibault
ps: As for an editor with real-time feedback on static analysis and more, I am
100% with you :) (and there are some promising prototypes, like in this talk:
http://vimeo.com/36579366)

 Hello all,
 
 My name is Thibault Raffaillac, CS degree student at Kungliga Tekniska 
 Högskolan,
 Stockholm, Sweden (in double-degree partnership with Ecole Centrale Marseille,
 France).
 GCC currently provides no concise way to inform the user whether it applied an
 expected optimization (ie, it understood the code). As a result, some will 
 do
 premature optimizations when they do not trust the compiler, and some others
 will create overly convoluted code with blind belief in the compiler. This is
 especially relevant for users non-initiated to the internals of GCC.
 The project I would like to propose is a feedback for the optimizations
 performed by GCC. To avoid binding users to the compiler, I would focus on 
 some
 very standard optimizations across vendors, or for some specific yet nice
 features I would indicate their specificity to GCC/an architecture.
 
 The feedback would be triggered when compilation is successful, and display a
 couple of different messages each time it is run:
 gcc --feedback test.c
 test.c:xx:x: info: All operands being constant, constant folding was applied 
 to assign '2560' to 'a'
 test.c:xx:x: info: GCC could not fold constants here because...
 test.c:xx:x: info: As integers are stored in binary format, strength 
 reduction was applied to replace '* 8' by ' 3'
 test.c:xx:x: info: Basic block vectorization was applied to pack the 3 
 independent additions into a single SIMD instruction
 test.c:xx:x: info: GCC implements unordered_map as open-addressed hash 
 tables, with double hashing probing
 
 As a difference with the internal verbose messages, here they would form a 
 set,
 and the system would remember those already displayed and decrease their
 frequency of occurence between compilations. All messages would explain what
 triggered them, cite the optimization name, and describe the consequence.
 
 As for the work plan, it would consist in:
 _ Enumerating all possible messages in the messages set.
 _ Implementing a function receiving feedback from each optimization unit and
   choosing whether to display it: info_printf(enum INFO_INDEX, const char*, 
 ...);
 _ Write a formatting guide for adding messages in the set.
 
 My academic background includes compiler construction, C programming and 
 Human-
 Computer Interactions. I am very much interested in the usability of compilers
 (on which I am currently carrying my degree thesis -
 http://www.csc.kth.se/~traf/traf-sketch.pdf) and thus would be glad to
 contribute to GCC.
 
 If this can be of interest, suggestions are welcome!
 
 Best regards,
 Thibault (http://www.csc.kth.se/~traf/)


Re: GSoC proposal: Provide optimizations feedback through post-compilation messages

2012-03-29 Thread Tomasz Borowik
On Tue, 27 Mar 2012 22:33:39 +
Thibault Raffaillac t...@kth.se wrote:

 Hello all,
 
 My name is Thibault Raffaillac, CS degree student at Kungliga Tekniska 
 Högskolan,
 Stockholm, Sweden (in double-degree partnership with Ecole Centrale Marseille,
 France).
 GCC currently provides no concise way to inform the user whether it applied an
 expected optimization (ie, it understood the code). As a result, some will 
 do
 premature optimizations when they do not trust the compiler, and some others
 will create overly convoluted code with blind belief in the compiler. This is
 especially relevant for users non-initiated to the internals of GCC.
 The project I would like to propose is a feedback for the optimizations
 performed by GCC. To avoid binding users to the compiler, I would focus on 
 some
 very standard optimizations across vendors, or for some specific yet nice
 features I would indicate their specificity to GCC/an architecture.
 
 The feedback would be triggered when compilation is successful, and display a
 couple of different messages each time it is run:
 gcc --feedback test.c
 test.c:xx:x: info: All operands being constant, constant folding was applied 
 to assign '2560' to 'a'
 test.c:xx:x: info: GCC could not fold constants here because...
 test.c:xx:x: info: As integers are stored in binary format, strength 
 reduction was applied to replace '* 8' by ' 3'
 test.c:xx:x: info: Basic block vectorization was applied to pack the 3 
 independent additions into a single SIMD instruction
 test.c:xx:x: info: GCC implements unordered_map as open-addressed hash 
 tables, with double hashing probing
 
 As a difference with the internal verbose messages, here they would form a 
 set,
 and the system would remember those already displayed and decrease their
 frequency of occurence between compilations. All messages would explain what
 triggered them, cite the optimization name, and describe the consequence.
 
 As for the work plan, it would consist in:
 _ Enumerating all possible messages in the messages set.
 _ Implementing a function receiving feedback from each optimization unit and
   choosing whether to display it: info_printf(enum INFO_INDEX, const char*, 
 ...);
 _ Write a formatting guide for adding messages in the set.
 
 My academic background includes compiler construction, C programming and 
 Human-
 Computer Interactions. I am very much interested in the usability of compilers
 (on which I am currently carrying my degree thesis -
 http://www.csc.kth.se/~traf/traf-sketch.pdf) and thus would be glad to
 contribute to GCC.
 
 If this can be of interest, suggestions are welcome!
 
 Best regards,
 Thibault (http://www.csc.kth.se/~traf/)
 

Hi Thibault,

I completely agree, and it's actually a part of what I'm targeting in the long 
term, so I think we might be able to join forces. I'm also thinking of a gsoc 
project though in different areas (there's an email in the list about them on 
19.03), so maybe we could do separate parts that combine into something even 
more awesome;)

I think a huge part of the issue is in the medium of communication between the 
programmer and compiler. I'm targeting an environment where the source code 
editor practically becomes the compiler's front-end. My project allows 
extremely dynamic presentation of the source code, so I can e.g.
 - easily inform the programmer about anything in an unobtrusive manner within 
the code, 
 - give him different perspectives of the same code,
 - allow him to give precise and detailed information to the compiler about 
possible code optimizations without making the code unreadable.

The first two points may seem already solved by eclipse, xcode or whatever 
other gigantic ide, but I'm talking about a much larger scale of feedback 
presented instantly like: ex/implicit and inferred typing info, constant folds, 
dead code, unfolded loops, data flow, vector operations, tree view of 
expressions.

The first issue is that for any non trivial amount of code you'll end up with 
thousands of messages 90% of which are probably not very interesting (similarly 
to warnings in a certain style of objective programming in C). As long as the 
output is not interleaved with the code at the right place and the delay from 
writing to getting feedback is too long, the feature will loose much of its 
usefullness. Though don't misunderstand me, I think it's still better to have 
the info in any form than not.

The last point is probably the more important, as there often is a large amount 
of optimizations that cannot be done due to for example pointer aliasing rules, 
but the programmer knows that the optimization is safe. I can easily add 
literally hundreds of markers like this expression is volatile, the result 
of this function call will not change within this loop, these two pointers 
don't alias and it wouldn't obfuscate the code as much as with normal 
languages. Furthermore my editor can easily list only the meaningful options 
for a given