Re: [patch 0/3] Header file reduction.
Hi, On Wed, 7 Oct 2015, Richard Biener wrote: > > I'm probably the last person in the world that still generally prefers > > -cp :-) I'm getting to the point where I can tolerate -u. > > No, I prefer -cp too - diff just too easily makes a mess out of diffs > with -u, esp. if you have re-indenting going on as well. Actually -c was the recommended form of sending patches for many years even in our own guidelines. It only got changed to -up or -cp when moving instructions from the texinfo files to the website in 2001. From gcc 3.0 (https://gcc.gnu.org/onlinedocs/gcc-3.0/gcc_10.html): Use `diff -c' to make your diffs. Diffs without context are hard for us to install reliably. More than that, they make it hard for us to study the diffs to decide whether we want to install them. Unidiff format is better than contextless diffs, but not as easy to read as `-c' format. If you have GNU diff, use `diff -cp', which shows the name of the function that each change occurs in. ;-) (IMHO it depends on what the patch does if -c or -u is better, if the _change_ is important -u might be better, if the new state is the more interesting thing, -c is) Ciao, Michael.
Re: [patch 0/3] Header file reduction.
On Tue, Oct 6, 2015 at 11:43 PM, Jeff Lawwrote: > On 10/05/2015 02:10 PM, Andrew MacLeod wrote: >>> >>> >>> Is the bitmap/obstack example really one of a change that is >>> desirable? I think if a file uses obstacks then an include of >>> obstack.h is perfectly fine, giving us freedom to e.g. change bitmaps >>> not to use obstacks. Given that multiple headers include obstack.h, >>> and pretty much everything seems to indirectly include bitmap.h >>> anyway, maybe a better change would be to just include it always in >>> system.h. >> >> >> Its just an example of the sort of redundant includes the tool removes. > > It may not be the best example. The tools don't treat obstack specially > (nor should they IMHO). So let's pretend it's not obstack.h which has been > arguably a core part of GCC for a long time. > >> >>I don't see the point in leaving redundant #includes in the source >> code because of direct uses from that header in the source. I'm not >> even sure how I could automate detecting that accurately.. Going >> forward, If anyone ever makes a change which removes a header from an >> include file, they just have to correct the fallout. heh. Thats kinda >> all I've done for 4 months :-) At least we'll have grasp of the >> ramifications.. > > And the last sentence is the key here. We're trying to get to a point where > we can make certain kinds of changes, then have the compiler spit out > errors, fix the errors and have a high degree of confidence that the final > change is correct and that we've found all the places that need to change. > > The change could be as simple as moving a function declaration to its > natural place, collecting interfaces & data into classes, or something more > ambitious like removing trees from the backend. Folks will note that these > are all refactorings that we don't want to change any observable behaviour. > > > > > >>> * diff -c is somewhat unusual and I find diff -u much more readable. >> >> >> unsual? I've been using -cp for the past 2 decades and no one has ever >> mentioned it before... poking around the wiki I see it mentions you >> can use either -up or -cp. >> >> I guess I could repackage things using -up... I don't even know where >> my script is to change it :-). is -u what everyone uses now? no one >> has mentioned it before that I am aware of. > > I'm probably the last person in the world that still generally prefers -cp > :-) I'm getting to the point where I can tolerate -u. No, I prefer -cp too - diff just too easily makes a mess out of diffs with -u, esp. if you have re-indenting going on as well. Richard. > >> >> >>> * Maybe the patches for reordering and removing should be split, also >>>for readability and for easier future identification of problems. >>> >> I was trying to avoid too much churn on 550ish files... I didn't think >> each one needed 2 sets of check-ins.It could be done, but it will >> take a while. The reordering patch can be quickly generated, but the >> reduction on all those files will take the better part of a week. >> >> My theory is it perfectly safe to back out any single file from the >> patch set if we discover it has an issue and then examine what the root >> of the problem is.. >> >> tool patch coming shortly... probably tomorrow now. > > I haven't looked at the 3 patches in detail yet. Given my familiarity with > the overall process/goal, I can probably handle them as-is. They're just big > mechanical changes. > > jeff
Re: [patch 0/3] Header file reduction.
On 10/05/2015 03:11 PM, Andrew MacLeod wrote: In any case, a direct include of obstack.h in coretypes.h was considered earlier in the aggregation process and it didn't show up as something that would be a win. It is included a couple of common places that we have no control over.. in particular libcpp/include/symtab.h includes obstack.h and is included by tree-core.h. A very significant number of files bring that in. If we included obstack.h in coretypes.h then those files would be including it again for a second time for no particularly good reason. So I made the judgement call to not put it in coretypes.h. And just as important, we can revisit the aggregators and when we do so, we ought to be able to answer the question, "if obstack.h is put into coretypes.h" does that clean things up elsewhere and re-run the tools to clean things up. And it's one example, but it does point out a problem with this sort of automated approach: realistically no one is going to check the whole patch, and it may contain changes that could be done better. The point being that the aggregation *wasn't* automated... and has nothing to do with this patch set.I analyzed and preformed all that sort of thing earlier. Sure judgment calls were made, but it wasn't automated in the slightest. There are certainly further aggregation improvements that could be made... and either I or someone else could do more down the road., The heavy lifting has all been done now. Agreed. So the *only* thing that is automated is removing include files which are not needed so that we can get an idea of what the true dependencies in the source base are. Also agreed. the reduction on all those files will take the better part of a week. That's a little concerning due to the possibility of intervening commits. I'd like to make one requirement for checkin, that you take the revision at which you're committing and then run the script again, verifying that the process produces the same changes as the patch you committed. (Or do things in smaller chunks.). Well, sure there are intervening commits.. the only ones that actually matter are the ones which fail to compile because someone made a code change which now requires a header that wasn't needed before. which is really a state we're looking for I think. I fix those up before committing. Its *possible* a conditional compilation issue could creep in, but highly unlikely. More likely is conditional compilation will be removed :-) We're trying to get away from conditional compilation as a general direction. Intervening commits are always a problem with this kind of large patch that hits many places. But IMHO, they're an easily managed problem. jeff
Re: [patch 0/3] Header file reduction.
On 10/05/2015 02:10 PM, Andrew MacLeod wrote: Is the bitmap/obstack example really one of a change that is desirable? I think if a file uses obstacks then an include of obstack.h is perfectly fine, giving us freedom to e.g. change bitmaps not to use obstacks. Given that multiple headers include obstack.h, and pretty much everything seems to indirectly include bitmap.h anyway, maybe a better change would be to just include it always in system.h. Its just an example of the sort of redundant includes the tool removes. It may not be the best example. The tools don't treat obstack specially (nor should they IMHO). So let's pretend it's not obstack.h which has been arguably a core part of GCC for a long time. I don't see the point in leaving redundant #includes in the source code because of direct uses from that header in the source. I'm not even sure how I could automate detecting that accurately.. Going forward, If anyone ever makes a change which removes a header from an include file, they just have to correct the fallout. heh. Thats kinda all I've done for 4 months :-) At least we'll have grasp of the ramifications.. And the last sentence is the key here. We're trying to get to a point where we can make certain kinds of changes, then have the compiler spit out errors, fix the errors and have a high degree of confidence that the final change is correct and that we've found all the places that need to change. The change could be as simple as moving a function declaration to its natural place, collecting interfaces & data into classes, or something more ambitious like removing trees from the backend. Folks will note that these are all refactorings that we don't want to change any observable behaviour. * diff -c is somewhat unusual and I find diff -u much more readable. unsual? I've been using -cp for the past 2 decades and no one has ever mentioned it before... poking around the wiki I see it mentions you can use either -up or -cp. I guess I could repackage things using -up... I don't even know where my script is to change it :-). is -u what everyone uses now? no one has mentioned it before that I am aware of. I'm probably the last person in the world that still generally prefers -cp :-) I'm getting to the point where I can tolerate -u. * Maybe the patches for reordering and removing should be split, also for readability and for easier future identification of problems. I was trying to avoid too much churn on 550ish files... I didn't think each one needed 2 sets of check-ins.It could be done, but it will take a while. The reordering patch can be quickly generated, but the reduction on all those files will take the better part of a week. My theory is it perfectly safe to back out any single file from the patch set if we discover it has an issue and then examine what the root of the problem is.. tool patch coming shortly... probably tomorrow now. I haven't looked at the 3 patches in detail yet. Given my familiarity with the overall process/goal, I can probably handle them as-is. They're just big mechanical changes. jeff
Re: [patch 0/3] Header file reduction.
On 10/05/2015 09:27 AM, Bernd Schmidt wrote: On 10/02/2015 04:22 AM, Andrew MacLeod wrote: The patches are generated by a pair of tools. * gcc-order-includes goes through the headers and canonically reorders some of our more common/troublesome headers and removes any duplicates. This includes headers which are included by other headers. (ie, obstack.h can be removed as a duplicate if bitmap.h is included already.) * remove-includes is the tool which tries to remove each non-conditional header file and does the real work. Is the bitmap/obstack example really one of a change that is desirable? I think if a file uses obstacks then an include of obstack.h is perfectly fine, giving us freedom to e.g. change bitmaps not to use obstacks. Given that multiple headers include obstack.h, and pretty much everything seems to indirectly include bitmap.h anyway, maybe a better change would be to just include it always in system.h. Its just an example of the sort of redundant includes the tool removes. And your assertion turns out to be incorrect... bitmap.h is barely used outside the backend, thus it is included in the backend.h aggregator (This is the only header now which includes bitmap.h... Most of this many-month effort was to untangle all those indirect includes.) There are only 6 remaining uses of bitmap.h in all the front end files. ( Most files can get obstack.h from tree.h. (it comes from symtab.h in tree-core.) If they don't get it there, it often comes from diagnostics-core.h.) I don't see the point in leaving redundant #includes in the source code because of direct uses from that header in the source. I'm not even sure how I could automate detecting that accurately.. Going forward, If anyone ever makes a change which removes a header from an include file, they just have to correct the fallout. heh. Thats kinda all I've done for 4 months :-) At least we'll have grasp of the ramifications.. I'll have a patch shortly to add these and some other useful tools to a header-tools directory in contrib. How soon? It's difficult to meaningfully comment on these patches without looking at how they were generated. Two points: within the next day, im just cobbling together some minimal documentation. Dunno how much that will help reviewing the patches tho. the include reduction process was described in more detail earlier in this project. * diff -c is somewhat unusual and I find diff -u much more readable. unsual? I've been using -cp for the past 2 decades and no one has ever mentioned it before... poking around the wiki I see it mentions you can use either -up or -cp. I guess I could repackage things using -up... I don't even know where my script is to change it :-). is -u what everyone uses now? no one has mentioned it before that I am aware of. * Maybe the patches for reordering and removing should be split, also for readability and for easier future identification of problems. I was trying to avoid too much churn on 550ish files... I didn't think each one needed 2 sets of check-ins.It could be done, but it will take a while. The reordering patch can be quickly generated, but the reduction on all those files will take the better part of a week. My theory is it perfectly safe to back out any single file from the patch set if we discover it has an issue and then examine what the root of the problem is.. tool patch coming shortly... probably tomorrow now. Andrew
Re: [patch 0/3] Header file reduction.
On 10/05/2015 10:10 PM, Andrew MacLeod wrote: Its just an example of the sort of redundant includes the tool removes. And your assertion turns out to be incorrect... bitmap.h is barely used outside the backend, thus it is included in the backend.h aggregator (This is the only header now which includes bitmap.h... Most of this many-month effort was to untangle all those indirect includes.) I said a few headers include obstack.h, not bitmap.h, and that's true in my (maybe a week old) checkout. My suggestion was to move the include of the former to (as Richi corrected) coretypes.h. And it's one example, but it does point out a problem with this sort of automated approach: realistically no one is going to check the whole patch, and it may contain changes that could be done better. * diff -c is somewhat unusual and I find diff -u much more readable. unsual? I've been using -cp for the past 2 decades and no one has ever mentioned it before... poking around the wiki I see it mentions you can use either -up or -cp. I guess I could repackage things using -up... I don't even know where my script is to change it :-). is -u what everyone uses now? no one has mentioned it before that I am aware of. I'm pretty much used to seeing diff -u, whenever I get a -c diff things become harder to work out, because the region in the diff you're looking at never tells you the full story. In this case in particular, the existence of both reordering and removing changes makes it very hard to mentally keep track of what's going on. Let's take this example: Index: attribs.c === *** attribs.c (revision 228331) --- attribs.c (working copy) *** along with GCC; see the file COPYING3. *** 20,36 #include "config.h" #include "system.h" #include "coretypes.h" ! #include "tm.h" #include "tree.h" - #include "alias.h" #include "stringpool.h" #include "attribs.h" #include "stor-layout.h" - #include "flags.h" - #include "diagnostic-core.h" - #include "tm_p.h" - #include "cpplib.h" - #include "target.h" #include "langhooks.h" #include "plugin.h" --- 20,31 #include "config.h" #include "system.h" #include "coretypes.h" ! #include "target.h" #include "tree.h" #include "stringpool.h" + #include "diagnostic-core.h" #include "attribs.h" #include "stor-layout.h" #include "langhooks.h" #include "plugin.h" You could be misled into thinking that diagnostic-core.h and target.h are removed, and the algorithm confuses the issue by showing that lines "changed" rather than getting removed or added. With a unified diff, the size is cut in half which makes things more readable to begin with: Index: attribs.c === --- attribs.c (revision 228331) +++ attribs.c (working copy) @@ -20,17 +20,12 @@ #include "config.h" #include "system.h" #include "coretypes.h" -#include "tm.h" +#include "target.h" #include "tree.h" -#include "alias.h" #include "stringpool.h" +#include "diagnostic-core.h" #include "attribs.h" #include "stor-layout.h" -#include "flags.h" -#include "diagnostic-core.h" -#include "tm_p.h" -#include "cpplib.h" -#include "target.h" #include "langhooks.h" #include "plugin.h" and with things closer together it's easier to follow what's actually going on. This is a smaller example, many instances in your patch are actually about a page long and you have to scroll back and forth to work things out, getting confused because everything in the 410k of text looks the same. This was actually one of the reasons I proposed splitting the patch into reordering and removal phases, it would alleviate the diff -c disadvantages. the reduction on all those files will take the better part of a week. That's a little concerning due to the possibility of intervening commits. I'd like to make one requirement for checkin, that you take the revision at which you're committing and then run the script again, verifying that the process produces the same changes as the patch you committed. (Or do things in smaller chunks.). Bernd
Re: [patch 0/3] Header file reduction.
On 10/05/2015 04:37 PM, Bernd Schmidt wrote: On 10/05/2015 10:10 PM, Andrew MacLeod wrote: Its just an example of the sort of redundant includes the tool removes. And your assertion turns out to be incorrect... bitmap.h is barely used outside the backend, thus it is included in the backend.h aggregator (This is the only header now which includes bitmap.h... Most of this many-month effort was to untangle all those indirect includes.) I said a few headers include obstack.h, not bitmap.h, and that's true in my (maybe a week old) checkout. My suggestion was to move the include of the former to (as Richi corrected) coretypes.h. Ah, sorry the parsing was non-deterministic and I parsed it the other way. My comments refer to the "true" dependencies in the code after all the un-needed headers have been trimmed out... i've little doubt mainline shows obstack.h and bitmap.h being included everywhere. In any case, a direct include of obstack.h in coretypes.h was considered earlier in the aggregation process and it didn't show up as something that would be a win. It is included a couple of common places that we have no control over.. in particular libcpp/include/symtab.h includes obstack.h and is included by tree-core.h. A very significant number of files bring that in. If we included obstack.h in coretypes.h then those files would be including it again for a second time for no particularly good reason. So I made the judgement call to not put it in coretypes.h. And it's one example, but it does point out a problem with this sort of automated approach: realistically no one is going to check the whole patch, and it may contain changes that could be done better. The point being that the aggregation *wasn't* automated... and has nothing to do with this patch set.I analyzed and preformed all that sort of thing earlier. Sure judgment calls were made, but it wasn't automated in the slightest. There are certainly further aggregation improvements that could be made... and either I or someone else could do more down the road., The heavy lifting has all been done now. So the *only* thing that is automated is removing include files which are not needed so that we can get an idea of what the true dependencies in the source base are. * diff -c is somewhat unusual and I find diff -u much more readable. unsual? I've been using -cp for the past 2 decades and no one has ever mentioned it before... poking around the wiki I see it mentions you can use either -up or -cp. I guess I could repackage things using -up... I don't even know where my script is to change it :-). is -u what everyone uses now? no one has mentioned it before that I am aware of. I'm pretty much used to seeing diff -u, whenever I get a -c diff things become harder to work out, because the region in the diff you're looking at never tells you the full story. In this case in particular, the existence of both reordering and removing changes makes it very hard to mentally keep track of what's going on. I can switch to -u.. I've just never seen the request before. I can regenerate the patches with -u if you want. the reduction on all those files will take the better part of a week. That's a little concerning due to the possibility of intervening commits. I'd like to make one requirement for checkin, that you take the revision at which you're committing and then run the script again, verifying that the process produces the same changes as the patch you committed. (Or do things in smaller chunks.). Well, sure there are intervening commits.. the only ones that actually matter are the ones which fail to compile because someone made a code change which now requires a header that wasn't needed before. which is really a state we're looking for I think. I fix those up before committing. Its *possible* a conditional compilation issue could creep in, but highly unlikely. I can rerun everything on that revision from just before I committed and see if everything is the same. It'll take a week to find out :-) but that seems like a reasonable sanity check. Andrew
Re: [patch 0/3] Header file reduction. - unified patches
On 10/05/2015 05:11 PM, Andrew MacLeod wrote: I can switch to -u.. I've just never seen the request before. I can regenerate the patches with -u if you want. You are right, the patches are significantly easier to read with -u.. I've changed my svn diff script.here's all 3 patches: Andrew backend.patch.bz2 Description: application/bzip FE.patch.bz2 Description: application/bzip config.patch.bz2 Description: application/bzip
Re: [patch 0/3] Header file reduction.
On 10/02/2015 04:22 AM, Andrew MacLeod wrote: The patches are generated by a pair of tools. * gcc-order-includes goes through the headers and canonically reorders some of our more common/troublesome headers and removes any duplicates. This includes headers which are included by other headers. (ie, obstack.h can be removed as a duplicate if bitmap.h is included already.) * remove-includes is the tool which tries to remove each non-conditional header file and does the real work. Is the bitmap/obstack example really one of a change that is desirable? I think if a file uses obstacks then an include of obstack.h is perfectly fine, giving us freedom to e.g. change bitmaps not to use obstacks. Given that multiple headers include obstack.h, and pretty much everything seems to indirectly include bitmap.h anyway, maybe a better change would be to just include it always in system.h. I'll have a patch shortly to add these and some other useful tools to a header-tools directory in contrib. How soon? It's difficult to meaningfully comment on these patches without looking at how they were generated. Two points: * diff -c is somewhat unusual and I find diff -u much more readable. * Maybe the patches for reordering and removing should be split, also for readability and for easier future identification of problems. Bernd
Re: [patch 0/3] Header file reduction.
On Mon, Oct 5, 2015 at 3:27 PM, Bernd Schmidtwrote: > On 10/02/2015 04:22 AM, Andrew MacLeod wrote: >> >> The patches are generated by a pair of tools. >> * gcc-order-includes goes through the headers and canonically reorders >> some of our more common/troublesome headers and removes any duplicates. >> This includes headers which are included by other headers. (ie, obstack.h >> can be removed as a duplicate if bitmap.h is included already.) >> * remove-includes is the tool which tries to remove each non-conditional >> header file and does the real work. > > > Is the bitmap/obstack example really one of a change that is desirable? I > think if a file uses obstacks then an include of obstack.h is perfectly > fine, giving us freedom to e.g. change bitmaps not to use obstacks. Given > that multiple headers include obstack.h, and pretty much everything seems to > indirectly include bitmap.h anyway, maybe a better change would be to just > include it always in system.h. Not system.h please - use coretypes.h if really necessary. Richard. >> I'll have a patch shortly to add these and some other useful tools to a >> header-tools directory in contrib. > > > How soon? It's difficult to meaningfully comment on these patches without > looking at how they were generated. Two points: > * diff -c is somewhat unusual and I find diff -u much more readable. > * Maybe the patches for reordering and removing should be split, also >for readability and for easier future identification of problems. > > > Bernd >
[patch 0/3] Header file reduction.
OK, newly regenerated patches to remove header files from the latest version of the tools. The patches are generated by a pair of tools. * gcc-order-includes goes through the headers and canonically reorders some of our more common/troublesome headers and removes any duplicates. This includes headers which are included by other headers. (ie, obstack.h can be removed as a duplicate if bitmap.h is included already.) * remove-includes is the tool which tries to remove each non-conditional header file and does the real work. I'll have a patch shortly to add these and some other useful tools to a header-tools directory in contrib. There are 3 patches which follow: backend.a files, fe files, and config files.. Effecting 547 files total These were generated from a trunk snapshot taken about 2 weeks ago. The tools ran, and once I finished some minor tweaking, I reapplied them to a 9/28 branch. Anything which didn't apply cleanly due to intervening changes, was simply re-reduced. I then reapplied them to a snapshot from this morning for these patches. The tool also monitor what macros are defined and conditionally consumed, and wont remove a header which still compiles if it may define a macro which is used in conditional compilation (tm.h is frequently affected by this). The tool spits out messages like these: * Passed host and target builds, but must keep target.h because it provides ASM_OUTPUT_DEF Possibly required by ipa-icf.c * Passed host and target builds, but must keep insn-attr.h because it provides DELAY_SLOTS Possibly required by toplev.c note my host arch doesn't define DELAY_SLOTS in insn-attr-common.h (included by insn-attr.h) , but some other target archs do, and this is caught this during some of the target builds. (toplev.c has the following lines:) #ifndef DELAY_SLOTS if (flag_delayed_branch) warning (0, "this target machine does not have delayed branches"); #endif So as far as I can tell I'm catching all those conditional compilation cases. The only ones I might have missed would be macros which some host build defines on the command line and which doesn't show up in a config-list.mk target build.. I guess. Everything bootstraps on x86_64-pc-linux-gnu and powerpc64le-unknown-linux-gnu. All targets in config-list.mk still build. Regressions tests also came up clean. OK for trunk? I will make minor tweaks as needed when applying to trunk for check-in.. I know that the new requirement for cgraph.h in builtins.c is no longer required in trunk. (https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00102.html) Going forward, we'll know when we are adding new dependencies, as this thread so timely shows :-). Andrew PS. Then keep an eye for anything funny. I'm not expecting much, but if a file causes an issue, simple reverting the change for that one file should be sufficient until we figure out why it is an issue,. No change in any of these files is dependent on any other.. it simple include reduction, and there should be no functional change in code.