Fw: GCC interpretation of C11 atomics (DR 459)

2018-02-25 Thread Ruslan Nikolaev via gcc
Alexander,
Thank you for your comments. Please see my response below. I definitely do not 
want to fight for or against this change in gcc, but there are definitely 
legitimate concerns to consider.  I think, it would really be good to consider 
this change to make things more compatible (i.e., at least between clang/llvm 
and gcc which can be both used within the same ecosystem). There are real 
practical benefits of having true lock-free double-width operations when 
implementing algorithms that rely on ABA tagging for pointers, and C11 at last 
gives an opportunity to do that without resorting to assembly or 
platform-specific implementations.


> Note that there's more issues to that than just behavior on readonly memory:
> you need to ensure that the whole program, including all static and shared
> libraries, is compiled with -mcx16 (and currently there's no ld.so/ld-level
> support to ensure that), or you'd need to be sure that it's safe to mix code
> compiled with different -mcx16 settings because it never happens to interop
> on wide atomic objects.

Well, if libatomic is already doing it when corresponding CPU feature is 
available (i.e., effectively implementing operations using cmpxchg16b), I do 
not see any problem here. mcx16 implies that you *have* cmpxchg16b, therefore 
other code compiled without -mcx16 flag will go to libatomic. Inside libatomic, 
it will detect that cmpxchg16b *is* available, thus making code compiled with 
and without -mcx16 flag completely compatible on a given system. Or do I miss 
something here?

If you do not have cmpxchg16b, but the program is compiled with the flag, it 
will simply not run (as expected).
 
So, in other words, libatomic should still decide whether you have cmpxchg16b 
or not for cases when -mcx16 is not specified. But if it is specified, 
cmpxchg16b can be generated unconditionally. If you want better compatibility, 
you will not specify the flag. Mix of -mcx16 and mno-cx16 will be, thus, binary 
compatible.

> Note that there's no "load" function in the __sync family, so the original
> concern about operations on readonly memory does not apply.
Yes, but per clarification from WG14/C11, read-only memory should not be a 
concern at all, as this behavior is not specified anyway (regardless of the 
const specifier). Read-modify-write is allowed for atomic_load as long as there 
is no 'visible' change on the value being loaded. In this sense, the bug that 
was filed previously regarding read-only memory accesses and const specifier 
does not seem to be valid.
Additionally, it is really odd and counterintuitive to still provide support 
for (almost) deprecated macros while not giving such an opportunity for newer 
and more advanced functions.

> You don't mention it directly, so just to make it clear for readers: on 
> systems
> where GNU IFUNC extension is available (i.e. on Glibc), libatomic tries to do
> exactly that: test for cmpxchg16b availability and redirect 128-bit atomics to
> lock-free RMW implementations if so.  (I don't like this solution)

Yes, but libatomic makes things slower due to indirection. Also, it is much 
harder to track what is going on, as there is no guarantee of lock-freedom in 
this case. BTW -- The fact that it currently uses cmpxchg16b if available may 
actually be helpful to switch to the suggested behavior without breaking binary 
compatibility (if I understand everything correctly).

-- Ruslan
   

gcc-8-20180225: recipe for target 'configure-target-libbacktrace' failed

2018-02-25 Thread Siegmar Gross

Hi,

today I tried to install gcc-8-20180225 with accelerator support
on my "SUSE Linux Enterprise Server 12.3 (x86_64)" with gcc-6.4.0.
I used the following commands to download and build everything.

setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/usr/local/cuda/lib64
setenv CUDA_INC_PATH /usr/local/cuda/include
setenv CUDA_LIB_PATH /usr/local/cuda/lib64
set path = ( ${path} /usr/local/cuda/bin )

git clone https://github.com/MentorEmbedded/nvptx-tools
git clone https://github.com/MentorEmbedded/nvptx-newlib

wget ftp://gcc.gnu.org/pub/gcc/snapshots/LATEST-8/gcc-8-20180225.tar.xz
tar xf gcc-8-20180225.tar.xz
ln -s gcc-8-20180225 gcc-8.0.0
cd gcc-8.0.0
ln -s ../nvptx-newlib/newlib newlib
cd ..

mkdir make_nvptx-tools
cd make_nvptx-tools
../nvptx-tools/configure --prefix=/usr/local/gcc-8.0.0 \
  |& tee log.configure
make |& tee log.make
make install |& tee log.make-install
cd ..

mkdir gcc-8.0.0_build
cd gcc-8.0.0_build
../gcc-8.0.0/configure --prefix=/usr/local/gcc-8.0.0 \
  --target=nvptx-none \
  --enable-as-accelerator-for=x86_64-pc-linux-gnu \
  --with-build-time-tools=/usr/local/gcc-8.0.0/nvptx-none/bin \
  --disable-sjlj-exceptions \
  --with-newlib \
  --enable-newlib-io-long-long \
  --enable-languages=c,c++,fortran,lto \
  |& tee log.configure
make |& tee log.make


Unfortunately, "make" breaks with the following error.

loki gcc-8.0.0_build 137 tail -19 log.make
make[3]: Leaving directory 
'/export2/src/gcc-8.0.0/gcc-8.0.0_build/nvptx-none/libgcc'
make[2]: Leaving directory 
'/export2/src/gcc-8.0.0/gcc-8.0.0_build/nvptx-none/libgcc'
Checking multilib configuration for libbacktrace...
mkdir -p -- nvptx-none/libbacktrace
Configuring in nvptx-none/libbacktrace
configure: creating cache ./config.cache
checking build system type... x86_64-pc-linux-gnu
checking host system type... nvptx-unknown-none
checking target system type... nvptx-unknown-none
checking for nvptx-none-gcc... /export2/src/gcc-8.0.0/gcc-8.0.0_build/./gcc/xgcc -B/export2/src/gcc-8.0.0/gcc-8.0.0_build/./gcc/ 
-B/usr/local/gcc-8.0.0/nvptx-none/bin/ -B/usr/local/gcc-8.0.0/nvptx-none/lib/ -isystem /usr/local/gcc-8.0.0/nvptx-none/include -isystem 
/usr/local/gcc-8.0.0/nvptx-none/sys-include

checking for C compiler default output file name...
configure: error: in 
`/export2/src/gcc-8.0.0/gcc-8.0.0_build/nvptx-none/libbacktrace':
configure: error: C compiler cannot create executables
See `config.log' for more details.
Makefile:11774: recipe for target 'configure-target-libbacktrace' failed
make[1]: *** [configure-target-libbacktrace] Error 1
make[1]: Leaving directory '/export2/src/gcc-8.0.0/gcc-8.0.0_build'
Makefile:883: recipe for target 'all' failed
make: *** [all] Error 2
loki gcc-8.0.0_build 138


I was able to build that part for gcc-7.3.0 with a patched nvptx.c
file with the same commands. I would be grateful, if somebody knows
a solution or can fix the problem. Do you need anything else? Thank
you very much for any help in advance.


Kind regards

Siegmar
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.

It was created by package-unused configure version-unused, which was
generated by GNU Autoconf 2.64.  Invocation command line was

  $ /export2/src/gcc-8.0.0/gcc-8.0.0/libbacktrace/configure 
--srcdir=../../../gcc-8.0.0/libbacktrace --cache-file=./config.cache 
--enable-multilib --with-cross-host=x86_64-pc-linux-gnu 
--prefix=/usr/local/gcc-8.0.0 --enable-as-accelerator-for=x86_64-pc-linux-gnu 
--with-build-time-tools=/usr/local/gcc-8.0.0/nvptx-none/bin 
--disable-sjlj-exceptions --with-newlib --enable-newlib-io-long-long 
--enable-languages=c,c++,fortran,lto --program-transform-name=s&^&nvptx-none-& 
--disable-option-checking --with-target-subdir=nvptx-none 
--build=x86_64-pc-linux-gnu --host=nvptx-none --target=nvptx-none

## - ##
## Platform. ##
## - ##

hostname = loki
uname -m = x86_64
uname -r = 4.4.114-94.11-default
uname -s = Linux
uname -v = #1 SMP Thu Feb 1 19:28:26 UTC 2018 (4309ff9)

/usr/bin/uname -p = x86_64
/bin/uname -X = unknown

/bin/arch  = x86_64
/usr/bin/arch -k   = unknown
/usr/convex/getsysinfo = unknown
/usr/bin/hostinfo  = unknown
/bin/machine   = unknown
/usr/bin/oslevel   = unknown
/bin/universe  = unknown

PATH: /usr/local/valgrind-3.12.0/bin
PATH: /usr/local/jdk-9/bin
PATH: /usr/local/jdk-9/db/bin
PATH: /usr/local/llvm-5.0/bin
PATH: /usr/local/pgi-2017/linux86-64/2017/bin
PATH: 
/usr/local/intel_xe_2018/compilers_and_libraries_2018.1.163/linux/bin/intel64
PATH: 
/usr/local/intel_xe_2018/compilers_and_libraries_2018.1.163/linux/mpi/intel64/bin
PATH: /opt/solstudio12.6/bin
PATH: /usr/local/gcc-6.4.0/bin
PATH: /usr/local/sbin
PATH: /usr/local/bin
PATH: /sbin
PATH: /usr/sbin
PATH: /bin
PATH: /usr/bin
PATH: /usr/local/hwloc-1.11.5/bin
PATH: /root/Linux/x86_64/bin
PAT

Re: GCC interpretation of C11 atomics (DR 459)

2018-02-25 Thread Alexander Monakov
Hello,

Although I wouldn't like to fight defending GCC's design change here, let me
offer a couple of corrections/additions so everyone is on the same page:

On Mon, 26 Feb 2018, Ruslan Nikolaev via gcc wrote:
> 
> 1. Not consistent with clang/llvm which completely supports double-width
> atomics for arm32, arm64, x86 and x86-64 making it possible to write portable
> code (w/o specific extensions or assembly code) across all these architectures
> (which is finally possible with C11!).The behavior of clang: if mxc16 is
> specified, cmpxchg16b is generated for x86-64 (without any calls to
> libatomic), otherwise -- redirection to libatomic. For arm64, ldaxp/staxp are
> always generated. In my opinion, this is very logical and non-confusing.

Note that there's more issues to that than just behavior on readonly memory:
you need to ensure that the whole program, including all static and shared
libraries, is compiled with -mcx16 (and currently there's no ld.so/ld-level
support to ensure that), or you'd need to be sure that it's safe to mix code
compiled with different -mcx16 settings because it never happens to interop
on wide atomic objects.

(if you mix -mcx16 and -mno-cx16 code operating on the same 128-bit object,
you get wrong code that will appear to work >99% of the time)

> 3. The behavior is inconsistent even within GCC. Older (and more limited, less
> portable, etc) __sync builtins still use cmpxchg16b directly, newer __atomic
> and C11 -- do not. Moreover, __sync builtins are probably less suitable for
> arm/arm64.

Note that there's no "load" function in the __sync family, so the original
concern about operations on readonly memory does not apply.

> For these reasons, it may be a good idea if GCC folks reconsider past
> decision. And just to clarify: if mcx16 (x86-64) is not specified during
> compilation, it is totally OK to redirect to libatomic, and there make the
> final decision if target CPU supports a given instruction or not. But if it is
> specified, it makes sense for performance reasons and lock-freedom guarantees
> to always generate it directly. 

You don't mention it directly, so just to make it clear for readers: on systems
where GNU IFUNC extension is available (i.e. on Glibc), libatomic tries to do
exactly that: test for cmpxchg16b availability and redirect 128-bit atomics to
lock-free RMW implementations if so.  (I don't like this solution)

Thanks.
Alexander


GCC interpretation of C11 atomics (DR 459)

2018-02-25 Thread Ruslan Nikolaev via gcc
Hi
I have read multiple bug reports (84522, 80878, 70490), and the past decision 
regarding GCC change to redirect double-width (128-bit) atomics for x86-64 and 
arm64 to libatomic. Below I mention major concerns as well as the response from 
C11 (WG14) regarding DR 459 which, most likely, triggered this change in more 
recent GCC releases in the first place. 
If I understand correctly, the redirection to libatomic was made for 2 reasons:
1. cmpxchg16b is not available on early amd64 processors. (However, mcx16 flag 
already specifies that you use CPUs that have this instruction, so it should 
not be a concern when the flag is specified.)
2. atomic_load on read-only memory. DR 459 now requires to have 'const' 
qualifiers for atomic_load which probably resulted in the interpretation that 
read-only memory must be supported. However, per response from C11/WG14 (see 
below), it does not seem to be the case at all. Therefore, previously filed bug 
70490 does not seem to be valid.
There are several concerns with current GCC behavior:

1. Not consistent with clang/llvm which completely supports double-width 
atomics for arm32, arm64, x86 and x86-64 making it possible to write portable 
code (w/o specific extensions or assembly code) across all these architectures 
(which is finally possible with C11!).The behavior of clang: if mxc16 is 
specified, cmpxchg16b is generated for x86-64 (without any calls to libatomic), 
otherwise -- redirection to libatomic. For arm64, ldaxp/staxp are always 
generated. In my opinion, this is very logical and non-confusing.

2. Oftentimes you want to have strict guarantees (by specifying mcx16 flag for 
x86-64) that the generated code is lock-free, otherwise it is useless. 
Double-width atomics are often used in lock-free algorithms that use tags 
(stamps) for pointers to resolve the ABA problem. So, it is very useful to have 
corresponding support in the compiler.

3. The behavior is inconsistent even within GCC. Older (and more limited, less 
portable, etc) __sync builtins still use cmpxchg16b directly, newer __atomic 
and C11 -- do not. Moreover, __sync builtins are probably less suitable for 
arm/arm64.

4. atomic_load can be implemented using read-modify-write as it is the only 
option for x86-64 and arm64 (see below).

For these reasons, it may be a good idea if GCC folks reconsider past decision. 
And just to clarify: if mcx16 (x86-64) is not specified during compilation, it 
is totally OK to redirect to libatomic, and there make the final decision if 
target CPU supports a given instruction or not. But if it is specified, it 
makes sense for performance reasons and lock-freedom guarantees to always 
generate it directly. 

-- Ruslan

Response from the WG14 (C11) Convener regarding DR 459: (I asked for a 
permission to publish this response here.)
Ruslan,

     Thank you for your comments.  There is no normative requirement that const 
objects be suitable for read-only memory.  An example and a footnote refer to 
read-only memory as a way to illustrate a point, but examples and footnotes are 
not normative.  The actual nature of read-only memory and how it can be used 
are outside the scope of the standard, so there is nothing to prevent 
atomic_load from being implemented as a read-modify-write operation.

                                        David
My original email:

Dear David Keaton,
After reviewing the proposed change DR 459 for C11: 
http://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_459 ,I 
identified that adding const qualifier to atomic_load (C11 implements its 
without it) may actually be harmful in some cases.
Particularly, for double-width (128-bit) atomics found in x86-64 (cmpxchg16b 
instruction), arm64 (ldaxp/staxp instructions), it is currently only possible 
to implement atomic_load for 128 bit using corresponding read-modify-write 
instructions (i.e., potentially rewriting memory with the same value, but, in 
essence, not changing it). But these implementations will not work on read-only 
memory. Similar concerns apply to some extent to x86 and arm32 for double-width 
(64-bit) atomics. Otherwise, there is no obstacle to implement all C11 atomics 
for corresponding types in these architectures. Moreover, a well-known 
clang/llvm compiler already implements all double-width operations for x86, 
x86-64, arm32 and arm64 (atomic_load is implemented using corresponding 
read-modify-write instructions). Double-width atomics are often used in data 
structures that need tagging for pointers to avoid the ABA problem (e.g., in 
lock-free stacks and queues).
It is my understanding that C11 aimed to make atomics more or less portable 
across different microarchitectures, while at the same time provide an ability 
for a compiler to optimize code well and utilize all potential of the 
corresponding microarchitecture.
If now it is required to support read-only memory (i.e., const qualifier) for 
atomic_load, 128-bit atomics are likely be impossible to impleme

gcc-8-20180225 is now available

2018-02-25 Thread gccadmin
Snapshot gcc-8-20180225 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/8-20180225/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 257975

You'll find:

 gcc-8-20180225.tar.xzComplete GCC

  SHA256=faaa9656b627a05180a57e25a544293115b20c334efe16b4f8ba5600a3ec330a
  SHA1=22692132364fc256746189cb3cd811cd2baf9103

Diffs from 8-20180218 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: $target.h vs $target-protos.h

2018-02-25 Thread Georg-Johann Lay

Sandra Loosemore schrieb:
The internals manual says that a backend for $target should have 
$target.h and $target-protos.h files, but doesn't say what the 
difference between them is or what belongs in which file.  Current 
practice seems to be a mix of


(1) $target.h contains macro definitions and $target-protos.h contains 
extern declarations.


(2) $target.h defines the external interface to the backend (the macros 
documented in the internals manual) and $target-protos.h contains things 
shared between $target.c and the various .md files.


But some generic source files include $target.h only and some source 
files include both, which wouldn't be true if either of the above models 
applied.  So is there some other logical way to explain the difference 
and what goes where?


IIRC, one difference is scanning for GTY markers used to tag ggc roots: 
$target.h is scanned whereas $target-protos.h is not (and AFAIR adding 
$target-protos.h in config.gcc to the files being scanned pops up other 
problems). Hence when you have a target-specific GTYed structure that's 
shared by several back-end modules, you'd add the struct definition to 
$target.h (If only one module needs such objects, then you'd add the 
type definition to, say, $target.c which is scanned — or can be rendered 
to a scanned one by adjusting config.gcc).


The bulk of code is not GTYed, of course, and from my experience the 
usage of the mentioned files is like you summarized: $target-protos.h is 
usually a blob of prototypes used internally to communicate within a 
back-end, whereas $target.h "defines" the backend part that's not yet 
hookized, e.g. the TARGET_ macros that define (initializers for) 
register classes etc.


And the usage in libgcc might be different: $target.h is used in libgcc 
(which is the reason for why $target.h might need runtime library 
exceptions, cf. PR61152: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61152#c0 ) Ideally, all 
information needed by libgcc and other target libraries would be shipped 
by built-in macros, but such complete separation from compiler sources 
from libgcc sources is not yet complete to the best of my knowledge. 
$target-protos.h (via tm_p.h), however, is not used by libgcc.


Johann

The thing that got me thinking about this is looking at a new port 
originally written by a customer, where it seems like there is too much 
stuff in $target.h.  Then I started chasing it down


- FUNCTION_BOUNDARY depends on which arch variant is being compiled for

- Its expansion references some internal target data structures and 
enumerators to look up that info


- The default definition of TARGET_PTRMEMFUNC_VBIT_LOCATION uses 
FUNCTION_BOUNDARY


- ipa-prop.c uses TARGET_PTRMEMFUNC_VBIT_LOCATION but doesn't include 
$target-protos.h


- tree.c also uses FUNCTION_BOUNDARY directly without including 
$target-protos.h


- Probably there are many other target macros that potentially have 
similar issues


- Presumably this means everything referenced in the expansion of any 
target macro in $target.h also needs to be declared in $target.h and not 
depend on $target-protos.h also being included at the point of use.


So what is the purpose of having a separate $target-protos.h?

-Sandra the confused





Re: gcc-7.3.0: ptxas lib_a-hash_func.o, line 11; fatal : Invalid initial value expression

2018-02-25 Thread Siegmar Gross

Hi Thomas,

thank you very much for your help. I applied the patch to nvptx.c
and was able to build everything. The compiler works for my small
accelerator programs.

loki nvptx 147 diff nvptx.c nvptx.c.orig
1878,1881c1878
<   bool function = SYMBOL_REF_DECL (sym)
<   && (TREE_CODE (SYMBOL_REF_DECL (sym)) == FUNCTION_DECL);
<   if (!function)
<   fprintf (asm_out_file, "generic(");
---
>   fprintf (asm_out_file, "generic(");
1883,1886c1880
<   if (!function)
<   fprintf (asm_out_file, val ? ") + " : ")");
<   else if (val)
<   fprintf (asm_out_file, " + ");
---
>   fprintf (asm_out_file, val ? ") + " : ")");
loki nvptx 148


I'm not sure if I used CUDA 8 or CUDA 9 to build gcc-7.2.0, but I
assume it was already CUDA 9.

loki nvptx 148 ls -ld /usr/local/gcc-7.2.0/ /usr/local/cuda
lrwxrwxrwx 1 root root8 Sep 27 09:15 /usr/local/cuda -> cuda-9.0
drwxr-xr-x 9 root root 4096 Nov  2 12:07 /usr/local/gcc-7.2.0/
loki nvptx 149


Thank you very much once more

Siegmar


Am 24.02.2018 um 23:18 schrieb Thomas Schwinge:

Hi!

On Sat, 24 Feb 2018 17:20:13 +0100, Siegmar Gross 
 wrote:

today I tried to install gcc-7.3.0 with accelerator support


Thanks for giving that a try and reporting back!


on my "SUSE Linux Enterprise Server 12.3 (x86_64)" with
gcc-6.4.0. I used the following commands to download and build
everything.


Thanks for providing these (but I have not yet reviewed them in detail,
because the problem might be solved already, see below).


'../../../../../../gcc-7.3.0/newlib/libc/search/'`hash_func.c
ptxas lib_a-hash_func.o, line 11; fatal   : Invalid initial value expression
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status
Makefile:413: recipe for target 'lib_a-hash_func.o' failed
make[8]: *** [lib_a-hash_func.o] Error 1


Am I right guessing that you're using CUDA 9?  Then it'd most likely be
the issue discussed in  "Update nvptx target
to work with cuda 9".  By now, there has a fix been committed to GCC
trunk, and I assume the same would also work on earlier GCC
releases/branches.


I was able to build that part for gcc-7.2.0 with the same commands.


Interesting.  Or, maybe you've not been using CUDA 9 in that build?


Grüße
  Thomas


Re: GSOC 2018 - Textual LTO dump tool project

2018-02-25 Thread Martin Jambor
Hello Hrishikesh,

I apologize for replying to you this late, this has been a busy week
and now I am traveling.

On Mon, Feb 19 2018, Hrishikesh Kulkarni wrote:
> Hi,
>
> I am Hrishikesh Kulkarni currently studying as an undergrad student in
> Computer Engineering at Pune University, India. I find compilers quite
> interesting as a subject,  and would like to apply to GSoC to gain some
> understanding of how real-world compilers work. So far, I have managed to
> build gcc and perform some simple tweaks to the codebase. In particular, I
> would like to apply to the Textual LTO dump tool project.
>

I must say I am impressed by the research you have already done.
Nevertheless, please note that Ray Kim has also expressed interest in
the project.  Martin Liska will be the mentor, so I will let him drive
the selection process.  On the other hand, Ray also liked another
project, so maybe he will pick that and everyone will be happy.

> As far as I understand, the motivation for LTO framework was to enable
> cross file interprocedural optimizations, and for this purpose an ipa pass
> is divided into following three stages:
>
>1.
>
>LGEN: The pass does a local analysis of the function and generates a
>“summary”, ie, the information relevant to the pass and writes it to LTO
>object file.

A pass might do that, but the output of the whole stage is not just the
pass summaries, it also writes the function IL (the function gimple
statements, above all) to the object file.

>2.
>
>WPA: The LTO object files are given as input to the linker, which then
>invokes the lto1 frontend to perform global ipa analysis over the
>call-graph and write optimized summaries to LTO object files
>(partitioning). The global ipa analysis is done over summary and not the
>actual function bodies.

Well... note that partitioning actually means dividing the whole
compiled program/library into chunks that are then compiled
independently in the LTRANS stage.  But you are basically right that WPA
does also do whole-program analysis based on summaries and then writes
its decisions to optimization summaries, yes.

>3.

>
>LTRANS: The partitions are read back, and the function bodies are
>reconstructed from summary and are then compiled to produce real object
>files.

Function bodies and the summaries are distinct things.  The body
consists of gimple statements and all the associated stuff (such as
types, so a lot of stuff), whereas when we refer to summaries, we mean
small chunks of data that interprocedural optimizations such as inlining
or IPA-CP scurry away because they cannot feasibly work on bodies of the
entire program.

But apart from this terminology issue, you are basically correct, at the
LTRANS stage, IPA passes apply transformations to the bodies according
to the optimization summary generated by the WPA phase.  And then, all
normal, intra-procedural passes and code generation runs.

>
>
> If I understand correctly, the motivation for textual LTO dump tool is to
> easily analyze contents of LTO object file, similar to readelf or objdump ?

That is how I understand it too, but Martin may have some further uses
in mind.

>
> Assume that LTO object file contains in pureconst section: 0b0110 (0b for
> binary prefix) corresponding to values of fs->pure_const_state and
> fs->state_previously_known.
>
> If I understand correctly, the output of dump tool should then be:
>
> pure_const pass:
>
> pure_const_state = IPA_PURE (enum value of pure_const_state_e corresponding
> to 0b01)
>
> state_previously_known = IPA_NEITHER (enum value of pure_const_state_e
> corresponding to 0b10)
>
> Is this the expected output of the dump tool ?

I think the tool would have to a bit more than just dumping summaries of
IPA passes.  I tend to think that the task should also include dumping
gimple bodies (but we already do that in GCC and so it should be mostly
easy) and also of types (that are merged as one of the first steps of
WPA and interesting things happen when mergingit does something
"interesting").  And perhaps quite a bit more.  Martin?

>
> I am reasonably familiar working with C, C++ and python. My prior
> experience includes opportunities to work in areas of NLP. Some of my
> accomplishments in the area include presenting project VicharDhara- A
> thought Mapper that was selected among top five ideas in Accenture
> Innovation Challenge among 7000 nationwide entries. My paper on this topic
> won the best paper award in IEEE Conference ICCUBEA-2017. My previous work
> was focused on simple parsers, student psychology, thought process
> detection for team selection.

Interesting, congratulations.

>
> In the interim, I have been through a few docs on GCC and LTO [1][2][3] and
> am trying to write a toy ipa pass to better understand LTO/IPA
> infrastructure. 

Great, I believe that's exactly what my advice would be

> I would be grateful for feedback on the textual LTO dump
> tool.

I hope that Martin