Re: GSoC proposal for extending static analyzer

2022-04-16 Thread David Malcolm via Gcc
On Fri, 2022-04-15 at 22:36 +0530, Mir Immad wrote:
> I've updated the link on the repo --  
> https://mirimmad.github.io/zeta-lang.
> 
> > You don't give many specifics in your personal decription.  One thing
> > I'm not seeing is a sense of how proficient you are in various
> > programming languages.  In particular, how is your C and C++?  How
> > familiar are you with the debugger?  Looking at your github, you seem
> > to have relevant experience in compilers, which is great, but all
> > your
> > code appears to be with "managed" languages such as Ruby, Java, and
> > Python  [and Zeta :)].
> 
>  I'm pretty comfortable with both C and C++ and manual mem management .
> Unfortunately, I don't have any project in C/C++ to show. Maybe I can
> use
> the time before May 20 to write something in cpp?

Perhaps write some low-level code in C and/or C++ that uses file-
descriptors?  That would help you gain familiarity with both C/C++, and
with the problem domain.  (in terms of what does good code look like,
and also, what are the ways in which a programmer can make mistakes
when using a particular API).


> About the debugger, I'm okay-ish with it. In fact, it was due to the
> debugger (and your blog on it) that I was initially able to walk
> through
> the codebase.

(nods)

> 
> 
> > That said, I got
> > the sense from your previous emails that you're not very familiar
> > with
> > the APIs, and that you chose them because that was the suggestion I
> > had
> > made on the wiki page.
> 
> thats right.
> 
> 
> >  Obviously it's something you can learn on the
> >  way, but it would be better to accurately identify which areas
> > you're
> > going need to learn along the way, and the timetable and scope should
> > reflect that.
> 
> If I understand the statement correctly; currently I'm thinking of
> extending the support for open() (for creating the fd), write/read (for
> working on the fd) and close(). This is quite analogous to what we have
> in
> sm-file. Please let me know how do you want the analyzer to be extended
> and
> if you expect support for any other FD APIs  too as I understand there
> are
> many other APIs for creating and working wiith FDs?
> 
> Thank you.

There are a lot of them; see e.g.:
  https://en.wikipedia.org/wiki/File_descriptor
Perhaps it's worth considering implementing some kind of attribute for
function-decls to specify what their behavior is with regards to file
descriptors, rather than hardcoding the expected behaviors
individually?  (not sure, perhaps one part of the project will be to
catalog the different expected preconditions and behaviors of API
entrypoints that work with file descriptors, in that I expect that they
will fall into patterns).

BTW, I'm about to go on a week-long trip, and will be away from the
computer during that time, so I probably won't be able to reply further
before the application deadline.

Hope this is helpful
Dave

> 
> 
> 
> On Fri, Apr 15, 2022 at 9:35 PM David Malcolm 
> wrote:
> 
> > On Fri, 2022-04-15 at 19:58 +0530, Mir Immad wrote:
> > > I've submitted a proposal for extending the static analyzer to
> > > support
> > > posix fd APIs on GSoC website. Here is the Google docs link (gdocs
> > > <
> > > 
> > https://docs.google.com/document/d/188zxPUsuYcF-uGVYL_G1s2RVtHhJSZeQ4sha40H7374/edit?usp=sharing
> > > > ).
> > > 
> > > 
> > > Please take a look and let me know what you think.
> > > 
> > > Thank you.
> > 
> > Thanks.
> > 
> > FWIW, I'm getting an error when trying the URL given in your github
> > repo: http://mirimmad.me/
> > but https://mirimmad.github.io/ seems to work  - but it's almost
> > empty.
> > 
> > You don't give many specifics in your personal decription.  One thing
> > I'm not seeing is a sense of how proficient you are in various
> > programming languages.  In particular, how is your C and C++?  How
> > familiar are you with the debugger?  Looking at your github, you seem
> > to have relevant experience in compilers, which is great, but all
> > your
> > code appears to be with "managed" languages such as Ruby, Java, and
> > Python  [and Zeta :)].
> > 
> > Also, the proposal is to extend the analyzer to cover a specific
> > domain: various POSIX APIs.  Can you please give a sense of your
> > level
> > of expertise with these APIs?  I was pleased at your initiative in
> > trying to reuse the existing code to work with them.  That said, I
> > got
> > the sense from your previous emails that you're not very familiar
> > with
> > the APIs, and that you chose them because that was the suggestion I
> > had
> > made on the wiki page.  Obviously it's something you can learn on the
> > way, but it would be better to accurately identify which areas you're
> > going need to learn along the way, and the timetable and scope should
> > reflect that.
> > 
> > Hope this is constructive
> > Dave
> > 
> > 
> > 




Re: GSoC proposal for extending static analyzer

2022-04-15 Thread Mir Immad via Gcc
I've updated the link on the repo -- https://mirimmad.github.io/zeta-lang.

> You don't give many specifics in your personal decription.  One thing
> I'm not seeing is a sense of how proficient you are in various
> programming languages.  In particular, how is your C and C++?  How
> familiar are you with the debugger?  Looking at your github, you seem
> to have relevant experience in compilers, which is great, but all your
> code appears to be with "managed" languages such as Ruby, Java, and
> Python  [and Zeta :)].

 I'm pretty comfortable with both C and C++ and manual mem management .
Unfortunately, I don't have any project in C/C++ to show. Maybe I can use
the time before May 20 to write something in cpp?
About the debugger, I'm okay-ish with it. In fact, it was due to the
debugger (and your blog on it) that I was initially able to walk through
the codebase.


> That said, I got
> the sense from your previous emails that you're not very familiar with
> the APIs, and that you chose them because that was the suggestion I had
> made on the wiki page.

thats right.


>  Obviously it's something you can learn on the
>  way, but it would be better to accurately identify which areas you're
> going need to learn along the way, and the timetable and scope should
> reflect that.

If I understand the statement correctly; currently I'm thinking of
extending the support for open() (for creating the fd), write/read (for
working on the fd) and close(). This is quite analogous to what we have in
sm-file. Please let me know how do you want the analyzer to be extended and
if you expect support for any other FD APIs  too as I understand there are
many other APIs for creating and working wiith FDs?

Thank you.



On Fri, Apr 15, 2022 at 9:35 PM David Malcolm  wrote:

> On Fri, 2022-04-15 at 19:58 +0530, Mir Immad wrote:
> > I've submitted a proposal for extending the static analyzer to support
> > posix fd APIs on GSoC website. Here is the Google docs link (gdocs
> > <
> >
> https://docs.google.com/document/d/188zxPUsuYcF-uGVYL_G1s2RVtHhJSZeQ4sha40H7374/edit?usp=sharing
> > >).
> >
> >
> > Please take a look and let me know what you think.
> >
> > Thank you.
>
> Thanks.
>
> FWIW, I'm getting an error when trying the URL given in your github
> repo: http://mirimmad.me/
> but https://mirimmad.github.io/ seems to work  - but it's almost empty.
>
> You don't give many specifics in your personal decription.  One thing
> I'm not seeing is a sense of how proficient you are in various
> programming languages.  In particular, how is your C and C++?  How
> familiar are you with the debugger?  Looking at your github, you seem
> to have relevant experience in compilers, which is great, but all your
> code appears to be with "managed" languages such as Ruby, Java, and
> Python  [and Zeta :)].
>
> Also, the proposal is to extend the analyzer to cover a specific
> domain: various POSIX APIs.  Can you please give a sense of your level
> of expertise with these APIs?  I was pleased at your initiative in
> trying to reuse the existing code to work with them.  That said, I got
> the sense from your previous emails that you're not very familiar with
> the APIs, and that you chose them because that was the suggestion I had
> made on the wiki page.  Obviously it's something you can learn on the
> way, but it would be better to accurately identify which areas you're
> going need to learn along the way, and the timetable and scope should
> reflect that.
>
> Hope this is constructive
> Dave
>
>
>


Re: GSoC proposal for extending static analyzer

2022-04-15 Thread David Malcolm via Gcc
On Fri, 2022-04-15 at 19:58 +0530, Mir Immad wrote:
> I've submitted a proposal for extending the static analyzer to support
> posix fd APIs on GSoC website. Here is the Google docs link (gdocs
> <
> https://docs.google.com/document/d/188zxPUsuYcF-uGVYL_G1s2RVtHhJSZeQ4sha40H7374/edit?usp=sharing
> >).
> 
> 
> Please take a look and let me know what you think.
> 
> Thank you.

Thanks.

FWIW, I'm getting an error when trying the URL given in your github
repo: http://mirimmad.me/
but https://mirimmad.github.io/ seems to work  - but it's almost empty.

You don't give many specifics in your personal decription.  One thing
I'm not seeing is a sense of how proficient you are in various
programming languages.  In particular, how is your C and C++?  How
familiar are you with the debugger?  Looking at your github, you seem
to have relevant experience in compilers, which is great, but all your
code appears to be with "managed" languages such as Ruby, Java, and
Python  [and Zeta :)].

Also, the proposal is to extend the analyzer to cover a specific
domain: various POSIX APIs.  Can you please give a sense of your level
of expertise with these APIs?  I was pleased at your initiative in
trying to reuse the existing code to work with them.  That said, I got
the sense from your previous emails that you're not very familiar with
the APIs, and that you chose them because that was the suggestion I had
made on the wiki page.  Obviously it's something you can learn on the
way, but it would be better to accurately identify which areas you're
going need to learn along the way, and the timetable and scope should
reflect that.

Hope this is constructive
Dave




Re: GSOC Proposal

2019-04-08 Thread nick



On 2019-04-08 9:42 a.m., Richard Biener wrote:
> On Mon, 8 Apr 2019, nick wrote:
> 
>>
>>
>> On 2019-04-08 3:29 a.m., Richard Biener wrote:
>>> On Sun, 7 Apr 2019, nick wrote:
>>>


 On 2019-04-07 5:31 a.m., Richard Biener wrote:
> On April 5, 2019 6:11:15 PM GMT+02:00, nick  wrote:
>>
>>
>> On 2019-04-05 6:25 a.m., Richard Biener wrote:
>>> On Wed, 3 Apr 2019, nick wrote:
>>>


 On 2019-04-03 7:30 a.m., Richard Biener wrote:
> On Mon, 1 Apr 2019, nick wrote:
>
>>
>>
>> On 2019-04-01 9:47 a.m., Richard Biener wrote:
>>> On Mon, 1 Apr 2019, nick wrote:
>>>
 Well I'm talking about the shared roots of this garbage
>> collector core state 
 data structure or just struct ggc_root_tab.

 But also this seems that this to be no longer shared globally if
>> I'm not mistaken 
 or this:
 static vec extra_root_vec;

 Not sure after reading the code which is a bigger deal through
>> so I wrote
 my proposal not just asking which is a better issue for not
>> being thread
 safe. Sorry about that.

 As for the second question injection seems to not be the issue
>> or outside
 callers but just internal so phase 3 or step 3 would now be:
 Find internal callers or users of x where x is one of the above
>> rather
 than injecting outside callers. Which answers my second question
>> about
 external callers being a issue still.

 Let me know which  of the two is a better issue:
 1. struct ggc_root_tabs being shared
 2.static vec extra_root_vec; as a shared
>> heap or
 vector of root nodes for each type of allocation

 and I will gladly rewrite my proposal sections for that
 as needs to be reedited.
>>>
>>> I don't think working on the garbage collector as a separate
>>> GSoC project is useful at this point.  Doing locking around
>>> allocation seems like a good short-term solution and if that
>>> turns out to be a performance issue for the threaded part
>>> using per-thread freelists is likely an easy to deploy
>>> solution.
>>>
>>> Richard.
>>>
>> I agree but we were discussing this:
>> Or maybe a project to be more
>> explicit about regions of the code that assume that the garbage-
>> collector can't run within them?[3] (since the GC is state that
>> would
>> be shared by the threads).
>
> The process of collecting garbage is not the only issue (and that
> very issue is easiest mitigated by collecting only at specific
> points - which is what we do - and have those be serializing
>> points).
> The main issue is the underlying memory allocator (GCC uses memory
> that is garbage collected plus regular heap memory).
>
>> In addition I moved my paper back to our discussion about garbage
>> collector
>> state with outside callers.Seems we really need to do something
>> about
>> my wording as the idea of my project in a nutshell was to figure
>> out how to mark shared state by callers and inject it into the
>> garbage collector letting it known that the state was not shared
>> between
>> threads or shared. Seems that was on the GSoc page and in our
>> discussions the issue
>> is marking outside code for shared state. If that's correct then
>> my
>> wording of outside callers is incorrect it should have been shared
>> state between threads on outside callers to the garbage collector.
>> If the state is that in your wording above then great as I
>> understand
>> where we are going and will gladly change my wording.
>
> I'm still not sure what you are shooting at, the above sentences do
> not make any sense to me.
>
>> Also freelists don't work here as the state is shared at the
>> caller's 
>> end which would need two major issues:
>> 1. Locking on nodes of the 
>> freelists when two threads allocate at the same thing which can be
>> a 
>> problem if the shared state is shared a lot
>> 2. Locking allocation with 
>> large numbers of callers can starve threads
>
> First of all allocating memory from the GC pool is not the main
> work of GIMPLE passes so simply serializing at allocation time
>> might
> work out.  Second free lists of course do work.  What you'd do is
> have a fast path in allocation using a thread-local "free list"
> which you can allocate from without 

Re: GSOC Proposal

2019-04-08 Thread Richard Biener
On Mon, 8 Apr 2019, nick wrote:

> 
> 
> On 2019-04-08 3:29 a.m., Richard Biener wrote:
> > On Sun, 7 Apr 2019, nick wrote:
> > 
> >>
> >>
> >> On 2019-04-07 5:31 a.m., Richard Biener wrote:
> >>> On April 5, 2019 6:11:15 PM GMT+02:00, nick  wrote:
> 
> 
>  On 2019-04-05 6:25 a.m., Richard Biener wrote:
> > On Wed, 3 Apr 2019, nick wrote:
> >
> >>
> >>
> >> On 2019-04-03 7:30 a.m., Richard Biener wrote:
> >>> On Mon, 1 Apr 2019, nick wrote:
> >>>
> 
> 
>  On 2019-04-01 9:47 a.m., Richard Biener wrote:
> > On Mon, 1 Apr 2019, nick wrote:
> >
> >> Well I'm talking about the shared roots of this garbage
>  collector core state 
> >> data structure or just struct ggc_root_tab.
> >>
> >> But also this seems that this to be no longer shared globally if
>  I'm not mistaken 
> >> or this:
> >> static vec extra_root_vec;
> >>
> >> Not sure after reading the code which is a bigger deal through
>  so I wrote
> >> my proposal not just asking which is a better issue for not
>  being thread
> >> safe. Sorry about that.
> >>
> >> As for the second question injection seems to not be the issue
>  or outside
> >> callers but just internal so phase 3 or step 3 would now be:
> >> Find internal callers or users of x where x is one of the above
>  rather
> >> than injecting outside callers. Which answers my second question
>  about
> >> external callers being a issue still.
> >>
> >> Let me know which  of the two is a better issue:
> >> 1. struct ggc_root_tabs being shared
> >> 2.static vec extra_root_vec; as a shared
>  heap or
> >> vector of root nodes for each type of allocation
> >>
> >> and I will gladly rewrite my proposal sections for that
> >> as needs to be reedited.
> >
> > I don't think working on the garbage collector as a separate
> > GSoC project is useful at this point.  Doing locking around
> > allocation seems like a good short-term solution and if that
> > turns out to be a performance issue for the threaded part
> > using per-thread freelists is likely an easy to deploy
> > solution.
> >
> > Richard.
> >
>  I agree but we were discussing this:
>  Or maybe a project to be more
>  explicit about regions of the code that assume that the garbage-
>  collector can't run within them?[3] (since the GC is state that
>  would
>  be shared by the threads).
> >>>
> >>> The process of collecting garbage is not the only issue (and that
> >>> very issue is easiest mitigated by collecting only at specific
> >>> points - which is what we do - and have those be serializing
>  points).
> >>> The main issue is the underlying memory allocator (GCC uses memory
> >>> that is garbage collected plus regular heap memory).
> >>>
>  In addition I moved my paper back to our discussion about garbage
>  collector
>  state with outside callers.Seems we really need to do something
>  about
>  my wording as the idea of my project in a nutshell was to figure
>  out how to mark shared state by callers and inject it into the
>  garbage collector letting it known that the state was not shared
>  between
>  threads or shared. Seems that was on the GSoc page and in our
>  discussions the issue
>  is marking outside code for shared state. If that's correct then
>  my
>  wording of outside callers is incorrect it should have been shared
>  state between threads on outside callers to the garbage collector.
>  If the state is that in your wording above then great as I
>  understand
>  where we are going and will gladly change my wording.
> >>>
> >>> I'm still not sure what you are shooting at, the above sentences do
> >>> not make any sense to me.
> >>>
>  Also freelists don't work here as the state is shared at the
>  caller's 
>  end which would need two major issues:
>  1. Locking on nodes of the 
>  freelists when two threads allocate at the same thing which can be
>  a 
>  problem if the shared state is shared a lot
>  2. Locking allocation with 
>  large numbers of callers can starve threads
> >>>
> >>> First of all allocating memory from the GC pool is not the main
> >>> work of GIMPLE passes so simply serializing at allocation time
>  might
> >>> work out.  Second free lists of course do work.  What you'd do is
> >>> have a fast path in allocation using a thread-local "free list"
> >>> which you can allocate from without taking any lock.  Maybe I
>  should
> >>> 

Re: GSOC Proposal

2019-04-08 Thread nick



On 2019-04-08 3:29 a.m., Richard Biener wrote:
> On Sun, 7 Apr 2019, nick wrote:
> 
>>
>>
>> On 2019-04-07 5:31 a.m., Richard Biener wrote:
>>> On April 5, 2019 6:11:15 PM GMT+02:00, nick  wrote:


 On 2019-04-05 6:25 a.m., Richard Biener wrote:
> On Wed, 3 Apr 2019, nick wrote:
>
>>
>>
>> On 2019-04-03 7:30 a.m., Richard Biener wrote:
>>> On Mon, 1 Apr 2019, nick wrote:
>>>


 On 2019-04-01 9:47 a.m., Richard Biener wrote:
> On Mon, 1 Apr 2019, nick wrote:
>
>> Well I'm talking about the shared roots of this garbage
 collector core state 
>> data structure or just struct ggc_root_tab.
>>
>> But also this seems that this to be no longer shared globally if
 I'm not mistaken 
>> or this:
>> static vec extra_root_vec;
>>
>> Not sure after reading the code which is a bigger deal through
 so I wrote
>> my proposal not just asking which is a better issue for not
 being thread
>> safe. Sorry about that.
>>
>> As for the second question injection seems to not be the issue
 or outside
>> callers but just internal so phase 3 or step 3 would now be:
>> Find internal callers or users of x where x is one of the above
 rather
>> than injecting outside callers. Which answers my second question
 about
>> external callers being a issue still.
>>
>> Let me know which  of the two is a better issue:
>> 1. struct ggc_root_tabs being shared
>> 2.static vec extra_root_vec; as a shared
 heap or
>> vector of root nodes for each type of allocation
>>
>> and I will gladly rewrite my proposal sections for that
>> as needs to be reedited.
>
> I don't think working on the garbage collector as a separate
> GSoC project is useful at this point.  Doing locking around
> allocation seems like a good short-term solution and if that
> turns out to be a performance issue for the threaded part
> using per-thread freelists is likely an easy to deploy
> solution.
>
> Richard.
>
 I agree but we were discussing this:
 Or maybe a project to be more
 explicit about regions of the code that assume that the garbage-
 collector can't run within them?[3] (since the GC is state that
 would
 be shared by the threads).
>>>
>>> The process of collecting garbage is not the only issue (and that
>>> very issue is easiest mitigated by collecting only at specific
>>> points - which is what we do - and have those be serializing
 points).
>>> The main issue is the underlying memory allocator (GCC uses memory
>>> that is garbage collected plus regular heap memory).
>>>
 In addition I moved my paper back to our discussion about garbage
 collector
 state with outside callers.Seems we really need to do something
 about
 my wording as the idea of my project in a nutshell was to figure
 out how to mark shared state by callers and inject it into the
 garbage collector letting it known that the state was not shared
 between
 threads or shared. Seems that was on the GSoc page and in our
 discussions the issue
 is marking outside code for shared state. If that's correct then
 my
 wording of outside callers is incorrect it should have been shared
 state between threads on outside callers to the garbage collector.
 If the state is that in your wording above then great as I
 understand
 where we are going and will gladly change my wording.
>>>
>>> I'm still not sure what you are shooting at, the above sentences do
>>> not make any sense to me.
>>>
 Also freelists don't work here as the state is shared at the
 caller's 
 end which would need two major issues:
 1. Locking on nodes of the 
 freelists when two threads allocate at the same thing which can be
 a 
 problem if the shared state is shared a lot
 2. Locking allocation with 
 large numbers of callers can starve threads
>>>
>>> First of all allocating memory from the GC pool is not the main
>>> work of GIMPLE passes so simply serializing at allocation time
 might
>>> work out.  Second free lists of course do work.  What you'd do is
>>> have a fast path in allocation using a thread-local "free list"
>>> which you can allocate from without taking any lock.  Maybe I
 should
>>> explain "free list" since that term doesn't make too much sense in
>>> a garbage collector world.  What I'd do is when a client thread
>>> asks for memory of size N allocate M objects of that size but put
>>> M - 1 on the client thread local "free list" to be 

Re: GSOC Proposal

2019-04-08 Thread Richard Biener
On Sun, 7 Apr 2019, nick wrote:

> 
> 
> On 2019-04-07 5:31 a.m., Richard Biener wrote:
> > On April 5, 2019 6:11:15 PM GMT+02:00, nick  wrote:
> >>
> >>
> >> On 2019-04-05 6:25 a.m., Richard Biener wrote:
> >>> On Wed, 3 Apr 2019, nick wrote:
> >>>
> 
> 
>  On 2019-04-03 7:30 a.m., Richard Biener wrote:
> > On Mon, 1 Apr 2019, nick wrote:
> >
> >>
> >>
> >> On 2019-04-01 9:47 a.m., Richard Biener wrote:
> >>> On Mon, 1 Apr 2019, nick wrote:
> >>>
>  Well I'm talking about the shared roots of this garbage
> >> collector core state 
>  data structure or just struct ggc_root_tab.
> 
>  But also this seems that this to be no longer shared globally if
> >> I'm not mistaken 
>  or this:
>  static vec extra_root_vec;
> 
>  Not sure after reading the code which is a bigger deal through
> >> so I wrote
>  my proposal not just asking which is a better issue for not
> >> being thread
>  safe. Sorry about that.
> 
>  As for the second question injection seems to not be the issue
> >> or outside
>  callers but just internal so phase 3 or step 3 would now be:
>  Find internal callers or users of x where x is one of the above
> >> rather
>  than injecting outside callers. Which answers my second question
> >> about
>  external callers being a issue still.
> 
>  Let me know which  of the two is a better issue:
>  1. struct ggc_root_tabs being shared
>  2.static vec extra_root_vec; as a shared
> >> heap or
>  vector of root nodes for each type of allocation
> 
>  and I will gladly rewrite my proposal sections for that
>  as needs to be reedited.
> >>>
> >>> I don't think working on the garbage collector as a separate
> >>> GSoC project is useful at this point.  Doing locking around
> >>> allocation seems like a good short-term solution and if that
> >>> turns out to be a performance issue for the threaded part
> >>> using per-thread freelists is likely an easy to deploy
> >>> solution.
> >>>
> >>> Richard.
> >>>
> >> I agree but we were discussing this:
> >> Or maybe a project to be more
> >> explicit about regions of the code that assume that the garbage-
> >> collector can't run within them?[3] (since the GC is state that
> >> would
> >> be shared by the threads).
> >
> > The process of collecting garbage is not the only issue (and that
> > very issue is easiest mitigated by collecting only at specific
> > points - which is what we do - and have those be serializing
> >> points).
> > The main issue is the underlying memory allocator (GCC uses memory
> > that is garbage collected plus regular heap memory).
> >
> >> In addition I moved my paper back to our discussion about garbage
> >> collector
> >> state with outside callers.Seems we really need to do something
> >> about
> >> my wording as the idea of my project in a nutshell was to figure
> >> out how to mark shared state by callers and inject it into the
> >> garbage collector letting it known that the state was not shared
> >> between
> >> threads or shared. Seems that was on the GSoc page and in our
> >> discussions the issue
> >> is marking outside code for shared state. If that's correct then
> >> my
> >> wording of outside callers is incorrect it should have been shared
> >> state between threads on outside callers to the garbage collector.
> >> If the state is that in your wording above then great as I
> >> understand
> >> where we are going and will gladly change my wording.
> >
> > I'm still not sure what you are shooting at, the above sentences do
> > not make any sense to me.
> >
> >> Also freelists don't work here as the state is shared at the
> >> caller's 
> >> end which would need two major issues:
> >> 1. Locking on nodes of the 
> >> freelists when two threads allocate at the same thing which can be
> >> a 
> >> problem if the shared state is shared a lot
> >> 2. Locking allocation with 
> >> large numbers of callers can starve threads
> >
> > First of all allocating memory from the GC pool is not the main
> > work of GIMPLE passes so simply serializing at allocation time
> >> might
> > work out.  Second free lists of course do work.  What you'd do is
> > have a fast path in allocation using a thread-local "free list"
> > which you can allocate from without taking any lock.  Maybe I
> >> should
> > explain "free list" since that term doesn't make too much sense in
> > a garbage collector world.  What I'd do is when a client thread
> > asks for memory of size N allocate M objects of that size but put
> > M - 1 on the client thread local "free list" to be allocated
> >> lock-free
> > from for the next M - 1 

Re: GSOC Proposal

2019-04-07 Thread nick



On 2019-04-07 5:31 a.m., Richard Biener wrote:
> On April 5, 2019 6:11:15 PM GMT+02:00, nick  wrote:
>>
>>
>> On 2019-04-05 6:25 a.m., Richard Biener wrote:
>>> On Wed, 3 Apr 2019, nick wrote:
>>>


 On 2019-04-03 7:30 a.m., Richard Biener wrote:
> On Mon, 1 Apr 2019, nick wrote:
>
>>
>>
>> On 2019-04-01 9:47 a.m., Richard Biener wrote:
>>> On Mon, 1 Apr 2019, nick wrote:
>>>
 Well I'm talking about the shared roots of this garbage
>> collector core state 
 data structure or just struct ggc_root_tab.

 But also this seems that this to be no longer shared globally if
>> I'm not mistaken 
 or this:
 static vec extra_root_vec;

 Not sure after reading the code which is a bigger deal through
>> so I wrote
 my proposal not just asking which is a better issue for not
>> being thread
 safe. Sorry about that.

 As for the second question injection seems to not be the issue
>> or outside
 callers but just internal so phase 3 or step 3 would now be:
 Find internal callers or users of x where x is one of the above
>> rather
 than injecting outside callers. Which answers my second question
>> about
 external callers being a issue still.

 Let me know which  of the two is a better issue:
 1. struct ggc_root_tabs being shared
 2.static vec extra_root_vec; as a shared
>> heap or
 vector of root nodes for each type of allocation

 and I will gladly rewrite my proposal sections for that
 as needs to be reedited.
>>>
>>> I don't think working on the garbage collector as a separate
>>> GSoC project is useful at this point.  Doing locking around
>>> allocation seems like a good short-term solution and if that
>>> turns out to be a performance issue for the threaded part
>>> using per-thread freelists is likely an easy to deploy
>>> solution.
>>>
>>> Richard.
>>>
>> I agree but we were discussing this:
>> Or maybe a project to be more
>> explicit about regions of the code that assume that the garbage-
>> collector can't run within them?[3] (since the GC is state that
>> would
>> be shared by the threads).
>
> The process of collecting garbage is not the only issue (and that
> very issue is easiest mitigated by collecting only at specific
> points - which is what we do - and have those be serializing
>> points).
> The main issue is the underlying memory allocator (GCC uses memory
> that is garbage collected plus regular heap memory).
>
>> In addition I moved my paper back to our discussion about garbage
>> collector
>> state with outside callers.Seems we really need to do something
>> about
>> my wording as the idea of my project in a nutshell was to figure
>> out how to mark shared state by callers and inject it into the
>> garbage collector letting it known that the state was not shared
>> between
>> threads or shared. Seems that was on the GSoc page and in our
>> discussions the issue
>> is marking outside code for shared state. If that's correct then
>> my
>> wording of outside callers is incorrect it should have been shared
>> state between threads on outside callers to the garbage collector.
>> If the state is that in your wording above then great as I
>> understand
>> where we are going and will gladly change my wording.
>
> I'm still not sure what you are shooting at, the above sentences do
> not make any sense to me.
>
>> Also freelists don't work here as the state is shared at the
>> caller's 
>> end which would need two major issues:
>> 1. Locking on nodes of the 
>> freelists when two threads allocate at the same thing which can be
>> a 
>> problem if the shared state is shared a lot
>> 2. Locking allocation with 
>> large numbers of callers can starve threads
>
> First of all allocating memory from the GC pool is not the main
> work of GIMPLE passes so simply serializing at allocation time
>> might
> work out.  Second free lists of course do work.  What you'd do is
> have a fast path in allocation using a thread-local "free list"
> which you can allocate from without taking any lock.  Maybe I
>> should
> explain "free list" since that term doesn't make too much sense in
> a garbage collector world.  What I'd do is when a client thread
> asks for memory of size N allocate M objects of that size but put
> M - 1 on the client thread local "free list" to be allocated
>> lock-free
> from for the next M - 1 calls.  Note that garbage collected memory
> objects are only handed out in fixed chunks (powers of two plus
> a few special sizes) so you'd have one "free list" per chunk size
> per thread.
>
> The collection itself (mark & sweep) would be fully 

Re: GSOC Proposal

2019-04-07 Thread Richard Biener
On April 5, 2019 6:11:15 PM GMT+02:00, nick  wrote:
>
>
>On 2019-04-05 6:25 a.m., Richard Biener wrote:
>> On Wed, 3 Apr 2019, nick wrote:
>> 
>>>
>>>
>>> On 2019-04-03 7:30 a.m., Richard Biener wrote:
 On Mon, 1 Apr 2019, nick wrote:

>
>
> On 2019-04-01 9:47 a.m., Richard Biener wrote:
>> On Mon, 1 Apr 2019, nick wrote:
>>
>>> Well I'm talking about the shared roots of this garbage
>collector core state 
>>> data structure or just struct ggc_root_tab.
>>>
>>> But also this seems that this to be no longer shared globally if
>I'm not mistaken 
>>> or this:
>>> static vec extra_root_vec;
>>>
>>> Not sure after reading the code which is a bigger deal through
>so I wrote
>>> my proposal not just asking which is a better issue for not
>being thread
>>> safe. Sorry about that.
>>>
>>> As for the second question injection seems to not be the issue
>or outside
>>> callers but just internal so phase 3 or step 3 would now be:
>>> Find internal callers or users of x where x is one of the above
>rather
>>> than injecting outside callers. Which answers my second question
>about
>>> external callers being a issue still.
>>>
>>> Let me know which  of the two is a better issue:
>>> 1. struct ggc_root_tabs being shared
>>> 2.static vec extra_root_vec; as a shared
>heap or
>>> vector of root nodes for each type of allocation
>>>
>>> and I will gladly rewrite my proposal sections for that
>>> as needs to be reedited.
>>
>> I don't think working on the garbage collector as a separate
>> GSoC project is useful at this point.  Doing locking around
>> allocation seems like a good short-term solution and if that
>> turns out to be a performance issue for the threaded part
>> using per-thread freelists is likely an easy to deploy
>> solution.
>>
>> Richard.
>>
> I agree but we were discussing this:
> Or maybe a project to be more
> explicit about regions of the code that assume that the garbage-
> collector can't run within them?[3] (since the GC is state that
>would
> be shared by the threads).

 The process of collecting garbage is not the only issue (and that
 very issue is easiest mitigated by collecting only at specific
 points - which is what we do - and have those be serializing
>points).
 The main issue is the underlying memory allocator (GCC uses memory
 that is garbage collected plus regular heap memory).

> In addition I moved my paper back to our discussion about garbage
>collector
> state with outside callers.Seems we really need to do something
>about
> my wording as the idea of my project in a nutshell was to figure
> out how to mark shared state by callers and inject it into the
> garbage collector letting it known that the state was not shared
>between
> threads or shared. Seems that was on the GSoc page and in our
>discussions the issue
> is marking outside code for shared state. If that's correct then
>my
> wording of outside callers is incorrect it should have been shared
> state between threads on outside callers to the garbage collector.
> If the state is that in your wording above then great as I
>understand
> where we are going and will gladly change my wording.

 I'm still not sure what you are shooting at, the above sentences do
 not make any sense to me.

> Also freelists don't work here as the state is shared at the
>caller's 
> end which would need two major issues:
> 1. Locking on nodes of the 
> freelists when two threads allocate at the same thing which can be
>a 
> problem if the shared state is shared a lot
> 2. Locking allocation with 
> large numbers of callers can starve threads

 First of all allocating memory from the GC pool is not the main
 work of GIMPLE passes so simply serializing at allocation time
>might
 work out.  Second free lists of course do work.  What you'd do is
 have a fast path in allocation using a thread-local "free list"
 which you can allocate from without taking any lock.  Maybe I
>should
 explain "free list" since that term doesn't make too much sense in
 a garbage collector world.  What I'd do is when a client thread
 asks for memory of size N allocate M objects of that size but put
 M - 1 on the client thread local "free list" to be allocated
>lock-free
 from for the next M - 1 calls.  Note that garbage collected memory
 objects are only handed out in fixed chunks (powers of two plus
 a few special sizes) so you'd have one "free list" per chunk size
 per thread.

 The collection itself (mark & sweep) would be fully serialized
>still
 (and not return to any threads local "free list").

 ggc_free'd objects _might_ go to the threads "free list"s (yeah, we
 _do_ have ggc_free ...).

 As 

Re: GSOC Proposal

2019-04-05 Thread nick



On 2019-04-05 6:25 a.m., Richard Biener wrote:
> On Wed, 3 Apr 2019, nick wrote:
> 
>>
>>
>> On 2019-04-03 7:30 a.m., Richard Biener wrote:
>>> On Mon, 1 Apr 2019, nick wrote:
>>>


 On 2019-04-01 9:47 a.m., Richard Biener wrote:
> On Mon, 1 Apr 2019, nick wrote:
>
>> Well I'm talking about the shared roots of this garbage collector core 
>> state 
>> data structure or just struct ggc_root_tab.
>>
>> But also this seems that this to be no longer shared globally if I'm not 
>> mistaken 
>> or this:
>> static vec extra_root_vec;
>>
>> Not sure after reading the code which is a bigger deal through so I wrote
>> my proposal not just asking which is a better issue for not being thread
>> safe. Sorry about that.
>>
>> As for the second question injection seems to not be the issue or outside
>> callers but just internal so phase 3 or step 3 would now be:
>> Find internal callers or users of x where x is one of the above rather
>> than injecting outside callers. Which answers my second question about
>> external callers being a issue still.
>>
>> Let me know which  of the two is a better issue:
>> 1. struct ggc_root_tabs being shared
>> 2.static vec extra_root_vec; as a shared heap or
>> vector of root nodes for each type of allocation
>>
>> and I will gladly rewrite my proposal sections for that
>> as needs to be reedited.
>
> I don't think working on the garbage collector as a separate
> GSoC project is useful at this point.  Doing locking around
> allocation seems like a good short-term solution and if that
> turns out to be a performance issue for the threaded part
> using per-thread freelists is likely an easy to deploy
> solution.
>
> Richard.
>
 I agree but we were discussing this:
 Or maybe a project to be more
 explicit about regions of the code that assume that the garbage-
 collector can't run within them?[3] (since the GC is state that would
 be shared by the threads).
>>>
>>> The process of collecting garbage is not the only issue (and that
>>> very issue is easiest mitigated by collecting only at specific
>>> points - which is what we do - and have those be serializing points).
>>> The main issue is the underlying memory allocator (GCC uses memory
>>> that is garbage collected plus regular heap memory).
>>>
 In addition I moved my paper back to our discussion about garbage collector
 state with outside callers.Seems we really need to do something about
 my wording as the idea of my project in a nutshell was to figure
 out how to mark shared state by callers and inject it into the
 garbage collector letting it known that the state was not shared between
 threads or shared. Seems that was on the GSoc page and in our discussions 
 the issue
 is marking outside code for shared state. If that's correct then my
 wording of outside callers is incorrect it should have been shared
 state between threads on outside callers to the garbage collector.
 If the state is that in your wording above then great as I understand
 where we are going and will gladly change my wording.
>>>
>>> I'm still not sure what you are shooting at, the above sentences do
>>> not make any sense to me.
>>>
 Also freelists don't work here as the state is shared at the caller's 
 end which would need two major issues:
 1. Locking on nodes of the 
 freelists when two threads allocate at the same thing which can be a 
 problem if the shared state is shared a lot
 2. Locking allocation with 
 large numbers of callers can starve threads
>>>
>>> First of all allocating memory from the GC pool is not the main
>>> work of GIMPLE passes so simply serializing at allocation time might
>>> work out.  Second free lists of course do work.  What you'd do is
>>> have a fast path in allocation using a thread-local "free list"
>>> which you can allocate from without taking any lock.  Maybe I should
>>> explain "free list" since that term doesn't make too much sense in
>>> a garbage collector world.  What I'd do is when a client thread
>>> asks for memory of size N allocate M objects of that size but put
>>> M - 1 on the client thread local "free list" to be allocated lock-free
>>> from for the next M - 1 calls.  Note that garbage collected memory
>>> objects are only handed out in fixed chunks (powers of two plus
>>> a few special sizes) so you'd have one "free list" per chunk size
>>> per thread.
>>>
>>> The collection itself (mark & sweep) would be fully serialized still
>>> (and not return to any threads local "free list").
>>>
>>> ggc_free'd objects _might_ go to the threads "free list"s (yeah, we
>>> _do_ have ggc_free ...).
>>>
>>> As said, I don't see GC or the memory allocator as sth interesting
>>> to work on for parallelization until the basic setup works and it
>>> proves to be a 

Re: GSOC Proposal

2019-04-05 Thread Richard Biener
On Wed, 3 Apr 2019, nick wrote:

> 
> 
> On 2019-04-03 7:30 a.m., Richard Biener wrote:
> > On Mon, 1 Apr 2019, nick wrote:
> > 
> >>
> >>
> >> On 2019-04-01 9:47 a.m., Richard Biener wrote:
> >>> On Mon, 1 Apr 2019, nick wrote:
> >>>
>  Well I'm talking about the shared roots of this garbage collector core 
>  state 
>  data structure or just struct ggc_root_tab.
> 
>  But also this seems that this to be no longer shared globally if I'm not 
>  mistaken 
>  or this:
>  static vec extra_root_vec;
> 
>  Not sure after reading the code which is a bigger deal through so I wrote
>  my proposal not just asking which is a better issue for not being thread
>  safe. Sorry about that.
> 
>  As for the second question injection seems to not be the issue or outside
>  callers but just internal so phase 3 or step 3 would now be:
>  Find internal callers or users of x where x is one of the above rather
>  than injecting outside callers. Which answers my second question about
>  external callers being a issue still.
> 
>  Let me know which  of the two is a better issue:
>  1. struct ggc_root_tabs being shared
>  2.static vec extra_root_vec; as a shared heap or
>  vector of root nodes for each type of allocation
> 
>  and I will gladly rewrite my proposal sections for that
>  as needs to be reedited.
> >>>
> >>> I don't think working on the garbage collector as a separate
> >>> GSoC project is useful at this point.  Doing locking around
> >>> allocation seems like a good short-term solution and if that
> >>> turns out to be a performance issue for the threaded part
> >>> using per-thread freelists is likely an easy to deploy
> >>> solution.
> >>>
> >>> Richard.
> >>>
> >> I agree but we were discussing this:
> >> Or maybe a project to be more
> >> explicit about regions of the code that assume that the garbage-
> >> collector can't run within them?[3] (since the GC is state that would
> >> be shared by the threads).
> > 
> > The process of collecting garbage is not the only issue (and that
> > very issue is easiest mitigated by collecting only at specific
> > points - which is what we do - and have those be serializing points).
> > The main issue is the underlying memory allocator (GCC uses memory
> > that is garbage collected plus regular heap memory).
> > 
> >> In addition I moved my paper back to our discussion about garbage collector
> >> state with outside callers.Seems we really need to do something about
> >> my wording as the idea of my project in a nutshell was to figure
> >> out how to mark shared state by callers and inject it into the
> >> garbage collector letting it known that the state was not shared between
> >> threads or shared. Seems that was on the GSoc page and in our discussions 
> >> the issue
> >> is marking outside code for shared state. If that's correct then my
> >> wording of outside callers is incorrect it should have been shared
> >> state between threads on outside callers to the garbage collector.
> >> If the state is that in your wording above then great as I understand
> >> where we are going and will gladly change my wording.
> > 
> > I'm still not sure what you are shooting at, the above sentences do
> > not make any sense to me.
> > 
> >> Also freelists don't work here as the state is shared at the caller's 
> >> end which would need two major issues:
> >> 1. Locking on nodes of the 
> >> freelists when two threads allocate at the same thing which can be a 
> >> problem if the shared state is shared a lot
> >> 2. Locking allocation with 
> >> large numbers of callers can starve threads
> > 
> > First of all allocating memory from the GC pool is not the main
> > work of GIMPLE passes so simply serializing at allocation time might
> > work out.  Second free lists of course do work.  What you'd do is
> > have a fast path in allocation using a thread-local "free list"
> > which you can allocate from without taking any lock.  Maybe I should
> > explain "free list" since that term doesn't make too much sense in
> > a garbage collector world.  What I'd do is when a client thread
> > asks for memory of size N allocate M objects of that size but put
> > M - 1 on the client thread local "free list" to be allocated lock-free
> > from for the next M - 1 calls.  Note that garbage collected memory
> > objects are only handed out in fixed chunks (powers of two plus
> > a few special sizes) so you'd have one "free list" per chunk size
> > per thread.
> > 
> > The collection itself (mark & sweep) would be fully serialized still
> > (and not return to any threads local "free list").
> > 
> > ggc_free'd objects _might_ go to the threads "free list"s (yeah, we
> > _do_ have ggc_free ...).
> > 
> > As said, I don't see GC or the memory allocator as sth interesting
> > to work on for parallelization until the basic setup works and it
> > proves to be a bottleneck.
> > 
> >> Seems that working on the 

Re: GSOC Proposal

2019-04-03 Thread nick



On 2019-04-03 7:30 a.m., Richard Biener wrote:
> On Mon, 1 Apr 2019, nick wrote:
> 
>>
>>
>> On 2019-04-01 9:47 a.m., Richard Biener wrote:
>>> On Mon, 1 Apr 2019, nick wrote:
>>>
 Well I'm talking about the shared roots of this garbage collector core 
 state 
 data structure or just struct ggc_root_tab.

 But also this seems that this to be no longer shared globally if I'm not 
 mistaken 
 or this:
 static vec extra_root_vec;

 Not sure after reading the code which is a bigger deal through so I wrote
 my proposal not just asking which is a better issue for not being thread
 safe. Sorry about that.

 As for the second question injection seems to not be the issue or outside
 callers but just internal so phase 3 or step 3 would now be:
 Find internal callers or users of x where x is one of the above rather
 than injecting outside callers. Which answers my second question about
 external callers being a issue still.

 Let me know which  of the two is a better issue:
 1. struct ggc_root_tabs being shared
 2.static vec extra_root_vec; as a shared heap or
 vector of root nodes for each type of allocation

 and I will gladly rewrite my proposal sections for that
 as needs to be reedited.
>>>
>>> I don't think working on the garbage collector as a separate
>>> GSoC project is useful at this point.  Doing locking around
>>> allocation seems like a good short-term solution and if that
>>> turns out to be a performance issue for the threaded part
>>> using per-thread freelists is likely an easy to deploy
>>> solution.
>>>
>>> Richard.
>>>
>> I agree but we were discussing this:
>> Or maybe a project to be more
>> explicit about regions of the code that assume that the garbage-
>> collector can't run within them?[3] (since the GC is state that would
>> be shared by the threads).
> 
> The process of collecting garbage is not the only issue (and that
> very issue is easiest mitigated by collecting only at specific
> points - which is what we do - and have those be serializing points).
> The main issue is the underlying memory allocator (GCC uses memory
> that is garbage collected plus regular heap memory).
> 
>> In addition I moved my paper back to our discussion about garbage collector
>> state with outside callers.Seems we really need to do something about
>> my wording as the idea of my project in a nutshell was to figure
>> out how to mark shared state by callers and inject it into the
>> garbage collector letting it known that the state was not shared between
>> threads or shared. Seems that was on the GSoc page and in our discussions 
>> the issue
>> is marking outside code for shared state. If that's correct then my
>> wording of outside callers is incorrect it should have been shared
>> state between threads on outside callers to the garbage collector.
>> If the state is that in your wording above then great as I understand
>> where we are going and will gladly change my wording.
> 
> I'm still not sure what you are shooting at, the above sentences do
> not make any sense to me.
> 
>> Also freelists don't work here as the state is shared at the caller's 
>> end which would need two major issues:
>> 1. Locking on nodes of the 
>> freelists when two threads allocate at the same thing which can be a 
>> problem if the shared state is shared a lot
>> 2. Locking allocation with 
>> large numbers of callers can starve threads
> 
> First of all allocating memory from the GC pool is not the main
> work of GIMPLE passes so simply serializing at allocation time might
> work out.  Second free lists of course do work.  What you'd do is
> have a fast path in allocation using a thread-local "free list"
> which you can allocate from without taking any lock.  Maybe I should
> explain "free list" since that term doesn't make too much sense in
> a garbage collector world.  What I'd do is when a client thread
> asks for memory of size N allocate M objects of that size but put
> M - 1 on the client thread local "free list" to be allocated lock-free
> from for the next M - 1 calls.  Note that garbage collected memory
> objects are only handed out in fixed chunks (powers of two plus
> a few special sizes) so you'd have one "free list" per chunk size
> per thread.
> 
> The collection itself (mark & sweep) would be fully serialized still
> (and not return to any threads local "free list").
> 
> ggc_free'd objects _might_ go to the threads "free list"s (yeah, we
> _do_ have ggc_free ...).
> 
> As said, I don't see GC or the memory allocator as sth interesting
> to work on for parallelization until the basic setup works and it
> proves to be a bottleneck.
> 
>> Seems that working on the garbage collector itself isn't the issue but 
>> the callers as I just figured out as related to your state idea. Let me 
>> know if that's correct and if the wording change I mentioned is fine 
>> with you as that's the state it seems that needs to 

Re: GSOC Proposal

2019-04-03 Thread Richard Biener
On Mon, 1 Apr 2019, nick wrote:

> 
> 
> On 2019-04-01 9:47 a.m., Richard Biener wrote:
> > On Mon, 1 Apr 2019, nick wrote:
> > 
> >> Well I'm talking about the shared roots of this garbage collector core 
> >> state 
> >> data structure or just struct ggc_root_tab.
> >>
> >> But also this seems that this to be no longer shared globally if I'm not 
> >> mistaken 
> >> or this:
> >> static vec extra_root_vec;
> >>
> >> Not sure after reading the code which is a bigger deal through so I wrote
> >> my proposal not just asking which is a better issue for not being thread
> >> safe. Sorry about that.
> >>
> >> As for the second question injection seems to not be the issue or outside
> >> callers but just internal so phase 3 or step 3 would now be:
> >> Find internal callers or users of x where x is one of the above rather
> >> than injecting outside callers. Which answers my second question about
> >> external callers being a issue still.
> >>
> >> Let me know which  of the two is a better issue:
> >> 1. struct ggc_root_tabs being shared
> >> 2.static vec extra_root_vec; as a shared heap or
> >> vector of root nodes for each type of allocation
> >>
> >> and I will gladly rewrite my proposal sections for that
> >> as needs to be reedited.
> > 
> > I don't think working on the garbage collector as a separate
> > GSoC project is useful at this point.  Doing locking around
> > allocation seems like a good short-term solution and if that
> > turns out to be a performance issue for the threaded part
> > using per-thread freelists is likely an easy to deploy
> > solution.
> > 
> > Richard.
> > 
> I agree but we were discussing this:
> Or maybe a project to be more
> explicit about regions of the code that assume that the garbage-
> collector can't run within them?[3] (since the GC is state that would
> be shared by the threads).

The process of collecting garbage is not the only issue (and that
very issue is easiest mitigated by collecting only at specific
points - which is what we do - and have those be serializing points).
The main issue is the underlying memory allocator (GCC uses memory
that is garbage collected plus regular heap memory).

> In addition I moved my paper back to our discussion about garbage collector
> state with outside callers.Seems we really need to do something about
> my wording as the idea of my project in a nutshell was to figure
> out how to mark shared state by callers and inject it into the
> garbage collector letting it known that the state was not shared between
> threads or shared. Seems that was on the GSoc page and in our discussions the 
> issue
> is marking outside code for shared state. If that's correct then my
> wording of outside callers is incorrect it should have been shared
> state between threads on outside callers to the garbage collector.
> If the state is that in your wording above then great as I understand
> where we are going and will gladly change my wording.

I'm still not sure what you are shooting at, the above sentences do
not make any sense to me.

> Also freelists don't work here as the state is shared at the caller's 
> end which would need two major issues:
> 1. Locking on nodes of the 
> freelists when two threads allocate at the same thing which can be a 
> problem if the shared state is shared a lot
> 2. Locking allocation with 
> large numbers of callers can starve threads

First of all allocating memory from the GC pool is not the main
work of GIMPLE passes so simply serializing at allocation time might
work out.  Second free lists of course do work.  What you'd do is
have a fast path in allocation using a thread-local "free list"
which you can allocate from without taking any lock.  Maybe I should
explain "free list" since that term doesn't make too much sense in
a garbage collector world.  What I'd do is when a client thread
asks for memory of size N allocate M objects of that size but put
M - 1 on the client thread local "free list" to be allocated lock-free
from for the next M - 1 calls.  Note that garbage collected memory
objects are only handed out in fixed chunks (powers of two plus
a few special sizes) so you'd have one "free list" per chunk size
per thread.

The collection itself (mark & sweep) would be fully serialized still
(and not return to any threads local "free list").

ggc_free'd objects _might_ go to the threads "free list"s (yeah, we
_do_ have ggc_free ...).

As said, I don't see GC or the memory allocator as sth interesting
to work on for parallelization until the basic setup works and it
proves to be a bottleneck.

> Seems that working on the garbage collector itself isn't the issue but 
> the callers as I just figured out as related to your state idea. Let me 
> know if that's correct and if the wording change I mentioned is fine 
> with you as that's the state it seems that needs to be changed.
> Nick 

Richard.

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, 

Re: GSOC Proposal

2019-04-01 Thread nick



On 2019-04-01 9:47 a.m., Richard Biener wrote:
> On Mon, 1 Apr 2019, nick wrote:
> 
>>
>>
>> On 2019-04-01 5:56 a.m., Richard Biener wrote:
>>> On Fri, 29 Mar 2019, nick wrote:
>>>


 On 2019-03-29 10:28 a.m., nick wrote:
>
>
> On 2019-03-29 5:08 a.m., Richard Biener wrote:
>> On Thu, 28 Mar 2019, nick wrote:
>>
>>>
>>>
>>> On 2019-03-28 4:59 a.m., Richard Biener wrote:
 On Wed, Mar 27, 2019 at 6:31 PM nick  wrote:
>
> Greetings All,
>
> I've already done most of the work required for signing up for GSoC
> as of last year i.e. reading getting started, being signed up legally
> for contributions.
>
> My only real concern would be the proposal which I started writing 
> here:
> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit?usp=sharing
>
> The biography and success section I'm fine with my bigger concern 
> would be the project and roadmap
> section. The roadmap is there and I will go into more detail about it 
> in the projects section as
> need be. Just wanted to known if the roadmap is detailed enough or 
> can I just write out a few
> paragraphs discussing it in the Projects Section.

 I'm not sure I understand either the problem analysis nor the project
 goal parts.  What
 shared state with respect to garbage collection are you talking about?

 Richard.

>>> I just fixed it. Seems we were discussing RTL itself. I edited it to 
>>> reflect those changes. Let me know if it's unclear or you would 
>>> actually 
>>> like me to discuss some changes that may occur in the RTL layer itself.
>>>
>>>
>>> I'm glad to be more exact if that's better but seems your confusion was 
>>> just what layer we were touching.
>>
>> Let me just throw in some knowledge here.  The issue with RTL
>> is that we currently can only have a single function in this
>> intermediate language state since a function in RTL has some
>> state in global variables that would differ if it were another
>> function.  We can have multiple functions in GIMPLE intermediate
>> language state since all such state is in a function-specific
>> data structure (struct function).  The hard thing about moving
>> all this "global" state of RTL into the same place is that
>> there's global state in the various backends (and there's
>> already a struct funtion 'machine' part for such state, so there's
>> hope the issue isn't as big as it could be) and that some of
>> the global state is big and only changes very rarely.
>> That said, I'm not sure if anybody knows the full details here.
>>
>> So as far as I understand you'd like to tackle this as project
>> with the goal to be able to have multiple functions in RTL
>> state.
>>
>> That's laudable but IMHO also quite ambitious for a GSoC
>> project.  It's also an area I am not very familiar with so
>> I opt out of being a mentor for this project.
>>
> While I'm aware of three areas where the shared state is an issue
> currently:
> 1, Compiler's Proper
> 2. The expand_functions 
> 3. RTL
> 4.Garbage Collector
>
> Or maybe a project to be more
> explicit about regions of the code that assume that the garbage-
> collector can't run within them?[3] (since the GC is state that would
> be shared by the threads).
>
> This is what we were discussing previously and I wrote my proposal for
> that. You however seem confused about what parts of the garbage collector
> would be touched. That's fine with me, however seems you want be to
> be more exact about which part  is touched.
>
> My questions would be as it's changed back to the garbage collector 
> project:
> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit
>
> 1. Your confusion about which part of the garbage collector is touched 
> doesn't
> really make sense s it's for the whole garbage collector as related to 
> shared
> state?
> 2. Injection was my code here in phase 3 for the callers of the new 
> functions or
> macros, perhaps this is not needed as the work with the garbage collector 
> is enough?
> 3. Am I not understanding this project as I thought I was in the proposal 
> I wrote?
>
> Seems your more confusing my wording probably so I'm going to suggest one 
> of 
> two things here:
> a) I'm going to allow you to make comments with what's confusing you and
> it needs that's the issue here more than anything else so I sent you 
> a link and please comment where you are having issues with this not
> be clear for you:
> Or maybe a project to be more
> 

Re: GSOC Proposal

2019-04-01 Thread Richard Biener
On Mon, 1 Apr 2019, nick wrote:

> 
> 
> On 2019-04-01 5:56 a.m., Richard Biener wrote:
> > On Fri, 29 Mar 2019, nick wrote:
> > 
> >>
> >>
> >> On 2019-03-29 10:28 a.m., nick wrote:
> >>>
> >>>
> >>> On 2019-03-29 5:08 a.m., Richard Biener wrote:
>  On Thu, 28 Mar 2019, nick wrote:
> 
> >
> >
> > On 2019-03-28 4:59 a.m., Richard Biener wrote:
> >> On Wed, Mar 27, 2019 at 6:31 PM nick  wrote:
> >>>
> >>> Greetings All,
> >>>
> >>> I've already done most of the work required for signing up for GSoC
> >>> as of last year i.e. reading getting started, being signed up legally
> >>> for contributions.
> >>>
> >>> My only real concern would be the proposal which I started writing 
> >>> here:
> >>> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit?usp=sharing
> >>>
> >>> The biography and success section I'm fine with my bigger concern 
> >>> would be the project and roadmap
> >>> section. The roadmap is there and I will go into more detail about it 
> >>> in the projects section as
> >>> need be. Just wanted to known if the roadmap is detailed enough or 
> >>> can I just write out a few
> >>> paragraphs discussing it in the Projects Section.
> >>
> >> I'm not sure I understand either the problem analysis nor the project
> >> goal parts.  What
> >> shared state with respect to garbage collection are you talking about?
> >>
> >> Richard.
> >>
> > I just fixed it. Seems we were discussing RTL itself. I edited it to 
> > reflect those changes. Let me know if it's unclear or you would 
> > actually 
> > like me to discuss some changes that may occur in the RTL layer itself.
> >
> >
> > I'm glad to be more exact if that's better but seems your confusion was 
> > just what layer we were touching.
> 
>  Let me just throw in some knowledge here.  The issue with RTL
>  is that we currently can only have a single function in this
>  intermediate language state since a function in RTL has some
>  state in global variables that would differ if it were another
>  function.  We can have multiple functions in GIMPLE intermediate
>  language state since all such state is in a function-specific
>  data structure (struct function).  The hard thing about moving
>  all this "global" state of RTL into the same place is that
>  there's global state in the various backends (and there's
>  already a struct funtion 'machine' part for such state, so there's
>  hope the issue isn't as big as it could be) and that some of
>  the global state is big and only changes very rarely.
>  That said, I'm not sure if anybody knows the full details here.
> 
>  So as far as I understand you'd like to tackle this as project
>  with the goal to be able to have multiple functions in RTL
>  state.
> 
>  That's laudable but IMHO also quite ambitious for a GSoC
>  project.  It's also an area I am not very familiar with so
>  I opt out of being a mentor for this project.
> 
> >>> While I'm aware of three areas where the shared state is an issue
> >>> currently:
> >>> 1, Compiler's Proper
> >>> 2. The expand_functions 
> >>> 3. RTL
> >>> 4.Garbage Collector
> >>>
> >>> Or maybe a project to be more
> >>> explicit about regions of the code that assume that the garbage-
> >>> collector can't run within them?[3] (since the GC is state that would
> >>> be shared by the threads).
> >>>
> >>> This is what we were discussing previously and I wrote my proposal for
> >>> that. You however seem confused about what parts of the garbage collector
> >>> would be touched. That's fine with me, however seems you want be to
> >>> be more exact about which part  is touched.
> >>>
> >>> My questions would be as it's changed back to the garbage collector 
> >>> project:
> >>> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit
> >>>
> >>> 1. Your confusion about which part of the garbage collector is touched 
> >>> doesn't
> >>> really make sense s it's for the whole garbage collector as related to 
> >>> shared
> >>> state?
> >>> 2. Injection was my code here in phase 3 for the callers of the new 
> >>> functions or
> >>> macros, perhaps this is not needed as the work with the garbage collector 
> >>> is enough?
> >>> 3. Am I not understanding this project as I thought I was in the proposal 
> >>> I wrote?
> >>>
> >>> Seems your more confusing my wording probably so I'm going to suggest one 
> >>> of 
> >>> two things here:
> >>> a) I'm going to allow you to make comments with what's confusing you and
> >>> it needs that's the issue here more than anything else so I sent you 
> >>> a link and please comment where you are having issues with this not
> >>> be clear for you:
> >>> Or maybe a project to be more
> >>> explicit about regions of the code that assume that 

Re: GSOC Proposal

2019-04-01 Thread nick



On 2019-04-01 5:56 a.m., Richard Biener wrote:
> On Fri, 29 Mar 2019, nick wrote:
> 
>>
>>
>> On 2019-03-29 10:28 a.m., nick wrote:
>>>
>>>
>>> On 2019-03-29 5:08 a.m., Richard Biener wrote:
 On Thu, 28 Mar 2019, nick wrote:

>
>
> On 2019-03-28 4:59 a.m., Richard Biener wrote:
>> On Wed, Mar 27, 2019 at 6:31 PM nick  wrote:
>>>
>>> Greetings All,
>>>
>>> I've already done most of the work required for signing up for GSoC
>>> as of last year i.e. reading getting started, being signed up legally
>>> for contributions.
>>>
>>> My only real concern would be the proposal which I started writing here:
>>> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit?usp=sharing
>>>
>>> The biography and success section I'm fine with my bigger concern would 
>>> be the project and roadmap
>>> section. The roadmap is there and I will go into more detail about it 
>>> in the projects section as
>>> need be. Just wanted to known if the roadmap is detailed enough or can 
>>> I just write out a few
>>> paragraphs discussing it in the Projects Section.
>>
>> I'm not sure I understand either the problem analysis nor the project
>> goal parts.  What
>> shared state with respect to garbage collection are you talking about?
>>
>> Richard.
>>
> I just fixed it. Seems we were discussing RTL itself. I edited it to 
> reflect those changes. Let me know if it's unclear or you would actually 
> like me to discuss some changes that may occur in the RTL layer itself.
>
>
> I'm glad to be more exact if that's better but seems your confusion was 
> just what layer we were touching.

 Let me just throw in some knowledge here.  The issue with RTL
 is that we currently can only have a single function in this
 intermediate language state since a function in RTL has some
 state in global variables that would differ if it were another
 function.  We can have multiple functions in GIMPLE intermediate
 language state since all such state is in a function-specific
 data structure (struct function).  The hard thing about moving
 all this "global" state of RTL into the same place is that
 there's global state in the various backends (and there's
 already a struct funtion 'machine' part for such state, so there's
 hope the issue isn't as big as it could be) and that some of
 the global state is big and only changes very rarely.
 That said, I'm not sure if anybody knows the full details here.

 So as far as I understand you'd like to tackle this as project
 with the goal to be able to have multiple functions in RTL
 state.

 That's laudable but IMHO also quite ambitious for a GSoC
 project.  It's also an area I am not very familiar with so
 I opt out of being a mentor for this project.

>>> While I'm aware of three areas where the shared state is an issue
>>> currently:
>>> 1, Compiler's Proper
>>> 2. The expand_functions 
>>> 3. RTL
>>> 4.Garbage Collector
>>>
>>> Or maybe a project to be more
>>> explicit about regions of the code that assume that the garbage-
>>> collector can't run within them?[3] (since the GC is state that would
>>> be shared by the threads).
>>>
>>> This is what we were discussing previously and I wrote my proposal for
>>> that. You however seem confused about what parts of the garbage collector
>>> would be touched. That's fine with me, however seems you want be to
>>> be more exact about which part  is touched.
>>>
>>> My questions would be as it's changed back to the garbage collector project:
>>> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit
>>>
>>> 1. Your confusion about which part of the garbage collector is touched 
>>> doesn't
>>> really make sense s it's for the whole garbage collector as related to 
>>> shared
>>> state?
>>> 2. Injection was my code here in phase 3 for the callers of the new 
>>> functions or
>>> macros, perhaps this is not needed as the work with the garbage collector 
>>> is enough?
>>> 3. Am I not understanding this project as I thought I was in the proposal I 
>>> wrote?
>>>
>>> Seems your more confusing my wording probably so I'm going to suggest one 
>>> of 
>>> two things here:
>>> a) I'm going to allow you to make comments with what's confusing you and
>>> it needs that's the issue here more than anything else so I sent you 
>>> a link and please comment where you are having issues with this not
>>> be clear for you:
>>> Or maybe a project to be more
>>> explicit about regions of the code that assume that the garbage-
>>> collector can't run within them?[3] (since the GC is state that would
>>> be shared by the threads).
>>> as that's the actual project
>>>
>>> b) Just comment here about the wording that's an issue for you or
>>> where you want more exact wording
>>>
>>> Sorry and 

Re: GSOC Proposal

2019-04-01 Thread Nathan Sidwell

On 4/1/19 1:24 AM, Eric Gallager wrote:

On 3/29/19, nick  wrote:



Seems your right touching the complete garbage collector is too much. I'm
just looking at the users of the garbage collector and it seems one of the
major ones is pre compiled headers.



The thing about pre-compiled headers is that I seem to remember some
GCC devs saying they wanted to rip out pre-compiled headers completely
once the C++ modules branch is merged, so I'm not sure if it's worth
putting that much work into something that might be removed soon,
anyways... I'm pretty sure Nathan Sidwell is the main person working
on the C++ modules branch, so I'm cc-ing him.


The use of the GC machinery for PCH is needed by the front ends and is 
orthogonal to that for GC during code generation.


nathan

--
Nathan Sidwell


Re: GSOC Proposal

2019-04-01 Thread Richard Biener
On Fri, 29 Mar 2019, nick wrote:

> 
> 
> On 2019-03-29 10:28 a.m., nick wrote:
> > 
> > 
> > On 2019-03-29 5:08 a.m., Richard Biener wrote:
> >> On Thu, 28 Mar 2019, nick wrote:
> >>
> >>>
> >>>
> >>> On 2019-03-28 4:59 a.m., Richard Biener wrote:
>  On Wed, Mar 27, 2019 at 6:31 PM nick  wrote:
> >
> > Greetings All,
> >
> > I've already done most of the work required for signing up for GSoC
> > as of last year i.e. reading getting started, being signed up legally
> > for contributions.
> >
> > My only real concern would be the proposal which I started writing here:
> > https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit?usp=sharing
> >
> > The biography and success section I'm fine with my bigger concern would 
> > be the project and roadmap
> > section. The roadmap is there and I will go into more detail about it 
> > in the projects section as
> > need be. Just wanted to known if the roadmap is detailed enough or can 
> > I just write out a few
> > paragraphs discussing it in the Projects Section.
> 
>  I'm not sure I understand either the problem analysis nor the project
>  goal parts.  What
>  shared state with respect to garbage collection are you talking about?
> 
>  Richard.
> 
> >>> I just fixed it. Seems we were discussing RTL itself. I edited it to 
> >>> reflect those changes. Let me know if it's unclear or you would actually 
> >>> like me to discuss some changes that may occur in the RTL layer itself.
> >>>
> >>>
> >>> I'm glad to be more exact if that's better but seems your confusion was 
> >>> just what layer we were touching.
> >>
> >> Let me just throw in some knowledge here.  The issue with RTL
> >> is that we currently can only have a single function in this
> >> intermediate language state since a function in RTL has some
> >> state in global variables that would differ if it were another
> >> function.  We can have multiple functions in GIMPLE intermediate
> >> language state since all such state is in a function-specific
> >> data structure (struct function).  The hard thing about moving
> >> all this "global" state of RTL into the same place is that
> >> there's global state in the various backends (and there's
> >> already a struct funtion 'machine' part for such state, so there's
> >> hope the issue isn't as big as it could be) and that some of
> >> the global state is big and only changes very rarely.
> >> That said, I'm not sure if anybody knows the full details here.
> >>
> >> So as far as I understand you'd like to tackle this as project
> >> with the goal to be able to have multiple functions in RTL
> >> state.
> >>
> >> That's laudable but IMHO also quite ambitious for a GSoC
> >> project.  It's also an area I am not very familiar with so
> >> I opt out of being a mentor for this project.
> >>
> > While I'm aware of three areas where the shared state is an issue
> > currently:
> > 1, Compiler's Proper
> > 2. The expand_functions 
> > 3. RTL
> > 4.Garbage Collector
> > 
> > Or maybe a project to be more
> > explicit about regions of the code that assume that the garbage-
> > collector can't run within them?[3] (since the GC is state that would
> > be shared by the threads).
> > 
> > This is what we were discussing previously and I wrote my proposal for
> > that. You however seem confused about what parts of the garbage collector
> > would be touched. That's fine with me, however seems you want be to
> > be more exact about which part  is touched.
> > 
> > My questions would be as it's changed back to the garbage collector project:
> > https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit
> > 
> > 1. Your confusion about which part of the garbage collector is touched 
> > doesn't
> > really make sense s it's for the whole garbage collector as related to 
> > shared
> > state?
> > 2. Injection was my code here in phase 3 for the callers of the new 
> > functions or
> > macros, perhaps this is not needed as the work with the garbage collector 
> > is enough?
> > 3. Am I not understanding this project as I thought I was in the proposal I 
> > wrote?
> > 
> > Seems your more confusing my wording probably so I'm going to suggest one 
> > of 
> > two things here:
> > a) I'm going to allow you to make comments with what's confusing you and
> > it needs that's the issue here more than anything else so I sent you 
> > a link and please comment where you are having issues with this not
> > be clear for you:
> > Or maybe a project to be more
> > explicit about regions of the code that assume that the garbage-
> > collector can't run within them?[3] (since the GC is state that would
> > be shared by the threads).
> > as that's the actual project
> > 
> > b) Just comment here about the wording that's an issue for you or
> > where you want more exact wording
> > 
> > Sorry and hopefully this is helps you understand where I'm 

Re: GSOC Proposal

2019-03-31 Thread Eric Gallager
On 3/29/19, nick  wrote:
>
>
> On 2019-03-29 10:28 a.m., nick wrote:
>>
>>
>> On 2019-03-29 5:08 a.m., Richard Biener wrote:
>>> On Thu, 28 Mar 2019, nick wrote:
>>>


 On 2019-03-28 4:59 a.m., Richard Biener wrote:
> On Wed, Mar 27, 2019 at 6:31 PM nick  wrote:
>>
>> Greetings All,
>>
>> I've already done most of the work required for signing up for GSoC
>> as of last year i.e. reading getting started, being signed up legally
>> for contributions.
>>
>> My only real concern would be the proposal which I started writing
>> here:
>> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit?usp=sharing
>>
>> The biography and success section I'm fine with my bigger concern
>> would be the project and roadmap
>> section. The roadmap is there and I will go into more detail about it
>> in the projects section as
>> need be. Just wanted to known if the roadmap is detailed enough or can
>> I just write out a few
>> paragraphs discussing it in the Projects Section.
>
> I'm not sure I understand either the problem analysis nor the project
> goal parts.  What
> shared state with respect to garbage collection are you talking about?
>
> Richard.
>
 I just fixed it. Seems we were discussing RTL itself. I edited it to
 reflect those changes. Let me know if it's unclear or you would actually

 like me to discuss some changes that may occur in the RTL layer itself.


 I'm glad to be more exact if that's better but seems your confusion was

 just what layer we were touching.
>>>
>>> Let me just throw in some knowledge here.  The issue with RTL
>>> is that we currently can only have a single function in this
>>> intermediate language state since a function in RTL has some
>>> state in global variables that would differ if it were another
>>> function.  We can have multiple functions in GIMPLE intermediate
>>> language state since all such state is in a function-specific
>>> data structure (struct function).  The hard thing about moving
>>> all this "global" state of RTL into the same place is that
>>> there's global state in the various backends (and there's
>>> already a struct funtion 'machine' part for such state, so there's
>>> hope the issue isn't as big as it could be) and that some of
>>> the global state is big and only changes very rarely.
>>> That said, I'm not sure if anybody knows the full details here.
>>>
>>> So as far as I understand you'd like to tackle this as project
>>> with the goal to be able to have multiple functions in RTL
>>> state.
>>>
>>> That's laudable but IMHO also quite ambitious for a GSoC
>>> project.  It's also an area I am not very familiar with so
>>> I opt out of being a mentor for this project.
>>>
>> While I'm aware of three areas where the shared state is an issue
>> currently:
>> 1, Compiler's Proper
>> 2. The expand_functions
>> 3. RTL
>> 4.Garbage Collector
>>
>> Or maybe a project to be more
>> explicit about regions of the code that assume that the garbage-
>> collector can't run within them?[3] (since the GC is state that would
>> be shared by the threads).
>>
>> This is what we were discussing previously and I wrote my proposal for
>> that. You however seem confused about what parts of the garbage collector
>> would be touched. That's fine with me, however seems you want be to
>> be more exact about which part  is touched.
>>
>> My questions would be as it's changed back to the garbage collector
>> project:
>> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit
>>
>> 1. Your confusion about which part of the garbage collector is touched
>> doesn't
>> really make sense s it's for the whole garbage collector as related to
>> shared
>> state?
>> 2. Injection was my code here in phase 3 for the callers of the new
>> functions or
>> macros, perhaps this is not needed as the work with the garbage collector
>> is enough?
>> 3. Am I not understanding this project as I thought I was in the proposal
>> I wrote?
>>
>> Seems your more confusing my wording probably so I'm going to suggest one
>> of
>> two things here:
>> a) I'm going to allow you to make comments with what's confusing you and
>> it needs that's the issue here more than anything else so I sent you
>> a link and please comment where you are having issues with this not
>> be clear for you:
>> Or maybe a project to be more
>> explicit about regions of the code that assume that the garbage-
>> collector can't run within them?[3] (since the GC is state that would
>> be shared by the threads).
>> as that's the actual project
>>
>> b) Just comment here about the wording that's an issue for you or
>> where you want more exact wording
>>
>> Sorry and hopefully this is helps you understand where I'm going,
>> Nick
>>
>>> Richard.
>>>
 Nick
>> Any other comments are welcome as well as I write it there,
>> Nick


Re: GSOC Proposal

2019-03-29 Thread nick



On 2019-03-29 10:28 a.m., nick wrote:
> 
> 
> On 2019-03-29 5:08 a.m., Richard Biener wrote:
>> On Thu, 28 Mar 2019, nick wrote:
>>
>>>
>>>
>>> On 2019-03-28 4:59 a.m., Richard Biener wrote:
 On Wed, Mar 27, 2019 at 6:31 PM nick  wrote:
>
> Greetings All,
>
> I've already done most of the work required for signing up for GSoC
> as of last year i.e. reading getting started, being signed up legally
> for contributions.
>
> My only real concern would be the proposal which I started writing here:
> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit?usp=sharing
>
> The biography and success section I'm fine with my bigger concern would 
> be the project and roadmap
> section. The roadmap is there and I will go into more detail about it in 
> the projects section as
> need be. Just wanted to known if the roadmap is detailed enough or can I 
> just write out a few
> paragraphs discussing it in the Projects Section.

 I'm not sure I understand either the problem analysis nor the project
 goal parts.  What
 shared state with respect to garbage collection are you talking about?

 Richard.

>>> I just fixed it. Seems we were discussing RTL itself. I edited it to 
>>> reflect those changes. Let me know if it's unclear or you would actually 
>>> like me to discuss some changes that may occur in the RTL layer itself.
>>>
>>>
>>> I'm glad to be more exact if that's better but seems your confusion was 
>>> just what layer we were touching.
>>
>> Let me just throw in some knowledge here.  The issue with RTL
>> is that we currently can only have a single function in this
>> intermediate language state since a function in RTL has some
>> state in global variables that would differ if it were another
>> function.  We can have multiple functions in GIMPLE intermediate
>> language state since all such state is in a function-specific
>> data structure (struct function).  The hard thing about moving
>> all this "global" state of RTL into the same place is that
>> there's global state in the various backends (and there's
>> already a struct funtion 'machine' part for such state, so there's
>> hope the issue isn't as big as it could be) and that some of
>> the global state is big and only changes very rarely.
>> That said, I'm not sure if anybody knows the full details here.
>>
>> So as far as I understand you'd like to tackle this as project
>> with the goal to be able to have multiple functions in RTL
>> state.
>>
>> That's laudable but IMHO also quite ambitious for a GSoC
>> project.  It's also an area I am not very familiar with so
>> I opt out of being a mentor for this project.
>>
> While I'm aware of three areas where the shared state is an issue
> currently:
> 1, Compiler's Proper
> 2. The expand_functions 
> 3. RTL
> 4.Garbage Collector
> 
> Or maybe a project to be more
> explicit about regions of the code that assume that the garbage-
> collector can't run within them?[3] (since the GC is state that would
> be shared by the threads).
> 
> This is what we were discussing previously and I wrote my proposal for
> that. You however seem confused about what parts of the garbage collector
> would be touched. That's fine with me, however seems you want be to
> be more exact about which part  is touched.
> 
> My questions would be as it's changed back to the garbage collector project:
> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit
> 
> 1. Your confusion about which part of the garbage collector is touched doesn't
> really make sense s it's for the whole garbage collector as related to shared
> state?
> 2. Injection was my code here in phase 3 for the callers of the new functions 
> or
> macros, perhaps this is not needed as the work with the garbage collector is 
> enough?
> 3. Am I not understanding this project as I thought I was in the proposal I 
> wrote?
> 
> Seems your more confusing my wording probably so I'm going to suggest one of 
> two things here:
> a) I'm going to allow you to make comments with what's confusing you and
> it needs that's the issue here more than anything else so I sent you 
> a link and please comment where you are having issues with this not
> be clear for you:
> Or maybe a project to be more
> explicit about regions of the code that assume that the garbage-
> collector can't run within them?[3] (since the GC is state that would
> be shared by the threads).
> as that's the actual project
> 
> b) Just comment here about the wording that's an issue for you or
> where you want more exact wording
> 
> Sorry and hopefully this is helps you understand where I'm going,
> Nick
> 
>> Richard.
>>
>>> Nick
> Any other comments are welcome as well as I write it there,
> Nick
>>>

Richard,

Seems your right touching the complete garbage collector is too much. I'm
just looking at the users of the garbage collector and it seems 

Re: GSOC Proposal

2019-03-29 Thread nick



On 2019-03-29 5:08 a.m., Richard Biener wrote:
> On Thu, 28 Mar 2019, nick wrote:
> 
>>
>>
>> On 2019-03-28 4:59 a.m., Richard Biener wrote:
>>> On Wed, Mar 27, 2019 at 6:31 PM nick  wrote:

 Greetings All,

 I've already done most of the work required for signing up for GSoC
 as of last year i.e. reading getting started, being signed up legally
 for contributions.

 My only real concern would be the proposal which I started writing here:
 https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit?usp=sharing

 The biography and success section I'm fine with my bigger concern would be 
 the project and roadmap
 section. The roadmap is there and I will go into more detail about it in 
 the projects section as
 need be. Just wanted to known if the roadmap is detailed enough or can I 
 just write out a few
 paragraphs discussing it in the Projects Section.
>>>
>>> I'm not sure I understand either the problem analysis nor the project
>>> goal parts.  What
>>> shared state with respect to garbage collection are you talking about?
>>>
>>> Richard.
>>>
>> I just fixed it. Seems we were discussing RTL itself. I edited it to 
>> reflect those changes. Let me know if it's unclear or you would actually 
>> like me to discuss some changes that may occur in the RTL layer itself.
>>
>>
>> I'm glad to be more exact if that's better but seems your confusion was 
>> just what layer we were touching.
> 
> Let me just throw in some knowledge here.  The issue with RTL
> is that we currently can only have a single function in this
> intermediate language state since a function in RTL has some
> state in global variables that would differ if it were another
> function.  We can have multiple functions in GIMPLE intermediate
> language state since all such state is in a function-specific
> data structure (struct function).  The hard thing about moving
> all this "global" state of RTL into the same place is that
> there's global state in the various backends (and there's
> already a struct funtion 'machine' part for such state, so there's
> hope the issue isn't as big as it could be) and that some of
> the global state is big and only changes very rarely.
> That said, I'm not sure if anybody knows the full details here.
> 
> So as far as I understand you'd like to tackle this as project
> with the goal to be able to have multiple functions in RTL
> state.
> 
> That's laudable but IMHO also quite ambitious for a GSoC
> project.  It's also an area I am not very familiar with so
> I opt out of being a mentor for this project.
> 
While I'm aware of three areas where the shared state is an issue
currently:
1, Compiler's Proper
2. The expand_functions 
3. RTL
4.Garbage Collector

Or maybe a project to be more
explicit about regions of the code that assume that the garbage-
collector can't run within them?[3] (since the GC is state that would
be shared by the threads).

This is what we were discussing previously and I wrote my proposal for
that. You however seem confused about what parts of the garbage collector
would be touched. That's fine with me, however seems you want be to
be more exact about which part  is touched.

My questions would be as it's changed back to the garbage collector project:
https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit

1. Your confusion about which part of the garbage collector is touched doesn't
really make sense s it's for the whole garbage collector as related to shared
state?
2. Injection was my code here in phase 3 for the callers of the new functions or
macros, perhaps this is not needed as the work with the garbage collector is 
enough?
3. Am I not understanding this project as I thought I was in the proposal I 
wrote?

Seems your more confusing my wording probably so I'm going to suggest one of 
two things here:
a) I'm going to allow you to make comments with what's confusing you and
it needs that's the issue here more than anything else so I sent you 
a link and please comment where you are having issues with this not
be clear for you:
Or maybe a project to be more
explicit about regions of the code that assume that the garbage-
collector can't run within them?[3] (since the GC is state that would
be shared by the threads).
as that's the actual project

b) Just comment here about the wording that's an issue for you or
where you want more exact wording

Sorry and hopefully this is helps you understand where I'm going,
Nick

> Richard.
> 
>> Nick
 Any other comments are welcome as well as I write it there,
 Nick
>>
> 


Re: GSOC Proposal

2019-03-29 Thread Richard Biener
On Thu, 28 Mar 2019, nick wrote:

> 
> 
> On 2019-03-28 4:59 a.m., Richard Biener wrote:
> > On Wed, Mar 27, 2019 at 6:31 PM nick  wrote:
> >>
> >> Greetings All,
> >>
> >> I've already done most of the work required for signing up for GSoC
> >> as of last year i.e. reading getting started, being signed up legally
> >> for contributions.
> >>
> >> My only real concern would be the proposal which I started writing here:
> >> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit?usp=sharing
> >>
> >> The biography and success section I'm fine with my bigger concern would be 
> >> the project and roadmap
> >> section. The roadmap is there and I will go into more detail about it in 
> >> the projects section as
> >> need be. Just wanted to known if the roadmap is detailed enough or can I 
> >> just write out a few
> >> paragraphs discussing it in the Projects Section.
> > 
> > I'm not sure I understand either the problem analysis nor the project
> > goal parts.  What
> > shared state with respect to garbage collection are you talking about?
> > 
> > Richard.
> > 
> I just fixed it. Seems we were discussing RTL itself. I edited it to 
> reflect those changes. Let me know if it's unclear or you would actually 
> like me to discuss some changes that may occur in the RTL layer itself.
> 
> 
> I'm glad to be more exact if that's better but seems your confusion was 
> just what layer we were touching.

Let me just throw in some knowledge here.  The issue with RTL
is that we currently can only have a single function in this
intermediate language state since a function in RTL has some
state in global variables that would differ if it were another
function.  We can have multiple functions in GIMPLE intermediate
language state since all such state is in a function-specific
data structure (struct function).  The hard thing about moving
all this "global" state of RTL into the same place is that
there's global state in the various backends (and there's
already a struct funtion 'machine' part for such state, so there's
hope the issue isn't as big as it could be) and that some of
the global state is big and only changes very rarely.
That said, I'm not sure if anybody knows the full details here.

So as far as I understand you'd like to tackle this as project
with the goal to be able to have multiple functions in RTL
state.

That's laudable but IMHO also quite ambitious for a GSoC
project.  It's also an area I am not very familiar with so
I opt out of being a mentor for this project.

Richard.

> Nick
> >> Any other comments are welcome as well as I write it there,
> >> Nick
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: GSOC Proposal

2019-03-28 Thread nick



On 2019-03-28 4:59 a.m., Richard Biener wrote:
> On Wed, Mar 27, 2019 at 6:31 PM nick  wrote:
>>
>> Greetings All,
>>
>> I've already done most of the work required for signing up for GSoC
>> as of last year i.e. reading getting started, being signed up legally
>> for contributions.
>>
>> My only real concern would be the proposal which I started writing here:
>> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit?usp=sharing
>>
>> The biography and success section I'm fine with my bigger concern would be 
>> the project and roadmap
>> section. The roadmap is there and I will go into more detail about it in the 
>> projects section as
>> need be. Just wanted to known if the roadmap is detailed enough or can I 
>> just write out a few
>> paragraphs discussing it in the Projects Section.
> 
> I'm not sure I understand either the problem analysis nor the project
> goal parts.  What
> shared state with respect to garbage collection are you talking about?
> 
> Richard.
> 
I just fixed it. Seems we were discussing RTL itself. I edited it to reflect 
those
changes. Let me know if it's unclear or you would actually like me to discuss
some changes that may occur in the RTL layer itself.


I'm glad to be more exact if that's better but seems your confusion was just 
what
layer we were touching.

Nick
>> Any other comments are welcome as well as I write it there,
>> Nick


Re: GSOC Proposal

2019-03-28 Thread Richard Biener
On Wed, Mar 27, 2019 at 6:31 PM nick  wrote:
>
> Greetings All,
>
> I've already done most of the work required for signing up for GSoC
> as of last year i.e. reading getting started, being signed up legally
> for contributions.
>
> My only real concern would be the proposal which I started writing here:
> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit?usp=sharing
>
> The biography and success section I'm fine with my bigger concern would be 
> the project and roadmap
> section. The roadmap is there and I will go into more detail about it in the 
> projects section as
> need be. Just wanted to known if the roadmap is detailed enough or can I just 
> write out a few
> paragraphs discussing it in the Projects Section.

I'm not sure I understand either the problem analysis nor the project
goal parts.  What
shared state with respect to garbage collection are you talking about?

Richard.

> Any other comments are welcome as well as I write it there,
> Nick


Re: GSOC proposal

2018-03-26 Thread Martin Jambor
Hello Ismael,

On Wed, Mar 21 2018, Ismael El Houas Ghouddana wrote:
> Dear Mr./Mrs,
>
> First of all, I really appreciate your time and attention. I am Ismael El
> Houas an aerospace engineer student with knowledge of Google Cloud Platform
> and I want to express my interest in working on your project.

I am sorry to reply only now, mainly because of traveling, I was not
reading my email in the second half of last week.  

>
> Secondly, I want to ask if I am still at a time to apply to this project,
> unfortunately, I was not aware of GSOC until yesterday. In the case, I am
> still able to apply for it, I will make the proposal as soon as possible.

Strictly speaking, the deadline is tomorrow, as decided by the GSoC
organizers.  If you have been working on a proposal despite not hearing
from us, we would sill like to see it and encourage you to submit it
before the deadline.  If you haven't, it is really getting rather late,
unless you have a very clear idea of what you want to do (in that case
we should still try!).

My apologies again for missing you message, I hope GSoC works out for
you one way or another.

Martin



Re: GSoC Proposal

2013-03-21 Thread Benjamin De Kosnik

 I have been told that the Project - Implement regular expressions in
 c++ mentored by Sir Benjamin De Kosnik is not completed and is
 available for this year GSoC project also by the the Mentor.

Sorry, there still appears to be some confusion here.

I am not mentoring GSOC this year. Here is some helpful information for
people who want to propose projects:
http://gcc.gnu.org/ml/gcc/2013-03/msg00082.html

-benjamin


RE: GSoC proposal: Provide optimizations feedback through post-compilation messages

2012-04-12 Thread Thibault Raffaillac
Quite lengthy but very interesting mail! It took me a while to formulate a 
proper reply :)

 Feedback can be scarce, but don't let that stop you from submitting a
 proposal.
 Either way, can you keep me informed about any progress? I might wish to help
 though that would probably be later in the cycle (got a lot queued up for
 the comming months).

Submitted :) The reviews are not too positive yet, my biggest efforts go into
making my plan clear. If any progress, help will be very appreciable indeed.

 Great that's exactly what I'm aiming at:) It's not just presenting the
 results of static analysis in real-time, as I actually dislike most
 kinds of it like finding memory leaks, to me that seems like an attempt
 to make the computer do what it's really bad at (understanding the
 code). I just want to give the programmer the fullest picture of the
 situation but at the same time make it so it doesn't become noise that
 interferes. More or less you can say the goal is To provide feedback
 that allows the user to extend his understanding of the program. That
 mostly means giving access to all the information that can be
 unambiguously concluded from the code by the computer. To what degree
 we carry it and how much the compiler is involved is only a question of
 practicality and performance.

I quite agree for the most part, still there is a subtle nuance on which I want
to argue: Do we really help the programmer by offering all the valuable
information that is possible to infer? Ten years from now, would he/she be a
better programmer if we had not let him/her strive to simulate the program in
mind, or code a portion in assembly and finally learn about machine
architecture?

My point is to avoid creating an interface that assists of helps the
programmer, as he/she might become dependent on it. This is just helping in the
short term, and the only person who ever learns something is the one who
actually creates the compiler. If a statement could sum my view, it would be
that the user improves through his/her use of the interface (here the
feedback messages).

How does it make a difference in practice? I want to minimize the information 
given :)
The reason I want to introduce feedback messages is that this particular
information (the inner workings of compilers) is very hard to find in practice.
I want to give a slight help to put the user on the rails, nothing more.

 Perfect! However, how to do that so that it actually works seems a bit
 complex. The first (practically unsolvable) issue is what actually
 constitutes better code, as given two pieces one may be faster in some
 cirtumstances while the other in different. But as I understand that's
 not really what we're trying to tell the user, rather we want him to
 explore for himself what's possible and what are the results and why
 they are the way they are? I'm guessing this will unfortunately (or
 fortunately) require him to actually see and undestand the intermediate
 code, see how it changes after different optimizations, and see the
 output assembly. Personally I really need/want that;) Though my end
 target is a bit more to broaden the abstraction when programming
 (both up and down), so not to just show what's happening with the code
 but also allow the programmer to interact with it on that lower level.
 LLVM seems like the perfect fit for that but I've got some gripes with
 it, and that is still far away in the future.

Excellent! Letting the user explore by himself sounds great, and seing the
output assembly/IR besides is indeed a must. I like the idea that compilation
is a cooperation between programmer and machine (as far as the programmer is
inclined to help of course). It would also be nice to see compilation be split
at Value range propagation, as one could verify it is properly computed, before
proceeding into optimizations.

 Unfortunately I only saw 36m of it as it broke and seeking doesn't work
 on vimeo for me, so I'll watch the rest later. To me it touches on some
 of the right issues/concepts but in slightly the wrong way, and it
 completely ignores some issues.

Agreed. (Only the first half of the video is relevant for the programming
prototype)

Thibault


Re: GSoC proposal: Provide optimizations feedback through post-compilation messages

2012-04-04 Thread Tomasz Borowik
On Mon, 2 Apr 2012 19:57:20 +
Thibault Raffaillac t...@kth.se wrote:

 Bump!
 
 Let me renew my interest in contributing through GSoC with post-compilation
 feedback (This was not an early april joke). Do you think it could lead to an
 acceptable GSoC proposal? (mentor interested?)

Feedback can be scarce, but don't let that stop you from submitting a
proposal.
Either way, can you keep me informed about any progress? I might wish to help
though that would probably be later in the cycle (got a lot queued up for
the comming months).

 @Tomasz:
 On the interaction side I totally agree that communication between compiler 
 and
 programmer is scarce (and there is room for improvement). Focusing too soon on
 the editor would overlook the vast users needs though, as:
 _ some users do not use an IDE (and will kindly refuse);
 _ some users do not need more communication, as they already know what GCC can
   and cannot do;
 _ some users do not want more communication, as they have other business to
   focus on;

Sure, I'm one of the people who don't use an IDE as it causes more
issues than it solves for me. This isn't meant for everyone the same
way anything else isn't, it just can't;p Still looking at it, other
languages, different IDEs, I'd say my way of tackling the issues is
more usable and useful than most other, and could easily see wider
adoption. Btw my experience is mostly in low-level kernel/driver
programming, 2/3d graphics, games.

 I think the editor being split from the compiler is good thing. There still
 exist tools to expose static analysis data from the compiler (and choose the
 editor to visualize it with), but fundamentally they are assisting him/her
 rather than helping him/her improve. Instead of gathering loads of data on the
 optimizations/analysis performed, and filtering it for visualization by the
 user, we could relate the optimization technique used so that the user truly
 knows what GCC is capable of (instead of guessing by observation).

Great that's exactly what I'm aiming at:) It's not just presenting the
results of static analysis in real-time, as I actually dislike most
kinds of it like finding memory leaks, to me that seems like an attempt
to make the computer do what it's really bad at (understanding the
code). I just want to give the programmer the fullest picture of the
situation but at the same time make it so it doesn't become noise that
interferes. More or less you can say the goal is To provide feedback
that allows the user to extend his understanding of the program. That
mostly means giving access to all the information that can be
unambiguously concluded from the code by the computer. To what degree
we carry it and how much the compiler is involved is only a question of
practicality and performance.

 My proposal is thus not to be confused with a static analysis visualization:
 the programmer learns what techniques are implemented in GCC (or in compilers
 in general), how to write code that is more easily compiled, and can further
 browse the Intwawaernet for detailed theory on the techniques involved.

Perfect! However, how to do that so that it actually works seems a bit
complex. The first (practically unsolvable) issue is what actually
constitutes better code, as given two pieces one may be faster in some
cirtumstances while the other in different. But as I understand that's
not really what we're trying to tell the user, rather we want him to
explore for himself what's possible and what are the results and why
they are the way they are? I'm guessing this will unfortunately (or
fortunately) require him to actually see and undestand the intermediate
code, see how it changes after different optimizations, and see the
output assembly. Personally I really need/want that;) Though my end
target is a bit more to broaden the abstraction when programming
(both up and down), so not to just show what's happening with the code
but also allow the programmer to interact with it on that lower level.
LLVM seems like the perfect fit for that but I've got some gripes with
it, and that is still far away in the future.

 The point on the possible-optimizations-which-could-be-enabled-if-specific-
 -constraint-is-lifted is particularly interesting, but is also extremely risky
 if the compiler makes a stupid remark on a constraint which can obviously
 (for the programmer) not be lifted. If ever, I would introduce it with a LOT 
 of
 care.

Yes and no. First of all I don't necessarily mean for the
compiler/editor to suggest anything to the programmer, rather if the
programmer asks just say what's physically possible, and not what's
right, since if the compiler could do that it would just perform the
optimization. Furthermore the situation with my source code is that I
can probably make all this in such a form that it is actually usable
and useful which seems to me close to impossible with normal languages.
I can also with almost no effort store within the source code the
dialogue between 

Re: GSoC proposal: Provide optimizations feedback through post-compilation messages

2012-04-02 Thread Thibault Raffaillac
Bump!

Let me renew my interest in contributing through GSoC with post-compilation
feedback (This was not an early april joke). Do you think it could lead to an
acceptable GSoC proposal? (mentor interested?)

@Tomasz:
On the interaction side I totally agree that communication between compiler and
programmer is scarce (and there is room for improvement). Focusing too soon on
the editor would overlook the vast users needs though, as:
_ some users do not use an IDE (and will kindly refuse);
_ some users do not need more communication, as they already know what GCC can
  and cannot do;
_ some users do not want more communication, as they have other business to
  focus on;

I think the editor being split from the compiler is good thing. There still
exist tools to expose static analysis data from the compiler (and choose the
editor to visualize it with), but fundamentally they are assisting him/her
rather than helping him/her improve. Instead of gathering loads of data on the
optimizations/analysis performed, and filtering it for visualization by the
user, we could relate the optimization technique used so that the user truly
knows what GCC is capable of (instead of guessing by observation).

My proposal is thus not to be confused with a static analysis visualization:
the programmer learns what techniques are implemented in GCC (or in compilers
in general), how to write code that is more easily compiled, and can further
browse the Internet for detailed theory on the techniques involved.

The point on the possible-optimizations-which-could-be-enabled-if-specific-
-constraint-is-lifted is particularly interesting, but is also extremely risky
if the compiler makes a stupid remark on a constraint which can obviously
(for the programmer) not be lifted. If ever, I would introduce it with a LOT of
care.

Thibault
ps: As for an editor with real-time feedback on static analysis and more, I am
100% with you :) (and there are some promising prototypes, like in this talk:
http://vimeo.com/36579366)

 Hello all,
 
 My name is Thibault Raffaillac, CS degree student at Kungliga Tekniska 
 Högskolan,
 Stockholm, Sweden (in double-degree partnership with Ecole Centrale Marseille,
 France).
 GCC currently provides no concise way to inform the user whether it applied an
 expected optimization (ie, it understood the code). As a result, some will 
 do
 premature optimizations when they do not trust the compiler, and some others
 will create overly convoluted code with blind belief in the compiler. This is
 especially relevant for users non-initiated to the internals of GCC.
 The project I would like to propose is a feedback for the optimizations
 performed by GCC. To avoid binding users to the compiler, I would focus on 
 some
 very standard optimizations across vendors, or for some specific yet nice
 features I would indicate their specificity to GCC/an architecture.
 
 The feedback would be triggered when compilation is successful, and display a
 couple of different messages each time it is run:
 gcc --feedback test.c
 test.c:xx:x: info: All operands being constant, constant folding was applied 
 to assign '2560' to 'a'
 test.c:xx:x: info: GCC could not fold constants here because...
 test.c:xx:x: info: As integers are stored in binary format, strength 
 reduction was applied to replace '* 8' by ' 3'
 test.c:xx:x: info: Basic block vectorization was applied to pack the 3 
 independent additions into a single SIMD instruction
 test.c:xx:x: info: GCC implements unordered_map as open-addressed hash 
 tables, with double hashing probing
 
 As a difference with the internal verbose messages, here they would form a 
 set,
 and the system would remember those already displayed and decrease their
 frequency of occurence between compilations. All messages would explain what
 triggered them, cite the optimization name, and describe the consequence.
 
 As for the work plan, it would consist in:
 _ Enumerating all possible messages in the messages set.
 _ Implementing a function receiving feedback from each optimization unit and
   choosing whether to display it: info_printf(enum INFO_INDEX, const char*, 
 ...);
 _ Write a formatting guide for adding messages in the set.
 
 My academic background includes compiler construction, C programming and 
 Human-
 Computer Interactions. I am very much interested in the usability of compilers
 (on which I am currently carrying my degree thesis -
 http://www.csc.kth.se/~traf/traf-sketch.pdf) and thus would be glad to
 contribute to GCC.
 
 If this can be of interest, suggestions are welcome!
 
 Best regards,
 Thibault (http://www.csc.kth.se/~traf/)


Re: GSoC proposal: Provide optimizations feedback through post-compilation messages

2012-03-29 Thread Tomasz Borowik
On Tue, 27 Mar 2012 22:33:39 +
Thibault Raffaillac t...@kth.se wrote:

 Hello all,
 
 My name is Thibault Raffaillac, CS degree student at Kungliga Tekniska 
 Högskolan,
 Stockholm, Sweden (in double-degree partnership with Ecole Centrale Marseille,
 France).
 GCC currently provides no concise way to inform the user whether it applied an
 expected optimization (ie, it understood the code). As a result, some will 
 do
 premature optimizations when they do not trust the compiler, and some others
 will create overly convoluted code with blind belief in the compiler. This is
 especially relevant for users non-initiated to the internals of GCC.
 The project I would like to propose is a feedback for the optimizations
 performed by GCC. To avoid binding users to the compiler, I would focus on 
 some
 very standard optimizations across vendors, or for some specific yet nice
 features I would indicate their specificity to GCC/an architecture.
 
 The feedback would be triggered when compilation is successful, and display a
 couple of different messages each time it is run:
 gcc --feedback test.c
 test.c:xx:x: info: All operands being constant, constant folding was applied 
 to assign '2560' to 'a'
 test.c:xx:x: info: GCC could not fold constants here because...
 test.c:xx:x: info: As integers are stored in binary format, strength 
 reduction was applied to replace '* 8' by ' 3'
 test.c:xx:x: info: Basic block vectorization was applied to pack the 3 
 independent additions into a single SIMD instruction
 test.c:xx:x: info: GCC implements unordered_map as open-addressed hash 
 tables, with double hashing probing
 
 As a difference with the internal verbose messages, here they would form a 
 set,
 and the system would remember those already displayed and decrease their
 frequency of occurence between compilations. All messages would explain what
 triggered them, cite the optimization name, and describe the consequence.
 
 As for the work plan, it would consist in:
 _ Enumerating all possible messages in the messages set.
 _ Implementing a function receiving feedback from each optimization unit and
   choosing whether to display it: info_printf(enum INFO_INDEX, const char*, 
 ...);
 _ Write a formatting guide for adding messages in the set.
 
 My academic background includes compiler construction, C programming and 
 Human-
 Computer Interactions. I am very much interested in the usability of compilers
 (on which I am currently carrying my degree thesis -
 http://www.csc.kth.se/~traf/traf-sketch.pdf) and thus would be glad to
 contribute to GCC.
 
 If this can be of interest, suggestions are welcome!
 
 Best regards,
 Thibault (http://www.csc.kth.se/~traf/)
 

Hi Thibault,

I completely agree, and it's actually a part of what I'm targeting in the long 
term, so I think we might be able to join forces. I'm also thinking of a gsoc 
project though in different areas (there's an email in the list about them on 
19.03), so maybe we could do separate parts that combine into something even 
more awesome;)

I think a huge part of the issue is in the medium of communication between the 
programmer and compiler. I'm targeting an environment where the source code 
editor practically becomes the compiler's front-end. My project allows 
extremely dynamic presentation of the source code, so I can e.g.
 - easily inform the programmer about anything in an unobtrusive manner within 
the code, 
 - give him different perspectives of the same code,
 - allow him to give precise and detailed information to the compiler about 
possible code optimizations without making the code unreadable.

The first two points may seem already solved by eclipse, xcode or whatever 
other gigantic ide, but I'm talking about a much larger scale of feedback 
presented instantly like: ex/implicit and inferred typing info, constant folds, 
dead code, unfolded loops, data flow, vector operations, tree view of 
expressions.

The first issue is that for any non trivial amount of code you'll end up with 
thousands of messages 90% of which are probably not very interesting (similarly 
to warnings in a certain style of objective programming in C). As long as the 
output is not interleaved with the code at the right place and the delay from 
writing to getting feedback is too long, the feature will loose much of its 
usefullness. Though don't misunderstand me, I think it's still better to have 
the info in any form than not.

The last point is probably the more important, as there often is a large amount 
of optimizations that cannot be done due to for example pointer aliasing rules, 
but the programmer knows that the optimization is safe. I can easily add 
literally hundreds of markers like this expression is volatile, the result 
of this function call will not change within this loop, these two pointers 
don't alias and it wouldn't obfuscate the code as much as with normal 
languages. Furthermore my editor can easily list only the meaningful options 
for a given