avoid producing empty set_pcpu section [Was: elf obj load: skip zero-sized sections early]

2010-07-15 Thread Andriy Gapon
on 11/07/2010 15:23 Andriy Gapon said the following:
 on 11/07/2010 14:54 Andriy Gapon said the following:
 For completeness, here is a patch that simply drops the inline assembly and 
 the
 comment about it, and GCC-generated assembly and its diff:
 http://people.freebsd.org/~avg/dpcpu/pcpu.new.patch
 http://people.freebsd.org/~avg/dpcpu/dpcpu.new.s
 http://people.freebsd.org/~avg/dpcpu/dpcpu.new.diff

 As was speculated above, the only thing really changed is section alignment
 (from 128 to 4).
 
 After making the above analysis I wondered why we require set_pcpu section
 alignment at all.  After all, it's not used as loaded, data from the section
 gets copied into special per-cpu memory areas.  So, logically, it's those 
 areas
 that need to be aligned, not the section.
 
 svn log and google quickly pointed me to this excellent analysis and 
 explanation
 by bz (thanks again!):
 http://people.freebsd.org/~bz/20090809-02-pcpu-start-align-fix.diff
 
 Summary: this alignment is needed to work around a bug in GNU binutils ld for
 __start_SECNAME placement.
 
 As explained by bz, ld internally generates an equivalent of the following
 linker script:
 ...
 __start_pcpu_set = ALIGN(NN);
 pcpu_set : {
 ...
 }
 __stop_pcpu_set = .;
 
 Where NN is an alignment of the first _input_ pcpu_set section found in
 whichever .o file happens to be first.  Not the resulting alignment of 
 pcpu_set
 _output_ section.
 Alignment requirement of input sections is based on largest alignment
 requirement of section's members.  So if section is empty then the required
 alignment is 1.  Alignment of output section, if not explicitly overridden 
 e.g.
 via linker script, is the largest alignment of the corresponding input 
 sections.
 
 I think that the problem can be fixed by making ld define __start_SECNAME like
 follows:
 ...
 pcpu_set : {
 __start_pcpu_set = ABSOLUTE(.);
 ...
 }
 __stop_pcpu_set = .;
 
 This way __start_SECNAME would always point to the actual start of the output
 section.
 
 Here's a patch that implements the idea:
 http://people.freebsd.org/~avg/dpcpu/ld.start_sec-alignment.patch
 
 This is similar to what was done upstream:
 http://sourceware.org/cgi-bin/cvsweb.cgi/src/ld/ldlang.c.diff?r1=1.306r2=1.307cvsroot=srcf=h
 The code is quite different there, and approach is somewhat different, but the
 idea is the same - place __start_SECNAME inside the section, not outside it.

Does anybody see any obvious or non-obvious flaw in the above analysis and the
proposed patches?
Does anyone object against me committing the ld patch and some time later the
pcpu.h patch?

My plan is to commit the ld patch tomorrow and the pcpu.h patch early next week.

P.S.
Pro-active testing is welcome!  Especially if you have an unusual architecture
or use epair or both.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: avoid producing empty set_pcpu section [Was: elf obj load: skip zero-sized sections early]

2010-07-15 Thread Kostik Belousov
On Thu, Jul 15, 2010 at 02:25:26PM +0300, Andriy Gapon wrote:
 on 11/07/2010 15:23 Andriy Gapon said the following:
  on 11/07/2010 14:54 Andriy Gapon said the following:
  For completeness, here is a patch that simply drops the inline assembly 
  and the
  comment about it, and GCC-generated assembly and its diff:
  http://people.freebsd.org/~avg/dpcpu/pcpu.new.patch
  http://people.freebsd.org/~avg/dpcpu/dpcpu.new.s
  http://people.freebsd.org/~avg/dpcpu/dpcpu.new.diff
 
  As was speculated above, the only thing really changed is section alignment
  (from 128 to 4).
  
  After making the above analysis I wondered why we require set_pcpu section
  alignment at all.  After all, it's not used as loaded, data from the section
  gets copied into special per-cpu memory areas.  So, logically, it's those 
  areas
  that need to be aligned, not the section.
  
  svn log and google quickly pointed me to this excellent analysis and 
  explanation
  by bz (thanks again!):
  http://people.freebsd.org/~bz/20090809-02-pcpu-start-align-fix.diff
  
  Summary: this alignment is needed to work around a bug in GNU binutils ld 
  for
  __start_SECNAME placement.
  
  As explained by bz, ld internally generates an equivalent of the following
  linker script:
  ...
  __start_pcpu_set = ALIGN(NN);
  pcpu_set : {
  ...
  }
  __stop_pcpu_set = .;
  
  Where NN is an alignment of the first _input_ pcpu_set section found in
  whichever .o file happens to be first.  Not the resulting alignment of 
  pcpu_set
  _output_ section.
  Alignment requirement of input sections is based on largest alignment
  requirement of section's members.  So if section is empty then the required
  alignment is 1.  Alignment of output section, if not explicitly overridden 
  e.g.
  via linker script, is the largest alignment of the corresponding input 
  sections.
  
  I think that the problem can be fixed by making ld define __start_SECNAME 
  like
  follows:
  ...
  pcpu_set : {
  __start_pcpu_set = ABSOLUTE(.);
  ...
  }
  __stop_pcpu_set = .;
  
  This way __start_SECNAME would always point to the actual start of the 
  output
  section.
  
  Here's a patch that implements the idea:
  http://people.freebsd.org/~avg/dpcpu/ld.start_sec-alignment.patch
  
  This is similar to what was done upstream:
  http://sourceware.org/cgi-bin/cvsweb.cgi/src/ld/ldlang.c.diff?r1=1.306r2=1.307cvsroot=srcf=h
  The code is quite different there, and approach is somewhat different, but 
  the
  idea is the same - place __start_SECNAME inside the section, not outside it.
 
 Does anybody see any obvious or non-obvious flaw in the above analysis and the
 proposed patches?
 Does anyone object against me committing the ld patch and some time later the
 pcpu.h patch?
 
 My plan is to commit the ld patch tomorrow and the pcpu.h patch early next 
 week.
 
 P.S.
 Pro-active testing is welcome!  Especially if you have an unusual 
 architecture
 or use epair or both.
 

Is new behaviour completely identical to the behaviour of the newer
ld ? Even if yes, I think that such changes make potential import of
newer binutils harder.


pgpTCQwf8PSpI.pgp
Description: PGP signature


Re: avoid producing empty set_pcpu section [Was: elf obj load: skip zero-sized sections early]

2010-07-15 Thread Andriy Gapon
on 15/07/2010 14:39 Kostik Belousov said the following:
 
 Is new behaviour completely identical to the behaviour of the newer
 ld ? 

No, it's not completely identical.
__start_SECNAME placement would be identical, but our ld would still assign the
symbol while latest upstream binutils PROVIDES it.

 Even if yes, I think that such changes make potential import of
 newer binutils harder.

How?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org