Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-17 Thread Carmelo AMOROSO
On 17/01/2012 2.59, Khem Raj wrote:
 On Mon, Jan 16, 2012 at 1:36 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 16/01/2012 9.09, Carmelo Amoroso wrote:
 On 16/01/2012 8.53, Khem Raj wrote:
 On Sun, Jan 15, 2012 at 11:46 PM, Carmelo AMOROSO
 carmelo.amor...@st.com wrote:
 On 15/01/2012 7.22, Khem Raj wrote:
 On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO 
 carmelo.amor...@st.com wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes 
 are
 being handled


 we have reviewed several times this change before committing. Anyway 
 we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as 
 glibc
 actually does) and I don't really think this could have some 
 impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any 
 simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

 This is the problem that happens with the new scoping and does not
 happen without it

 Error reading Pango modules file

 (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
 No builtin or dynamically loaded modules were found.
 PangoFc will not work correctly.
 This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
 You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='latin'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='common'

 here is the error

 /usr/bin/pango-querymodules: can't resolve symbol
 '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.

 this does not happen without scope patch

 pango-querymodules loads a shared library
 /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
 library had libstdc++.so.6 in its DT_NEEDED entries

 I was trying to create a small testcase where I created a small binary
 which would dlopen another .so which has libstdc++ in DT_NEEDED in its
 header so not able to reproduce a small testcase but making some
 progress


 I might have a test case here 
 http://uclibc.org/~kraj/reproducer_v2.tar.gz
 untar it on target and run make and the ./run.sh

 with buggy libraries i get
 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini'
 in lib './/libC.so'.

 whereas without the scopes patch I get

 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 7)libC:finish - atexit()
 8)main:finish main
 root@qemux86:~/rep/reproducer_v2#


 I think thats the problem that I am facing in pango-querymodules as well
 another data point is if I use BIND_NOW then it works too.

 let me know if you can reproduce it with this testcase

 Thanks
 -Khem


 Thanks khem for your effort in reproducing.
 I-ll let you know asap.

 We will focus on this 100% since now.

 Carmelo

 I have a patch (sort of) which fixes this issue have a look at it.
 Problem is that its trying to unload sub scopes after it has been
 removed from global scope so I just delayed the removal of dlopened
 library


 what is triggering the problem is the use of atexit()


 I'd  ask.. is it correct that a dlopen-ed shared library install
 a function via atexit() to be called at program exit, if the shared
 library could be un-loaded at any time during the program's life ?

 
 does library know if it will be dlopened all the time ?
 

no it doesn't obviously.

I've read again atexit man pages, initially it simply refers to the use
of atexit in binaries (so the reason of my doubts), later in the Note
I've read a reference to the use of atexit in shared libraries acting as
a destructor so my concerns are invalid.

 I'd say that with the old it was just working fortunately !

 The shared library image is actually un-mapped from the system, why we
 should expect to have some of its symbols still alive ?

 
 how about the dependencies that it loaded
 

again I was wrong. Looking at the code more carefully, inded in the loop
where 

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-17 Thread Carmelo AMOROSO
On 17/01/2012 9.41, Carmelo AMOROSO wrote:
 On 17/01/2012 2.59, Khem Raj wrote:
 On Mon, Jan 16, 2012 at 1:36 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 16/01/2012 9.09, Carmelo Amoroso wrote:
 On 16/01/2012 8.53, Khem Raj wrote:
 On Sun, Jan 15, 2012 at 11:46 PM, Carmelo AMOROSO
 carmelo.amor...@st.com wrote:
 On 15/01/2012 7.22, Khem Raj wrote:
 On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO 
 carmelo.amor...@st.com wrote:
 and since I see the same issue on all architectures probably its 
 not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes 
 are
 being handled


 we have reviewed several times this change before committing. 
 Anyway we
 will review it again. We have not ever seen any failure in the 
 lookup
 with all of our tests. The only change in the way the symbol scope 
 is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, 
 while
 with the new structure in place it was added as soon as found (as 
 glibc
 actually does) and I don't really think this could have some 
 impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any 
 simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

 This is the problem that happens with the new scoping and does not
 happen without it

 Error reading Pango modules file

 (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
 No builtin or dynamically loaded modules were found.
 PangoFc will not work correctly.
 This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
 You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='latin'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='common'

 here is the error

 /usr/bin/pango-querymodules: can't resolve symbol
 '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.

 this does not happen without scope patch

 pango-querymodules loads a shared library
 /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
 library had libstdc++.so.6 in its DT_NEEDED entries

 I was trying to create a small testcase where I created a small binary
 which would dlopen another .so which has libstdc++ in DT_NEEDED in its
 header so not able to reproduce a small testcase but making some
 progress


 I might have a test case here 
 http://uclibc.org/~kraj/reproducer_v2.tar.gz
 untar it on target and run make and the ./run.sh

 with buggy libraries i get
 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini'
 in lib './/libC.so'.

 whereas without the scopes patch I get

 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 7)libC:finish - atexit()
 8)main:finish main
 root@qemux86:~/rep/reproducer_v2#


 I think thats the problem that I am facing in pango-querymodules as well
 another data point is if I use BIND_NOW then it works too.

 let me know if you can reproduce it with this testcase

 Thanks
 -Khem


 Thanks khem for your effort in reproducing.
 I-ll let you know asap.

 We will focus on this 100% since now.

 Carmelo

 I have a patch (sort of) which fixes this issue have a look at it.
 Problem is that its trying to unload sub scopes after it has been
 removed from global scope so I just delayed the removal of dlopened
 library


 what is triggering the problem is the use of atexit()


 I'd  ask.. is it correct that a dlopen-ed shared library install
 a function via atexit() to be called at program exit, if the shared
 library could be un-loaded at any time during the program's life ?


 does library know if it will be dlopened all the time ?

 
 no it doesn't obviously.
 
 I've read again atexit man pages, initially it simply refers to the use
 of atexit in binaries (so the reason of my doubts), later in the Note
 I've read a reference to the use of atexit in shared libraries acting as
 a destructor so my concerns are invalid.
 
 I'd say that with the old it was just working fortunately !

 The shared library image is actually un-mapped from the system, why we
 should expect to have some of its symbols still alive ?


 how about the dependencies that it loaded

 
 again I was wrong. 

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-16 Thread Carmelo AMOROSO
On 16/01/2012 8.53, Khem Raj wrote:
 On Sun, Jan 15, 2012 at 11:46 PM, Carmelo AMOROSO
 carmelo.amor...@st.com wrote:
 On 15/01/2012 7.22, Khem Raj wrote:
 On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO 
 carmelo.amor...@st.com wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes are
 being handled


 we have reviewed several times this change before committing. Anyway we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as glibc
 actually does) and I don't really think this could have some impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

 This is the problem that happens with the new scoping and does not
 happen without it

 Error reading Pango modules file

 (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
 No builtin or dynamically loaded modules were found.
 PangoFc will not work correctly.
 This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
 You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='latin'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='common'

 here is the error

 /usr/bin/pango-querymodules: can't resolve symbol
 '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.

 this does not happen without scope patch

 pango-querymodules loads a shared library
 /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
 library had libstdc++.so.6 in its DT_NEEDED entries

 I was trying to create a small testcase where I created a small binary
 which would dlopen another .so which has libstdc++ in DT_NEEDED in its
 header so not able to reproduce a small testcase but making some
 progress


 I might have a test case here http://uclibc.org/~kraj/reproducer_v2.tar.gz
 untar it on target and run make and the ./run.sh

 with buggy libraries i get
 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini'
 in lib './/libC.so'.

 whereas without the scopes patch I get

 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 7)libC:finish - atexit()
 8)main:finish main
 root@qemux86:~/rep/reproducer_v2#


 I think thats the problem that I am facing in pango-querymodules as well
 another data point is if I use BIND_NOW then it works too.

 let me know if you can reproduce it with this testcase

 Thanks
 -Khem


 Thanks khem for your effort in reproducing.
 I-ll let you know asap.

 We will focus on this 100% since now.

 Carmelo
 
 I have a patch (sort of) which fixes this issue have a look at it.
 Problem is that its trying to unload sub scopes after it has been
 removed from global scope so I just delayed the removal of dlopened
 library
 

what is triggering the problem is the use of atexit()

 http://www.uclibc.org/~kraj/fix_libdl.patch
 

looking at it
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-16 Thread Carmelo AMOROSO
On 16/01/2012 9.09, Carmelo Amoroso wrote:
 On 16/01/2012 8.53, Khem Raj wrote:
 On Sun, Jan 15, 2012 at 11:46 PM, Carmelo AMOROSO
 carmelo.amor...@st.com wrote:
 On 15/01/2012 7.22, Khem Raj wrote:
 On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO 
 carmelo.amor...@st.com wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes are
 being handled


 we have reviewed several times this change before committing. Anyway we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as glibc
 actually does) and I don't really think this could have some 
 impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any 
 simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

 This is the problem that happens with the new scoping and does not
 happen without it

 Error reading Pango modules file

 (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
 No builtin or dynamically loaded modules were found.
 PangoFc will not work correctly.
 This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
 You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='latin'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='common'

 here is the error

 /usr/bin/pango-querymodules: can't resolve symbol
 '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.

 this does not happen without scope patch

 pango-querymodules loads a shared library
 /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
 library had libstdc++.so.6 in its DT_NEEDED entries

 I was trying to create a small testcase where I created a small binary
 which would dlopen another .so which has libstdc++ in DT_NEEDED in its
 header so not able to reproduce a small testcase but making some
 progress


 I might have a test case here http://uclibc.org/~kraj/reproducer_v2.tar.gz
 untar it on target and run make and the ./run.sh

 with buggy libraries i get
 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini'
 in lib './/libC.so'.

 whereas without the scopes patch I get

 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 7)libC:finish - atexit()
 8)main:finish main
 root@qemux86:~/rep/reproducer_v2#


 I think thats the problem that I am facing in pango-querymodules as well
 another data point is if I use BIND_NOW then it works too.

 let me know if you can reproduce it with this testcase

 Thanks
 -Khem


 Thanks khem for your effort in reproducing.
 I-ll let you know asap.

 We will focus on this 100% since now.

 Carmelo

 I have a patch (sort of) which fixes this issue have a look at it.
 Problem is that its trying to unload sub scopes after it has been
 removed from global scope so I just delayed the removal of dlopened
 library

 
 what is triggering the problem is the use of atexit()
 

I'd  ask.. is it correct that a dlopen-ed shared library install
a function via atexit() to be called at program exit, if the shared
library could be un-loaded at any time during the program's life ?

I'd say that with the old it was just working fortunately !

The shared library image is actually un-mapped from the system, why we
should expect to have some of its symbols still alive ?

 http://www.uclibc.org/~kraj/fix_libdl.patch

 
 looking at it

not considering the concerns on the use of atexit, this patch is
correct. Could we avoid to use the unlink_local_scope guard and test the
stored_ls pointer directly ?

carmelo
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-16 Thread Carmelo AMOROSO
On 16/01/2012 10.36, Carmelo Amoroso wrote:
 On 16/01/2012 9.09, Carmelo Amoroso wrote:
 On 16/01/2012 8.53, Khem Raj wrote:
 On Sun, Jan 15, 2012 at 11:46 PM, Carmelo AMOROSO
 carmelo.amor...@st.com wrote:
 On 15/01/2012 7.22, Khem Raj wrote:
 On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO 
 carmelo.amor...@st.com wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes 
 are
 being handled


 we have reviewed several times this change before committing. Anyway 
 we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as 
 glibc
 actually does) and I don't really think this could have some 
 impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any 
 simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

 This is the problem that happens with the new scoping and does not
 happen without it

 Error reading Pango modules file

 (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
 No builtin or dynamically loaded modules were found.
 PangoFc will not work correctly.
 This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
 You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='latin'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='common'

 here is the error

 /usr/bin/pango-querymodules: can't resolve symbol
 '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.

 this does not happen without scope patch

 pango-querymodules loads a shared library
 /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
 library had libstdc++.so.6 in its DT_NEEDED entries

 I was trying to create a small testcase where I created a small binary
 which would dlopen another .so which has libstdc++ in DT_NEEDED in its
 header so not able to reproduce a small testcase but making some
 progress


 I might have a test case here http://uclibc.org/~kraj/reproducer_v2.tar.gz
 untar it on target and run make and the ./run.sh

 with buggy libraries i get
 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini'
 in lib './/libC.so'.

 whereas without the scopes patch I get

 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 7)libC:finish - atexit()
 8)main:finish main
 root@qemux86:~/rep/reproducer_v2#


 I think thats the problem that I am facing in pango-querymodules as well
 another data point is if I use BIND_NOW then it works too.

 let me know if you can reproduce it with this testcase

 Thanks
 -Khem


 Thanks khem for your effort in reproducing.
 I-ll let you know asap.

 We will focus on this 100% since now.

 Carmelo

 I have a patch (sort of) which fixes this issue have a look at it.
 Problem is that its trying to unload sub scopes after it has been
 removed from global scope so I just delayed the removal of dlopened
 library


 what is triggering the problem is the use of atexit()

 
 I'd  ask.. is it correct that a dlopen-ed shared library install
 a function via atexit() to be called at program exit, if the shared
 library could be un-loaded at any time during the program's life ?
 
 I'd say that with the old it was just working fortunately !
 
 The shared library image is actually un-mapped from the system, why we
 should expect to have some of its symbols still alive ?
 
 http://www.uclibc.org/~kraj/fix_libdl.patch


 looking at it
 
 not considering the concerns on the use of atexit, this patch is
 correct. Could we avoid to use the unlink_local_scope guard and test the
 stored_ls pointer directly ?
 
 carmelo

hum still wondering how original lookup mechanism worked !?

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-16 Thread Khem Raj
On Mon, Jan 16, 2012 at 1:36 AM, Carmelo AMOROSO carmelo.amor...@st.com wrote:
 On 16/01/2012 9.09, Carmelo Amoroso wrote:
 On 16/01/2012 8.53, Khem Raj wrote:
 On Sun, Jan 15, 2012 at 11:46 PM, Carmelo AMOROSO
 carmelo.amor...@st.com wrote:
 On 15/01/2012 7.22, Khem Raj wrote:
 On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO 
 carmelo.amor...@st.com wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes 
 are
 being handled


 we have reviewed several times this change before committing. Anyway 
 we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as 
 glibc
 actually does) and I don't really think this could have some 
 impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any 
 simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

 This is the problem that happens with the new scoping and does not
 happen without it

 Error reading Pango modules file

 (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
 No builtin or dynamically loaded modules were found.
 PangoFc will not work correctly.
 This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
 You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='latin'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='common'

 here is the error

 /usr/bin/pango-querymodules: can't resolve symbol
 '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.

 this does not happen without scope patch

 pango-querymodules loads a shared library
 /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
 library had libstdc++.so.6 in its DT_NEEDED entries

 I was trying to create a small testcase where I created a small binary
 which would dlopen another .so which has libstdc++ in DT_NEEDED in its
 header so not able to reproduce a small testcase but making some
 progress


 I might have a test case here http://uclibc.org/~kraj/reproducer_v2.tar.gz
 untar it on target and run make and the ./run.sh

 with buggy libraries i get
 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini'
 in lib './/libC.so'.

 whereas without the scopes patch I get

 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 7)libC:finish - atexit()
 8)main:finish main
 root@qemux86:~/rep/reproducer_v2#


 I think thats the problem that I am facing in pango-querymodules as well
 another data point is if I use BIND_NOW then it works too.

 let me know if you can reproduce it with this testcase

 Thanks
 -Khem


 Thanks khem for your effort in reproducing.
 I-ll let you know asap.

 We will focus on this 100% since now.

 Carmelo

 I have a patch (sort of) which fixes this issue have a look at it.
 Problem is that its trying to unload sub scopes after it has been
 removed from global scope so I just delayed the removal of dlopened
 library


 what is triggering the problem is the use of atexit()


 I'd  ask.. is it correct that a dlopen-ed shared library install
 a function via atexit() to be called at program exit, if the shared
 library could be un-loaded at any time during the program's life ?


does library know if it will be dlopened all the time ?

 I'd say that with the old it was just working fortunately !

 The shared library image is actually un-mapped from the system, why we
 should expect to have some of its symbols still alive ?


how about the dependencies that it loaded

 http://www.uclibc.org/~kraj/fix_libdl.patch


 looking at it

 not considering the concerns on the use of atexit, this patch is
 correct. Could we avoid to use the unlink_local_scope guard and test the
 stored_ls pointer directly ?

 carmelo
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-15 Thread Carmelo AMOROSO
On 15/01/2012 7.22, Khem Raj wrote:
 On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes are
 being handled


 we have reviewed several times this change before committing. Anyway we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as glibc
 actually does) and I don't really think this could have some impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

 This is the problem that happens with the new scoping and does not
 happen without it

 Error reading Pango modules file

 (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
 No builtin or dynamically loaded modules were found.
 PangoFc will not work correctly.
 This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
 You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='latin'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='common'

 here is the error

 /usr/bin/pango-querymodules: can't resolve symbol
 '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.

 this does not happen without scope patch

 pango-querymodules loads a shared library
 /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
 library had libstdc++.so.6 in its DT_NEEDED entries

 I was trying to create a small testcase where I created a small binary
 which would dlopen another .so which has libstdc++ in DT_NEEDED in its
 header so not able to reproduce a small testcase but making some
 progress
 
 
 I might have a test case here http://uclibc.org/~kraj/reproducer_v2.tar.gz
 untar it on target and run make and the ./run.sh
 
 with buggy libraries i get
 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini'
 in lib './/libC.so'.
 
 whereas without the scopes patch I get
 
 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 7)libC:finish - atexit()
 8)main:finish main
 root@qemux86:~/rep/reproducer_v2#
 
 
 I think thats the problem that I am facing in pango-querymodules as well
 another data point is if I use BIND_NOW then it works too.
 
 let me know if you can reproduce it with this testcase
 
 Thanks
 -Khem
 

Thanks khem for your effort in reproducing.
I-ll let you know asap.

We will focus on this 100% since now.

Carmelo
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-15 Thread Khem Raj
On Sun, Jan 15, 2012 at 11:46 PM, Carmelo AMOROSO
carmelo.amor...@st.com wrote:
 On 15/01/2012 7.22, Khem Raj wrote:
 On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes are
 being handled


 we have reviewed several times this change before committing. Anyway we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as glibc
 actually does) and I don't really think this could have some impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

 This is the problem that happens with the new scoping and does not
 happen without it

 Error reading Pango modules file

 (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
 No builtin or dynamically loaded modules were found.
 PangoFc will not work correctly.
 This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
 You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='latin'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='common'

 here is the error

 /usr/bin/pango-querymodules: can't resolve symbol
 '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.

 this does not happen without scope patch

 pango-querymodules loads a shared library
 /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
 library had libstdc++.so.6 in its DT_NEEDED entries

 I was trying to create a small testcase where I created a small binary
 which would dlopen another .so which has libstdc++ in DT_NEEDED in its
 header so not able to reproduce a small testcase but making some
 progress


 I might have a test case here http://uclibc.org/~kraj/reproducer_v2.tar.gz
 untar it on target and run make and the ./run.sh

 with buggy libraries i get
 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini'
 in lib './/libC.so'.

 whereas without the scopes patch I get

 root@qemux86:~/rep/reproducer_v2# ./run.sh
 1)main:dlopen  libA.so
 4)libC:dlopen  libB.so
 5)libC:atexit(libC_fini)
 6)main:dlclose libA.so
 7)libC:finish - atexit()
 8)main:finish main
 root@qemux86:~/rep/reproducer_v2#


 I think thats the problem that I am facing in pango-querymodules as well
 another data point is if I use BIND_NOW then it works too.

 let me know if you can reproduce it with this testcase

 Thanks
 -Khem


 Thanks khem for your effort in reproducing.
 I-ll let you know asap.

 We will focus on this 100% since now.

 Carmelo

I have a patch (sort of) which fixes this issue have a look at it.
Problem is that its trying to unload sub scopes after it has been
removed from global scope so I just delayed the removal of dlopened
library

http://www.uclibc.org/~kraj/fix_libdl.patch
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-14 Thread Khem Raj
On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes are
 being handled


 we have reviewed several times this change before committing. Anyway we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as glibc
 actually does) and I don't really think this could have some impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

 This is the problem that happens with the new scoping and does not
 happen without it

 Error reading Pango modules file

 (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
 No builtin or dynamically loaded modules were found.
 PangoFc will not work correctly.
 This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
 You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='latin'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='common'

here is the error

/usr/bin/pango-querymodules: can't resolve symbol
'_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.

this does not happen without scope patch

pango-querymodules loads a shared library
/usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
library had libstdc++.so.6 in its DT_NEEDED entries

I was trying to create a small testcase where I created a small binary
which would dlopen another .so which has libstdc++ in DT_NEEDED in its
header so not able to reproduce a small testcase but making some
progress
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-14 Thread Khem Raj
On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes are
 being handled


 we have reviewed several times this change before committing. Anyway we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as glibc
 actually does) and I don't really think this could have some impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

 This is the problem that happens with the new scoping and does not
 happen without it

 Error reading Pango modules file

 (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
 No builtin or dynamically loaded modules were found.
 PangoFc will not work correctly.
 This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
 You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='latin'

 (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
 expect ugly output. engine-type='PangoRenderFc', script='common'

 here is the error

 /usr/bin/pango-querymodules: can't resolve symbol
 '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.

 this does not happen without scope patch

 pango-querymodules loads a shared library
 /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
 library had libstdc++.so.6 in its DT_NEEDED entries

 I was trying to create a small testcase where I created a small binary
 which would dlopen another .so which has libstdc++ in DT_NEEDED in its
 header so not able to reproduce a small testcase but making some
 progress


I might have a test case here http://uclibc.org/~kraj/reproducer_v2.tar.gz
untar it on target and run make and the ./run.sh

with buggy libraries i get
root@qemux86:~/rep/reproducer_v2# ./run.sh
1)main:dlopen  libA.so
4)libC:dlopen  libB.so
5)libC:atexit(libC_fini)
6)main:dlclose libA.so
/home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini'
in lib './/libC.so'.

whereas without the scopes patch I get

root@qemux86:~/rep/reproducer_v2# ./run.sh
1)main:dlopen  libA.so
4)libC:dlopen  libB.so
5)libC:atexit(libC_fini)
6)main:dlclose libA.so
7)libC:finish - atexit()
8)main:finish main
root@qemux86:~/rep/reproducer_v2#


I think thats the problem that I am facing in pango-querymodules as well
another data point is if I use BIND_NOW then it works too.

let me know if you can reproduce it with this testcase

Thanks
-Khem
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-13 Thread Carmelo AMOROSO
On 11/01/2012 8.44, Khem Raj wrote:
 khem ? any news ?

 no unfortunately, had no time to delve further. once I have turned a
 merge into patch which was causing the regression, let me go down that
 path.
 let
 
 hi Carmelo
 

Hi khem

 Attached patch is causing the trouble. and I have
 

ok

 # LDSO_STANDALONE_SUPPORT is not set
 # LDSO_PRELINK_SUPPORT is not set
 

ok

 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes are
 being handled
 

we have reviewed several times this change before committing. Anyway we
will review it again. We have not ever seen any failure in the lookup
with all of our tests. The only change in the way the symbol scope is
created is in where the ld.so is added.
In the original code it was the last entry of the global scope, while
with the new structure in place it was added as soon as found (as glibc
actually does) and I don't really think this could have some impact.

We are trying to startup a X system on our platform. Is there any simple
X app we can run to show the failure ?

Is some .so failing to be dl-opened due to unresolved symbol ?

 I can reproduce it by exchanging ld.so and libdl.so
 

it is reasonable considering the code impacted by the offending patch

 while I keep looking more can you see anything visually in this patch would 
 help
 

I'll re-re-look again

 i tried with latest master and problem happens there too.
 

yes, not changes have been applied to the symbol scope logic further.

 Thanks
 -Khem

to you.
Carmelo

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-13 Thread Khem Raj
On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO carmelo.amor...@st.com wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes are
 being handled


 we have reviewed several times this change before committing. Anyway we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as glibc
 actually does) and I don't really think this could have some impact.

I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

this is potentially possible. I will try to debug it through
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2012-01-13 Thread Khem Raj
On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj raj.k...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 and since I see the same issue on all architectures probably its not
 elfinterp changes
 too. Mostly it seems likely that it could be in the way the scopes are
 being handled


 we have reviewed several times this change before committing. Anyway we
 will review it again. We have not ever seen any failure in the lookup
 with all of our tests. The only change in the way the symbol scope is
 created is in where the ld.so is added.
 In the original code it was the last entry of the global scope, while
 with the new structure in place it was added as soon as found (as glibc
 actually does) and I don't really think this could have some impact.

 I tried to reverse it as well but the problem remained.


 We are trying to startup a X system on our platform. Is there any simple
 X app we can run to show the failure ?

 Is some .so failing to be dl-opened due to unresolved symbol ?

 this is potentially possible. I will try to debug it through

This is the problem that happens with the new scoping and does not
happen without it

Error reading Pango modules file

(matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
No builtin or dynamically loaded modules were found.
PangoFc will not work correctly.
This probably means there was an error in the creation of:
  '/etc/pango/pango.modules'
You should create this file by running:
  pango-querymodules  '/etc/pango/pango.modules'

(matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
expect ugly output. engine-type='PangoRenderFc', script='latin'

(matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
expect ugly output. engine-type='PangoRenderFc', script='common'
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-12 Thread Carmelo AMOROSO
On 06/12/2011 11.25, Carmelo Amoroso wrote:
 On 06/12/2011 1.54, Khem Raj wrote:
 On Mon, Dec 5, 2011 at 7:09 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 05/12/2011 13.04, Carmelo AMOROSO wrote:
 On 01/12/2011 20.40, Khem Raj wrote:
 yes I tried the elf_machine_relocations patch and it did not help
 I am at 7682323a3a798d6f15708f228f859a64cb869aa3
 which is merge commit.


 so if you do reset --hard HEAD~1 you have something working again ?


 yes


 Hi guys,
 starting from SHA1 7682323a3a798d6f15708f228f859a64cb869aa3, and
 resetting back to HEAD~1, we ends in SHA1

 commit 3004ce0c9619f89bf8e64931edd696bf4df8d2e1
 Merge: 3b3285b 4916fd8
 Author: Carmelo Amoroso carmelo.amor...@st.com
 Date:   Wed May 4 08:31:16 2011 +0200

 Merge remote-tracking branch 'origin/master' into prelink

 * origin/master: (32 commits)
   libubacktrace: fix backtrace support on arm-eabi, which needs
 libgcc_eh linked too
   getaddrinfo.c: fix incorrect check for ERANGE from gethostbyaddr_r
   getaddrinfo.c: improve code readability. No functional changes
   string: remove unused variable
   x86_64: silence warning if !TLS
   buildsys: prettify ssp.c handling
   madvise is LINUX_SPECIFIC
   test_nptl: fix expected result for tst-cputimer[123]
   test_nptl: fix expected result for tst-clock2 test
   buildsys: make $(LOCAL_INSTALL_PATH) phony
   ether_aton: reject invalid input
   tests: disable ether tests if !HAS_SOCKET
   inet: add ether_aton testcase
   sysconf: clock_getres depends on HAS_REALTIME
   __rt_sigwaitinfo: depends on HAS_REALTIME
   buildsys: minor fixes in Makefile.arch for C6X
   buildsys: minor fixes in Makefile.arch for microblaze
   libubacktrace: enabled for all archs indeed.
   sparc: don't access fp registers when configured for no fpu
   libubacktrace: generic implementation based dwarf
   ...

 Conflicts:
 ldso/ldso/dl-elf.c
 ldso/ldso/mips/elfinterp.c
 ldso/ldso/x86_64/elfinterp.c

 Signed-off-by: Carmelo Amoroso carmelo.amor...@st.com

 That already includes both STANDALONE, PRELINK and symbol lookup re-design.

 Filippo and my self have thoroughly looked again at
 prelink/stand-alone/global scope work included from the prelink branch,
 and we can't see any issue (except for the fdpic archs under discussion
 with Mike).

 My suspect is in the merge process itself (badly handled conflicts), or
 in the commits (in master) between
 3004ce0c9619f89bf8e64931edd696bf4df8d2e1 and
 7682323a3a798d6f15708f228f859a64cb869aa3

 So I'll look again at all merge commits I've done focusing on conflicts.

 I'd kindly ask Khem to confirm if he is seeing or not problem with
 master @3004ce0c9.

 just finished trying, the same problem exists on master @3004ce0c9

 
 Khem,
 this conflicts with what you said previously, that a reset at HEAD~1
 starting 7682323a3a798 worked fine.
 
 Please could you clarify ?
 
 Which is the top most commits that works for you ?
 Any traces available to help in debugging... currently we don't know
 where to look.
 
 Thanks,
 Carmelo
 

khem ? any news ?

carmelo


 Cheers,
 Carmelo



 a follow up...

 merge commits looks fine to me, also the conflicts seems to have been
 properly handled.

 If Khem will confirm that master @3004ce0c9 is fine as I hope, then I'd
 suggest to try reverting the commit

 204c7849029d90e5e3486670a6a07a76f949afd6
 libc: make common longjmp usable with NPTL

 it's the only change with a wide impacts on all archs with NPTL enabled.

 Carmelo



 ___
 uClibc mailing list
 uClibc@uclibc.org
 http://lists.busybox.net/mailman/listinfo/uclibc


 ___
 uClibc mailing list
 uClibc@uclibc.org
 http://lists.busybox.net/mailman/listinfo/uclibc

 

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-12 Thread Khem Raj
On Mon, Dec 12, 2011 at 3:04 AM, Carmelo AMOROSO carmelo.amor...@st.com wrote:
 On 06/12/2011 11.25, Carmelo Amoroso wrote:
 On 06/12/2011 1.54, Khem Raj wrote:
 On Mon, Dec 5, 2011 at 7:09 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 05/12/2011 13.04, Carmelo AMOROSO wrote:
 On 01/12/2011 20.40, Khem Raj wrote:
 yes I tried the elf_machine_relocations patch and it did not help
 I am at 7682323a3a798d6f15708f228f859a64cb869aa3
 which is merge commit.


 so if you do reset --hard HEAD~1 you have something working again ?


 yes


 Hi guys,
 starting from SHA1 7682323a3a798d6f15708f228f859a64cb869aa3, and
 resetting back to HEAD~1, we ends in SHA1

 commit 3004ce0c9619f89bf8e64931edd696bf4df8d2e1
 Merge: 3b3285b 4916fd8
 Author: Carmelo Amoroso carmelo.amor...@st.com
 Date:   Wed May 4 08:31:16 2011 +0200

     Merge remote-tracking branch 'origin/master' into prelink

     * origin/master: (32 commits)
       libubacktrace: fix backtrace support on arm-eabi, which needs
 libgcc_eh linked too
       getaddrinfo.c: fix incorrect check for ERANGE from gethostbyaddr_r
       getaddrinfo.c: improve code readability. No functional changes
       string: remove unused variable
       x86_64: silence warning if !TLS
       buildsys: prettify ssp.c handling
       madvise is LINUX_SPECIFIC
       test_nptl: fix expected result for tst-cputimer[123]
       test_nptl: fix expected result for tst-clock2 test
       buildsys: make $(LOCAL_INSTALL_PATH) phony
       ether_aton: reject invalid input
       tests: disable ether tests if !HAS_SOCKET
       inet: add ether_aton testcase
       sysconf: clock_getres depends on HAS_REALTIME
       __rt_sigwaitinfo: depends on HAS_REALTIME
       buildsys: minor fixes in Makefile.arch for C6X
       buildsys: minor fixes in Makefile.arch for microblaze
       libubacktrace: enabled for all archs indeed.
       sparc: don't access fp registers when configured for no fpu
       libubacktrace: generic implementation based dwarf
       ...

     Conflicts:
         ldso/ldso/dl-elf.c
         ldso/ldso/mips/elfinterp.c
         ldso/ldso/x86_64/elfinterp.c

     Signed-off-by: Carmelo Amoroso carmelo.amor...@st.com

 That already includes both STANDALONE, PRELINK and symbol lookup 
 re-design.

 Filippo and my self have thoroughly looked again at
 prelink/stand-alone/global scope work included from the prelink branch,
 and we can't see any issue (except for the fdpic archs under discussion
 with Mike).

 My suspect is in the merge process itself (badly handled conflicts), or
 in the commits (in master) between
 3004ce0c9619f89bf8e64931edd696bf4df8d2e1 and
 7682323a3a798d6f15708f228f859a64cb869aa3

 So I'll look again at all merge commits I've done focusing on conflicts.

 I'd kindly ask Khem to confirm if he is seeing or not problem with
 master @3004ce0c9.

 just finished trying, the same problem exists on master @3004ce0c9


 Khem,
 this conflicts with what you said previously, that a reset at HEAD~1
 starting 7682323a3a798 worked fine.

 Please could you clarify ?

 Which is the top most commits that works for you ?
 Any traces available to help in debugging... currently we don't know
 where to look.

 Thanks,
 Carmelo


 khem ? any news ?

no unfortunately, had no time to delve further. once I have turned a
merge into patch which was causing the regression, let me go down that
path.
let
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-06 Thread Carmelo AMOROSO
On 06/12/2011 1.54, Khem Raj wrote:
 On Mon, Dec 5, 2011 at 7:09 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 05/12/2011 13.04, Carmelo AMOROSO wrote:
 On 01/12/2011 20.40, Khem Raj wrote:
 yes I tried the elf_machine_relocations patch and it did not help
 I am at 7682323a3a798d6f15708f228f859a64cb869aa3
 which is merge commit.


 so if you do reset --hard HEAD~1 you have something working again ?


 yes


 Hi guys,
 starting from SHA1 7682323a3a798d6f15708f228f859a64cb869aa3, and
 resetting back to HEAD~1, we ends in SHA1

 commit 3004ce0c9619f89bf8e64931edd696bf4df8d2e1
 Merge: 3b3285b 4916fd8
 Author: Carmelo Amoroso carmelo.amor...@st.com
 Date:   Wed May 4 08:31:16 2011 +0200

 Merge remote-tracking branch 'origin/master' into prelink

 * origin/master: (32 commits)
   libubacktrace: fix backtrace support on arm-eabi, which needs
 libgcc_eh linked too
   getaddrinfo.c: fix incorrect check for ERANGE from gethostbyaddr_r
   getaddrinfo.c: improve code readability. No functional changes
   string: remove unused variable
   x86_64: silence warning if !TLS
   buildsys: prettify ssp.c handling
   madvise is LINUX_SPECIFIC
   test_nptl: fix expected result for tst-cputimer[123]
   test_nptl: fix expected result for tst-clock2 test
   buildsys: make $(LOCAL_INSTALL_PATH) phony
   ether_aton: reject invalid input
   tests: disable ether tests if !HAS_SOCKET
   inet: add ether_aton testcase
   sysconf: clock_getres depends on HAS_REALTIME
   __rt_sigwaitinfo: depends on HAS_REALTIME
   buildsys: minor fixes in Makefile.arch for C6X
   buildsys: minor fixes in Makefile.arch for microblaze
   libubacktrace: enabled for all archs indeed.
   sparc: don't access fp registers when configured for no fpu
   libubacktrace: generic implementation based dwarf
   ...

 Conflicts:
 ldso/ldso/dl-elf.c
 ldso/ldso/mips/elfinterp.c
 ldso/ldso/x86_64/elfinterp.c

 Signed-off-by: Carmelo Amoroso carmelo.amor...@st.com

 That already includes both STANDALONE, PRELINK and symbol lookup re-design.

 Filippo and my self have thoroughly looked again at
 prelink/stand-alone/global scope work included from the prelink branch,
 and we can't see any issue (except for the fdpic archs under discussion
 with Mike).

 My suspect is in the merge process itself (badly handled conflicts), or
 in the commits (in master) between
 3004ce0c9619f89bf8e64931edd696bf4df8d2e1 and
 7682323a3a798d6f15708f228f859a64cb869aa3

 So I'll look again at all merge commits I've done focusing on conflicts.

 I'd kindly ask Khem to confirm if he is seeing or not problem with
 master @3004ce0c9.
 
 just finished trying, the same problem exists on master @3004ce0c9
 

Khem,
this conflicts with what you said previously, that a reset at HEAD~1
starting 7682323a3a798 worked fine.

Please could you clarify ?

Which is the top most commits that works for you ?
Any traces available to help in debugging... currently we don't know
where to look.

Thanks,
Carmelo


 Cheers,
 Carmelo



 a follow up...

 merge commits looks fine to me, also the conflicts seems to have been
 properly handled.

 If Khem will confirm that master @3004ce0c9 is fine as I hope, then I'd
 suggest to try reverting the commit

 204c7849029d90e5e3486670a6a07a76f949afd6
 libc: make common longjmp usable with NPTL

 it's the only change with a wide impacts on all archs with NPTL enabled.

 Carmelo



 ___
 uClibc mailing list
 uClibc@uclibc.org
 http://lists.busybox.net/mailman/listinfo/uclibc


 ___
 uClibc mailing list
 uClibc@uclibc.org
 http://lists.busybox.net/mailman/listinfo/uclibc
 

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-05 Thread Carmelo AMOROSO
On 01/12/2011 20.40, Khem Raj wrote:
 yes I tried the elf_machine_relocations patch and it did not help
 I am at 7682323a3a798d6f15708f228f859a64cb869aa3
 which is merge commit.


 so if you do reset --hard HEAD~1 you have something working again ?

 
 yes
 

Hi guys,
starting from SHA1 7682323a3a798d6f15708f228f859a64cb869aa3, and
resetting back to HEAD~1, we ends in SHA1

commit 3004ce0c9619f89bf8e64931edd696bf4df8d2e1
Merge: 3b3285b 4916fd8
Author: Carmelo Amoroso carmelo.amor...@st.com
Date:   Wed May 4 08:31:16 2011 +0200

Merge remote-tracking branch 'origin/master' into prelink

* origin/master: (32 commits)
  libubacktrace: fix backtrace support on arm-eabi, which needs
libgcc_eh linked too
  getaddrinfo.c: fix incorrect check for ERANGE from gethostbyaddr_r
  getaddrinfo.c: improve code readability. No functional changes
  string: remove unused variable
  x86_64: silence warning if !TLS
  buildsys: prettify ssp.c handling
  madvise is LINUX_SPECIFIC
  test_nptl: fix expected result for tst-cputimer[123]
  test_nptl: fix expected result for tst-clock2 test
  buildsys: make $(LOCAL_INSTALL_PATH) phony
  ether_aton: reject invalid input
  tests: disable ether tests if !HAS_SOCKET
  inet: add ether_aton testcase
  sysconf: clock_getres depends on HAS_REALTIME
  __rt_sigwaitinfo: depends on HAS_REALTIME
  buildsys: minor fixes in Makefile.arch for C6X
  buildsys: minor fixes in Makefile.arch for microblaze
  libubacktrace: enabled for all archs indeed.
  sparc: don't access fp registers when configured for no fpu
  libubacktrace: generic implementation based dwarf
  ...

Conflicts:
ldso/ldso/dl-elf.c
ldso/ldso/mips/elfinterp.c
ldso/ldso/x86_64/elfinterp.c

Signed-off-by: Carmelo Amoroso carmelo.amor...@st.com

That already includes both STANDALONE, PRELINK and symbol lookup re-design.

Filippo and my self have thoroughly looked again at
prelink/stand-alone/global scope work included from the prelink branch,
and we can't see any issue (except for the fdpic archs under discussion
with Mike).

My suspect is in the merge process itself (badly handled conflicts), or
in the commits (in master) between
3004ce0c9619f89bf8e64931edd696bf4df8d2e1 and
7682323a3a798d6f15708f228f859a64cb869aa3

So I'll look again at all merge commits I've done focusing on conflicts.

I'd kindly ask Khem to confirm if he is seeing or not problem with
master @3004ce0c9.

Cheers,
Carmelo




___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-05 Thread Carmelo AMOROSO
On 05/12/2011 13.04, Carmelo AMOROSO wrote:
 On 01/12/2011 20.40, Khem Raj wrote:
 yes I tried the elf_machine_relocations patch and it did not help
 I am at 7682323a3a798d6f15708f228f859a64cb869aa3
 which is merge commit.


 so if you do reset --hard HEAD~1 you have something working again ?


 yes

 
 Hi guys,
 starting from SHA1 7682323a3a798d6f15708f228f859a64cb869aa3, and
 resetting back to HEAD~1, we ends in SHA1
 
 commit 3004ce0c9619f89bf8e64931edd696bf4df8d2e1
 Merge: 3b3285b 4916fd8
 Author: Carmelo Amoroso carmelo.amor...@st.com
 Date:   Wed May 4 08:31:16 2011 +0200
 
 Merge remote-tracking branch 'origin/master' into prelink
 
 * origin/master: (32 commits)
   libubacktrace: fix backtrace support on arm-eabi, which needs
 libgcc_eh linked too
   getaddrinfo.c: fix incorrect check for ERANGE from gethostbyaddr_r
   getaddrinfo.c: improve code readability. No functional changes
   string: remove unused variable
   x86_64: silence warning if !TLS
   buildsys: prettify ssp.c handling
   madvise is LINUX_SPECIFIC
   test_nptl: fix expected result for tst-cputimer[123]
   test_nptl: fix expected result for tst-clock2 test
   buildsys: make $(LOCAL_INSTALL_PATH) phony
   ether_aton: reject invalid input
   tests: disable ether tests if !HAS_SOCKET
   inet: add ether_aton testcase
   sysconf: clock_getres depends on HAS_REALTIME
   __rt_sigwaitinfo: depends on HAS_REALTIME
   buildsys: minor fixes in Makefile.arch for C6X
   buildsys: minor fixes in Makefile.arch for microblaze
   libubacktrace: enabled for all archs indeed.
   sparc: don't access fp registers when configured for no fpu
   libubacktrace: generic implementation based dwarf
   ...
 
 Conflicts:
 ldso/ldso/dl-elf.c
 ldso/ldso/mips/elfinterp.c
 ldso/ldso/x86_64/elfinterp.c
 
 Signed-off-by: Carmelo Amoroso carmelo.amor...@st.com
 
 That already includes both STANDALONE, PRELINK and symbol lookup re-design.
 
 Filippo and my self have thoroughly looked again at
 prelink/stand-alone/global scope work included from the prelink branch,
 and we can't see any issue (except for the fdpic archs under discussion
 with Mike).
 
 My suspect is in the merge process itself (badly handled conflicts), or
 in the commits (in master) between
 3004ce0c9619f89bf8e64931edd696bf4df8d2e1 and
 7682323a3a798d6f15708f228f859a64cb869aa3
 
 So I'll look again at all merge commits I've done focusing on conflicts.
 
 I'd kindly ask Khem to confirm if he is seeing or not problem with
 master @3004ce0c9.
 
 Cheers,
 Carmelo
 
 

a follow up...

merge commits looks fine to me, also the conflicts seems to have been
properly handled.

If Khem will confirm that master @3004ce0c9 is fine as I hope, then I'd
suggest to try reverting the commit

204c7849029d90e5e3486670a6a07a76f949afd6
libc: make common longjmp usable with NPTL

it's the only change with a wide impacts on all archs with NPTL enabled.

Carmelo

 
 
 ___
 uClibc mailing list
 uClibc@uclibc.org
 http://lists.busybox.net/mailman/listinfo/uclibc
 

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-05 Thread Khem Raj
On Mon, Dec 5, 2011 at 7:09 AM, Carmelo AMOROSO carmelo.amor...@st.com wrote:
 On 05/12/2011 13.04, Carmelo AMOROSO wrote:
 On 01/12/2011 20.40, Khem Raj wrote:
 yes I tried the elf_machine_relocations patch and it did not help
 I am at 7682323a3a798d6f15708f228f859a64cb869aa3
 which is merge commit.


 so if you do reset --hard HEAD~1 you have something working again ?


 yes


 Hi guys,
 starting from SHA1 7682323a3a798d6f15708f228f859a64cb869aa3, and
 resetting back to HEAD~1, we ends in SHA1

 commit 3004ce0c9619f89bf8e64931edd696bf4df8d2e1
 Merge: 3b3285b 4916fd8
 Author: Carmelo Amoroso carmelo.amor...@st.com
 Date:   Wed May 4 08:31:16 2011 +0200

     Merge remote-tracking branch 'origin/master' into prelink

     * origin/master: (32 commits)
       libubacktrace: fix backtrace support on arm-eabi, which needs
 libgcc_eh linked too
       getaddrinfo.c: fix incorrect check for ERANGE from gethostbyaddr_r
       getaddrinfo.c: improve code readability. No functional changes
       string: remove unused variable
       x86_64: silence warning if !TLS
       buildsys: prettify ssp.c handling
       madvise is LINUX_SPECIFIC
       test_nptl: fix expected result for tst-cputimer[123]
       test_nptl: fix expected result for tst-clock2 test
       buildsys: make $(LOCAL_INSTALL_PATH) phony
       ether_aton: reject invalid input
       tests: disable ether tests if !HAS_SOCKET
       inet: add ether_aton testcase
       sysconf: clock_getres depends on HAS_REALTIME
       __rt_sigwaitinfo: depends on HAS_REALTIME
       buildsys: minor fixes in Makefile.arch for C6X
       buildsys: minor fixes in Makefile.arch for microblaze
       libubacktrace: enabled for all archs indeed.
       sparc: don't access fp registers when configured for no fpu
       libubacktrace: generic implementation based dwarf
       ...

     Conflicts:
         ldso/ldso/dl-elf.c
         ldso/ldso/mips/elfinterp.c
         ldso/ldso/x86_64/elfinterp.c

     Signed-off-by: Carmelo Amoroso carmelo.amor...@st.com

 That already includes both STANDALONE, PRELINK and symbol lookup re-design.

 Filippo and my self have thoroughly looked again at
 prelink/stand-alone/global scope work included from the prelink branch,
 and we can't see any issue (except for the fdpic archs under discussion
 with Mike).

 My suspect is in the merge process itself (badly handled conflicts), or
 in the commits (in master) between
 3004ce0c9619f89bf8e64931edd696bf4df8d2e1 and
 7682323a3a798d6f15708f228f859a64cb869aa3

 So I'll look again at all merge commits I've done focusing on conflicts.

 I'd kindly ask Khem to confirm if he is seeing or not problem with
 master @3004ce0c9.

just finished trying, the same problem exists on master @3004ce0c9


 Cheers,
 Carmelo



 a follow up...

 merge commits looks fine to me, also the conflicts seems to have been
 properly handled.

 If Khem will confirm that master @3004ce0c9 is fine as I hope, then I'd
 suggest to try reverting the commit

 204c7849029d90e5e3486670a6a07a76f949afd6
 libc: make common longjmp usable with NPTL

 it's the only change with a wide impacts on all archs with NPTL enabled.

 Carmelo



 ___
 uClibc mailing list
 uClibc@uclibc.org
 http://lists.busybox.net/mailman/listinfo/uclibc


 ___
 uClibc mailing list
 uClibc@uclibc.org
 http://lists.busybox.net/mailman/listinfo/uclibc
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-02 Thread Bernhard Reutner-Fischer
On Dec 1, 2011 8:41 PM, Khem Raj raj.k...@gmail.com wrote:

  yes I tried the elf_machine_relocations patch and it did not help
  I am at 7682323a3a798d6f15708f228f859a64cb869aa3
  which is merge commit.
 
 
  so if you do reset --hard HEAD~1 you have something working again ?
 

 yes

I would like to have this fixed for the upcoming release. This is the only
real blocker, we can take care of fixing fdpic etc. in the .1

Thanks,
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-02 Thread Carmelo AMOROSO
On 02/12/2011 14.45, Bernhard Reutner-Fischer wrote:
 On Dec 1, 2011 8:41 PM, Khem Raj raj.k...@gmail.com wrote:

 yes I tried the elf_machine_relocations patch and it did not help
 I am at 7682323a3a798d6f15708f228f859a64cb869aa3
 which is merge commit.


 so if you do reset --hard HEAD~1 you have something working again ?


 yes
 
 I would like to have this fixed for the upcoming release. This is the only
 real blocker, we can take care of fixing fdpic etc. in the .1
 
 Thanks,
 

Bernhard,
that's what we are trying to do. Unfortunately we have not yet a test
case showing the problem.

We are running right now a uclibc suite/LTP on an SH4 platform (master
branch, w/ and w/o prelink) without problems.

I've asked Khem to provide some logs/strace outputs to help.

Carmelo


___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-01 Thread Carmelo AMOROSO
On 30/11/2011 22.44, Khem Raj wrote:
 On Wed, Nov 30, 2011 at 5:57 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 29/11/2011 17.18, Khem Raj wrote:
 On Tue, Nov 29, 2011 at 1:53 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 28/11/2011 18.33, Khem Raj wrote:
 On Mon, Nov 28, 2011 at 5:21 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 28/11/2011 3.15, Khem Raj wrote:
 On Sat, Nov 26, 2011 at 4:33 PM, Mike Frysinger vap...@gentoo.org 
 wrote:
 On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

 it's a merge commit that happens when you merge a branch that isn't a 
 fast
 forward.  it isn't an actual commit ...

 right. I think its set of prelink changes that I am looking at. That
 merge commit is the data point I have
 I was looking for Carmelo has seen something similar. I can provide
 the two root file systems
 built with and without that merge commit.


 khem,
 are you able to reproduce the failure with a simpler test case ? it
 would be the best.


 I am trying to get to it. but its a full X system that comes up but
 fonts are wrong.
 it will definitely take some time since it has to be debugged all way 
 through
 I am not an X expert either :)


 are you running with prelink disabled ?

 yes LDSO_PRELINK_SUPPORT and even LDSO_STANDALONE_SUPPORT are not set
 same .config works if I revert back the prelink changes.


 Hi Khem
 have you tried with the patch
 [PATCH] ldso: invoke elf_machine_relocations in any case

 could you list the commits you have reverted ? it will help me a lot in
 debugging.
 
 yes I tried the elf_machine_relocations patch and it did not help

:-(

 I am at 7682323a3a798d6f15708f228f859a64cb869aa3
 which is merge commit.
 

ok. Filippo has already some fix to do in !prelink path. we are working
on this. I'll keep you posted.

Carmelo


 Thanks,
 Carmelo

 w/o prelink, the only part impacted is in the symbol lookup process, but
 it were broken, nothing would work.

 carmelo


 -mike









 

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-01 Thread Carmelo AMOROSO
On 30/11/2011 22.44, Khem Raj wrote:
 On Wed, Nov 30, 2011 at 5:57 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 29/11/2011 17.18, Khem Raj wrote:
 On Tue, Nov 29, 2011 at 1:53 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 28/11/2011 18.33, Khem Raj wrote:
 On Mon, Nov 28, 2011 at 5:21 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 28/11/2011 3.15, Khem Raj wrote:
 On Sat, Nov 26, 2011 at 4:33 PM, Mike Frysinger vap...@gentoo.org 
 wrote:
 On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

 it's a merge commit that happens when you merge a branch that isn't a 
 fast
 forward.  it isn't an actual commit ...

 right. I think its set of prelink changes that I am looking at. That
 merge commit is the data point I have
 I was looking for Carmelo has seen something similar. I can provide
 the two root file systems
 built with and without that merge commit.


 khem,
 are you able to reproduce the failure with a simpler test case ? it
 would be the best.


 I am trying to get to it. but its a full X system that comes up but
 fonts are wrong.
 it will definitely take some time since it has to be debugged all way 
 through
 I am not an X expert either :)


 are you running with prelink disabled ?

 yes LDSO_PRELINK_SUPPORT and even LDSO_STANDALONE_SUPPORT are not set
 same .config works if I revert back the prelink changes.


 Hi Khem
 have you tried with the patch
 [PATCH] ldso: invoke elf_machine_relocations in any case

 could you list the commits you have reverted ? it will help me a lot in
 debugging.
 
 yes I tried the elf_machine_relocations patch and it did not help
 I am at 7682323a3a798d6f15708f228f859a64cb869aa3
 which is merge commit.
 

so if you do reset --hard HEAD~1 you have something working again ?


 Thanks,
 Carmelo

 w/o prelink, the only part impacted is in the symbol lookup process, but
 it were broken, nothing would work.

 carmelo


 -mike









 

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-12-01 Thread Khem Raj
 yes I tried the elf_machine_relocations patch and it did not help
 I am at 7682323a3a798d6f15708f228f859a64cb869aa3
 which is merge commit.


 so if you do reset --hard HEAD~1 you have something working again ?


yes
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-11-30 Thread Carmelo AMOROSO
On 29/11/2011 17.18, Khem Raj wrote:
 On Tue, Nov 29, 2011 at 1:53 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 28/11/2011 18.33, Khem Raj wrote:
 On Mon, Nov 28, 2011 at 5:21 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 28/11/2011 3.15, Khem Raj wrote:
 On Sat, Nov 26, 2011 at 4:33 PM, Mike Frysinger vap...@gentoo.org wrote:
 On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

 it's a merge commit that happens when you merge a branch that isn't a 
 fast
 forward.  it isn't an actual commit ...

 right. I think its set of prelink changes that I am looking at. That
 merge commit is the data point I have
 I was looking for Carmelo has seen something similar. I can provide
 the two root file systems
 built with and without that merge commit.


 khem,
 are you able to reproduce the failure with a simpler test case ? it
 would be the best.


 I am trying to get to it. but its a full X system that comes up but
 fonts are wrong.
 it will definitely take some time since it has to be debugged all way 
 through
 I am not an X expert either :)


 are you running with prelink disabled ?
 
 yes LDSO_PRELINK_SUPPORT and even LDSO_STANDALONE_SUPPORT are not set
 same .config works if I revert back the prelink changes.
 

Hi Khem
have you tried with the patch
[PATCH] ldso: invoke elf_machine_relocations in any case

could you list the commits you have reverted ? it will help me a lot in
debugging.

Thanks,
Carmelo

 w/o prelink, the only part impacted is in the symbol lookup process, but
 it were broken, nothing would work.

 carmelo


 -mike







 

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-11-30 Thread Khem Raj
On Wed, Nov 30, 2011 at 5:57 AM, Carmelo AMOROSO carmelo.amor...@st.com wrote:
 On 29/11/2011 17.18, Khem Raj wrote:
 On Tue, Nov 29, 2011 at 1:53 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 28/11/2011 18.33, Khem Raj wrote:
 On Mon, Nov 28, 2011 at 5:21 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 28/11/2011 3.15, Khem Raj wrote:
 On Sat, Nov 26, 2011 at 4:33 PM, Mike Frysinger vap...@gentoo.org 
 wrote:
 On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

 it's a merge commit that happens when you merge a branch that isn't a 
 fast
 forward.  it isn't an actual commit ...

 right. I think its set of prelink changes that I am looking at. That
 merge commit is the data point I have
 I was looking for Carmelo has seen something similar. I can provide
 the two root file systems
 built with and without that merge commit.


 khem,
 are you able to reproduce the failure with a simpler test case ? it
 would be the best.


 I am trying to get to it. but its a full X system that comes up but
 fonts are wrong.
 it will definitely take some time since it has to be debugged all way 
 through
 I am not an X expert either :)


 are you running with prelink disabled ?

 yes LDSO_PRELINK_SUPPORT and even LDSO_STANDALONE_SUPPORT are not set
 same .config works if I revert back the prelink changes.


 Hi Khem
 have you tried with the patch
 [PATCH] ldso: invoke elf_machine_relocations in any case

 could you list the commits you have reverted ? it will help me a lot in
 debugging.

yes I tried the elf_machine_relocations patch and it did not help
I am at 7682323a3a798d6f15708f228f859a64cb869aa3
which is merge commit.


 Thanks,
 Carmelo

 w/o prelink, the only part impacted is in the symbol lookup process, but
 it were broken, nothing would work.

 carmelo


 -mike









___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-11-29 Thread Carmelo AMOROSO
On 28/11/2011 18.33, Khem Raj wrote:
 On Mon, Nov 28, 2011 at 5:21 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 28/11/2011 3.15, Khem Raj wrote:
 On Sat, Nov 26, 2011 at 4:33 PM, Mike Frysinger vap...@gentoo.org wrote:
 On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

 it's a merge commit that happens when you merge a branch that isn't a fast
 forward.  it isn't an actual commit ...

 right. I think its set of prelink changes that I am looking at. That
 merge commit is the data point I have
 I was looking for Carmelo has seen something similar. I can provide
 the two root file systems
 built with and without that merge commit.


 khem,
 are you able to reproduce the failure with a simpler test case ? it
 would be the best.

 
 I am trying to get to it. but its a full X system that comes up but
 fonts are wrong.
 it will definitely take some time since it has to be debugged all way through
 I am not an X expert either :)
 

are you running with prelink disabled ?
w/o prelink, the only part impacted is in the symbol lookup process, but
it were broken, nothing would work.

carmelo


 -mike




 

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-11-29 Thread Khem Raj
On Tue, Nov 29, 2011 at 1:53 AM, Carmelo AMOROSO carmelo.amor...@st.com wrote:
 On 28/11/2011 18.33, Khem Raj wrote:
 On Mon, Nov 28, 2011 at 5:21 AM, Carmelo AMOROSO carmelo.amor...@st.com 
 wrote:
 On 28/11/2011 3.15, Khem Raj wrote:
 On Sat, Nov 26, 2011 at 4:33 PM, Mike Frysinger vap...@gentoo.org wrote:
 On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

 it's a merge commit that happens when you merge a branch that isn't a fast
 forward.  it isn't an actual commit ...

 right. I think its set of prelink changes that I am looking at. That
 merge commit is the data point I have
 I was looking for Carmelo has seen something similar. I can provide
 the two root file systems
 built with and without that merge commit.


 khem,
 are you able to reproduce the failure with a simpler test case ? it
 would be the best.


 I am trying to get to it. but its a full X system that comes up but
 fonts are wrong.
 it will definitely take some time since it has to be debugged all way through
 I am not an X expert either :)


 are you running with prelink disabled ?

yes LDSO_PRELINK_SUPPORT and even LDSO_STANDALONE_SUPPORT are not set
same .config works if I revert back the prelink changes.

 w/o prelink, the only part impacted is in the symbol lookup process, but
 it were broken, nothing would work.

 carmelo


 -mike







___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-11-28 Thread Carmelo AMOROSO
On 28/11/2011 3.15, Khem Raj wrote:
 On Sat, Nov 26, 2011 at 4:33 PM, Mike Frysinger vap...@gentoo.org wrote:
 On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

 it's a merge commit that happens when you merge a branch that isn't a fast
 forward.  it isn't an actual commit ...
 
 right. I think its set of prelink changes that I am looking at. That
 merge commit is the data point I have
 I was looking for Carmelo has seen something similar. I can provide
 the two root file systems
 built with and without that merge commit.
 
 -mike

 
I'll try to investigate on this tomorrow... I'll ask you for some
further info.

carmelo
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-11-28 Thread Carmelo AMOROSO
On 28/11/2011 3.15, Khem Raj wrote:
 On Sat, Nov 26, 2011 at 4:33 PM, Mike Frysinger vap...@gentoo.org wrote:
 On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

 it's a merge commit that happens when you merge a branch that isn't a fast
 forward.  it isn't an actual commit ...
 
 right. I think its set of prelink changes that I am looking at. That
 merge commit is the data point I have
 I was looking for Carmelo has seen something similar. I can provide
 the two root file systems
 built with and without that merge commit.
 

khem,
are you able to reproduce the failure with a simpler test case ? it
would be the best.


 -mike

 

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-11-28 Thread Khem Raj
On Mon, Nov 28, 2011 at 5:21 AM, Carmelo AMOROSO carmelo.amor...@st.com wrote:
 On 28/11/2011 3.15, Khem Raj wrote:
 On Sat, Nov 26, 2011 at 4:33 PM, Mike Frysinger vap...@gentoo.org wrote:
 On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

 it's a merge commit that happens when you merge a branch that isn't a fast
 forward.  it isn't an actual commit ...

 right. I think its set of prelink changes that I am looking at. That
 merge commit is the data point I have
 I was looking for Carmelo has seen something similar. I can provide
 the two root file systems
 built with and without that merge commit.


 khem,
 are you able to reproduce the failure with a simpler test case ? it
 would be the best.


I am trying to get to it. but its a full X system that comes up but
fonts are wrong.
it will definitely take some time since it has to be debugged all way through
I am not an X expert either :)


 -mike




___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-11-27 Thread Khem Raj
On Sat, Nov 26, 2011 at 4:33 PM, Mike Frysinger vap...@gentoo.org wrote:
 On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

 it's a merge commit that happens when you merge a branch that isn't a fast
 forward.  it isn't an actual commit ...

right. I think its set of prelink changes that I am looking at. That
merge commit is the data point I have
I was looking for Carmelo has seen something similar. I can provide
the two root file systems
built with and without that merge commit.

 -mike

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-11-26 Thread Khem Raj
Hi Carmelo,

While trying to regress test future branch on top of master

I have been getting a regression on master on all architectures
(arm/ppc/mips/x86/x86_64) where my clutter based image does not
render the font properly after boot up ( it draws them with squares for
each character)

It has been working alright with 0.9.32 branch and I decided to
bisect master and after days I pinned the problematic commit to be the
one below.

If I build the root file system without this patch everything works
as expected. Could you explain why this commit is needed ?

commit 7682323a3a798d6f15708f228f859a64cb869aa3
Merge: 3004ce0 74da7a8
Author: Carmelo Amoroso carmelo.amor...@st.com
Date:   Fri Jun 24 16:24:25 2011 +0200

Merge remote-tracking branch 'origin/master' into prelink

* origin/master: (61 commits)
  fts: fix warning due to old-style function definition
  ldso_tls: fix compiler warning due to missing cast
  resolv: fix bug in res_init with ipv6 nameservers
  config: Fix passing defconfig args
  buildsys: pt-initfini.s depends on uClibc_config.h
  libdl: search for ELF_RTYPE_CLASS_DLSYM in dlsym()
  resolv: try next server on SERVFAIL
  getaddrinfo: allow numeric service without any hints
  bump version to 0.9.33-git
  nptl/pthread: Correct path for machine specific pt-initfini.c
  ctor/dtor nptl: Fix init and fini function compilation
  Rules.mak: Rearrange appending UCLIBC_EXTRA_CFLAGS to CFLAGS
  ARM: remove EABI/OABI selection
  ARM: detect BX availibility at build time
  ARM: #include bits/arm_asm.h where __USE_BX__ is used
  ARM: transform the EABI/OABI choice into a boolean
  ARM: remove sub-arch/variants selection from menuconfig
  ARM: introduce blind options to select  force THUMB mode
  ARM: reorder Use BX option
  Fix __libc_epoll_pwait compile failure on x86
  ...

Conflicts:
ldso/libdl/libdl.c

Signed-off-by: Carmelo Amoroso carmelo.amor...@st.com


-- 
-Khem
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc


Re: Regression caused by commit 7682323a3a798d6f15708f228f859a64cb869aa3

2011-11-26 Thread Mike Frysinger
On Saturday 26 November 2011 19:07:44 Khem Raj wrote:
 If I build the root file system without this patch everything works
 as expected. Could you explain why this commit is needed ?

it's a merge commit that happens when you merge a branch that isn't a fast 
forward.  it isn't an actual commit ...
-mike


signature.asc
Description: This is a digitally signed message part.
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc