[PATCH] ldso: fix dlsym hang when reloading DSOs

2013-06-27 Thread Timo Teräs
It can happen under certain cases that the DSO had refcount 0,
but was already loaded. (NODELETE flag is set, or it is pulled
in via both NEEDED dependency and explicit dlopen()).

In these cases we must not re-add the DSO to the global symbol
scope as it is already there. Or we end up corrupting the linked
list and further dlsym lookups will hang.

Signed-off-by: Timo Teräs timo.te...@iki.fi
---
 ldso/libdl/libdl.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/ldso/libdl/libdl.c b/ldso/libdl/libdl.c
index c451b63..c901512 100644
--- a/ldso/libdl/libdl.c
+++ b/ldso/libdl/libdl.c
@@ -543,15 +543,23 @@ static void *do_dlopen(const char *libname, int flag, 
ElfW(Addr) from)
 
 #ifdef SHARED
/*
-* Get the tail of the list.
 * In the static case doesn't need to extend the global scope, it is
 * ready to be used as it is, because _dl_loaded_modules already points
 * to the dlopened library.
+*
+* Extend the global scope by adding the local scope of the dlopened 
DSO.
+* But only if it's not there. It can happen under certain cases that 
the
+* DSO had refcount = 0, but was already loaded. (NODELETE flag is set, 
or
+* it is pulled in via both NEEDED dependency and explicit dlopen()).
 */
-   for (ls = _dl_loaded_modules-symbol_scope; ls  ls-next; ls = 
ls-next);
-
-   /* Extend the global scope by adding the local scope of the dlopened 
DSO. */
-   ls-next = dyn_chain-dyn-symbol_scope;
+   for (ls = _dl_loaded_modules-symbol_scope; ; ls = ls-next) {
+   if (ls == dyn_chain-dyn-symbol_scope)
+   break;
+   if (ls-next == NULL) {
+   ls-next = dyn_chain-dyn-symbol_scope;
+   break;
+   }
+   }
 #endif
 #ifdef __mips__
/*
-- 
1.8.3.1

___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: [PATCH] ldso: fix dlsym hang when reloading DSOs

2013-06-27 Thread Carmelo AMOROSO
On 27/06/2013 8.34, Timo Teräs wrote:
 It can happen under certain cases that the DSO had refcount 0,
 but was already loaded. (NODELETE flag is set, or it is pulled
 in via both NEEDED dependency and explicit dlopen()).
 
 In these cases we must not re-add the DSO to the global symbol
 scope as it is already there. Or we end up corrupting the linked
 list and further dlsym lookups will hang.
 
 Signed-off-by: Timo Teräs timo.te...@iki.fi

Acked-by: Carmelo Amoroso carmelo.amor...@st.com

Cheers,
Carmelo

 ---
  ldso/libdl/libdl.c | 18 +-
  1 file changed, 13 insertions(+), 5 deletions(-)
 
 diff --git a/ldso/libdl/libdl.c b/ldso/libdl/libdl.c
 index c451b63..c901512 100644
 --- a/ldso/libdl/libdl.c
 +++ b/ldso/libdl/libdl.c
 @@ -543,15 +543,23 @@ static void *do_dlopen(const char *libname, int flag, 
 ElfW(Addr) from)
  
  #ifdef SHARED
   /*
 -  * Get the tail of the list.
* In the static case doesn't need to extend the global scope, it is
* ready to be used as it is, because _dl_loaded_modules already points
* to the dlopened library.
 +  *
 +  * Extend the global scope by adding the local scope of the dlopened 
 DSO.
 +  * But only if it's not there. It can happen under certain cases that 
 the
 +  * DSO had refcount = 0, but was already loaded. (NODELETE flag is set, 
 or
 +  * it is pulled in via both NEEDED dependency and explicit dlopen()).
*/
 - for (ls = _dl_loaded_modules-symbol_scope; ls  ls-next; ls = 
 ls-next);
 -
 - /* Extend the global scope by adding the local scope of the dlopened 
 DSO. */
 - ls-next = dyn_chain-dyn-symbol_scope;
 + for (ls = _dl_loaded_modules-symbol_scope; ; ls = ls-next) {
 + if (ls == dyn_chain-dyn-symbol_scope)
 + break;
 + if (ls-next == NULL) {
 + ls-next = dyn_chain-dyn-symbol_scope;
 + break;
 + }
 + }
  #endif
  #ifdef __mips__
   /*
 
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: [PATCH] ldso: fix dlsym hang when reloading DSOs

2013-06-27 Thread Rich Felker
On Thu, Jun 27, 2013 at 09:34:32AM +0300, Timo Teräs wrote:
 It can happen under certain cases that the DSO had refcount 0,
 but was already loaded. (NODELETE flag is set, or it is pulled
 in via both NEEDED dependency and explicit dlopen()).

Wouldn't it be more logical to prevent this from happening by not
using a refcount of zero? For example, NODELETE could just perform an
extra increment on the refcnt so that it never reaches 0. In essence,
the NODELETE flag is a permanent reference to the DSO.

Rich
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: [PATCH] ldso: fix dlsym hang when reloading DSOs

2013-06-27 Thread Bernhard Reutner-Fischer

On 27 June 2013 20:16:26 Rich Felker dal...@aerifal.cx wrote:

On Thu, Jun 27, 2013 at 09:34:32AM +0300, Timo Teräs wrote:
 It can happen under certain cases that the DSO had refcount 0,
 but was already loaded. (NODELETE flag is set, or it is pulled
 in via both NEEDED dependency and explicit dlopen()).

Wouldn't it be more logical to prevent this from happening by not
using a refcount of zero? For example, NODELETE could just perform an
extra increment on the refcnt so that it never reaches 0. In essence,
the NODELETE flag is a permanent reference to the DSO.


I was thinking the same, without having looked yet. This does not sound 
like the correct fix.



Sent with AquaMail for Android
http://www.aqua-mail.com


___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: [PATCH] ldso: fix dlsym hang when reloading DSOs

2013-06-27 Thread Timo Teras
On Thu, 27 Jun 2013 14:16:26 -0400
Rich Felker dal...@aerifal.cx wrote:

 On Thu, Jun 27, 2013 at 09:34:32AM +0300, Timo Teräs wrote:
  It can happen under certain cases that the DSO had refcount 0,
  but was already loaded. (NODELETE flag is set, or it is pulled
  in via both NEEDED dependency and explicit dlopen()).
 
 Wouldn't it be more logical to prevent this from happening by not
 using a refcount of zero? For example, NODELETE could just perform an
 extra increment on the refcnt so that it never reaches 0. In essence,
 the NODELETE flag is a permanent reference to the DSO.

It seems this is not the only case. As mentioned, NEEDED dependencies
together with dlopen()/dlclose() can result in this too in some cases.
Apparently usage_count refers to active handles from dlopen() - and can
be zero if the lib is pulled in via NEEDED dependency only.

Thus, this is the least intrusive fix. While restructuring dlopen()
and dlclose() could fix this better, I did not have the time or
interest to rewrite most of the libdl code.

- Timo
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: [PATCH] ldso: fix dlsym hang when reloading DSOs

2013-06-27 Thread Timo Teras
On Thu, 27 Jun 2013 21:36:14 +0300
Timo Teras timo.te...@iki.fi wrote:

 On Thu, 27 Jun 2013 14:16:26 -0400
 Rich Felker dal...@aerifal.cx wrote:
 
  On Thu, Jun 27, 2013 at 09:34:32AM +0300, Timo Teräs wrote:
   It can happen under certain cases that the DSO had refcount 0,
   but was already loaded. (NODELETE flag is set, or it is pulled
   in via both NEEDED dependency and explicit dlopen()).
  
  Wouldn't it be more logical to prevent this from happening by not
  using a refcount of zero? For example, NODELETE could just perform
  an extra increment on the refcnt so that it never reaches 0. In
  essence, the NODELETE flag is a permanent reference to the DSO.
 
 It seems this is not the only case. As mentioned, NEEDED dependencies
 together with dlopen()/dlclose() can result in this too in some cases.
 Apparently usage_count refers to active handles from dlopen() - and
 can be zero if the lib is pulled in via NEEDED dependency only.
 
 Thus, this is the least intrusive fix. While restructuring dlopen()
 and dlclose() could fix this better, I did not have the time or
 interest to rewrite most of the libdl code.

Hum. Actually it might be that this happens with NODELETE flag only.
Seems my test case had that instead of the weird combo.

I'll test with NODELETE flag just adding usage_count by extra
reference. Will post new patch if it works (or report back if it
didn't).

- Timo
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: [PATCH] ldso: fix dlsym hang when reloading DSOs

2013-06-27 Thread Rich Felker
On Thu, Jun 27, 2013 at 09:36:14PM +0300, Timo Teras wrote:
 On Thu, 27 Jun 2013 14:16:26 -0400
 Rich Felker dal...@aerifal.cx wrote:
 
  On Thu, Jun 27, 2013 at 09:34:32AM +0300, Timo Teräs wrote:
   It can happen under certain cases that the DSO had refcount 0,
   but was already loaded. (NODELETE flag is set, or it is pulled
   in via both NEEDED dependency and explicit dlopen()).
  
  Wouldn't it be more logical to prevent this from happening by not
  using a refcount of zero? For example, NODELETE could just perform an
  extra increment on the refcnt so that it never reaches 0. In essence,
  the NODELETE flag is a permanent reference to the DSO.
 
 It seems this is not the only case. As mentioned, NEEDED dependencies
 together with dlopen()/dlclose() can result in this too in some cases.
 Apparently usage_count refers to active handles from dlopen() - and can
 be zero if the lib is pulled in via NEEDED dependency only.

This is also a bug in the reference counting. DT_NEEDED and dlopen
should both be considered references for counting purposes. Otherwise
you can end up with a reference count of zero when the library is
still in use.

Rich
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: [PATCH] ldso: fix dlsym hang when reloading DSOs

2013-06-27 Thread Timo Teras
On Thu, 27 Jun 2013 20:36:44 +0200
Bernhard Reutner-Fischer rep.dot@gmail.com wrote:

 On 27 June 2013 20:16:26 Rich Felker dal...@aerifal.cx wrote:
  On Thu, Jun 27, 2013 at 09:34:32AM +0300, Timo Teräs wrote:
   It can happen under certain cases that the DSO had refcount 0,
   but was already loaded. (NODELETE flag is set, or it is pulled
   in via both NEEDED dependency and explicit dlopen()).
 
  Wouldn't it be more logical to prevent this from happening by not
  using a refcount of zero? For example, NODELETE could just perform
  an extra increment on the refcnt so that it never reaches 0. In
  essence, the NODELETE flag is a permanent reference to the DSO.
 
 I was thinking the same, without having looked yet. This does not
 sound like the correct fix.

This would be the alternate patch - but somehow I lost my original test
case. And qemu seems to now depend on GTK making the other known issue
also go away.

Paolo, could you test if this fixes your problem also?

diff --git a/ldso/ldso/dl-elf.c b/ldso/ldso/dl-elf.c
index 96459f3..71133dd 100644
--- a/ldso/ldso/dl-elf.c
+++ b/ldso/ldso/dl-elf.c
@@ -831,6 +831,8 @@ struct elf_resolve *_dl_load_elf_shared_library(unsigned 
rflags,
 #endif
(*rpnt)-dyn = tpnt;
tpnt-usage_count++;
+   if (rtld_flags  RTLD_NODELETE)
+   tpnt-usage_count++;
 #ifdef __LDSO_STANDALONE_SUPPORT__
tpnt-libtype = (epnt-e_type == ET_DYN) ? elf_lib : elf_executable;
 #else
diff --git a/ldso/libdl/libdl.c b/ldso/libdl/libdl.c
index c451b63..035b56d 100644
--- a/ldso/libdl/libdl.c
+++ b/ldso/libdl/libdl.c
@@ -822,7 +822,7 @@ static int do_dlclose(void *vhandle, int need_fini)
_dl_handles = rpnt-next_handle;
_dl_if_debug_print(%s: usage count: %d\n,
handle-dyn-libname, handle-dyn-usage_count);
-   if (handle-dyn-usage_count != 1 || (handle-dyn-rtld_flags  
RTLD_NODELETE)) {
+   if (handle-dyn-usage_count != 1) {
handle-dyn-usage_count--;
free(handle);
return 0;
@@ -844,7 +844,7 @@ static int do_dlclose(void *vhandle, int need_fini)
for (j = 0; j  handle-init_fini.nlist; ++j) {
tpnt = handle-init_fini.init_fini[j];
tpnt-usage_count--;
-   if (tpnt-usage_count == 0  !(tpnt-rtld_flags  
RTLD_NODELETE)) {
+   if (tpnt-usage_count == 0) {
if ((tpnt-dynamic_info[DT_FINI]
 || tpnt-dynamic_info[DT_FINI_ARRAY])
  need_fini
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: [PATCH] ldso: fix dlsym hang when reloading DSOs

2013-06-27 Thread Bernhard Reutner-Fischer

On 27 June 2013 21:07:38 Timo Teras timo.te...@iki.fi wrote:

On Thu, 27 Jun 2013 20:36:44 +0200
Bernhard Reutner-Fischer rep.dot@gmail.com wrote:

 On 27 June 2013 20:16:26 Rich Felker dal...@aerifal.cx wrote:
  On Thu, Jun 27, 2013 at 09:34:32AM +0300, Timo Teräs wrote:
   It can happen under certain cases that the DSO had refcount 0,
   but was already loaded. (NODELETE flag is set, or it is pulled
   in via both NEEDED dependency and explicit dlopen()).
 
  Wouldn't it be more logical to prevent this from happening by not
  using a refcount of zero? For example, NODELETE could just perform
  an extra increment on the refcnt so that it never reaches 0. In
  essence, the NODELETE flag is a permanent reference to the DSO.
 I was thinking the same, without having looked yet. This does not
 sound like the correct fix.

This would be the alternate patch - but somehow I lost my original test
case. And qemu seems to now depend on GTK making the other known issue
also go away.

Paolo, could you test if this fixes your problem also?

diff --git a/ldso/ldso/dl-elf.c b/ldso/ldso/dl-elf.c
index 96459f3..71133dd 100644
--- a/ldso/ldso/dl-elf.c
+++ b/ldso/ldso/dl-elf.c
@@ -831,6 +831,8 @@ struct elf_resolve 
*_dl_load_elf_shared_library(unsigned rflags,

 #endif
(*rpnt)-dyn = tpnt;
tpnt-usage_count++;
+   if (rtld_flags  RTLD_NODELETE)
+   tpnt-usage_count++;
 #ifdef __LDSO_STANDALONE_SUPPORT__
tpnt-libtype = (epnt-e_type == ET_DYN) ? elf_lib : elf_executable;
 #else
diff --git a/ldso/libdl/libdl.c b/ldso/libdl/libdl.c
index c451b63..035b56d 100644
--- a/ldso/libdl/libdl.c
+++ b/ldso/libdl/libdl.c
@@ -822,7 +822,7 @@ static int do_dlclose(void *vhandle, int need_fini)
_dl_handles = rpnt-next_handle;
_dl_if_debug_print(%s: usage count: %d\n,
handle-dyn-libname, handle-dyn-usage_count);
-	if (handle-dyn-usage_count != 1 || (handle-dyn-rtld_flags  
RTLD_NODELETE)) {

+   if (handle-dyn-usage_count != 1) {
handle-dyn-usage_count--;
free(handle);
return 0;
@@ -844,7 +844,7 @@ static int do_dlclose(void *vhandle, int need_fini)
for (j = 0; j  handle-init_fini.nlist; ++j) {
tpnt = handle-init_fini.init_fini[j];
tpnt-usage_count--;


We want to keep usage_count at at least 1 if NODELETE here in dlclose:
tpnt-usage_count -= !(tpnt-rtld_flags  RTLD_NODELETE);
Instead, or, better -- and then
usage_count = MAX(usage_count, usage_count  RTLD_NODELETE);
or something to that effect.

Thanks,

-   if (tpnt-usage_count == 0  !(tpnt-rtld_flags  
RTLD_NODELETE)) {
+   if (tpnt-usage_count == 0) {
if ((tpnt-dynamic_info[DT_FINI]
 || tpnt-dynamic_info[DT_FINI_ARRAY])
  need_fini



Sent with AquaMail for Android
http://www.aqua-mail.com


___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Re: [PATCH] ldso: fix dlsym hang when reloading DSOs

2013-06-27 Thread Timo Teras
On Thu, 27 Jun 2013 21:25:12 +0200
Bernhard Reutner-Fischer rep.dot@gmail.com wrote:

 On 27 June 2013 21:07:38 Timo Teras timo.te...@iki.fi wrote:
  On Thu, 27 Jun 2013 20:36:44 +0200
  Bernhard Reutner-Fischer rep.dot@gmail.com wrote:
 
   On 27 June 2013 20:16:26 Rich Felker dal...@aerifal.cx wrote:
On Thu, Jun 27, 2013 at 09:34:32AM +0300, Timo Teräs wrote:
 It can happen under certain cases that the DSO had refcount 0,
 but was already loaded. (NODELETE flag is set, or it is pulled
 in via both NEEDED dependency and explicit dlopen()).
   
Wouldn't it be more logical to prevent this from happening by
not using a refcount of zero? For example, NODELETE could just
perform an extra increment on the refcnt so that it never
reaches 0. In essence, the NODELETE flag is a permanent
reference to the DSO.
   I was thinking the same, without having looked yet. This does not
   sound like the correct fix.
 
  This would be the alternate patch - but somehow I lost my original
  test case. And qemu seems to now depend on GTK making the other
  known issue also go away.
 
  Paolo, could you test if this fixes your problem also?
 
  diff --git a/ldso/ldso/dl-elf.c b/ldso/ldso/dl-elf.c
  index 96459f3..71133dd 100644
  --- a/ldso/ldso/dl-elf.c
  +++ b/ldso/ldso/dl-elf.c
  @@ -831,6 +831,8 @@ struct elf_resolve 
  *_dl_load_elf_shared_library(unsigned rflags,
   #endif
  (*rpnt)-dyn = tpnt;
  tpnt-usage_count++;
  +   if (rtld_flags  RTLD_NODELETE)
  +   tpnt-usage_count++;
   #ifdef __LDSO_STANDALONE_SUPPORT__
  tpnt-libtype = (epnt-e_type == ET_DYN) ? elf_lib :
  elf_executable; #else
  diff --git a/ldso/libdl/libdl.c b/ldso/libdl/libdl.c
  index c451b63..035b56d 100644
  --- a/ldso/libdl/libdl.c
  +++ b/ldso/libdl/libdl.c
  @@ -822,7 +822,7 @@ static int do_dlclose(void *vhandle, int
  need_fini) _dl_handles = rpnt-next_handle;
  _dl_if_debug_print(%s: usage count: %d\n,
  handle-dyn-libname,
  handle-dyn-usage_count);
  -   if (handle-dyn-usage_count != 1 ||
  (handle-dyn-rtld_flags  RTLD_NODELETE)) {
  +   if (handle-dyn-usage_count != 1) {
  handle-dyn-usage_count--;
  free(handle);
  return 0;
  @@ -844,7 +844,7 @@ static int do_dlclose(void *vhandle, int
  need_fini) for (j = 0; j  handle-init_fini.nlist; ++j) {
  tpnt = handle-init_fini.init_fini[j];
  tpnt-usage_count--;
 
 We want to keep usage_count at at least 1 if NODELETE here in dlclose:
 tpnt-usage_count -= !(tpnt-rtld_flags  RTLD_NODELETE);
 Instead, or, better -- and then
 usage_count = MAX(usage_count, usage_count  RTLD_NODELETE);
 or something to that effect.

The first chunk's:
+   if (rtld_flags  RTLD_NODELETE)
+   tpnt-usage_count++;

Will ensure that the reference count never drops below one. There's no
way this extra reference gets deleted.

However, it seems that do_dlopen() has:
if (tpnt-usage_count  1) { // assume already dlopened }

So the extra ref count actually messes up this. This should be simple
enough to replace with test like:
if (tpnt-init_flag  DL_OPENED)

But then I believe the the following search for previous dlopen handle
will fail. Perhaps it is not a problem since the same happens with .so
files that where NEEDED for the application and get later dlopened.

- Timo
___
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc