Musl treats DT_VERSYM as an array of int16_t (signed) instead of what we do and should do - Elf64_Versym aka uint16_t. And then the simply check if that version number is negative:
int16_t *versym; ... (!dso->versym || dso->versym[i] >= 0) We instead check if the bit is set which is more correct I think, no? Yeah it took me a while to connect the dots as at first also I did not understand the purpose of this comparison inn musl. But once I saw debug print the version values for old symbols from DT_VERSYM it all clicked. Waldek On Thu, Aug 29, 2019 at 7:37 AM Nadav Har'El <n...@scylladb.com> wrote: > > On Thu, Aug 29, 2019 at 1:28 AM Waldek Kozaczuk <jwkozac...@gmail.com> > wrote: > >> Here is this bit from https://www.akkadia.org/drepper/symbol-versioning: >> >> "If the highest bit (no. 15) of the version symbol value is set, the >> >> object is hidden and must not be used. In this case the linker must >> treat the symbol as not present in the object." >> >> >> This seems to explain what musl is doing which is a good aproximation. Which >> we should also follow. >> >> > I'm happy that this "hidden symbol bit" was all that it took to fix this > problem, because it's easy to understand and self-contained, > but what I don't understand is why what Musl is doing in the code snippet > you included below, has anything to do with this "bit 15" thing you ended > up with? > >> >> Waldek >> >> >> On Wednesday, August 28, 2019 at 1:58:25 PM UTC-4, Waldek Kozaczuk wrote: >>> >>> From https://wiki.musl-libc.org/functional-differences-from-glibc.html >>> >>> "Symbol versioning >>> >>> glibc allows the usage of versioned symbols through version tables in >>> gnu-specific elf sections. the musl dynlinker only supports the subset of >>> symbol versioning that allows to pick the default symbol version, instead >>> of the oldest version, which comes first in the regular symbol tables." >>> >>> So it might be enough to simply detect the newest symbol rather than 1st. >>> >>> This code from musl: >>> >>> static Sym *gnu_lookup(uint32_t h1, uint32_t *hashtab, struct dso *dso, >>> const char *s) >>> { >>> uint32_t nbuckets = hashtab[0]; >>> uint32_t *buckets = hashtab + 4 + hashtab[2]*(sizeof(size_t)/4); >>> uint32_t i = buckets[h1 % nbuckets]; >>> >>> if (!i) return 0; >>> >>> uint32_t *hashval = buckets + nbuckets + (i - hashtab[1]); >>> >>> for (h1 |= 1; ; i++) { >>> uint32_t h2 = *hashval++; >>> if ((h1 == (h2|1)) && (!dso->versym || dso->versym[i] >= 0) >>> && !strcmp(s, dso->strings + dso->syms[i].st_name)) >>> return dso->syms+i; >>> if (h2 & 1) break; >>> } >>> >>> return 0; >>> } >>> >>> >>> Look at versym. >>> >>> >>> On Wednesday, August 28, 2019 at 1:01:52 PM UTC-4, Waldek Kozaczuk wrote: >>>> >>>> As I was working to make ffmpeg wiith x265 with libnuma work on OSv, I >>>> stumbled across a limitation in OSv dynamic linker that leads to >>>> interesting symptoms. >>>> >>>> But before I go into the linker issues, I wanted to say that completing >>>> libnuma support on OSv is not that difficult and it mainly requires: >>>> >>>> - adding minimal sysfs support to OSv to expose 3 pseudo-files >>>> libnuma needs >>>> - /sys/devices/system/node/node0/distance (single number, but >>>> what it really means) >>>> - /sys/devices/system/node/node0/meminfo (subset only for free >>>> and total memory) >>>> - /sys/devices/system/node/node0/cpumap (some mask to identify >>>> all vCPUs) >>>> - exposing set_mempolicy() and sched_setaffinity() as syscalls >>>> (the get versions already are) >>>> >>>> Now the bigger issue - versioned symbols. It turns out the libnuma >>>> library (version 1.2 at least) comes with a number of functions that have >>>> different implementations, worse different signatures. Look at this portion >>>> of the libnuma code that defines two versions of sched_getaffinity >>>> (but there is a handful of other functions with 2 implementations): >>>> >>>> int numa_sched_getaffinity_v1(pid_t pid, unsigned len, const unsigned >>>> long *mask) >>>> { >>>> return syscall(__NR_sched_getaffinity,pid,len,mask); >>>> >>>> } >>>> __asm__(".symver >>>> numa_sched_getaffinity_v1,numa_sched_getaffinity@libnuma_1.1"); >>>> >>>> int numa_sched_getaffinity_v2(pid_t pid, struct bitmask *mask) >>>> { >>>> /* len is length in bytes */ >>>> return syscall(__NR_sched_getaffinity, pid, numa_bitmask_nbytes(mask), >>>> mask->maskp); >>>> /* sched_getaffinity returns sizeof(cpumask_t) */ >>>> >>>> } >>>> __asm__(".symver numa_sched_getaffinity_v2,numa_sched_getaffinity@ >>>> @libnuma_1.2"); >>>> >>>> make_internal_alias(numa_sched_getaffinity_v1); >>>> make_internal_alias(numa_sched_getaffinity_v2); >>>> >>>> Now the specific version of two is used in the libnuma init functions >>>> and ends up calling correctly the right version - no surprise here. But if >>>> a library or executable like numactl runs and calls it the wrong version >>>> will be used and lead to a crash like this: >>>> >>>> ./scripts/manifest_from_host.sh -w numactl && ./scripts/build fs=rofs >>>> -j4 --append-manifest >>>> ./scripts/run.py -e '/numactl --show' -c 1 >>>> OSv v0.53.0-95-g2aa889ea >>>> eth0: 192.168.122.15 >>>> Booted up in 133.21 ms >>>> sched_getaffinity_syscall: pid:0, len:512, mask:0xffffa000012a7600 >>>> sched_getaffinity_syscall: pid:0, len:8, mask:0xffffa0000095f388 >>>> policy: default >>>> preferred node: current >>>> sched_getaffinity_syscall: pid:0, len:9999184, mask:0 >>>> page fault outside application, addr: 0x0000000000000000 >>>> [registers] >>>> RIP: 0x000000004039538e <memset_repstosb+14> >>>> RFL: 0x0000000000010202 CS: 0x0000000000000008 SS: >>>> 0x0000000000000010 >>>> RAX: 0x0000000000000000 RBX: 0xffff80000149b040 RCX: >>>> 0x0000000000989350 RDX: 0x0000000000989350 >>>> RSI: 0x0000000000000000 RDI: 0x0000000000000000 RBP: >>>> 0x0000200000200980 R8: 0x0000000000000000 >>>> R9: 0x0000000004c49a80 R10: 0x0000000000000000 R11: >>>> 0xffffa00001139100 R12: 0x0000000000989350 >>>> R13: 0x0000000004c49a80 R14: 0x0000000000000000 R15: >>>> 0xffffa00001139100 RSP: 0x0000200000200968 >>>> Aborted >>>> >>>> [backtrace] >>>> 0x0000000040349582 <???+1077187970> >>>> 0x000000004034aea9 <mmu::vm_fault(unsigned long, exception_frame*)+393> >>>> 0x00000000403a74b3 <page_fault+147> >>>> 0x00000000403a6316 <???+1077568278> >>>> 0x0000000040460acc <sched_getaffinity+28> >>>> 0x00000000403f524e <__syscall+1134> >>>> 0x0000100000014cd6 <???+85206> >>>> 0xffffa0000095f38f <???+9827215> >>>> >>>> Look at the 3rd call where sched_getaffinity_syscall on OSv side gets >>>> completely wrong data. >>>> >>>> So libnuma has 2 implementations of sched_getaffinity but numactl uses >>>> specific 1.2 version: >>>> readelf -s /usr/lib/x86_64-linux-gnu/libnuma.so.1 | grep >>>> numa_sched_getaffinity >>>> 123: 0000000000006cc0 28 FUNC GLOBAL DEFAULT 13 >>>> numa_sched_getaffinity@libnuma_1.1 >>>> 125: 0000000000006ce0 46 FUNC GLOBAL DEFAULT 13 >>>> numa_sched_getaffinity@@libnuma_1.2 >>>> >>>> readelf -s /usr/bin/numactl | grep numa_sched_getaffinity >>>> 37: 0000000000000000 0 FUNC GLOBAL DEFAULT UND >>>> numa_sched_getaffinity@libnuma_1.2 (5) >>>> >>>> Our dynamic linker (elf.cc) does not seem to handle versioned symbols >>>> and ends up linking the wrong version of this function. >>>> >>>> What would it take to implement version symbols. It looks like per >>>> https://refspecs.linuxfoundation.org/LSB_2.1.0/LSB-Core-generic/LSB-Core-generic/symvertbl.html >>>> we >>>> would have to read Symbol Version Table on both sides caller and callee ELF >>>> and enhance symbol lookup by using the version on top of the name. It does >>>> not seem super difficult but might be quite tricky, no. >>>> >>>> I am not sure how many libraries use symbol versioning and how >>>> prevalent it is, but this gap in OSv dynamic linker might lead to many >>>> surprising errors like this with libnuma. >>>> >>>> BTW this nasty hack gets numactl and ffmpeg work on OSv: >>>> --- a/core/elf.cc >>>> +++ b/core/elf.cc >>>> @@ -847,7 +847,23 @@ Elf64_Sym* object::lookup_symbol_gnu(const char* >>>> name) >>>> continue; >>>> } >>>> if (strcmp(&strtab[symtab[idx].st_name], name) == 0) { >>>> - return &symtab[idx]; >>>> + //if (strcmp("numa_sched_getaffinity", name) == 0 && >>>> dynamic_exists(DT_VERSYM)) { >>>> + // auto versym = dynamic_ptr<Elf64_Versym>(DT_VERSYM); >>>> + // printf("---> Idx: %d, version: %d\n", idx, >>>> versym[idx]); >>>> + //} >>>> + if (strcmp("numa_sched_getaffinity", name) == 0 && idx == >>>> 123) { >>>> + continue; >>>> + } >>>> + else >>>> + if (strcmp("numa_node_to_cpus", name) == 0 && idx == 115) { >>>> + continue; >>>> + } >>>> + else >>>> + if (strcmp("numa_get_run_node_mask", name) == 0 && idx == >>>> 96) { >>>> + continue; >>>> + } >>>> + else >>>> + return &symtab[idx]; >>>> } >>>> } while ((chains[idx++] & 1) == 0); >>>> return nullptr; >>>> @@ -1108,6 +1124,20 @@ void object::init_static_tls() >>>> } >>>> } >>>> >>>> -- >> You received this message because you are subscribed to the Google Groups >> "OSv Development" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to osv-dev+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/osv-dev/c8829cc2-8ade-4a3a-b16a-63455e27e3f0%40googlegroups.com >> <https://groups.google.com/d/msgid/osv-dev/c8829cc2-8ade-4a3a-b16a-63455e27e3f0%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CAL9cFfOYKModdtLq6m-zG2VjL-LHQNXQUEyg3cb9NLYV%3Di3%2BtQ%40mail.gmail.com.