On Wed, Mar 19, 2014 at 08:44:46AM -0700, H. Peter Anvin wrote: > On 03/19/2014 07:48 AM, Neil Horman wrote: > > The recent conversion to build dpdk as a DSO has an error in > > rte_cpu_get_features. When being build with -fpie, %ebx gets clobbered by > > the > > cpuid instruction which is also the pic register. Therefore the inline asm > > tries to save off %ebx, but does so incorrectly. It starts by loading > > params.ebx to "D" which is %edi, but then the first instruction moves %ebx > > to > > %edi, clobbering the input value. Then after the operation is complete, "D" > > (%edi) is stored to the local ebx variable, but only after the xchgl > > instruction > > has happened, which means ebx holds only the PIC pointer. This behavior was > > causing strange segfults for me when running the cpuid instruction. > > > > The fix is pretty easy, split the asm into two separate directives, the > > first > > saving ebx, and using it to grab the appropriate cpuid info (and correctly > > listing %edi as a clobbered register in the process, and then a subsequent > > asm > > directive preforming the reverse exchange (again, listing %edi as being > > clobbered). > > > > Signed-off-by: Neil Horman <nhorman at tuxdriver.com> > > > So after some discussion with hpa, I need to self NAK this again, apologies for the noise. Theres some clean up to be done in this area, and I'm still getting a segfault that is in some way related to this code, but I need to dig deeper to understand it.
Neil > Hi Neil :) > > If I'm reading this correctly, this is at the very best extremely > dangerous (I'm confused why it would compile at all with PIC enabled), > since it leaves the CPU state in an unexpected way between two asm > statements, where the compiler is perfectly allowed to put code. > > Instead, I would do simple xchg/xchg, which is an idiom I have used for > this particular purpose in a lot of code. The minimal patch is simply > to change "mov" to "xchg" inside the asm statement. > > There is no fundamental reason to nail down the register to %edi, > though; thus I would suggest instead: > > diff --git a/lib/librte_eal/common/eal_common_cpuflags.c > b/lib/librte_eal/common/eal_common_cpuflags.c > index 1ebf78cc2a48..6b75992fec1a 100644 > --- a/lib/librte_eal/common/eal_common_cpuflags.c > +++ b/lib/librte_eal/common/eal_common_cpuflags.c > @@ -206,16 +206,16 @@ rte_cpu_get_features(struct cpuid_parameters_t params) > "d" (params.edx)); > #else > asm volatile ( > - "mov %%ebx, %%edi\n" > + "xchgl %%ebx, %1\n" > "cpuid\n" > - "xchgl %%ebx, %%edi;\n" > + "xchgl %%ebx, %1;\n" > : "=a" (eax), > - "=D" (ebx), > + "=r" (ebx), > "=c" (ecx), > "=d" (edx) > /* input */ > : "a" (params.eax), > - "D" (params.ebx), > + "1" (params.ebx), > "c" (params.ecx), > "d" (params.edx)); > #endif > > > > --- > > Change notes > > > > v2) Fix constraints to ensure that ebx isn't overwritten before asm starts > > --- > > lib/librte_eal/common/eal_common_cpuflags.c | 13 ++++++++----- > > 1 file changed, 8 insertions(+), 5 deletions(-) > > > > diff --git a/lib/librte_eal/common/eal_common_cpuflags.c > > b/lib/librte_eal/common/eal_common_cpuflags.c > > index 1ebf78c..75b505f 100644 > > --- a/lib/librte_eal/common/eal_common_cpuflags.c > > +++ b/lib/librte_eal/common/eal_common_cpuflags.c > > @@ -190,7 +190,7 @@ static const struct feature_entry cpu_feature_table[] = > > { > > static inline int > > rte_cpu_get_features(struct cpuid_parameters_t params) > > { > > - int eax, ebx, ecx, edx; /* registers */ > > + int eax, ebx, ecx, edx, oldebx; /* registers */ > > > > #ifndef __PIC__ > > asm volatile ("cpuid" > > @@ -206,18 +206,21 @@ rte_cpu_get_features(struct cpuid_parameters_t params) > > "d" (params.edx)); > > #else > > asm volatile ( > > - "mov %%ebx, %%edi\n" > > + "xchgl %%ebx, %%edi\n" > > "cpuid\n" > > - "xchgl %%ebx, %%edi;\n" > > : "=a" (eax), > > - "=D" (ebx), > > + "=b" (ebx), > > "=c" (ecx), > > - "=d" (edx) > > + "=d" (edx), > > + "=D" (oldebx) > > /* input */ > > : "a" (params.eax), > > "D" (params.ebx), > > "c" (params.ecx), > > "d" (params.edx)); > > + > > + asm volatile ("xchgl %%ebx, %%edi;\n" > > + : : "D" (oldebx)); > > #endif > > > > switch (params.return_register) { > > > >