Thank you Keith and Monroy, with your help I was able to track down the problem, My var/run was too small to hold the hugepage information so when I increased its size, it worked. Thank you so much.
On Thu, Feb 23, 2017 at 10:35 AM, Sergio Gonzalez Monroy < [email protected]> wrote: > As Keith suggested, gdb is probably your best bet now. > You could also do 'strace' to see if something shows up there. > > If you are running as root, the application is opening a file in /var/run > to store some hugepage information, then it memsets to 0. > > What distro and kernel are you running on? > > > > On 23/02/2017 16:19, Sushil Adhikari wrote: > >> I didn't understand what you mean by hugepage value, if you mean number of >> hugepages here's what it looks like >> [~]$ grep -ri hugepages /proc/meminfo >> AnonHugePages: 0 kB >> HugePages_Total: 512 >> HugePages_Free: 512 >> HugePages_Rsvd: 0 >> HugePages_Surp: 0 >> Hugepagesize: 2048 kB >> >> And the linux version is 4.4.20. >> >> On Thu, Feb 23, 2017 at 9:17 AM, Wiles, Keith <[email protected]> >> wrote: >> >> On Feb 22, 2017, at 7:18 PM, Sushil Adhikari <[email protected]> >>>> >>> wrote: >>> >>>> Thank you Keith for the response, >>>> >>>> Yes it should be line 1142 not 1405, I was using 16.11 and now I'm using >>>> >>> 17.02 and still getting the same error. >>> >>> Not sure what to say here, it looks like some type of system >>> configuration >>> issue as I do not see it on my machine. >>> >>> Can you tell if the hugepage has a value and is it sane? The next thing >>> is >>> to see where in that memory is it failing start, end or middle someplace. >>> Use GDB and compile the code with ‘make install >>> T=x86_64-native-lunixapp-gcc EXTRA_CFLAGS=“-g -O0”' then set a break >>> point >>> on ‘b eal_memory.c:1142’ and inspect the memory pointer hugepage. I do >>> not >>> think it is overrun error meaning the size for memset is different then >>> what was allocated and just stepping off the end. >>> >>> Also you did not tell me the linux version you are using? >>> >>> On Wed, Feb 22, 2017 at 8:46 PM, Wiles, Keith <[email protected]> >>>> >>> wrote: >>> >>>> On Feb 22, 2017, at 6:43 PM, Wiles, Keith <[email protected]> >>>>> >>>> wrote: >>> >>>> On Feb 22, 2017, at 6:30 PM, Sushil Adhikari <[email protected]> >>>>>> >>>>> wrote: >>> >>>> I used the basic command line option "dpdkTimer -c 0xf -n 4" >>>>>> And to update on my findings so far I have narrowed down to this >>>>>> >>>>> line(1405) >>> >>>> memset(hugepage, 0, nr_hugefiles * sizeof(struct hugepage_file)); >>>>>> of function rte_eal_hugepage_init() in file >>>>>> >>>>> dpdk\lib\librte_eal\linuxapp\eal\eal_memory.c >>> >>>> What version of DPDK are you using? I was looking at the file at 1405 >>>>> >>>> and I do not see a memset() call. >>> >>>> I found the memset call at 1142 in my 17.05-rc0 code. Please try the >>>> >>> latest version and see if you get the same problem. >>> >>>> Yes I have the hugepages of size 2MB(2048) and when I calculate the >>>>>> >>>>> memory this memset function is trying to set, it comes out to >>> 512(nr_hugefiles) * 4144 ( sizeof(struct hugepage_file) ) = 2121728 which >>> larger than 2MB, so my doubt is that the hugepages I have >>> allocated(512*2MB) is not contiguous 1GB memory its trying to access >>> memory >>> thats not part of hugepage, is that a possibility, even though I am >>> setting >>> up hugepages during boot time by providing it through kernel option. >>> >>>> >>>>>> On Wed, Feb 22, 2017 at 8:05 PM, Wiles, Keith <[email protected]> >>>>>> >>>>> wrote: >>> >>>> On Feb 22, 2017, at 3:05 PM, Sushil Adhikari <[email protected]> >>>>>>> >>>>>> wrote: >>> >>>> Hi, >>>>>>> >>>>>>> I was trying to run dpdk timer app by setting 512 2MB hugepages but >>>>>>> >>>>>> the >>> >>>> application crashed with following error >>>>>>> EAL: Detected 4 lcore(s) >>>>>>> EAL: Probing VFIO support... >>>>>>> Bus error (core dumped) >>>>>>> >>>>>>> If I reduce the number of hugepages to 256 it works fine. I >>>>>>> >>>>>> wondering what >>> >>>> could be the problem here. Here's my cpu info >>>>>>> >>>>>> I normally run with 2048 x 2 or 2048 per socket on my machine. What >>>>>> >>>>> is the command line you are using to start the application? >>> >>>> processor : 0 >>>>>>> vendor_id : GenuineIntel >>>>>>> cpu family : 6 >>>>>>> model : 26 >>>>>>> model name : Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz >>>>>>> stepping : 5 >>>>>>> microcode : 0x11 >>>>>>> cpu MHz : 2794.000 >>>>>>> cache size : 8192 KB >>>>>>> physical id : 0 >>>>>>> siblings : 4 >>>>>>> core id : 0 >>>>>>> cpu cores : 4 >>>>>>> apicid : 0 >>>>>>> initial apicid : 0 >>>>>>> fpu : yes >>>>>>> fpu_exception : yes >>>>>>> cpuid level : 11 >>>>>>> wp : yes >>>>>>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >>>>>>> >>>>>> pge mca >>> >>>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe >>>>>>> >>>>>> syscall nx >>> >>>> rdtscp lm constant_tsc arch_ >>>>>>> perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni >>>>>>> >>>>>> dtes64 >>> >>>> monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt >>>>>>> lahf_lm ida dtherm tpr_shadow vnm >>>>>>> i flexpriority ept vpid >>>>>>> bugs : >>>>>>> bogomips : 5600.00 >>>>>>> clflush size : 64 >>>>>>> cache_alignment : 64 >>>>>>> address sizes : 36 bits physical, 48 bits virtual >>>>>>> power management: >>>>>>> >>>>>>> processor : 1 >>>>>>> vendor_id : GenuineIntel >>>>>>> cpu family : 6 >>>>>>> model : 26 >>>>>>> model name : Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz >>>>>>> stepping : 5 >>>>>>> microcode : 0x11 >>>>>>> cpu MHz : 2794.000 >>>>>>> cache size : 8192 KB >>>>>>> physical id : 0 >>>>>>> siblings : 4 >>>>>>> core id : 1 >>>>>>> cpu cores : 4 >>>>>>> apicid : 2 >>>>>>> initial apicid : 2 >>>>>>> fpu : yes >>>>>>> fpu_exception : yes >>>>>>> cpuid level : 11 >>>>>>> wp : yes >>>>>>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >>>>>>> >>>>>> pge mca >>> >>>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe >>>>>>> >>>>>> syscall nx >>> >>>> rdtscp lm constant_tsc arch_ >>>>>>> perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni >>>>>>> >>>>>> dtes64 >>> >>>> monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt >>>>>>> lahf_lm ida dtherm tpr_shadow vnm >>>>>>> i flexpriority ept vpid >>>>>>> bugs : >>>>>>> bogomips : 5600.00 >>>>>>> clflush size : 64 >>>>>>> cache_alignment : 64 >>>>>>> address sizes : 36 bits physical, 48 bits virtual >>>>>>> power management:...... >>>>>>> >>>>>>> And Here's my meminfo >>>>>>> >>>>>>> MemTotal: 24679608 kB >>>>>>> MemFree: 24014156 kB >>>>>>> MemAvailable: 23950600 kB >>>>>>> Buffers: 3540 kB >>>>>>> Cached: 31436 kB >>>>>>> SwapCached: 0 kB >>>>>>> Active: 21980 kB >>>>>>> Inactive: 22256 kB >>>>>>> Active(anon): 10760 kB >>>>>>> Inactive(anon): 2940 kB >>>>>>> Active(file): 11220 kB >>>>>>> Inactive(file): 19316 kB >>>>>>> Unevictable: 0 kB >>>>>>> Mlocked: 0 kB >>>>>>> SwapTotal: 0 kB >>>>>>> SwapFree: 0 kB >>>>>>> Dirty: 32 kB >>>>>>> Writeback: 0 kB >>>>>>> AnonPages: 9252 kB >>>>>>> Mapped: 11912 kB >>>>>>> Shmem: 4448 kB >>>>>>> Slab: 27712 kB >>>>>>> SReclaimable: 11276 kB >>>>>>> SUnreclaim: 16436 kB >>>>>>> KernelStack: 2672 kB >>>>>>> PageTables: 1000 kB >>>>>>> NFS_Unstable: 0 kB >>>>>>> Bounce: 0 kB >>>>>>> WritebackTmp: 0 kB >>>>>>> CommitLimit: 12077660 kB >>>>>>> Committed_AS: 137792 kB >>>>>>> VmallocTotal: 34359738367 kB >>>>>>> VmallocUsed: 0 kB >>>>>>> VmallocChunk: 0 kB >>>>>>> HardwareCorrupted: 0 kB >>>>>>> AnonHugePages: 2048 kB >>>>>>> CmaTotal: 0 kB >>>>>>> CmaFree: 0 kB >>>>>>> HugePages_Total: 256 >>>>>>> HugePages_Free: 0 >>>>>>> HugePages_Rsvd: 0 >>>>>>> HugePages_Surp: 0 >>>>>>> Hugepagesize: 2048 kB >>>>>>> DirectMap4k: 22000 kB >>>>>>> DirectMap2M: 25133056 kB >>>>>>> >>>>>> Regards, >>>>>> Keith >>>>>> >>>>>> >>>>>> Regards, >>>>> Keith >>>>> >>>> Regards, >>>> Keith >>>> >>>> >>>> Regards, >>> Keith >>> >>> >>> >
