Re: [vpp-dev] About the vppcom_epoll_wait function in vcl

2018-08-03 Thread Florin Coras
Hi Yalei, 

Are you testing with master latest? Yes, blocking mode will be properly 
supported. The reason it won’t block now is because we need to poll multiple 
message queues (vpp’s and cut-through sessions) and -1 is not interpreted as a 
large number. This will be fixed once we switch from condvars to eventfds for 
notifications, at which point, we’ll manage message queue notifications with 
epoll. 

As for your second question, with the latest changes, we only support 
edge-triggered events. That is, vpp notifies apps (so vcl included) that an io 
event (rx or tx space available) has occurred only once. VCL consumes that 
event and then notifies the app. The expectation is that apps will proceed to 
doing rx/tx until no more data/space is available and then enter epoll again. A 
side effect of that is that epoll will not return EPOLLOUT unless the fifo 
really underwent a change from full to not full. Select on the other hand does 
poll all fifos to check if tx space is available, but that also makes it 
considerably slower. 

Florin

> On Aug 3, 2018, at 1:14 AM, wylandrea  wrote:
> 
> Hi,
> 
> These days, I tested the vcl and get some confusion. I don't know whether I 
> understand it right. Anyone could help me?
> 
> looks like vppcom_epoll_wait implement in non-blocking mode, it will loop and 
> check the fifo of the  related sessions.
> 
> Will vpp implement a blocking mode epoll_wait?
> 
> Another issue is that seems it only support EPOLLLT, clear_et_mask is not 
> used.
> 
> Any ideas?
> 
>  /Yalei 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> 
> View/Reply Online (#10028): https://lists.fd.io/g/vpp-dev/message/10028
> Mute This Topic: https://lists.fd.io/mt/24152381/675152
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [fcoras.li...@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10034): https://lists.fd.io/g/vpp-dev/message/10034
Mute This Topic: https://lists.fd.io/mt/24152381/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Regarding VPP TCP Stack usage

2018-08-03 Thread Florin Coras
Hi Yalei, 

Pretty much. We supported nginx forking at one point, but that code was not 
maintained. 

I’m now working on refactoring vcl and in the process adding multiple worker 
support in both vcl and the stack. The high level plan is pretty much the one 
I’ve stated on the list. I don’t have anything beyond that written down but if 
you have specific questions in mind, I’ll try to answer them :-)

As for your question concerning Envoy, I’m tempted to answer yes, mainly 
because latency/throughput is important for it. Will it work with LDP after I 
finish the refactor? I don’t know! I’ve never looked at the code and how it 
uses the sockets. Who knows, we may get lucky :-)

Florin

> On Aug 2, 2018, at 11:18 PM, 汪亚雷  wrote:
> 
> Hi Dave & Florin,
> 
> I am curious about this line "(and only with single workers)." ?  could you 
> light me some more? do you mean vcl support the APP which has one worker now, 
> the app could not 'fork'?
> 
> And as you mentioned, refactoring VCL infrastructure, is there a detailed 
> plan? will it be completed in 18.10?
> 
> If you advice "refactor legacy applications to use the VCL API directly." , 
> for the envoy integration, modification for envoy src code is necessary?
> 
> Thank you all!
> 
> /Yalei
> 
> Dave Wallace mailto:dwallac...@gmail.com>> 
> 于2018年8月1日周三 上午3:39写道:
> Florin is correct.  There is also a performance and/or scaling penalty due to 
> the need to handle both kernel socket based file descriptors and VCL/VPP 
> created file descriptors with the LD_PRELOAD callback functions.
> 
> Thanks,
> -daw-
> 
> On 7/31/18 2:11 PM, Florin Coras wrote:
>> Hi Matt, 
>> 
>> I’d say that trying to cover all possible combinations of POSIX calls is the 
>> main issue. Also, statically linked applications won’t work fine with 
>> ld_preload. But, I’ll let Dave provide more details since he is more closely 
>> involved with the effort. 
>> 
>> Florin
>> 
>> 
>>> On Jul 31, 2018, at 7:01 AM, Matthew Smith >> > wrote:
>>> 
>>> 
>>> Hi Florin and Dave,
>>> 
>>> I’m curious what problems were observed with the LD_PRELOAD mechanism. Were 
>>> there performance issues? Or was it too difficult to try and cover 
>>> different usage of POSIX calls? Or something else?
>>> 
>>> Thanks!
>>> -Matt
>>> 
>>> 
 On Jul 30, 2018, at 10:39 AM, Florin Coras >>> > wrote:
 
 Prashant, 
 
 Dave is exactly right. If you still want to try out the LDP layer, I 
 wouldn’t set a global LD_PRELOAD variable because that will end up 
 preloading all the applications and, inevitably, to some unsupported usage 
 patterns and crashes. Instead, start only your app with LD_PRELOAD set, 
 something like:
 
 LD_PRELOAD=../vpp/build-root/install-vpp_debug-native/vpp/lib64/libvcl_ldpreload.so
  
 
 Note that we’re exercising both the vcl and ldp layers with our test 
 infrastructure. So, you may also want to take a look at test_vcl for more 
 details on how we use the ldp layer. 
 
 Hope this helps,
 Florin
 
 
> On Jul 30, 2018, at 8:09 AM, Dave Wallace  > wrote:
> 
> Prashant,
> 
> The VCL LD_PRELOAD library is experimental and only works with a very 
> limited set of legacy POSIX sockets applications (and only with single 
> workers).
> 
> The conclusion based on the results of the initial experimentation with 
> LD_PRELOAD is that it is not a viable mechanism for accelerating legacy 
> POSIX sockets based applications using the VPP host stack.  The current 
> recommendation is to refactor legacy applications to use the VCL API 
> directly.
> 
> You should also be aware that the VCL infrastructure is in the middle of 
> being refactored at this time and thus the VCL API may change.  I'll let 
> Florin, who is doing the refactoring, add his input on the VCL API 
> roadmap.
> 
> Thanks,
> -daw-
> 
> On 7/30/2018 7:21 AM, Prashant Upadhyaya wrote:
>> Hi,
>> 
>> I have compiled VPP and it's running. I have an interface up and can
>> ping the IP applied there.
>> 
>> Now I am trying to bring up a legacy application TCP server (the one
>> which uses POSIX calls). So I set the LD_PRELOAD to point to
>> .../vpp/build-root/install-vpp_debug-native/vpp/lib64/libvcl_ldpreload.so
>> But the server application now crashes on startup.
>> Even the ldd command starts crashing.
>> 
>> Can somebody point me to the correct set of steps to be used for
>> LD_PRELOAD to bring up my legacy tcp server which will then engage the
>> VPP TCP stack instead of the kernel's
>> 
>> Regards
>> -Prashant
>> 
>> 
>> -=-=-=-=-=-=-=-=-=-=-=-
>> Links: You receive all messages sent to this group.
>> 
>> View/Reply Online (#9971): https://lists.fd.io/g/vpp-dev/message/9971 
>> 

Re: [vpp-dev] Large memory spike during make verify on ARM machine ThunderX

2018-08-03 Thread Neale Ranns via Lists.Fd.Io

The C++ language bindings are all templates. It’s the VOM compilation (that 
uses those templates) that consumes the memory. VOM is already in extras and 
these days only compiled if you do ‘make test-ext’ or ‘make ’

/neale


From: Ole Troan 
Date: Friday, 3 August 2018 at 12:51
To: Juraj Linkeš 
Cc: "Neale Ranns (nranns)" , "vpp-dev@lists.fd.io" 

Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Move the C++ language binding to extras?

Ole

On 3 Aug 2018, at 12:45, Juraj Linkeš 
mailto:juraj.lin...@pantheon.tech>> wrote:
Hi Neale,

Yea they do require a lot of memory - the same is true for x86. Is there a way 
to specify the max number of these? Or is that done with -j?

Would it be worthwhile to investigate if it's possible to reduce the memory 
requirements of these?

Is there a way to clear the cache so that I could run make verify back to back 
without deleting and recloning the vpp repo? ccache -C didn't work for me.

Thanks,
Juraj

From: Neale Ranns (nranns) [mailto:nra...@cisco.com]
Sent: Thursday, August 2, 2018 11:11 AM
To: Juraj Linkeš 
mailto:juraj.lin...@pantheon.tech>>; 
vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Hi Juraj,

I couldn’t say how much each compile ‘should’ use, but it has been noted in the 
past that these template heavy C++ files do require a lot of memory to compile. 
With the many cores you have, then that’s a lot in total.
‘make wipe’ does not clear the ccache, so any subsequent builds will require 
less memory because the compile is skipped.

/neale

From: mailto:vpp-dev@lists.fd.io>> on behalf of Juraj 
Linkeš mailto:juraj.lin...@pantheon.tech>>
Date: Thursday, 2 August 2018 at 10:10
To: "Neale Ranns (nranns)" mailto:nra...@cisco.com>>, 
"vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Hi Neale,

I'm not specifying -j, but I see a lot of processes running in parallel when 
the spike is happening. The processes are attached. They utilized most of 96 
available cores and most of them used more than 400MB - is that how much they 
should be using?

Also, here's the gcc version on the box:
gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/aarch64-linux-gnu/5/lto-wrapper
Target: aarch64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 
5.4.0-6ubuntu1~16.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs 
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr 
--program-suffix=-5 --enable-shared --enable-linker-build-id 
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix 
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes 
--with-default-libstdcxx-abi=new --enable-gnu-unique-object 
--disable-libquadmath --enable-plugin --with-system-zlib 
--disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo 
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-arm64/jre --enable-java-home 
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-arm64 
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-arm64 
--with-arch-directory=aarch64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar 
--enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror 
--enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu 
--target=aarch64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4)

Thanks,
Juraj

From: Neale Ranns (nranns) [mailto:nra...@cisco.com]
Sent: Wednesday, August 1, 2018 5:09 PM
To: Juraj Linkeš 
mailto:juraj.lin...@pantheon.tech>>; 
vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Hi Juraj,

How many parallel compiles do you have? What’s the j factor

/neale



From: mailto:vpp-dev@lists.fd.io>> on behalf of Juraj 
Linkeš mailto:juraj.lin...@pantheon.tech>>
Date: Wednesday, 1 August 2018 at 16:59
To: "vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: [vpp-dev] Large memory spike during make verify on ARM machine ThunderX

Hi vpp-devs,

I noticed that during a specific portion of make verify build on an ARM 
ThunderX machine the build consumes a lot of memory - around 25GB. I can 
identify the spot in the logs:
Jul 31 03:12:48   CXX  gbp_contract.lo

25GB memory hog

Jul 31 03:16:13   CXXLDlibvom.la

but not much else. I created a ticket which 
contains some more information. I didn't see this memory spike when trying to 
reproducing the behavior on my x86 laptop. Does anyone has any idea what could 
be the cause or how to debug this?

Thanks,
Juraj
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this 

Re: [vpp-dev] Large memory spike during make verify on ARM machine ThunderX

2018-08-03 Thread Neale Ranns via Lists.Fd.Io
Hi Juraj,

Answers/comments inline with [nr]

Regards,
neale

From: Juraj Linkeš 
Date: Friday, 3 August 2018 at 12:45
To: "Neale Ranns (nranns)" , "vpp-dev@lists.fd.io" 

Subject: RE: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Hi Neale,

Yea they do require a lot of memory - the same is true for x86. Is there a way 
to specify the max number of these? Or is that done with -j?

[nr] The j factor for a build is determined based on the number of cores your 
box has.
From build-root/Makefile

# /proc/cpuinfo does not exist on platforms without a /proc and on some
# platforms, notably inside containers, it has no content. In those cases
# we assume there's 1 processor; we use 2*ncpu for the -j option.
# NB: GNU Make 4.2 will let us use '$(file mailto:nra...@cisco.com]
Sent: Thursday, August 2, 2018 11:11 AM
To: Juraj Linkeš ; vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Hi Juraj,

I couldn’t say how much each compile ‘should’ use, but it has been noted in the 
past that these template heavy C++ files do require a lot of memory to compile. 
With the many cores you have, then that’s a lot in total.
‘make wipe’ does not clear the ccache, so any subsequent builds will require 
less memory because the compile is skipped.

/neale

From: mailto:vpp-dev@lists.fd.io>> on behalf of Juraj 
Linkeš mailto:juraj.lin...@pantheon.tech>>
Date: Thursday, 2 August 2018 at 10:10
To: "Neale Ranns (nranns)" mailto:nra...@cisco.com>>, 
"vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Hi Neale,

I'm not specifying -j, but I see a lot of processes running in parallel when 
the spike is happening. The processes are attached. They utilized most of 96 
available cores and most of them used more than 400MB - is that how much they 
should be using?

Also, here's the gcc version on the box:
gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/aarch64-linux-gnu/5/lto-wrapper
Target: aarch64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 
5.4.0-6ubuntu1~16.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs 
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr 
--program-suffix=-5 --enable-shared --enable-linker-build-id 
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix 
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes 
--with-default-libstdcxx-abi=new --enable-gnu-unique-object 
--disable-libquadmath --enable-plugin --with-system-zlib 
--disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo 
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-arm64/jre --enable-java-home 
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-arm64 
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-arm64 
--with-arch-directory=aarch64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar 
--enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror 
--enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu 
--target=aarch64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4)

Thanks,
Juraj

From: Neale Ranns (nranns) [mailto:nra...@cisco.com]
Sent: Wednesday, August 1, 2018 5:09 PM
To: Juraj Linkeš 
mailto:juraj.lin...@pantheon.tech>>; 
vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Hi Juraj,

How many parallel compiles do you have? What’s the j factor

/neale



From: mailto:vpp-dev@lists.fd.io>> on behalf of Juraj 
Linkeš mailto:juraj.lin...@pantheon.tech>>
Date: Wednesday, 1 August 2018 at 16:59
To: "vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: [vpp-dev] Large memory spike during make verify on ARM machine ThunderX

Hi vpp-devs,

I noticed that during a specific portion of make verify build on an ARM 
ThunderX machine the build consumes a lot of memory - around 25GB. I can 
identify the spot in the logs:
Jul 31 03:12:48   CXX  gbp_contract.lo

25GB memory hog

Jul 31 03:16:13   CXXLDlibvom.la

but not much else. I created a ticket which 
contains some more information. I didn't see this memory spike when trying to 
reproducing the behavior on my x86 laptop. Does anyone has any idea what could 
be the cause or how to debug this?

Thanks,
Juraj
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10031): https://lists.fd.io/g/vpp-dev/message/10031
Mute This Topic: https://lists.fd.io/mt/24005970/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Large memory spike during make verify on ARM machine ThunderX

2018-08-03 Thread Ole Troan
Move the C++ language binding to extras?

Ole

> On 3 Aug 2018, at 12:45, Juraj Linkeš  wrote:
> 
> Hi Neale,
>  
> Yea they do require a lot of memory - the same is true for x86. Is there a 
> way to specify the max number of these? Or is that done with -j?
>  
> Would it be worthwhile to investigate if it's possible to reduce the memory 
> requirements of these?
>  
> Is there a way to clear the cache so that I could run make verify back to 
> back without deleting and recloning the vpp repo? ccache -C didn't work for 
> me.
>  
> Thanks,
> Juraj
>  
> From: Neale Ranns (nranns) [mailto:nra...@cisco.com] 
> Sent: Thursday, August 2, 2018 11:11 AM
> To: Juraj Linkeš ; vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
> ThunderX
>  
> Hi Juraj,
>  
> I couldn’t say how much each compile ‘should’ use, but it has been noted in 
> the past that these template heavy C++ files do require a lot of memory to 
> compile. With the many cores you have, then that’s a lot in total.
> ‘make wipe’ does not clear the ccache, so any subsequent builds will require 
> less memory because the compile is skipped.
>  
> /neale
>  
> From:  on behalf of Juraj Linkeš 
> 
> Date: Thursday, 2 August 2018 at 10:10
> To: "Neale Ranns (nranns)" , "vpp-dev@lists.fd.io" 
> 
> Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
> ThunderX
>  
> Hi Neale,
>  
> I'm not specifying -j, but I see a lot of processes running in parallel when 
> the spike is happening. The processes are attached. They utilized most of 96 
> available cores and most of them used more than 400MB - is that how much they 
> should be using?
>  
> Also, here's the gcc version on the box:
> gcc -v
> Using built-in specs.
> COLLECT_GCC=gcc
> COLLECT_LTO_WRAPPER=/usr/lib/gcc/aarch64-linux-gnu/5/lto-wrapper
> Target: aarch64-linux-gnu
> Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 
> 5.4.0-6ubuntu1~16.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs 
> --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr 
> --program-suffix=-5 --enable-shared --enable-linker-build-id 
> --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix 
> --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu 
> --enable-libstdcxx-debug --enable-libstdcxx-time=yes 
> --with-default-libstdcxx-abi=new --enable-gnu-unique-object 
> --disable-libquadmath --enable-plugin --with-system-zlib 
> --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo 
> --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-arm64/jre --enable-java-home 
> --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-arm64 
> --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-arm64 
> --with-arch-directory=aarch64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar 
> --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror 
> --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu 
> --target=aarch64-linux-gnu
> Thread model: posix
> gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4)
>  
> Thanks,
> Juraj
>  
> From: Neale Ranns (nranns) [mailto:nra...@cisco.com] 
> Sent: Wednesday, August 1, 2018 5:09 PM
> To: Juraj Linkeš ; vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
> ThunderX
>  
> Hi Juraj,
>  
> How many parallel compiles do you have? What’s the j factor
>  
> /neale
>  
>  
>  
> From:  on behalf of Juraj Linkeš 
> 
> Date: Wednesday, 1 August 2018 at 16:59
> To: "vpp-dev@lists.fd.io" 
> Subject: [vpp-dev] Large memory spike during make verify on ARM machine 
> ThunderX
>  
> Hi vpp-devs,
>  
> I noticed that during a specific portion of make verify build on an ARM 
> ThunderX machine the build consumes a lot of memory - around 25GB. I can 
> identify the spot in the logs:
> Jul 31 03:12:48   CXX  gbp_contract.lo
>  
> 25GB memory hog
>  
> Jul 31 03:16:13   CXXLDlibvom.la
>  
> but not much else. I created a ticket which contains some more information. I 
> didn't see this memory spike when trying to reproducing the behavior on my 
> x86 laptop. Does anyone has any idea what could be the cause or how to debug 
> this?
>  
> Thanks,
> Juraj
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> 
> View/Reply Online (#10029): https://lists.fd.io/g/vpp-dev/message/10029
> Mute This Topic: https://lists.fd.io/mt/24005970/675193
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [otr...@employees.org]
> -=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10030): https://lists.fd.io/g/vpp-dev/message/10030
Mute This Topic: https://lists.fd.io/mt/24005970/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Large memory spike during make verify on ARM machine ThunderX

2018-08-03 Thread Juraj Linkeš
Hi Neale,

Yea they do require a lot of memory - the same is true for x86. Is there a way 
to specify the max number of these? Or is that done with -j?

Would it be worthwhile to investigate if it's possible to reduce the memory 
requirements of these?

Is there a way to clear the cache so that I could run make verify back to back 
without deleting and recloning the vpp repo? ccache -C didn't work for me.

Thanks,
Juraj

From: Neale Ranns (nranns) [mailto:nra...@cisco.com]
Sent: Thursday, August 2, 2018 11:11 AM
To: Juraj Linkeš ; vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Hi Juraj,

I couldn’t say how much each compile ‘should’ use, but it has been noted in the 
past that these template heavy C++ files do require a lot of memory to compile. 
With the many cores you have, then that’s a lot in total.
‘make wipe’ does not clear the ccache, so any subsequent builds will require 
less memory because the compile is skipped.

/neale

From: mailto:vpp-dev@lists.fd.io>> on behalf of Juraj 
Linkeš mailto:juraj.lin...@pantheon.tech>>
Date: Thursday, 2 August 2018 at 10:10
To: "Neale Ranns (nranns)" mailto:nra...@cisco.com>>, 
"vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Hi Neale,

I'm not specifying -j, but I see a lot of processes running in parallel when 
the spike is happening. The processes are attached. They utilized most of 96 
available cores and most of them used more than 400MB - is that how much they 
should be using?

Also, here's the gcc version on the box:
gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/aarch64-linux-gnu/5/lto-wrapper
Target: aarch64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 
5.4.0-6ubuntu1~16.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs 
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr 
--program-suffix=-5 --enable-shared --enable-linker-build-id 
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix 
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes 
--with-default-libstdcxx-abi=new --enable-gnu-unique-object 
--disable-libquadmath --enable-plugin --with-system-zlib 
--disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo 
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-arm64/jre --enable-java-home 
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-arm64 
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-arm64 
--with-arch-directory=aarch64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar 
--enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror 
--enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu 
--target=aarch64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4)

Thanks,
Juraj

From: Neale Ranns (nranns) [mailto:nra...@cisco.com]
Sent: Wednesday, August 1, 2018 5:09 PM
To: Juraj Linkeš 
mailto:juraj.lin...@pantheon.tech>>; 
vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Large memory spike during make verify on ARM machine 
ThunderX

Hi Juraj,

How many parallel compiles do you have? What’s the j factor

/neale



From: mailto:vpp-dev@lists.fd.io>> on behalf of Juraj 
Linkeš mailto:juraj.lin...@pantheon.tech>>
Date: Wednesday, 1 August 2018 at 16:59
To: "vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: [vpp-dev] Large memory spike during make verify on ARM machine ThunderX

Hi vpp-devs,

I noticed that during a specific portion of make verify build on an ARM 
ThunderX machine the build consumes a lot of memory - around 25GB. I can 
identify the spot in the logs:
Jul 31 03:12:48   CXX  gbp_contract.lo

25GB memory hog

Jul 31 03:16:13   CXXLDlibvom.la

but not much else. I created a ticket which 
contains some more information. I didn't see this memory spike when trying to 
reproducing the behavior on my x86 laptop. Does anyone has any idea what could 
be the cause or how to debug this?

Thanks,
Juraj
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10029): https://lists.fd.io/g/vpp-dev/message/10029
Mute This Topic: https://lists.fd.io/mt/24005970/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] About the vppcom_epoll_wait function in vcl

2018-08-03 Thread wylandrea
Hi,

These days, I tested the vcl and get some confusion. I don't know whether I
understand it right. Anyone could help me?

looks like vppcom_epoll_wait implement in non-blocking mode, it will loop
and check the fifo of the  related sessions.

Will vpp implement a blocking mode epoll_wait?

Another issue is that seems it only support EPOLLLT, clear_et_mask is not
used.

Any ideas?

 /Yalei
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10028): https://lists.fd.io/g/vpp-dev/message/10028
Mute This Topic: https://lists.fd.io/mt/24152381/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

2018-08-03 Thread Peter Mikus via Lists.Fd.Io
Hello vpp-dev,

Can you please take a look and advise? Currently 50-70% of CSIT tests on SKX 
(Ubuntu 18.04) are failing. About 10% affected on Haswell testbeds (Ubuntu 
16.04).

Thank you in advance.

Peter Mikus
Engineer – Software
Cisco Systems Limited


-Original Message-
From: Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) 
Sent: Thursday, August 02, 2018 1:05 PM
To: Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) 
; Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
; Ray Kinsella ; vpp-dev@lists.fd.io
Subject: RE: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

Added a Jira comment [1] with some details and attached the same dump (just 
compressed better) to the Jira bug report.

Vratko.

[1] 
https://jira.fd.io/browse/VPP-1361?focusedCommentId=13104=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13104

-Original Message-
From: vpp-dev@lists.fd.io  On Behalf Of Vratko Polak -X 
(vrpolak - PANTHEON TECHNOLOGIES at Cisco) via Lists.Fd.Io
Sent: Wednesday, 2018-August-01 18:13
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
; Ray Kinsella ; vpp-dev@lists.fd.io
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

> VPP is not crashing so no core dump are available.

I tried to use "gcore" command to create a core dump from running VPP.
So far I got this [0] archive, compressed to around 25 MB, but the core file 
inside is around 160 GB big.

Not sure how to make it smaller, even with small numbers in startup.conf, the 
core file has around 140 GB.

Vratko.

[0] 
https://jenkins.fd.io/sandbox/job/csit-vpp-perf-verify-master-2n-skx/13/artifact/archive/DUT1_cores.tar.xz

-Original Message-
From: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco)
Sent: Tuesday, 2018-July-31 13:25
To: Ray Kinsella ; vpp-dev@lists.fd.io
Cc: csit-...@lists.fd.io; Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at 
Cisco) ; yulong@intel.com
Subject: RE: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

Hello,

Thanks to Vratko (cc), he tested latest master with DPDK 18.02.2 [0]. The issue 
is there as well.

I cannot confirm if "no JSON data.VAT" is related. The bad thing is that there 
is no meaningful return message with more verbose output.

(we do see this on pretty much on all NIC cards in LF and all TBs)

[0] 
https://jenkins.fd.io/sandbox/job/vpp-csit-verify-hw-perf-master-2n-skx/6/consoleFull

Peter Mikus
Engineer – Software
Cisco Systems Limited

-Original Message-
From: Ray Kinsella [mailto:m...@ashroe.eu]
Sent: Tuesday, July 31, 2018 12:06 PM
To: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) 
; vpp-dev@lists.fd.io; yulong@intel.com
Subject: Re: [vpp-dev] CSIT - sw_interface_set_flags admin-up link-up failing

Hi Peter,

It may be unrelated, but I think we see this issue also pretty regularly with 
FD.io VPP 18.04 and the x520, on our local test rig.

The error we typically see is "VAT command sw_interface_set_flags sw_if_index 1 
admin-up: no JSON data.VAT".

Do think it is the same or a separate issue?

Ray K


On 30/07/2018 08:02, Peter Mikus via Lists.Fd.Io wrote:
> Hello vpp-dev,
> 
> I am looking for consultation. We started to test VPP for report on 
> all LF CSIT testbeds Skylakes and Haswells.
> 
> We are observing weird behavior. In each test we are using sequence to 
> first bring the both interfaces (physical up) by VAT:
> 
>    sw_interface_set_flags sw_if_index  admin-up (I 
> also tried sw_interface_set_flags sw_if_index idx admin-up link-up)
> 
> After setting all interfaces UP we are testing if interfaces are 
> really UP by VAT (loop 30times, 1s between API call check): 
> “sw_interface_dump”.
> 
> It wasn’t an issue in past but recently we start seeing that 
> sw_interface_dump is reporting interfaces as link_down (admin-up).
> 
> Notes/symptoms:
> 
> -Our sw_interface_dump check is running 30x (1s interval) in loop.
> 
> -Link-down is random, sometimes both interfaces are link-up sometimes 
> just one and sometimes both link are down.
> 
> -_It is not TB related_, nor cabling related, we see it on 
> Haswells-3node in like 1 out of 70 tests, Skylakes-2node 1 out of 70, 
> but on Skylake-3node more than half of the tests.
> 
> -Checking state during test reveals that interfaces are link-down 
> (show
> int) so “sw_interface_dump” is reporting state correctly.
> 
> -Doing CLI during test “set interface state … up” does bring 
> interfaces UP -> (but it is hard to check the timing here).
> 
> -Affected are mostly x520 and x710, but that is most probably because 
> of statistics (low coverage of other NICs like xxv710 and xl710).
> 
> -We have seen this in master vpp as well as rc2 vpp.
> 
> -It is not clear when this starts to happen, so bisecting would take 
> lot of time.
> 
> -This was spotted on VIRL as well also on Memif interface which bring 
> us to suspicious that this is 

Re: [vpp-dev] Regarding VPP TCP Stack usage

2018-08-03 Thread wylandrea
Hi Dave & Florin,

I am curious about this line "(and only with single workers)." ?  could you
light me some more? do you mean vcl support the APP which has one worker
now, the app could not 'fork'?

And as you mentioned, refactoring VCL infrastructure, is there a
detailed plan? will it be completed in 18.10?

If you advice "refactor legacy applications to use the VCL API directly." ,
for the envoy integration, modification for envoy src code is necessary?

Thank you all!

/Yalei

Dave Wallace  于2018年8月1日周三 上午3:39写道:

> Florin is correct.  There is also a performance and/or scaling penalty due
> to the need to handle both kernel socket based file descriptors and VCL/VPP
> created file descriptors with the LD_PRELOAD callback functions.
>
> Thanks,
> -daw-
>
> On 7/31/18 2:11 PM, Florin Coras wrote:
>
> Hi Matt,
>
> I’d say that trying to cover all possible combinations of POSIX calls is
> the main issue. Also, statically linked applications won’t work fine with
> ld_preload. But, I’ll let Dave provide more details since he is more
> closely involved with the effort.
>
> Florin
>
>
> On Jul 31, 2018, at 7:01 AM, Matthew Smith  wrote:
>
>
> Hi Florin and Dave,
>
> I’m curious what problems were observed with the LD_PRELOAD mechanism.
> Were there performance issues? Or was it too difficult to try and cover
> different usage of POSIX calls? Or something else?
>
> Thanks!
> -Matt
>
>
> On Jul 30, 2018, at 10:39 AM, Florin Coras  wrote:
>
> Prashant,
>
> Dave is exactly right. If you still want to try out the LDP layer, I
> wouldn’t set a global LD_PRELOAD variable because that will end up
> preloading all the applications and, inevitably, to some unsupported usage
> patterns and crashes. Instead, start only your app with LD_PRELOAD set,
> something like:
>
> LD_PRELOAD=../vpp/build-root/install-vpp_debug-native/vpp/lib64/libvcl_ldpreload.so
> 
>
> Note that we’re exercising both the vcl and ldp layers with our test
> infrastructure. So, you may also want to take a look at test_vcl for more
> details on how we use the ldp layer.
>
> Hope this helps,
> Florin
>
>
> On Jul 30, 2018, at 8:09 AM, Dave Wallace  wrote:
>
> Prashant,
>
> The VCL LD_PRELOAD library is experimental and only works with a very
> limited set of legacy POSIX sockets applications (and only with single
> workers).
>
> The conclusion based on the results of the initial experimentation with
> LD_PRELOAD is that it is not a viable mechanism for accelerating legacy
> POSIX sockets based applications using the VPP host stack.  The current
> recommendation is to refactor legacy applications to use the VCL API
> directly.
>
> You should also be aware that the VCL infrastructure is in the middle of
> being refactored at this time and thus the VCL API may change.  I'll let
> Florin, who is doing the refactoring, add his input on the VCL API roadmap.
>
> Thanks,
> -daw-
>
> On 7/30/2018 7:21 AM, Prashant Upadhyaya wrote:
>
> Hi,
>
> I have compiled VPP and it's running. I have an interface up and can
> ping the IP applied there.
>
> Now I am trying to bring up a legacy application TCP server (the one
> which uses POSIX calls). So I set the LD_PRELOAD to point to
> .../vpp/build-root/install-vpp_debug-native/vpp/lib64/libvcl_ldpreload.so
> But the server application now crashes on startup.
> Even the ldd command starts crashing.
>
> Can somebody point me to the correct set of steps to be used for
> LD_PRELOAD to bring up my legacy tcp server which will then engage the
> VPP TCP stack instead of the kernel's
>
> Regards
> -Prashant
>
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
>
> View/Reply Online (#9971): https://lists.fd.io/g/vpp-dev/message/9971
> Mute This Topic: https://lists.fd.io/mt/23858819/675079
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [dwallac...@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
>
> View/Reply Online (#9973): https://lists.fd.io/g/vpp-dev/message/9973
> Mute This Topic: https://lists.fd.io/mt/23858819/675152
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [fcoras.li...@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
>
> View/Reply Online (#9974): https://lists.fd.io/g/vpp-dev/message/9974
> Mute This Topic: https://lists.fd.io/mt/23858819/675725
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [mgsm...@netgate.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
>
> View/Reply Online (#9995): https://lists.fd.io/g/vpp-dev/message/9995
> Mute This Topic: https://lists.fd.io/mt/23858819/675329
> Group Owner: vpp-dev+ow...@lists.fd.io
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [wyland...@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>