Re: [vpp-dev] building vpp v19.04.2

2020-07-23 Thread sadhanakesavan
ok commenting the session enable config fixes it
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17078): https://lists.fd.io/g/vpp-dev/message/17078
Mute This Topic: https://lists.fd.io/mt/75754519/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] building vpp v19.04.2

2020-07-23 Thread sadhanakesavan
additional logs on adding nodaemon
/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: clib_elf_parse_file: open 
`linux-vdso.so.1': No such file or directory

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: lacp_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: memif_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: nat_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: vmxnet3_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: nsim_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: ct6_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: acl_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: ioam_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: avf_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: cdp_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: ikev2_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: lb_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: pppoe_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: nsh_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: mactime_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: stn_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: gtpu_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: load_one_vat_plugin:67: 
Loaded plugin: flowprobe_test_plugin.so

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: vlib_pci_bind_to_uio: 
Skipping PCI device :03:00.0 as host interface ens160 is up

/vpp/build-root/build-vpp-native/vpp/bin/vpp[18911]: dpdk: EAL init args: -c 2 
-n 4 --in-memory --file-prefix vpp -b :03:00.0 -b :0b:00.0 -b 
:13:00.0 -b :1b:00.0 --master-lcore 1
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17077): https://lists.fd.io/g/vpp-dev/message/17077
Mute This Topic: https://lists.fd.io/mt/75754519/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] building vpp v19.04.2

2020-07-23 Thread sadhanakesavan
Hi,
I am trying to build vpp v19.04.2 for spdk integration.
after i run ./extras/vagrant/build.sh and make build,
when i run sudo /vpp/build-root/install-vpp_debug-native/vpp/bin/vpp -c 
/etc/vpp/startup.conf
a background thread is not created and instead plugin load logs are generated. 
This is the startup.conf file:

unix {

# nodaemon

log /var/log/vpp.log

full-coredump

cli-listen /run/vpp/cli.sock

}

session {

evt_qs_memfd_seg

enable

}

api-segment {

# gid vpp

prefix vpp1

}

socksvr { socket-name /var/run/vpp.sock }

plugins {

path /vpp/build-root/build-vpp_debug-native/vpp/lib/vpp_plugins

## Disable all plugins by default and then selectively enable specific plugins

plugin default { enable }

}

was it somewhat different in the v19.04 with respect to configuring as 
background process?i am running this on ubuntu
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17076): https://lists.fd.io/g/vpp-dev/message/17076
Mute This Topic: https://lists.fd.io/mt/75754519/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] debugging corrupted frame arguments

2020-07-23 Thread Dave Barach via lists.fd.io
What is the invalid buffer index value? How many elements are in the frame? Is 
it always the same frame element which takes a lightning hit?

Without having all of the source code available and a reasonable way to repro 
the issue, it's going to be quite hard to help you find the culprit.

D.
-Original Message-
From: vpp-dev@lists.fd.io  On Behalf Of Christian Hopps
Sent: Thursday, July 23, 2020 1:10 PM
To: vpp-dev 
Cc: Christian Hopps 
Subject: [vpp-dev] debugging corrupted frame arguments

I have a very intermittent memory corruption occurring in the buffer indices 
passed in a nodes frame (encryption node).

Basically one of the indices is clearly not a vadlid buffer index, and this is 
leading to a SIGSEGV when the code attempts to use the buffer.

I see that vlib_frame_ts are allocated from the main heap, so I'm wondering is 
there any heap/alloc debugging I can enable to help figure out what is 
corrupting this vlib_frame_t?

FWIW what seems weird is that I am validating the indices in 
vlib_put_frame_to_node() (I changed the validation code to actually resolve the 
buffer index), so apparently the vector is being corrupted between the node 
that creates and puts the vlib_frame_t, and the pending node being dispatched. 
The heap appears to be per CPU as well, this all makes things odd since the 
pending node should be running immediately after the node that put the valid 
frame there. So it's hard to imagine what could be corrupting this memory in 
between the put and the dispatch.

Thanks,
Chris.
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17075): https://lists.fd.io/g/vpp-dev/message/17075
Mute This Topic: https://lists.fd.io/mt/75750436/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Regarding worker loop in VPP

2020-07-23 Thread Prashant Upadhyaya
Thanks Dave for the useful suggestions.

Regards
-Prashant

On Thu, Jul 23, 2020 at 4:24 PM Dave Barach (dbarach)  wrote:
>
> You could use the vlib_node_runtime_perf_counter callback hook to run code 
> between node dispatches, which SHOULD give adequate precision.
>
> Alternatively, spin up 1-N threads to run the shaper and driver TX path, and 
> nothing else. See also the handoff node.
>
> HTH... Dave
>
> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Prashant 
> Upadhyaya
> Sent: Thursday, July 23, 2020 2:39 AM
> To: vpp-dev@lists.fd.io
> Subject: [vpp-dev] Regarding worker loop in VPP
>
> Hi,
>
> I have implemented a shaper as a poll node in VPP. worker.
> The implementation is such that the shaper needs to send packets out which 
> are sitting/scheduled in a timer wheel with microsecond granularity slots.
> The shaper must invoke at a precise regular interval, say every 250 
> microseconds where it will rotate the wheel and if any timers expire then 
> send packets out corresponding to those timers.
>
> Everything works well, till the various other nodes start getting loaded and 
> disturb the invocation of  the shaper poll node at precise intervals. This 
> leads to multiple slots expiring from the timer wheel at times leading to 
> sending out of uneven amount of data depending on how many slots expire in 
> the wheel.
>
> Given the nature of while(1) loop operating in the worker and the graph 
> scheduling present there, is there any way I can have my poll node invoke at 
> high precision time boundary as an exception out of the main loop, do the job 
> there and go back to what the worker loop was doing.
>
> Regards
> -Prashant
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17074): https://lists.fd.io/g/vpp-dev/message/17074
Mute This Topic: https://lists.fd.io/mt/75740975/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] debugging corrupted frame arguments

2020-07-23 Thread Benoit Ganne (bganne) via lists.fd.io
Hi Christian,

You might want to try AddressSanitizer: 
https://fd.io/docs/vpp/master/troubleshooting/sanitizer.html

Best
ben

> -Original Message-
> From: vpp-dev@lists.fd.io  On Behalf Of Christian
> Hopps
> Sent: jeudi 23 juillet 2020 19:10
> To: vpp-dev 
> Cc: Christian Hopps 
> Subject: [vpp-dev] debugging corrupted frame arguments
> 
> I have a very intermittent memory corruption occurring in the buffer
> indices passed in a nodes frame (encryption node).
> 
> Basically one of the indices is clearly not a vadlid buffer index, and
> this is leading to a SIGSEGV when the code attempts to use the buffer.
> 
> I see that vlib_frame_ts are allocated from the main heap, so I'm
> wondering is there any heap/alloc debugging I can enable to help figure
> out what is corrupting this vlib_frame_t?
> 
> FWIW what seems weird is that I am validating the indices in
> vlib_put_frame_to_node() (I changed the validation code to actually
> resolve the buffer index), so apparently the vector is being corrupted
> between the node that creates and puts the vlib_frame_t, and the pending
> node being dispatched. The heap appears to be per CPU as well, this all
> makes things odd since the pending node should be running immediately
> after the node that put the valid frame there. So it's hard to imagine
> what could be corrupting this memory in between the put and the dispatch.
> 
> Thanks,
> Chris.
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17073): https://lists.fd.io/g/vpp-dev/message/17073
Mute This Topic: https://lists.fd.io/mt/75750436/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Create big tables on huge-page

2020-07-23 Thread Honnappa Nagarahalli
Sure. We will create couple of patches (in the areas we are analyzing 
currently) and we can decide from there.
Thanks,
Honnappa

From: Damjan Marion 
Sent: Thursday, July 23, 2020 12:17 PM
To: Honnappa Nagarahalli 
Cc: Lijian Zhang ; vpp-dev ; nd 
; Govindarajan Mohandoss ; 
Jieqiang Wang 
Subject: Re: [vpp-dev] Create big tables on huge-page



Hard to say without seeing the patch. Can you summarize what the changes will 
be in each particular .c file?



On 23 Jul 2020, at 18:00, Honnappa Nagarahalli 
mailto:honnappa.nagaraha...@arm.com>> wrote:

Hi Damjan,
Thank you. Till your patch is ready, would you accept patches 
that would enable creating these tables in 1G huge pages as temporary solution?

Thanks,
Honnappa

From: Damjan Marion mailto:dmar...@me.com>>
Sent: Thursday, July 23, 2020 7:15 AM
To: Lijian Zhang mailto:lijian.zh...@arm.com>>
Cc: vpp-dev mailto:vpp-dev@lists.fd.io>>; nd 
mailto:n...@arm.com>>; Honnappa Nagarahalli 
mailto:honnappa.nagaraha...@arm.com>>; 
Govindarajan Mohandoss 
mailto:govindarajan.mohand...@arm.com>>; 
Jieqiang Wang mailto:jieqiang.w...@arm.com>>
Subject: Re: [vpp-dev] Create big tables on huge-page


I started working on patch which addresses most of this points, few weeks ago, 
but likely I will not have it completed for 20.09.
Even if it is completed, it is probably bad idea to merge it so late in the 
release process….

—
Damjan




On 23 Jul 2020, at 10:45, Lijian Zhang 
mailto:lijian.zh...@arm.com>> wrote:

Hi Maintainers,
From VPP source code, ip4-mtrie table is created on huge-page only when below 
parameters are set in configuration file.
While adjacency table is created on normal-page always.
  36 ip {
  37   heap-size 256M
  38   mtrie-hugetlb
  39 }
In the 10K flow testing, I configured 10K routing entries in ip4-mtrie and 10K 
entries in adjacency table.
By creating ip4-mtrie table on 1G huge-page with above parameters set and 
similarly create adjacency table on 1G huge-page, although I don’t observe 
obvious throughput performance improvement, but TLB misses are dramatically 
reduced.
Do you think configuration of 10K routing entries + 10K adjacency entries is a 
reasonable and possible config, or normally it would be 10K routing entries + 
only several adjacency entries?
Does it make sense to create adjacency table on huge-pages?
Another problem is although above assigned heap-size is 256M, but on 1G 
huge-page system, it seems to occupy a huge-page completely, other memory space 
within that huge-page seems will not be used by other tables.

Same as the bihash based tables, only 2M huge-page system is supported. To 
support creating bihash based tables on 1G huge-page system, each table will 
occupy a 1G huge-page completely, but that will waste a lot of memories.
Is it possible just like pmalloc module, reserving a big memory space on 1G/2M 
huge-pages in initialization stage, and then allocate memory pieces per 
requirement for Bihash, ip4-mtrie and adjacency tables, so that all tables 
could be created on huge-pages and will fully utilize the huge-pages.
I tried to create MAC table on 1G huge-page, and it does improve throughput 
performance.
vpp# show bihash
Name Actual Configured
GBP Endpoints - MAC/BD   1m 1m
b4s 64m 64m
b4s 64m 64m
in2out   10.12m 10.12m
in2out   10.12m 10.12m
ip4-dr   2m 2m
ip4-dr   2m 2m
ip6 FIB fwding table32m 32m
ip6 FIB non-fwding table32m 32m
ip6 mFIB table  32m 32m
l2fib mac table512m 512m
mapping_by_as4  64m 64m
out2in 128m 128m
out2in 128m 128m
out2in   10.12m 10.12m
out2in   10.12m 10.12m
pppoe link table 8m 8m
pppoe session table  8m 8m
static_mapping_by_external  64m 64m
static_mapping_by_local 64m 64m
stn addresses1m 1m
users  648k 648k
users  648k 648k
vip_index_per_port  64m 64m
vxlan4   1m 1m
vxlan4-gbp   1m 1m
Total 1.28g 1.28g

Thanks.



-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17072): https://lists.fd.io/g/vpp-dev/message/17072
Mute This Topic: https://lists.fd.io/mt/75742152/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Create big tables on huge-page

2020-07-23 Thread Damjan Marion via lists.fd.io


Hard to say without seeing the patch. Can you summarize what the changes will 
be in each particular .c file?


> On 23 Jul 2020, at 18:00, Honnappa Nagarahalli  
> wrote:
> 
> Hi Damjan,
> Thank you. Till your patch is ready, would you accept patches 
> that would enable creating these tables in 1G huge pages as temporary 
> solution?
>
> Thanks,
> Honnappa
>
> From: Damjan Marion mailto:dmar...@me.com>> 
> Sent: Thursday, July 23, 2020 7:15 AM
> To: Lijian Zhang mailto:lijian.zh...@arm.com>>
> Cc: vpp-dev mailto:vpp-dev@lists.fd.io>>; nd 
> mailto:n...@arm.com>>; Honnappa Nagarahalli 
> mailto:honnappa.nagaraha...@arm.com>>; 
> Govindarajan Mohandoss  >; Jieqiang Wang 
> mailto:jieqiang.w...@arm.com>>
> Subject: Re: [vpp-dev] Create big tables on huge-page
>
>
> I started working on patch which addresses most of this points, few weeks 
> ago, but likely I will not have it completed for 20.09.
> Even if it is completed, it is probably bad idea to merge it so late in the 
> release process….
>
> — 
> Damjan
>
> 
> 
> On 23 Jul 2020, at 10:45, Lijian Zhang  > wrote:
>
> Hi Maintainers,
> From VPP source code, ip4-mtrie table is created on huge-page only when below 
> parameters are set in configuration file.
> While adjacency table is created on normal-page always.
>   36 ip {
>   37   heap-size 256M
>   38   mtrie-hugetlb
>   39 }
> In the 10K flow testing, I configured 10K routing entries in ip4-mtrie and 
> 10K entries in adjacency table.
> By creating ip4-mtrie table on 1G huge-page with above parameters set and 
> similarly create adjacency table on 1G huge-page, although I don’t observe 
> obvious throughput performance improvement, but TLB misses are dramatically 
> reduced.
> Do you think configuration of 10K routing entries + 10K adjacency entries is 
> a reasonable and possible config, or normally it would be 10K routing entries 
> + only several adjacency entries?
> Does it make sense to create adjacency table on huge-pages?
> Another problem is although above assigned heap-size is 256M, but on 1G 
> huge-page system, it seems to occupy a huge-page completely, other memory 
> space within that huge-page seems will not be used by other tables.
>
> Same as the bihash based tables, only 2M huge-page system is supported. To 
> support creating bihash based tables on 1G huge-page system, each table will 
> occupy a 1G huge-page completely, but that will waste a lot of memories.
> Is it possible just like pmalloc module, reserving a big memory space on 
> 1G/2M huge-pages in initialization stage, and then allocate memory pieces per 
> requirement for Bihash, ip4-mtrie and adjacency tables, so that all tables 
> could be created on huge-pages and will fully utilize the huge-pages.
> I tried to create MAC table on 1G huge-page, and it does improve throughput 
> performance.
> vpp# show bihash
> Name Actual Configured
> GBP Endpoints - MAC/BD   1m 1m
> b4s 64m 64m
> b4s 64m 64m
> in2out   10.12m 10.12m
> in2out   10.12m 10.12m
> ip4-dr   2m 2m
> ip4-dr   2m 2m
> ip6 FIB fwding table32m 32m
> ip6 FIB non-fwding table32m 32m
> ip6 mFIB table  32m 32m
> l2fib mac table512m 512m
> mapping_by_as4  64m 64m
> out2in 128m 128m
> out2in 128m 128m
> out2in   10.12m 10.12m
> out2in   10.12m 10.12m
> pppoe link table 8m 8m
> pppoe session table  8m 8m
> static_mapping_by_external  64m 64m
> static_mapping_by_local 64m 64m
> stn addresses1m 1m
> users  648k 648k
> users  648k 648k
> vip_index_per_port  64m 64m
> vxlan4   1m 1m
> vxlan4-gbp   1m 1m
> Total 1.28g 1.28g
>
> Thanks.
>
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17071): https://lists.fd.io/g/vpp-dev/message/17071
Mute This Topic: https://lists.fd.io/mt/75742152/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] debugging corrupted frame arguments

2020-07-23 Thread Christian Hopps
I have a very intermittent memory corruption occurring in the buffer indices 
passed in a nodes frame (encryption node).

Basically one of the indices is clearly not a vadlid buffer index, and this is 
leading to a SIGSEGV when the code attempts to use the buffer.

I see that vlib_frame_ts are allocated from the main heap, so I'm wondering is 
there any heap/alloc debugging I can enable to help figure out what is 
corrupting this vlib_frame_t?

FWIW what seems weird is that I am validating the indices in 
vlib_put_frame_to_node() (I changed the validation code to actually resolve the 
buffer index), so apparently the vector is being corrupted between the node 
that creates and puts the vlib_frame_t, and the pending node being dispatched. 
The heap appears to be per CPU as well, this all makes things odd since the 
pending node should be running immediately after the node that put the valid 
frame there. So it's hard to imagine what could be corrupting this memory in 
between the put and the dispatch.

Thanks,
Chris.


signature.asc
Description: Message signed with OpenPGP
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17070): https://lists.fd.io/g/vpp-dev/message/17070
Mute This Topic: https://lists.fd.io/mt/75750436/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Create big tables on huge-page

2020-07-23 Thread Honnappa Nagarahalli
Hi Damjan,
Thank you. Till your patch is ready, would you accept patches 
that would enable creating these tables in 1G huge pages as temporary solution?

Thanks,
Honnappa

From: Damjan Marion 
Sent: Thursday, July 23, 2020 7:15 AM
To: Lijian Zhang 
Cc: vpp-dev ; nd ; Honnappa Nagarahalli 
; Govindarajan Mohandoss 
; Jieqiang Wang 
Subject: Re: [vpp-dev] Create big tables on huge-page


I started working on patch which addresses most of this points, few weeks ago, 
but likely I will not have it completed for 20.09.
Even if it is completed, it is probably bad idea to merge it so late in the 
release process….

—
Damjan



On 23 Jul 2020, at 10:45, Lijian Zhang 
mailto:lijian.zh...@arm.com>> wrote:

Hi Maintainers,
From VPP source code, ip4-mtrie table is created on huge-page only when below 
parameters are set in configuration file.
While adjacency table is created on normal-page always.
  36 ip {
  37   heap-size 256M
  38   mtrie-hugetlb
  39 }
In the 10K flow testing, I configured 10K routing entries in ip4-mtrie and 10K 
entries in adjacency table.
By creating ip4-mtrie table on 1G huge-page with above parameters set and 
similarly create adjacency table on 1G huge-page, although I don’t observe 
obvious throughput performance improvement, but TLB misses are dramatically 
reduced.
Do you think configuration of 10K routing entries + 10K adjacency entries is a 
reasonable and possible config, or normally it would be 10K routing entries + 
only several adjacency entries?
Does it make sense to create adjacency table on huge-pages?
Another problem is although above assigned heap-size is 256M, but on 1G 
huge-page system, it seems to occupy a huge-page completely, other memory space 
within that huge-page seems will not be used by other tables.

Same as the bihash based tables, only 2M huge-page system is supported. To 
support creating bihash based tables on 1G huge-page system, each table will 
occupy a 1G huge-page completely, but that will waste a lot of memories.
Is it possible just like pmalloc module, reserving a big memory space on 1G/2M 
huge-pages in initialization stage, and then allocate memory pieces per 
requirement for Bihash, ip4-mtrie and adjacency tables, so that all tables 
could be created on huge-pages and will fully utilize the huge-pages.
I tried to create MAC table on 1G huge-page, and it does improve throughput 
performance.
vpp# show bihash
Name Actual Configured
GBP Endpoints - MAC/BD   1m 1m
b4s 64m 64m
b4s 64m 64m
in2out   10.12m 10.12m
in2out   10.12m 10.12m
ip4-dr   2m 2m
ip4-dr   2m 2m
ip6 FIB fwding table32m 32m
ip6 FIB non-fwding table32m 32m
ip6 mFIB table  32m 32m
l2fib mac table512m 512m
mapping_by_as4  64m 64m
out2in 128m 128m
out2in 128m 128m
out2in   10.12m 10.12m
out2in   10.12m 10.12m
pppoe link table 8m 8m
pppoe session table  8m 8m
static_mapping_by_external  64m 64m
static_mapping_by_local 64m 64m
stn addresses1m 1m
users  648k 648k
users  648k 648k
vip_index_per_port  64m 64m
vxlan4   1m 1m
vxlan4-gbp   1m 1m
Total 1.28g 1.28g

Thanks.


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17069): https://lists.fd.io/g/vpp-dev/message/17069
Mute This Topic: https://lists.fd.io/mt/75742152/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] TCP timer race and another possible TCP issue

2020-07-23 Thread Florin Coras
Ah, I didn’t try running test.sh 80. The only difference in how I’m running the 
test is that I start vpp outside of start.sh straight from binaries. 

Regards,
Florin

> On Jul 23, 2020, at 8:22 AM, Ivan Shvedunov  wrote:
> 
> Well, I always run the same test, the difference being only
> "test.sh 80" for http_static (it's configured to be listening on that port)
> or just "test.sh" for the proxy. As far as I understand, you run the tests 
> without using the containers, does that include setting up netem like this 
> [1] ?
> 
> [1] https://github.com/ivan4th/vpp-tcp-test/blob/a3b02ec/start.sh#L34-L35 
> 
> On Thu, Jul 23, 2020 at 5:10 PM Florin Coras  > wrote:
> Hi Ivan, 
> 
> Updated [1] but I’m not seeing [3] after several test iterations. 
> 
> Probably the static server needs the same treatment as the proxy. Are you 
> running a slightly different test? All of the builtin apps have the potential 
> to crash vpp or leave the host stack in an unwanted state since they run 
> inline. 
> 
> Either way, to solve this, first step would be to get rid of error like, “no 
> http session for thread 0 session_index x”. Will eventually try to look into 
> it if nobody beats me to it. 
> 
> Regards,
> Florin
> 
>> On Jul 23, 2020, at 4:59 AM, Ivan Shvedunov > > wrote:
>> 
>> http_static produces some errors:
>> /usr/bin/vpp[40]: http_static_server_rx_tx_callback:1010: No http session 
>> for thread 0 session_index 4124
>> /usr/bin/vpp[40]: http_static_server_rx_tx_callback:1010: No http session 
>> for thread 0 session_index 4124
>> /usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13658 disp error 
>> state CLOSE_WAIT flags 0x02 SYN
>> /usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13658 disp error 
>> state CLOSE_WAIT flags 0x02 SYN
>> /usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13350 disp error 
>> state CLOSE_WAIT flags 0x02 SYN
>> 
>> also with multiple different CP connection states related to connections 
>> being closed and receiving SYN / SYN+ACK.
>> The release build crashes (did already happen before, so it's unrelated to 
>> any of the fixes [1]):
>> 
>> /usr/bin/vpp[39]: state_sent_ok:973: BUG: couldn't send response header!
>> /usr/bin/vpp[39]: state_sent_ok:973: BUG: couldn't send response header!
>> 
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x74fdcfb9 in timer_remove (pool=0x7fffb56b6828, elt=)
>> at /src/vpp/src/vppinfra/tw_timer_template.c:154
>> 154 /src/vpp/src/vppinfra/tw_timer_template.c: No such file or directory.
>> #0  0x74fdcfb9 in timer_remove (pool=0x7fffb56b6828,
>> elt=) at /src/vpp/src/vppinfra/tw_timer_template.c:154
>> #1  tw_timer_stop_2t_1w_2048sl (
>> tw=0x7fffb0967728 , handle=7306)
>> at /src/vpp/src/vppinfra/tw_timer_template.c:374
>> ---Type  to continue, or q  to quit---
>> #2  0x7fffb076146f in http_static_server_session_timer_stop 
>> (hs=)
>> at /src/vpp/src/plugins/http_static/static_server.c:126
>> #3  http_static_server_rx_tx_callback (s=0x7fffb5e13a40, cf=CALLED_FROM_RX)
>> at /src/vpp/src/plugins/http_static/static_server.c:1026
>> #4  0x7fffb0760eb8 in http_static_server_rx_callback (
>> s=0x7fffb0967728 )
>> at /src/vpp/src/plugins/http_static/static_server.c:1037
>> #5  0x7774a9de in app_worker_builtin_rx (app_wrk=, 
>> s=0x7fffb5e13a40)
>> at /src/vpp/src/vnet/session/application_worker.c:485
>> #6  app_send_io_evt_rx (app_wrk=, s=0x7fffb5e13a40)
>> at /src/vpp/src/vnet/session/application_worker.c:691
>> #7  0x77713d9a in session_enqueue_notify_inline (s=0x7fffb5e13a40)
>> at /src/vpp/src/vnet/session/session.c:632
>> #8  0x77713fd1 in session_main_flush_enqueue_events 
>> (transport_proto=,
>> thread_index=0) at /src/vpp/src/vnet/session/session.c:736
>> #9  0x763960e9 in tcp46_established_inline (vm=0x75ddc6c0 
>> ,
>> node=, frame=, is_ip4=1) at 
>> /src/vpp/src/vnet/tcp/tcp_input.c:1558
>> #10 tcp4_established_node_fn_hsw (vm=0x75ddc6c0 , 
>> node=,
>> from_frame=0x7fffb5458480) at /src/vpp/src/vnet/tcp/tcp_input.c:1573
>> #11 0x75b5f509 in dispatch_node (vm=0x75ddc6c0 
>> , node=0x7fffb4baf400,
>> type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING, 
>> frame=,
>> last_time_stamp=) at /src/vpp/src/vlib/main.c:1194
>> #12 dispatch_pending_node (vm=0x75ddc6c0 , 
>> pending_frame_index=,
>> last_time_stamp=) at /src/vpp/src/vlib/main.c:1353
>> #13 vlib_main_or_worker_loop (vm=, is_main=1) at 
>> /src/vpp/src/vlib/main.c:1848
>> #14 vlib_main_loop (vm=) at /src/vpp/src/vlib/main.c:1976
>> #15 0x75b5daf0 in vlib_main (vm=0x75ddc6c0 , 
>> input=0x7fffb4762fb0)
>> at /src/vpp/src/vlib/main.c:
>> #16 0x75bc2816 in thread0 (arg=140737318340288) at 
>> /src/vpp/src/vlib/unix/main.c:660
>> #17 0

Re: [vpp-dev] TCP timer race and another possible TCP issue

2020-07-23 Thread Ivan Shvedunov
Well, I always run the same test, the difference being only
"test.sh 80" for http_static (it's configured to be listening on that port)
or just "test.sh" for the proxy. As far as I understand, you run the tests
without using the containers, does that include setting up netem like this
[1] ?

[1] https://github.com/ivan4th/vpp-tcp-test/blob/a3b02ec/start.sh#L34-L35

On Thu, Jul 23, 2020 at 5:10 PM Florin Coras  wrote:

> Hi Ivan,
>
> Updated [1] but I’m not seeing [3] after several test iterations.
>
> Probably the static server needs the same treatment as the proxy. Are you
> running a slightly different test? All of the builtin apps have the
> potential to crash vpp or leave the host stack in an unwanted state since
> they run inline.
>
> Either way, to solve this, first step would be to get rid of error like,
> “no http session for thread 0 session_index x”. Will eventually try to look
> into it if nobody beats me to it.
>
> Regards,
> Florin
>
> On Jul 23, 2020, at 4:59 AM, Ivan Shvedunov  wrote:
>
> http_static produces some errors:
> /usr/bin/vpp[40]: http_static_server_rx_tx_callback:1010: No http session
> for thread 0 session_index 4124
> /usr/bin/vpp[40]: http_static_server_rx_tx_callback:1010: No http session
> for thread 0 session_index 4124
> /usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13658 disp
> error state CLOSE_WAIT flags 0x02 SYN
> /usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13658 disp
> error state CLOSE_WAIT flags 0x02 SYN
> /usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13350 disp
> error state CLOSE_WAIT flags 0x02 SYN
>
> also with multiple different CP connection states related to connections
> being closed and receiving SYN / SYN+ACK.
> The release build crashes (did already happen before, so it's unrelated to
> any of the fixes [1]):
>
> /usr/bin/vpp[39]: state_sent_ok:973: BUG: couldn't send response header!
> /usr/bin/vpp[39]: state_sent_ok:973: BUG: couldn't send response header!
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x74fdcfb9 in timer_remove (pool=0x7fffb56b6828, elt= out>)
> at /src/vpp/src/vppinfra/tw_timer_template.c:154
> 154 /src/vpp/src/vppinfra/tw_timer_template.c: No such file or
> directory.
> #0  0x74fdcfb9 in timer_remove (pool=0x7fffb56b6828,
> elt=) at /src/vpp/src/vppinfra/tw_timer_template.c:154
> #1  tw_timer_stop_2t_1w_2048sl (
> tw=0x7fffb0967728 , handle=7306)
> at /src/vpp/src/vppinfra/tw_timer_template.c:374
> ---Type  to continue, or q  to quit---
> #2  0x7fffb076146f in http_static_server_session_timer_stop
> (hs=)
> at /src/vpp/src/plugins/http_static/static_server.c:126
> #3  http_static_server_rx_tx_callback (s=0x7fffb5e13a40, cf=CALLED_FROM_RX)
> at /src/vpp/src/plugins/http_static/static_server.c:1026
> #4  0x7fffb0760eb8 in http_static_server_rx_callback (
> s=0x7fffb0967728 )
> at /src/vpp/src/plugins/http_static/static_server.c:1037
> #5  0x7774a9de in app_worker_builtin_rx (app_wrk=,
> s=0x7fffb5e13a40)
> at /src/vpp/src/vnet/session/application_worker.c:485
> #6  app_send_io_evt_rx (app_wrk=, s=0x7fffb5e13a40)
> at /src/vpp/src/vnet/session/application_worker.c:691
> #7  0x77713d9a in session_enqueue_notify_inline (s=0x7fffb5e13a40)
> at /src/vpp/src/vnet/session/session.c:632
> #8  0x77713fd1 in session_main_flush_enqueue_events
> (transport_proto=,
> thread_index=0) at /src/vpp/src/vnet/session/session.c:736
> #9  0x763960e9 in tcp46_established_inline (vm=0x75ddc6c0
> ,
> node=, frame=, is_ip4=1) at
> /src/vpp/src/vnet/tcp/tcp_input.c:1558
> #10 tcp4_established_node_fn_hsw (vm=0x75ddc6c0 ,
> node=,
> from_frame=0x7fffb5458480) at /src/vpp/src/vnet/tcp/tcp_input.c:1573
> #11 0x75b5f509 in dispatch_node (vm=0x75ddc6c0
> , node=0x7fffb4baf400,
> type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING,
> frame=,
> last_time_stamp=) at /src/vpp/src/vlib/main.c:1194
> #12 dispatch_pending_node (vm=0x75ddc6c0 ,
> pending_frame_index=,
> last_time_stamp=) at /src/vpp/src/vlib/main.c:1353
> #13 vlib_main_or_worker_loop (vm=, is_main=1) at
> /src/vpp/src/vlib/main.c:1848
> #14 vlib_main_loop (vm=) at /src/vpp/src/vlib/main.c:1976
> #15 0x75b5daf0 in vlib_main (vm=0x75ddc6c0 ,
> input=0x7fffb4762fb0)
> at /src/vpp/src/vlib/main.c:
> #16 0x75bc2816 in thread0 (arg=140737318340288) at
> /src/vpp/src/vlib/unix/main.c:660
> #17 0x74fa9ec4 in clib_calljmp () from
> /usr/lib/x86_64-linux-gnu/libvppinfra.so.20.09
> #18 0x7fffd8b0 in ?? ()
> #19 0x75bc27c8 in vlib_unix_main (argc=,
> argv=)
> at /src/vpp/src/vlib/unix/main.c:733
>
> [1]
> https://github.com/ivan4th/vpp-tcp-test/blob/master/logs/crash-release-http_static-timer_remove.log
>
> On Thu, Jul 23, 2020 at 2:47 PM Ivan Shvedunov via lists.fd.io  gmail@lists.fd.io> wrote:
>
>> Hi,
>> I've found a problem with the timer

Re: [vpp-dev] TCP timer race and another possible TCP issue

2020-07-23 Thread Florin Coras
Hi Ivan, 

Updated [1] but I’m not seeing [3] after several test iterations. 

Probably the static server needs the same treatment as the proxy. Are you 
running a slightly different test? All of the builtin apps have the potential 
to crash vpp or leave the host stack in an unwanted state since they run 
inline. 

Either way, to solve this, first step would be to get rid of error like, “no 
http session for thread 0 session_index x”. Will eventually try to look into it 
if nobody beats me to it. 

Regards,
Florin

> On Jul 23, 2020, at 4:59 AM, Ivan Shvedunov  wrote:
> 
> http_static produces some errors:
> /usr/bin/vpp[40]: http_static_server_rx_tx_callback:1010: No http session for 
> thread 0 session_index 4124
> /usr/bin/vpp[40]: http_static_server_rx_tx_callback:1010: No http session for 
> thread 0 session_index 4124
> /usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13658 disp error 
> state CLOSE_WAIT flags 0x02 SYN
> /usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13658 disp error 
> state CLOSE_WAIT flags 0x02 SYN
> /usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13350 disp error 
> state CLOSE_WAIT flags 0x02 SYN
> 
> also with multiple different CP connection states related to connections 
> being closed and receiving SYN / SYN+ACK.
> The release build crashes (did already happen before, so it's unrelated to 
> any of the fixes [1]):
> 
> /usr/bin/vpp[39]: state_sent_ok:973: BUG: couldn't send response header!
> /usr/bin/vpp[39]: state_sent_ok:973: BUG: couldn't send response header!
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x74fdcfb9 in timer_remove (pool=0x7fffb56b6828, elt=)
> at /src/vpp/src/vppinfra/tw_timer_template.c:154
> 154 /src/vpp/src/vppinfra/tw_timer_template.c: No such file or directory.
> #0  0x74fdcfb9 in timer_remove (pool=0x7fffb56b6828,
> elt=) at /src/vpp/src/vppinfra/tw_timer_template.c:154
> #1  tw_timer_stop_2t_1w_2048sl (
> tw=0x7fffb0967728 , handle=7306)
> at /src/vpp/src/vppinfra/tw_timer_template.c:374
> ---Type  to continue, or q  to quit---
> #2  0x7fffb076146f in http_static_server_session_timer_stop 
> (hs=)
> at /src/vpp/src/plugins/http_static/static_server.c:126
> #3  http_static_server_rx_tx_callback (s=0x7fffb5e13a40, cf=CALLED_FROM_RX)
> at /src/vpp/src/plugins/http_static/static_server.c:1026
> #4  0x7fffb0760eb8 in http_static_server_rx_callback (
> s=0x7fffb0967728 )
> at /src/vpp/src/plugins/http_static/static_server.c:1037
> #5  0x7774a9de in app_worker_builtin_rx (app_wrk=, 
> s=0x7fffb5e13a40)
> at /src/vpp/src/vnet/session/application_worker.c:485
> #6  app_send_io_evt_rx (app_wrk=, s=0x7fffb5e13a40)
> at /src/vpp/src/vnet/session/application_worker.c:691
> #7  0x77713d9a in session_enqueue_notify_inline (s=0x7fffb5e13a40)
> at /src/vpp/src/vnet/session/session.c:632
> #8  0x77713fd1 in session_main_flush_enqueue_events 
> (transport_proto=,
> thread_index=0) at /src/vpp/src/vnet/session/session.c:736
> #9  0x763960e9 in tcp46_established_inline (vm=0x75ddc6c0 
> ,
> node=, frame=, is_ip4=1) at 
> /src/vpp/src/vnet/tcp/tcp_input.c:1558
> #10 tcp4_established_node_fn_hsw (vm=0x75ddc6c0 , 
> node=,
> from_frame=0x7fffb5458480) at /src/vpp/src/vnet/tcp/tcp_input.c:1573
> #11 0x75b5f509 in dispatch_node (vm=0x75ddc6c0 
> , node=0x7fffb4baf400,
> type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING, 
> frame=,
> last_time_stamp=) at /src/vpp/src/vlib/main.c:1194
> #12 dispatch_pending_node (vm=0x75ddc6c0 , 
> pending_frame_index=,
> last_time_stamp=) at /src/vpp/src/vlib/main.c:1353
> #13 vlib_main_or_worker_loop (vm=, is_main=1) at 
> /src/vpp/src/vlib/main.c:1848
> #14 vlib_main_loop (vm=) at /src/vpp/src/vlib/main.c:1976
> #15 0x75b5daf0 in vlib_main (vm=0x75ddc6c0 , 
> input=0x7fffb4762fb0)
> at /src/vpp/src/vlib/main.c:
> #16 0x75bc2816 in thread0 (arg=140737318340288) at 
> /src/vpp/src/vlib/unix/main.c:660
> #17 0x74fa9ec4 in clib_calljmp () from 
> /usr/lib/x86_64-linux-gnu/libvppinfra.so.20.09
> #18 0x7fffd8b0 in ?? ()
> #19 0x75bc27c8 in vlib_unix_main (argc=, 
> argv=)
> at /src/vpp/src/vlib/unix/main.c:733
> 
> [1] 
> https://github.com/ivan4th/vpp-tcp-test/blob/master/logs/crash-release-http_static-timer_remove.log
>  
> 
> On Thu, Jul 23, 2020 at 2:47 PM Ivan Shvedunov via lists.fd.io 
>   > wrote:
> Hi,
> I've found a problem with the timer fix and commented in Gerrit [1] 
> accordingly.
> Basically this change [2] makes the tcp_prepare_retransmit_segment() issue go 
> away for me.
> 
> Concerning the proxy example, I can no longer see the SVM FIFO crashes, but 
> when using debug build, VPP crashes with this error (fu

回覆: 回覆: [vpp-dev] Do VPP NAT have Conntrack-like feature?

2020-07-23 Thread Date Huang
Hi Ole,

Thanks for your reply again!!

Let me work on it!!

Regards,
Date


寄件者: otr...@employees.org 
寄件日期: 2020年7月23日 下午 09:48
收件者: Date Huang 
副本: vpp-dev@lists.fd.io 
主旨: Re: 回覆: [vpp-dev] Do VPP NAT have Conntrack-like feature?

Hi Date,

> Actually, I have some idea to patch for Conntrack-like feature.
> But I think I will need some guideline to submit a patch.
> Could you kindly share some code or docs of "Port overloading with NAT ED" to 
> me? and I can refer it.
>
> I found some gerrit guide of it.
> It will help me to submit some patch.

The port allocation algo is here:
https://git.fd.io/vpp/tree/src/plugins/nat/in2out_ed.c#n194

The static mapping code is here:
https://git.fd.io/vpp/tree/src/plugins/nat/nat44_cli.c#n995

Cheers,
Ole
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17064): https://lists.fd.io/g/vpp-dev/message/17064
Mute This Topic: https://lists.fd.io/mt/75746172/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: 回覆: [vpp-dev] Do VPP NAT have Conntrack-like feature?

2020-07-23 Thread Ole Troan
Hi Date,

> Actually, I have some idea to patch for Conntrack-like feature.
> But I think I will need some guideline to submit a patch.
> Could you kindly share some code or docs of "Port overloading with NAT ED" to 
> me? and I can refer it.
> 
> I found some gerrit guide of it.
> It will help me to submit some patch.

The port allocation algo is here:
https://git.fd.io/vpp/tree/src/plugins/nat/in2out_ed.c#n194

The static mapping code is here:
https://git.fd.io/vpp/tree/src/plugins/nat/nat44_cli.c#n995

Cheers,
Ole-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17063): https://lists.fd.io/g/vpp-dev/message/17063
Mute This Topic: https://lists.fd.io/mt/75745767/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


回覆: [vpp-dev] Do VPP NAT have Conntrack-like feature?

2020-07-23 Thread Date Huang
Hi Ole

Great thanks for your reply.

Actually, I have some idea to patch for Conntrack-like feature.
But I think I will need some guideline to submit a patch.
Could you kindly share some code or docs of "Port overloading with NAT ED" to 
me? and I can refer it.

I found some gerrit guide of it.
It will help me to submit some patch.
Thanks again for your help!

Thanks a lot
Regards,
Date


寄件者: otr...@employees.org 
寄件日期: 2020年7月23日 下午 06:43
收件者: Date Huang 
副本: vpp-dev@lists.fd.io 
主旨: Re: [vpp-dev] Do VPP NAT have Conntrack-like feature?

Hi Date,


Port overloading was added to NAT ED for 20.05.
The static mapping with port overloading isn't yet there.
We would have to split that function from non-port overloading NAT and NAT ED.
Feel free to submit a patch!

Best regards,
Ole


> On 22 Jul 2020, at 18:34, Date Huang  wrote:
>
> Hi all,
>
> I'm using VPP to develop my program.
> Here is my scenario.
> I want to use VPP to build a NAT gateway with only one Public IPv4, and all 
> traffic need to use this Public IP to internet. (for example: 1.1.1.1)
> I only can allow only one port from external firewall
> So I can only use 1.1.1.1:443 for example.
>
>  is in LAN side.
>  is in internet side.
>
> If I setup a DNAT rule to map :1234 to 1.1.1.1:443, and  
> connected to :1234 via 1.1.1.1:443.
> I will need to re-use 1.1.1.1:443 for  connect to :4321.
> In Linux Kernel Netfilter, we can use "Conntrack" to save session, and keep 
> TCP connection.
> So I can remove DNAT rule and create a new rule to map  to  
> without losing  to  connection.
>
> I try to use VPP to speed up performance
> I found VPP will delete related session when I removed DNAT rule.
> So I cannot keep session in VPP.
>
> Here is my startup.conf
>
> nat { endpoint-dependent }
>
> Here is my config in vppctl
>
> set interface mac address TenGigabitEthernet6/0/0 00:00:00:00:00:01
> set interface mac address TenGigabitEthernet6/0/1 00:00:00:00:00:02
> create bond mode round-robin
> bond add BondEthernet0 TenGigabitEthernet6/0/0
> bond add BondEthernet0 TenGigabitEthernet6/0/1
> create sub-interfaces BondEthernet0 10
> create sub-interfaces BondEthernet0 11
> set interface ip address BondEthernet0.10 192.168.1.1/16
> set interface ip address BondEthernet0.11 1.1.1.1/24
> ip route add 0.0.0.0/0 via 1.1.1.254 BondEthernet0.11
> set ip neighbor BondEthernet0.11 1.1.1.254 00:00:00:00:00:03
> set interface state BondEthernet0 up
> set interface state BondEthernet0.10 up
> set interface state BondEthernet0.11 up
> set interface state TenGigabitEthernet6/0/0 up
> set interface state TenGigabitEthernet6/0/1 up
> nat44 add interface address BondEthernet0.11
> set interface nat44 in BondEthernet0.10
> set interface nat44 out BondEthernet0.11
>
> nat44 add static mapping tcp local 10.0.0.2 1234 external 1.1.1.1 443
>
>
> Do you guys have some advice for me?
>
> Thanks a lot
> Regards,
> Date Huang
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17062): https://lists.fd.io/g/vpp-dev/message/17062
Mute This Topic: https://lists.fd.io/mt/75745767/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] How to do Bond interface configuration as fail_over_mac=active in VPP

2020-07-23 Thread Venkatarao M
Thanks! Steven for quick reply.  Is there any road map for supporting the same?

Thanks
Venkatarao Malempati
From: Steven Luong (sluong) [mailto:slu...@cisco.com]
Sent: 20 July 2020 21:53
To: Venkatarao M; vpp-dev@lists.fd.io
Cc: praveenkumar A S; Lokesh Chimbili; Mahesh Sivapuram
Subject: Re: [vpp-dev] How to do Bond interface configuration as 
fail_over_mac=active in VPP

CAUTION: This email originated from outside of Altiostar. Do not click on links 
or open attachments unless you recognize the sender and you are sure the 
content is safe. You will never be asked to reset your Altiostar password via 
email.

It is not supported.

From:  on behalf of Venkatarao M 

Date: Monday, July 20, 2020 at 8:35 AM
To: "vpp-dev@lists.fd.io" 
Cc: praveenkumar A S , Lokesh Chimbili 
, Mahesh Sivapuram 
Subject: [vpp-dev] How to do Bond interface configuration as 
fail_over_mac=active in VPP

Hi all,
We are trying bond interface configuration with VPP and looking for 
configuration as fail_over_mac=active as mentioned in below snippet.
We observed in VPP, default bond interface configuration is fail_over_mac =none 
and couldn’t see any CLI in VPP to configure  bond interface with fail over mac 
as active

Could you please let us know configuration to achieve the same

Snip from the below link
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-using_channel_bonding

[cid:image001.png@01D65EE2.5D47BA50]
VPP Configuration
==
create bond mode active-backup id 100
bond add BondEthernet100 vpp_itf_1
bond add BondEthernet100 vpp_itf_2
ip6 table add 100
set interface ip6 table BondEthernet100 100
set interface state vpp_itf_1 up
set interface state vpp_itf_2 up
set interface state BondEthernet100 up
set interface reassembly BondEthernet100 on

Thanks
Venkatarao Malempati
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17060): https://lists.fd.io/g/vpp-dev/message/17060
Mute This Topic: https://lists.fd.io/mt/75684142/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] NAT port number selection problem, leads to wrong thread index for some sessions

2020-07-23 Thread Ole Troan
Thanks Elias. Merged.

Cheers,
Ole


> On 23 Jul 2020, at 12:24, Elias Rudberg  wrote:
> 
> Hello,
> Just a reminder about this, see below.
> Best regards,
> Elias
> 
>  Forwarded Message 
> From: Elias Rudberg 
> To: vpp-dev@lists.fd.io 
> Subject: [vpp-dev] NAT port number selection problem, leads to wrong 
> thread index for some sessions
> Date: Thu, 02 Jul 2020 20:43:12 +
> 
> Hello VPP experts,
> 
> There seems to be a problem with the way port number is selected for
> NAT: sometimes the selected port number leads to a different thread
> index being selected for out2in packets, making that session useless.
> This applies to the current master branch as well as the latest stable
> branches, I think.
> 
> Here is the story as I understand it, please correct me if I have
> misunderstood something. Each NAT thread has a range of port numbers
> that it can use, and when a new session is created a port number is
> picked at random from within that range. That happens when a in2out
> packet is NATed. Then later when a response comes as a out2in packet,
> VPP needs to make sure it is handled by the correct thread, the same
> thread that created the session.
> 
> The port number to use for a new session is selected in
> nat_alloc_addr_and_port_default() like this:
> 
> portnum = (port_per_thread * snat_thread_index) + snat_random_port(1,
> port_per_thread) + 1024;
> 
> where port_per_thread is the number of ports each thread is allowed to
> use, and snat_random_port() returns a random number in the given range.
> This means that the smallest possible portnum is 1025, that can happen
> when snat_thread_index is zero.
> 
> The corresponding calculation to get the thread index back based on the
> port number is essentially this:
> 
> (portnum - 1024) / port_per_thread
> 
> This works most of the time, but not always. It works in all cases
> except when snat_random_port() returns the largest possible value, in
> that case we end up with the wrong thread index. That means that out2in
> packets arriving for that session get handed off to another thread. The
> other thread is unaware of that session so all out2in packets are then
> dropped for that session.
> 
> Since each thread has thousands of port numbers to choose from and the
> problem only appears for one particular choice, only a small fraction
> of all sessions are affected by this. In my tests there was 8 NAT
> threads, then the port_per_thread value was about 8000 so that the
> probability was about 1/8000 or roughly 0.0125% of all sessions that
> failed.
> 
> The test I used was simply to try many separate ping commands with the
> "-c 1" option, all should give the normal result "1 packets
> transmitted, 1 received, 0% packet loss" but due to this problem some
> of the pings fail. Note that it needs to be separate ping commands so
> that VPP creates a new session for each of them. Provided that you test
> a large enough number of sessions, it is straightforward to reproduce
> the problem.
> 
> It could be fixed in different ways, one way is to simply shift the
> arguments to snat_random_port() down by one:
> snat_random_port(1, port_per_thread)
> -->
> snat_random_port(0, port_per_thread-1)
> 
> I pushed such a change to gerrit, here: 
> https://gerrit.fd.io/r/c/vpp/+/27786
> 
> The smallest port number used then becomes 1024 instead of 1025 as it
> has been so far, I suppose that should be OK since it is the "well-
> known ports" from 0 to 1023 that should be avoided, port 1024 should be
> okay to use. What do you think, does it make sense to fix it in this
> way?
> 
> Best regards,
> Elias
> 
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17059): https://lists.fd.io/g/vpp-dev/message/17059
Mute This Topic: https://lists.fd.io/mt/75267169/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Create big tables on huge-page

2020-07-23 Thread Damjan Marion via lists.fd.io

I started working on patch which addresses most of this points, few weeks ago, 
but likely I will not have it completed for 20.09.
Even if it is completed, it is probably bad idea to merge it so late in the 
release process….

— 
Damjan


> On 23 Jul 2020, at 10:45, Lijian Zhang  wrote:
> 
> Hi Maintainers,
> From VPP source code, ip4-mtrie table is created on huge-page only when below 
> parameters are set in configuration file.
> While adjacency table is created on normal-page always.
>   36 ip {
>   37   heap-size 256M
>   38   mtrie-hugetlb
>   39 }
> In the 10K flow testing, I configured 10K routing entries in ip4-mtrie and 
> 10K entries in adjacency table.
> By creating ip4-mtrie table on 1G huge-page with above parameters set and 
> similarly create adjacency table on 1G huge-page, although I don’t observe 
> obvious throughput performance improvement, but TLB misses are dramatically 
> reduced.
> Do you think configuration of 10K routing entries + 10K adjacency entries is 
> a reasonable and possible config, or normally it would be 10K routing entries 
> + only several adjacency entries?
> Does it make sense to create adjacency table on huge-pages?
> Another problem is although above assigned heap-size is 256M, but on 1G 
> huge-page system, it seems to occupy a huge-page completely, other memory 
> space within that huge-page seems will not be used by other tables.
>
> Same as the bihash based tables, only 2M huge-page system is supported. To 
> support creating bihash based tables on 1G huge-page system, each table will 
> occupy a 1G huge-page completely, but that will waste a lot of memories.
> Is it possible just like pmalloc module, reserving a big memory space on 
> 1G/2M huge-pages in initialization stage, and then allocate memory pieces per 
> requirement for Bihash, ip4-mtrie and adjacency tables, so that all tables 
> could be created on huge-pages and will fully utilize the huge-pages.
> I tried to create MAC table on 1G huge-page, and it does improve throughput 
> performance.
> vpp# show bihash
> Name Actual Configured
> GBP Endpoints - MAC/BD   1m 1m
> b4s 64m 64m
> b4s 64m 64m
> in2out   10.12m 10.12m
> in2out   10.12m 10.12m
> ip4-dr   2m 2m
> ip4-dr   2m 2m
> ip6 FIB fwding table32m 32m
> ip6 FIB non-fwding table32m 32m
> ip6 mFIB table  32m 32m
> l2fib mac table512m 512m
> mapping_by_as4  64m 64m
> out2in 128m 128m
> out2in 128m 128m
> out2in   10.12m 10.12m
> out2in   10.12m 10.12m
> pppoe link table 8m 8m
> pppoe session table  8m 8m
> static_mapping_by_external  64m 64m
> static_mapping_by_local 64m 64m
> stn addresses1m 1m
> users  648k 648k
> users  648k 648k
> vip_index_per_port  64m 64m
> vxlan4   1m 1m
> vxlan4-gbp   1m 1m
> Total 1.28g 1.28g
>
> Thanks.
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17058): https://lists.fd.io/g/vpp-dev/message/17058
Mute This Topic: https://lists.fd.io/mt/75742152/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] TCP timer race and another possible TCP issue

2020-07-23 Thread Ivan Shvedunov
http_static produces some errors:
/usr/bin/vpp[40]: http_static_server_rx_tx_callback:1010: No http session
for thread 0 session_index 4124
/usr/bin/vpp[40]: http_static_server_rx_tx_callback:1010: No http session
for thread 0 session_index 4124
/usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13658 disp error
state CLOSE_WAIT flags 0x02 SYN
/usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13658 disp error
state CLOSE_WAIT flags 0x02 SYN
/usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp conn 13350 disp error
state CLOSE_WAIT flags 0x02 SYN

also with multiple different CP connection states related to connections
being closed and receiving SYN / SYN+ACK.
The release build crashes (did already happen before, so it's unrelated to
any of the fixes [1]):

/usr/bin/vpp[39]: state_sent_ok:973: BUG: couldn't send response header!
/usr/bin/vpp[39]: state_sent_ok:973: BUG: couldn't send response header!

Program received signal SIGSEGV, Segmentation fault.
0x74fdcfb9 in timer_remove (pool=0x7fffb56b6828, elt=)
at /src/vpp/src/vppinfra/tw_timer_template.c:154
154 /src/vpp/src/vppinfra/tw_timer_template.c: No such file or
directory.
#0  0x74fdcfb9 in timer_remove (pool=0x7fffb56b6828,
elt=) at /src/vpp/src/vppinfra/tw_timer_template.c:154
#1  tw_timer_stop_2t_1w_2048sl (
tw=0x7fffb0967728 , handle=7306)
at /src/vpp/src/vppinfra/tw_timer_template.c:374
---Type  to continue, or q  to quit---
#2  0x7fffb076146f in http_static_server_session_timer_stop
(hs=)
at /src/vpp/src/plugins/http_static/static_server.c:126
#3  http_static_server_rx_tx_callback (s=0x7fffb5e13a40, cf=CALLED_FROM_RX)
at /src/vpp/src/plugins/http_static/static_server.c:1026
#4  0x7fffb0760eb8 in http_static_server_rx_callback (
s=0x7fffb0967728 )
at /src/vpp/src/plugins/http_static/static_server.c:1037
#5  0x7774a9de in app_worker_builtin_rx (app_wrk=,
s=0x7fffb5e13a40)
at /src/vpp/src/vnet/session/application_worker.c:485
#6  app_send_io_evt_rx (app_wrk=, s=0x7fffb5e13a40)
at /src/vpp/src/vnet/session/application_worker.c:691
#7  0x77713d9a in session_enqueue_notify_inline (s=0x7fffb5e13a40)
at /src/vpp/src/vnet/session/session.c:632
#8  0x77713fd1 in session_main_flush_enqueue_events
(transport_proto=,
thread_index=0) at /src/vpp/src/vnet/session/session.c:736
#9  0x763960e9 in tcp46_established_inline (vm=0x75ddc6c0
,
node=, frame=, is_ip4=1) at
/src/vpp/src/vnet/tcp/tcp_input.c:1558
#10 tcp4_established_node_fn_hsw (vm=0x75ddc6c0 ,
node=,
from_frame=0x7fffb5458480) at /src/vpp/src/vnet/tcp/tcp_input.c:1573
#11 0x75b5f509 in dispatch_node (vm=0x75ddc6c0
, node=0x7fffb4baf400,
type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING,
frame=,
last_time_stamp=) at /src/vpp/src/vlib/main.c:1194
#12 dispatch_pending_node (vm=0x75ddc6c0 ,
pending_frame_index=,
last_time_stamp=) at /src/vpp/src/vlib/main.c:1353
#13 vlib_main_or_worker_loop (vm=, is_main=1) at
/src/vpp/src/vlib/main.c:1848
#14 vlib_main_loop (vm=) at /src/vpp/src/vlib/main.c:1976
#15 0x75b5daf0 in vlib_main (vm=0x75ddc6c0 ,
input=0x7fffb4762fb0)
at /src/vpp/src/vlib/main.c:
#16 0x75bc2816 in thread0 (arg=140737318340288) at
/src/vpp/src/vlib/unix/main.c:660
#17 0x74fa9ec4 in clib_calljmp () from
/usr/lib/x86_64-linux-gnu/libvppinfra.so.20.09
#18 0x7fffd8b0 in ?? ()
#19 0x75bc27c8 in vlib_unix_main (argc=,
argv=)
at /src/vpp/src/vlib/unix/main.c:733

[1]
https://github.com/ivan4th/vpp-tcp-test/blob/master/logs/crash-release-http_static-timer_remove.log

On Thu, Jul 23, 2020 at 2:47 PM Ivan Shvedunov via lists.fd.io  wrote:

> Hi,
> I've found a problem with the timer fix and commented in Gerrit [1]
> accordingly.
> Basically this change [2] makes the tcp_prepare_retransmit_segment() issue
> go away for me.
>
> Concerning the proxy example, I can no longer see the SVM FIFO crashes,
> but when using debug build, VPP crashes with this error (full log [3])
> during my test:
> /usr/bin/vpp[39]: /src/vpp/src/vnet/tcp/tcp_input.c:2857
> (tcp46_input_inline) assertion `tcp_lookup_is_valid (tc1, b[1],
> tcp_buffer_hdr (b[1]))' fails
>
> When using release build, it produces a lot of messages like this instead:
> /usr/bin/vpp[39]: tcp_input_dispatch_buffer:2812: tcp conn 15168 disp
> error state CLOSE_WAIT flags 0x02 SYN
> /usr/bin/vpp[39]: tcp_input_dispatch_buffer:2812: tcp conn 9417 disp error
> state FIN_WAIT_2 flags 0x12 SYN ACK
> /usr/bin/vpp[39]: tcp_input_dispatch_buffer:2812: tcp conn 10703 disp
> error state TIME_WAIT flags 0x12 SYN ACK
>
> and also
>
> /usr/bin/vpp[39]: active_open_connected_callback:439: connection 85557
> failed!
>
> [1] https://gerrit.fd.io/r/c/vpp/+/27952/4/src/vnet/tcp/tcp_timer.h#39
> [2]
> https://github.com/travelping/vpp/commit/04512323f311ceebfda351672372033b567d37ca
> [3]
> https://github.com/ivan4th/vpp-tcp-test/blob/maste

Re: [vpp-dev] TCP timer race and another possible TCP issue

2020-07-23 Thread Ivan Shvedunov
Hi,
I've found a problem with the timer fix and commented in Gerrit [1]
accordingly.
Basically this change [2] makes the tcp_prepare_retransmit_segment() issue
go away for me.

Concerning the proxy example, I can no longer see the SVM FIFO crashes, but
when using debug build, VPP crashes with this error (full log [3]) during
my test:
/usr/bin/vpp[39]: /src/vpp/src/vnet/tcp/tcp_input.c:2857
(tcp46_input_inline) assertion `tcp_lookup_is_valid (tc1, b[1],
tcp_buffer_hdr (b[1]))' fails

When using release build, it produces a lot of messages like this instead:
/usr/bin/vpp[39]: tcp_input_dispatch_buffer:2812: tcp conn 15168 disp error
state CLOSE_WAIT flags 0x02 SYN
/usr/bin/vpp[39]: tcp_input_dispatch_buffer:2812: tcp conn 9417 disp error
state FIN_WAIT_2 flags 0x12 SYN ACK
/usr/bin/vpp[39]: tcp_input_dispatch_buffer:2812: tcp conn 10703 disp error
state TIME_WAIT flags 0x12 SYN ACK

and also

/usr/bin/vpp[39]: active_open_connected_callback:439: connection 85557
failed!

[1] https://gerrit.fd.io/r/c/vpp/+/27952/4/src/vnet/tcp/tcp_timer.h#39
[2]
https://github.com/travelping/vpp/commit/04512323f311ceebfda351672372033b567d37ca
[3]
https://github.com/ivan4th/vpp-tcp-test/blob/master/logs/crash-debug-proxy-tcp_lookup_is_valid.log#L71

I will look into src/vcl/test/test_vcl.py to see if I can reproduce
something like my test there, thanks!
And waiting for Dave's input concerning the CSIT part, too, of course.


On Thu, Jul 23, 2020 at 5:22 AM Florin Coras  wrote:

> Hi Ivan,
>
> Thanks for the test. After modifying it a bit to run straight from
> binaries, I managed to repro the issue. As expected, the proxy is not
> cleaning up the sessions correctly (example apps do run out of sync ..).
> Here’s a quick patch that solves some of the obvious issues [1] (note that
> it’s chained with gerrit 27952). I didn’t do too much testing, so let me
> know if you hit some other problems. As far as I can tell, 27952 is needed.
>
> As for the CI, I guess there are two types of tests we might want (cc-ing
> Dave since he has experience with this):
> - functional test that could live as part of “make test” infra. The host
> stack already has some functional integration tests, i.e., the vcl tests in
> src/vcl/test/test_vcl.py (quic, tls, tcp also have some). We could do
> something   similar for the proxy app, but the tests need to be lightweight
> as they’re run as part of the verify jobs
> - CSIT scale/performance tests. We could use something like your scripts
> to test the proxy but also ld_preload + nginx and other applications. Dave
> should have more opinions here :-)
>
> Regards,
> Florin
>
> [1] https://gerrit.fd.io/r/c/vpp/+/28041
>
> On Jul 22, 2020, at 1:18 PM, Ivan Shvedunov  wrote:
>
> Concerning the CI: I'd be glad to add that test to "make test", but not
> sure how to approach it. The test is not about containers but more about
> using network namespaces and some tools like wrk to create a lot of TCP
> connections to do some "stress testing" of VPP host stack (and as it was
> noted, it fails not only on the proxy example, but also on http_static
> plugin). It's probably doable w/o any external tooling at all, and even
> without the network namespaces either, using only VPP's own TCP stack, but
> that is probably rather hard. Could you suggest some ideas how it could be
> added to "make test"? Should I add a `test_py` under `tests/` that
> creates host interfaces in VPP and uses these via OS networking instead of
> the packet generator? As far as I can see there's something like that in
> srv6-mobile plugin [1].
>
> [1]
> https://github.com/travelping/vpp/blob/feature/2005/upf/src/plugins/srv6-mobile/extra/runner.py#L125
>
> On Wed, Jul 22, 2020 at 8:25 PM Florin Coras 
> wrote:
>
>> I missed the point about the CI in my other reply. If we can somehow
>> integrate some container based tests into the “make test” infra, I wouldn’t
>> mind at all! :-)
>>
>> Regards,
>> Florin
>>
>> On Jul 22, 2020, at 4:17 AM, Ivan Shvedunov  wrote:
>>
>> Hi,
>> sadly the patch apparently didn't work. It should have worked but for
>> some reason it didn't ...
>>
>> On the bright side, I've made a test case [1] using fresh upstream VPP
>> code with no UPF that reproduces the issues I mentioned, including both
>> timer and TCP retransmit one along with some other possible problems using
>> http_static plugin and the proxy example, along with nginx (with proxy) and
>> wrk.
>>
>> It is docker-based, but the main scripts (start.sh and test.sh) can be
>> used without Docker, too.
>> I've used our own Dockerfiles to build the images, but I'm not sure if
>> that makes any difference.
>> I've added some log files resulting from the runs that crashed in
>> different places. For me, the tests crash on each run, but in different
>> places.
>>
>> The TCP retransmit problem happens with http_static when using the debug
>> build. When using release build, some unrelated crash in timer_remove()
>> happens instead.
>> The SVM FIFO crash happens wh

[vpp-dev] FYI/RFC: artifact naming change for -rc1 and -rc2 builds, and for the per-patch post-major-release builds

2020-07-23 Thread Andrew Yourtchenko
Hi all,

This is to make you aware of a patch that will, when committed, tweak
the versioning/naming
for VPP builds on -rc1 and -rc2 tags (the version string will show
also "-0~gXX"),
and also for the per-patch builds on throttle branches after the major
release but before the .1 release, to read as if it were tagged
"vXX.YY.0" as opposed "vXX.YY".

This is to address the discrepancy that happens with sorting these
artifacts, the details are in
https://wiki.fd.io/view/VPP/ArtifactVersioning

The change is:

https://gerrit.fd.io/r/c/vpp/+/27782

It did have some eyes on it already, this is just to make a broader
community aware,
as well as to get more eyes on it.

I would like to get it in sometime in a week timeframe, unless there
are suggestions on solving the issue more elegantly/gently.

NB: it still does not resolve the version sort issue that would
surface if "viewing" the version string across all the builds, but
because we use the separate repositories for per-oatch/release
candidate builds and the release builds, we do not see the issue. And
fixing it would require bigger changes to the version string.

Thanks!


--a
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17055): https://lists.fd.io/g/vpp-dev/message/17055
Mute This Topic: https://lists.fd.io/mt/75743748/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Regarding worker loop in VPP

2020-07-23 Thread Dave Barach via lists.fd.io
You could use the vlib_node_runtime_perf_counter callback hook to run code 
between node dispatches, which SHOULD give adequate precision. 

Alternatively, spin up 1-N threads to run the shaper and driver TX path, and 
nothing else. See also the handoff node. 

HTH... Dave 

-Original Message-
From: vpp-dev@lists.fd.io  On Behalf Of Prashant Upadhyaya
Sent: Thursday, July 23, 2020 2:39 AM
To: vpp-dev@lists.fd.io
Subject: [vpp-dev] Regarding worker loop in VPP

Hi,

I have implemented a shaper as a poll node in VPP. worker.
The implementation is such that the shaper needs to send packets out which are 
sitting/scheduled in a timer wheel with microsecond granularity slots.
The shaper must invoke at a precise regular interval, say every 250 
microseconds where it will rotate the wheel and if any timers expire then send 
packets out corresponding to those timers.

Everything works well, till the various other nodes start getting loaded and 
disturb the invocation of  the shaper poll node at precise intervals. This 
leads to multiple slots expiring from the timer wheel at times leading to 
sending out of uneven amount of data depending on how many slots expire in the 
wheel.

Given the nature of while(1) loop operating in the worker and the graph 
scheduling present there, is there any way I can have my poll node invoke at 
high precision time boundary as an exception out of the main loop, do the job 
there and go back to what the worker loop was doing.

Regards
-Prashant
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17054): https://lists.fd.io/g/vpp-dev/message/17054
Mute This Topic: https://lists.fd.io/mt/75740975/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Do VPP NAT have Conntrack-like feature?

2020-07-23 Thread Ole Troan
Hi Date,


Port overloading was added to NAT ED for 20.05.
The static mapping with port overloading isn't yet there.
We would have to split that function from non-port overloading NAT and NAT ED.
Feel free to submit a patch!

Best regards,
Ole


> On 22 Jul 2020, at 18:34, Date Huang  wrote:
> 
> Hi all,
> 
> I'm using VPP to develop my program.
> Here is my scenario.
> I want to use VPP to build a NAT gateway with only one Public IPv4, and all 
> traffic need to use this Public IP to internet. (for example: 1.1.1.1)
> I only can allow only one port from external firewall
> So I can only use 1.1.1.1:443 for example.
> 
>  is in LAN side.
>  is in internet side.
> 
> If I setup a DNAT rule to map :1234 to 1.1.1.1:443, and  
> connected to :1234 via 1.1.1.1:443.
> I will need to re-use 1.1.1.1:443 for  connect to :4321.
> In Linux Kernel Netfilter, we can use "Conntrack" to save session, and keep 
> TCP connection.
> So I can remove DNAT rule and create a new rule to map  to  
> without losing  to  connection.
> 
> I try to use VPP to speed up performance
> I found VPP will delete related session when I removed DNAT rule.
> So I cannot keep session in VPP.
> 
> Here is my startup.conf
> 
> nat { endpoint-dependent }
> 
> Here is my config in vppctl
> 
> set interface mac address TenGigabitEthernet6/0/0 00:00:00:00:00:01
> set interface mac address TenGigabitEthernet6/0/1 00:00:00:00:00:02
> create bond mode round-robin
> bond add BondEthernet0 TenGigabitEthernet6/0/0
> bond add BondEthernet0 TenGigabitEthernet6/0/1
> create sub-interfaces BondEthernet0 10
> create sub-interfaces BondEthernet0 11
> set interface ip address BondEthernet0.10 192.168.1.1/16
> set interface ip address BondEthernet0.11 1.1.1.1/24
> ip route add 0.0.0.0/0 via 1.1.1.254 BondEthernet0.11
> set ip neighbor BondEthernet0.11 1.1.1.254 00:00:00:00:00:03
> set interface state BondEthernet0 up
> set interface state BondEthernet0.10 up
> set interface state BondEthernet0.11 up
> set interface state TenGigabitEthernet6/0/0 up
> set interface state TenGigabitEthernet6/0/1 up
> nat44 add interface address BondEthernet0.11
> set interface nat44 in BondEthernet0.10
> set interface nat44 out BondEthernet0.11
> 
> nat44 add static mapping tcp local 10.0.0.2 1234 external 1.1.1.1 443
> 
> 
> Do you guys have some advice for me?
> 
> Thanks a lot
> Regards,
> Date Huang
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17053): https://lists.fd.io/g/vpp-dev/message/17053
Mute This Topic: https://lists.fd.io/mt/75728368/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] NAT port number selection problem, leads to wrong thread index for some sessions

2020-07-23 Thread Elias Rudberg
Hello,
Just a reminder about this, see below.
Best regards,
Elias

 Forwarded Message 
From: Elias Rudberg 
To: vpp-dev@lists.fd.io 
Subject: [vpp-dev] NAT port number selection problem, leads to wrong 
thread index for some sessions
Date: Thu, 02 Jul 2020 20:43:12 +

Hello VPP experts,

There seems to be a problem with the way port number is selected for
NAT: sometimes the selected port number leads to a different thread
index being selected for out2in packets, making that session useless.
This applies to the current master branch as well as the latest stable
branches, I think.

Here is the story as I understand it, please correct me if I have
misunderstood something. Each NAT thread has a range of port numbers
that it can use, and when a new session is created a port number is
picked at random from within that range. That happens when a in2out
packet is NATed. Then later when a response comes as a out2in packet,
VPP needs to make sure it is handled by the correct thread, the same
thread that created the session.

The port number to use for a new session is selected in
nat_alloc_addr_and_port_default() like this:

portnum = (port_per_thread * snat_thread_index) + snat_random_port(1,
port_per_thread) + 1024;

where port_per_thread is the number of ports each thread is allowed to
use, and snat_random_port() returns a random number in the given range.
This means that the smallest possible portnum is 1025, that can happen
when snat_thread_index is zero.

The corresponding calculation to get the thread index back based on the
port number is essentially this:

(portnum - 1024) / port_per_thread

This works most of the time, but not always. It works in all cases
except when snat_random_port() returns the largest possible value, in
that case we end up with the wrong thread index. That means that out2in
packets arriving for that session get handed off to another thread. The
other thread is unaware of that session so all out2in packets are then
dropped for that session.

Since each thread has thousands of port numbers to choose from and the
problem only appears for one particular choice, only a small fraction
of all sessions are affected by this. In my tests there was 8 NAT
threads, then the port_per_thread value was about 8000 so that the
probability was about 1/8000 or roughly 0.0125% of all sessions that
failed.

The test I used was simply to try many separate ping commands with the
"-c 1" option, all should give the normal result "1 packets
transmitted, 1 received, 0% packet loss" but due to this problem some
of the pings fail. Note that it needs to be separate ping commands so
that VPP creates a new session for each of them. Provided that you test
a large enough number of sessions, it is straightforward to reproduce
the problem.

It could be fixed in different ways, one way is to simply shift the
arguments to snat_random_port() down by one:
snat_random_port(1, port_per_thread)
-->
snat_random_port(0, port_per_thread-1)

I pushed such a change to gerrit, here: 
https://gerrit.fd.io/r/c/vpp/+/27786

The smallest port number used then becomes 1024 instead of 1025 as it
has been so far, I suppose that should be OK since it is the "well-
known ports" from 0 to 1023 that should be avoided, port 1024 should be
okay to use. What do you think, does it make sense to fix it in this
way?

Best regards,
Elias

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17052): https://lists.fd.io/g/vpp-dev/message/17052
Mute This Topic: https://lists.fd.io/mt/75267169/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] Create big tables on huge-page

2020-07-23 Thread Lijian Zhang
Hi Maintainers,
>From VPP source code, ip4-mtrie table is created on huge-page only when below 
>parameters are set in configuration file.
While adjacency table is created on normal-page always.
  36 ip {
  37   heap-size 256M
  38   mtrie-hugetlb
  39 }
In the 10K flow testing, I configured 10K routing entries in ip4-mtrie and 10K 
entries in adjacency table.
By creating ip4-mtrie table on 1G huge-page with above parameters set and 
similarly create adjacency table on 1G huge-page, although I don't observe 
obvious throughput performance improvement, but TLB misses are dramatically 
reduced.
Do you think configuration of 10K routing entries + 10K adjacency entries is a 
reasonable and possible config, or normally it would be 10K routing entries + 
only several adjacency entries?
Does it make sense to create adjacency table on huge-pages?
Another problem is although above assigned heap-size is 256M, but on 1G 
huge-page system, it seems to occupy a huge-page completely, other memory space 
within that huge-page seems will not be used by other tables.

Same as the bihash based tables, only 2M huge-page system is supported. To 
support creating bihash based tables on 1G huge-page system, each table will 
occupy a 1G huge-page completely, but that will waste a lot of memories.
Is it possible just like pmalloc module, reserving a big memory space on 1G/2M 
huge-pages in initialization stage, and then allocate memory pieces per 
requirement for Bihash, ip4-mtrie and adjacency tables, so that all tables 
could be created on huge-pages and will fully utilize the huge-pages.
I tried to create MAC table on 1G huge-page, and it does improve throughput 
performance.
vpp# show bihash
Name Actual Configured
GBP Endpoints - MAC/BD   1m 1m
b4s 64m 64m
b4s 64m 64m
in2out   10.12m 10.12m
in2out   10.12m 10.12m
ip4-dr   2m 2m
ip4-dr   2m 2m
ip6 FIB fwding table32m 32m
ip6 FIB non-fwding table32m 32m
ip6 mFIB table  32m 32m
l2fib mac table512m 512m
mapping_by_as4  64m 64m
out2in 128m 128m
out2in 128m 128m
out2in   10.12m 10.12m
out2in   10.12m 10.12m
pppoe link table 8m 8m
pppoe session table  8m 8m
static_mapping_by_external  64m 64m
static_mapping_by_local 64m 64m
stn addresses1m 1m
users  648k 648k
users  648k 648k
vip_index_per_port  64m 64m
vxlan4   1m 1m
vxlan4-gbp   1m 1m
Total 1.28g 1.28g

Thanks.
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17051): https://lists.fd.io/g/vpp-dev/message/17051
Mute This Topic: https://lists.fd.io/mt/75742152/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-