Hi Ivan,
Inline.
> On Jul 29, 2020, at 9:40 AM, Ivan Shvedunov wrote:
>
> Hi Florin,
>
> while trying to fix the proxy cleanup issue, I've spotted another problem in
> the TCP stack, namely RSTs being ignored in SYN_SENT (half-open) connection
> state:
> https://gerrit.fd.io/r/c/vpp/+/28103
Hi Florin,
while trying to fix the proxy cleanup issue, I've spotted another problem
in the TCP stack, namely RSTs being ignored in SYN_SENT (half-open)
connection state:
https://gerrit.fd.io/r/c/vpp/+/28103
The following fix for handling failed active connections in the proxy has
worked for me,
Hi Ivan,
Inline.
> On Jul 28, 2020, at 8:45 AM, Ivan Shvedunov wrote:
>
> Hi Florin,
> thanks, the fix has worked and http_static no longer crashes.
Perfect, thanks for confirming!
>
> I still get a number of messages like this when using release build:
> /usr/bin/vpp[39]: state_sent_ok:95
Hi Florin,
thanks, the fix has worked and http_static no longer crashes.
I still get a number of messages like this when using release build:
/usr/bin/vpp[39]: state_sent_ok:954: BUG: couldn't send response header!
Not sure if it's actually a bug or the queue being actually full because of
the pac
Hi Ivan,
Took a look at the static http server and, as far as I can tell, it has the
same type of issue the proxy had, i.e., premature session cleanup/reuse. Does
this solve the problem for you [1]?
Also, merged your elog fix patch. Thanks!
Regards,
Florin
[1] https://gerrit.fd.io/r/c/vpp/+
Hi.
I've debugged http server issue a bit more and here are my observations:
if I add an ASSERT(0) in the place of "No http session for thread 0
session_index 54",
I get stack trace along the lines of
Program received signal SIGABRT, Aborted.
0x7470bf47 in raise () from /lib/x86_64-linux-gn
Great! Thanks for confirming!
Let me know how it goes with the static http server.
Cheers,
Florin
> On Jul 24, 2020, at 2:00 PM, Ivan Shvedunov wrote:
>
> Hi Florin,
> I re-verified the patches and the modified patch doesn't crash either, so I
> think it's safe to merge it.
> Thanks!
>
> I
Hi Florin,
I re-verified the patches and the modified patch doesn't crash either, so I
think it's safe to merge it.
Thanks!
I will try to see what is the remaining problem with http_static
On Fri, Jul 24, 2020 at 8:15 PM Florin Coras wrote:
> Hi Ivan,
>
> Adding Vanessa to see if she can help w
Hi Ivan,
Adding Vanessa to see if she can help with the account issues.
Thanks a lot for the patches! Pushed them here [1] and [2]. I took the liberty
of slightly changing [2], so if you get a chance, do try it out again.
Finally, the static http server still needs fixes. Most probably it mi
I did a bit more debugging and found an issue that was causing invalid TCP
connection lookups.
Basically, if session_connected_callback was failing for an app (in case of
proxy, e.g. b/c the other corresponding connection got closed), it was
leaving an invalid entry in the session lookup table.
Ano
Ah, I didn’t try running test.sh 80. The only difference in how I’m running the
test is that I start vpp outside of start.sh straight from binaries.
Regards,
Florin
> On Jul 23, 2020, at 8:22 AM, Ivan Shvedunov wrote:
>
> Well, I always run the same test, the difference being only
> "test.sh
Well, I always run the same test, the difference being only
"test.sh 80" for http_static (it's configured to be listening on that port)
or just "test.sh" for the proxy. As far as I understand, you run the tests
without using the containers, does that include setting up netem like this
[1] ?
[1] ht
Hi Ivan,
Updated [1] but I’m not seeing [3] after several test iterations.
Probably the static server needs the same treatment as the proxy. Are you
running a slightly different test? All of the builtin apps have the potential
to crash vpp or leave the host stack in an unwanted state since th
http_static produces some errors:
/usr/bin/vpp[40]: http_static_server_rx_tx_callback:1010: No http session
for thread 0 session_index 4124
/usr/bin/vpp[40]: http_static_server_rx_tx_callback:1010: No http session
for thread 0 session_index 4124
/usr/bin/vpp[40]: tcp_input_dispatch_buffer:2812: tcp
Hi,
I've found a problem with the timer fix and commented in Gerrit [1]
accordingly.
Basically this change [2] makes the tcp_prepare_retransmit_segment() issue
go away for me.
Concerning the proxy example, I can no longer see the SVM FIFO crashes, but
when using debug build, VPP crashes with this
Hi Ivan,
Thanks for the test. After modifying it a bit to run straight from binaries, I
managed to repro the issue. As expected, the proxy is not cleaning up the
sessions correctly (example apps do run out of sync ..). Here’s a quick patch
that solves some of the obvious issues [1] (note that
Concerning the CI: I'd be glad to add that test to "make test", but not
sure how to approach it. The test is not about containers but more about
using network namespaces and some tools like wrk to create a lot of TCP
connections to do some "stress testing" of VPP host stack (and as it was
noted, it
I missed the point about the CI in my other reply. If we can somehow integrate
some container based tests into the “make test” infra, I wouldn’t mind at all!
:-)
Regards,
Florin
> On Jul 22, 2020, at 4:17 AM, Ivan Shvedunov wrote:
>
> Hi,
> sadly the patch apparently didn't work. It should ha
Hi Ivan,
Will try to reproduce but given the types of crashes, it could be that the
proxy app is not cleanly releasing the connections.
Regards,
Florin
> On Jul 22, 2020, at 8:29 AM, Ivan Shvedunov wrote:
>
> Some preliminary observations concerning the crashes in the proxy example:
> * !rb
Some preliminary observations concerning the crashes in the proxy example:
* !rb_tree_is_init(...) assertion failures are likely caused by
multiple active_open_connected_callback() invocations for the same
connection
* f_update_ooo_deq() SIGSEGV crash is possibly caused for late callbacks
for conne
Hi,
sadly the patch apparently didn't work. It should have worked but for some
reason it didn't ...
On the bright side, I've made a test case [1] using fresh upstream VPP code
with no UPF that reproduces the issues I mentioned, including both timer
and TCP retransmit one along with some other poss
Hi Ivan,
Thanks for the detailed report!
I assume this is a situation where most of the connections time out and the
rate limiting we apply on the pending timer queue delays handling for long
enough to be in a situation like the one you described. Here’s a draft patch
that starts tracking pen
Hi,
I'm working on the Travelping UPF project https://github.com/travelping/vpp (
https://github.com/travelping/vpp ) For variety of reasons, it's presently
maintained as a fork of UPF that's rebased on top of upstream master from time
to time, but really it's just a plugin. During 40K TCP conne
23 matches
Mail list logo