Thanks for your following up. 😊
________________________________
From: Andrew πŸ‘½ Yourtchenko <ayour...@gmail.com>
Sent: Wednesday, September 26, 2018 12:31 AM
To: Rubina Bianchi
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its 
session table is full

Dear Rubina,

On 9/25/18, Rubina Bianchi <r_bian...@outlook.com> wrote:
> Dear Andrew,
>
> Actually it's not a RFC standard , or known attack, we extracted the
> restriction using experiment on linux netfiltter and tried to implement its
> behavior.

hmm okay. so my position about implementing it still stands then :-)

Also, another point - if you send two echo requests at once with the
same code (which is fine, because they are differentiated by ICMP ID
and sequence number), then the second request will overwrite bihash
values on activation (which is fine) and then the first reply that
comes will erase those upon deactivation, so the second request will
be dropped. This can make the troubleshooting harder in some cases
since this is confusing.  One way to deal with that is to keep some
sort of reference counter, or extend the bihash key, but again in the
absence of real-world threat model this looks like shooting pigeons
from ICBMs... (the only attack I know for this scenario is smurf and
the existing implementation covers it fine)

>
> I checked your patch, throughput is fine. After about 2000 seconds my
> throughput was still maximum and there was no drop-rate on trex. Also,
> rx-miss was very little in this patch.

Excellent, thanks a lot!

> However, one thing is attracting for
> me is that, in the same scenario in previous discussion, vpp served this
> traffic with 3.6 million sessions, but in current patch it serves with 3.9
> million sessions. Is this related to your last changes?

Yeah, it could be...  the new change  make the expiry somewhat longer:
the session in any given timeout list is checked twice per timeout,
rather than every X seconds (X being the shortest list timeout). So if
e.g. a UDP session timeout is 100 seconds, and there is a packet on it
10 seconds before the session is checked, the next check will be 50
seconds later, at which point the idle time on the session (60
seconds) is still smaller than 100, so the session will be expunged
only 50 seconds later, with the idle time of 110 seconds.

Contrast this with the previous behavior where all the lists might be
checked as often as 2 seconds - this would have resulted in a more
precise timing out, but at an expense of a lot of CPU usage.

But then all of the TCP sessions will transition into the tcp
transient state upon closure, so the only impact in real world will be
with the UDP connections, which should be relatively small.

All that said: I am considering  to move the timeout infra to the
tw_timers (which weren't available back when I was writing the code) -
then in theory we can get both the precise expiry and the efficiency.
But that is a larger change, and I am not sure of it yet, so I wanted
to get this simpler solution in first.

--a

>
> Thanks,
> Sincerely
> ________________________________
> From: Andrew πŸ‘½ Yourtchenko <ayour...@gmail.com>
> Sent: Monday, September 24, 2018 6:49 PM
> To: Rubina Bianchi
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its
> session table is full
>
> Dear Rubina,
>
> On 9/24/18, Rubina Bianchi <r_bian...@outlook.com> wrote:
>> Dear Andrew,
>>
>> It's hardcoded as it was simple and fast solution for our default
>> scenario
>> implementation.
>> As you correctly mentioned in previous email one of the bug fixes was the
>> restriction. Also another one is preventing reply packets pass through
>> even
>
> ok. that's more a "featurette" - I deliberately did not attempt to
> implement the "strict checking" because I had difficult time finding
> the attack vector. (rather than maybe some kind of "compliance" checks
> for the reasons of compliance ?)
>
>> if those packets are matched with an acl rule. In another word these
>> reply
>> packets are not belong to any echo request in reverse direction.
>
> hmm so you are sort-of making a "protocol inspection engine" there ? :-)
>
> Anyway, so far I haven't managed to recreate this condition - though
> if you were running the 18.07 rather than 18.07.1 code, then the bug
> related to hash acl manipulation on ACL changes might have caused that
> effect... I will experiment a bit more, though.
>
> Also, remember the other thread we discussed a while ago about the
> throughput getting lower over time.. I have made
> https://gerrit.fd.io/r/#/c/14821/ which should significantly reduce
> the amount of session list shuffling work in normal case scenarios.
> Before I commit it, could you give it a shot to see if it indeed
> behaves as I would expect it to behave ?  Thanks a lot!
>
> --a
>
>> Thanks,
>> Sincerely
>> ________________________________
>> From: Andrew πŸ‘½ Yourtchenko <ayour...@gmail.com>
>> Sent: Tuesday, September 18, 2018 4:06 PM
>> To: Rubina Bianchi
>> Cc: vpp-dev@lists.fd.io
>> Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its
>> session table is full
>>
>> Dear Rubina,
>>
>> On 9/18/18, Rubina Bianchi <r_bian...@outlook.com> wrote:
>>> Dear Andrew,
>>>
>>> Our changes is provided to you by creating a patch which is attached to
>>> this
>>> email.
>>> I didn't commit it to gerrit due to our specific scenario
>>> (permit+reflect
>>> on
>>> all inputs, permit+reflect or deny on all outputs).
>>
>> Why do you hardcode it as opposed to making it part of configuration ?
>> permit+reflect in one direction and deny except established sessions
>> is a fairly standard config.
>>
>>> In addition to ICMP timeout handling, our code fixes some ICMP bugs.
>>
>> Do you mean the "strict"  enforcement of the one-request-one-response
>> policy for ICMP that this code does ?
>>
>> --a
>>
>>> Although, I think code is clear for you, I can explain it in details if
>>> you
>>> ask.
>>>
>>> Thanks,
>>> Sincerely
>>> ________________________________
>>> From: Andrew πŸ‘½ Yourtchenko <ayour...@gmail.com>
>>> Sent: Tuesday, September 18, 2018 11:27 AM
>>> To: Rubina Bianchi
>>> Cc: vpp-dev@lists.fd.io
>>> Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its
>>> session table is full
>>>
>>>
>>>
>>>
>>> Hi Rubina,
>>>
>>> On 18 Sep 2018, at 11:14, Rubina Bianchi
>>> <r_bian...@outlook.com<mailto:r_bian...@outlook.com>> wrote:
>>>
>>> Hi Dear Andrew
>>>
>>> 1) I just attached my init.conf to this email. As you guessed session
>>> table
>>> size is 1000000. This problem is occurred on vpp stable/1807.
>>>
>>> Ah, cool, that helps, thanks!
>>>
>>>
>>> 2) Yes, there is 6 timeout list. We added a list for handling icmp
>>> timeouts.
>>>
>>> That is not the stable/1807, then ☺️ would you mind submitting the
>>> change
>>> to
>>> gerrit so we could take a look at it and ideally incorporate into the
>>> master
>>> ?
>>>
>>> β€”a
>>>
>>>
>>> ________________________________
>>> From: Andrew πŸ‘½ Yourtchenko
>>> <ayour...@gmail.com<mailto:ayour...@gmail.com>>
>>> Sent: Monday, September 17, 2018 8:03 PM
>>> To: Rubina Bianchi
>>> Cc: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>
>>> Subject: Re: [vpp-dev] Odd problem in adding new session on vpp when its
>>> session table is full
>>>
>>> Dear Rubina,
>>>
>>> looking at the outputs, there are a few anomalies that hopefully you
>>> can clarify:
>>>
>>> 1) the max session count is 1000000. The latest master has the default
>>> limit of 500000, and I do not see any startup config parameters
>>> changing that. Which version are you testing with/building off ?
>>>
>>> 2) there are 6 fa_conn_list_head elements in each worker for your
>>> outputs. That number was initially 3, and in the early spring when I
>>> introduced the purgatory list and the reserved unused list this number
>>> has increased to 5. The vectors are initialized at a start with a
>>> constant, so I am wondering why your outputs have a different number.
>>>
>>> Would be able to comment on these observations ?
>>>
>>> Thank you!
>>>
>>> --a
>>>
>>> On 9/17/18, Rubina Bianchi
>>> <r_bian...@outlook.com<mailto:r_bian...@outlook.com>> wrote:
>>>>   *   Dear VPP
>>>>
>>>> I ran a test on VPP configured with permit+reflect ACl rules with
>>>> t-rex.
>>>> In
>>>> this test, I put two interfaces on one bridge-domain and had an ACL on
>>>> all
>>>> of its input and output interfaces. The ACL had just one rule which was
>>>> allowing any traffic. I ran my test until VPP's session table was full.
>>>> I
>>>> run t-rex whith following command:
>>>>
>>>> "./t-rex-64 -f cap2/sfr.yaml    -m 10  -d  10000"
>>>>
>>>>
>>>> After a couple of days, I took another test  on VPP. I tried to
>>>> establish
>>>> a
>>>> ssh session between two clients via my VPP. But session could not be
>>>> established. When I checked VPP trace, All of my ssh packets where
>>>> dropped
>>>> due to following error:
>>>>
>>>> "acl-plugin-in-ip4-l2: too many sessions to add new"
>>>>
>>>> when I checked VPP's session table, I realized that it was full. No
>>>> session
>>>> where deleted since my previous test and no session where going to be
>>>> added
>>>> to session table.I also checked my /var/log/hawk.log file and saw
>>>> following
>>>> error:
>>>>
>>>> "acl_fa_node_fn:516: BUG: session LSB16(sw_if_index) and 5-tuple
>>>> collision!"
>>>>
>>>> I could not fix this problem so I restarted my VPP service. After that,
>>>> I could not reproduce this state again. Does anyone have any idea on
>>>> what my problem on VPP was?
>>>>
>>>> I attached my hawk.log, vpp trace, "vppctl sh acl-plugin sessions"
>>>> output
>>>> and startup.conf file to this email.
>>>>
>>>>
>>>>
>>>>
>>>>
>>> <init.conf>
>>>
>>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10665): https://lists.fd.io/g/vpp-dev/message/10665
Mute This Topic: https://lists.fd.io/mt/25722080/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to