Hi Peter,
It may be unrelated, but I think we see this issue also pretty regularly
with FD.io VPP 18.04 and the x520, on our local test rig.
The error we typically see is "VAT command sw_interface_set_flags
sw_if_index 1 admin-up: no JSON data.VAT".
Do think it is the same or a separate issue?
Ray K
On 30/07/2018 08:02, Peter Mikus via Lists.Fd.Io wrote:
Hello vpp-dev,
I am looking for consultation. We started to test VPP for report on all
LF CSIT testbeds Skylakes and Haswells.
We are observing weird behavior. In each test we are using sequence to
first bring the both interfaces (physical up) by VAT:
sw_interface_set_flags sw_if_index <idx> admin-up (I also
tried sw_interface_set_flags sw_if_index idx admin-up link-up)
After setting all interfaces UP we are testing if interfaces are really
UP by VAT (loop 30times, 1s between API call check): “sw_interface_dump”.
It wasn’t an issue in past but recently we start seeing that
sw_interface_dump is reporting interfaces as link_down (admin-up).
Notes/symptoms:
-Our sw_interface_dump check is running 30x (1s interval) in loop.
-Link-down is random, sometimes both interfaces are link-up sometimes
just one and sometimes both link are down.
-_It is not TB related_, nor cabling related, we see it on
Haswells-3node in like 1 out of 70 tests, Skylakes-2node 1 out of 70,
but on Skylake-3node more than half of the tests.
-Checking state during test reveals that interfaces are link-down (show
int) so “sw_interface_dump” is reporting state correctly.
-Doing CLI during test “set interface state … up” does bring interfaces
UP -> (but it is hard to check the timing here).
-Affected are mostly x520 and x710, but that is most probably because of
statistics (low coverage of other NICs like xxv710 and xl710).
-We have seen this in master vpp as well as rc2 vpp.
-It is not clear when this starts to happen, so bisecting would take lot
of time.
-This was spotted on VIRL as well also on Memif interface which bring us
to suspicious that this is related to API not HW.
Do you have an idea what we could check further? VPP is not crashing so
no core dump are available. This issue is not 100% replicable which
makes it hard to debug.
Is there a way to get more verbose error from the api call mentioned to
reveal more information?
**
Thank you.
*Peter Mikus*
Engineer – Software
*Cisco Systems Limited*
http://www.cisco.com/web/europe/images/email/signature/logo05.jpg
Think before you print.
This email may contain confidential and privileged material for the sole
use of the intended recipient. Any review, use, distribution or
disclosure by others is strictly prohibited. If you are not the intended
recipient (or authorized to receive for the recipient), please contact
the sender by reply email and delete all copies of this message.
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#9967): https://lists.fd.io/g/vpp-dev/message/9967
Mute This Topic: https://lists.fd.io/mt/23857615/675355
Group Owner: [email protected]
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#9981): https://lists.fd.io/g/vpp-dev/message/9981
Mute This Topic: https://lists.fd.io/mt/23857615/21656
Group Owner: [email protected]
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-