Re: [vpp-dev] build queue backed up a bit foo

2018-09-10 Thread Ed Kern via Lists.Fd.Io
after a couple hours of testing on the sandbox and getting good builds I went 
with
option 1 for now.  (Should be getting merged as I type).

If we run into any more issues Ill remove the job from the main gerrit trigger 
so
builds will only be started via comment.

Ed



On Sep 10, 2018, at 12:48 PM, Florin Coras 
mailto:fcoras.li...@gmail.com>> wrote:

If 1 makes the arm jobs pass and supposedly run much faster, I’d be fine with 
that as well.

Florin

On Sep 10, 2018, at 7:48 AM, Marco Varlese 
mailto:mvarl...@suse.de>> wrote:

On Mon, 2018-09-10 at 14:29 +, Ed Kern (ejk) wrote:
At least three different possible actions to take at this point:  (outside of 
fixing the issue)

1. remove make test attempt from arm build (return it to the way it was before 
a week ago).
2. lower timeout further my first thought would be in the 75 minute range (from 
120)
I don't like this option mainly because it would still imply that developers 
will have to wait for 75 minutes to see a patch verified...
The worst is that the job is a non-voting one hence having it running does add 
any meaningful insight into the build result.
I'm in favour for the issue being resolved in the sandbox and eventually moved 
to production when stable.

3. remove job altogether.
I believe this should be the option to purse unless it could be fixed quickly.
As it is it only causes delays to the overall build / review / merging 
process...


Im happy to push a patch that accomplishes the above or other options I haven’t 
thought of
for this mail.

Just let me know..

Ed



On Sep 10, 2018, at 4:31 AM, Marco Varlese 
mailto:mvarl...@suse.de>> wrote:
Last example (today) can be found here

10:47:06 Not running extended tests (some tests will be skipped)
10:47:06 Perform 4 attempts to pass the suite...
10:47:10 *** Error in `python': double free or corruption (out):
0x75d483f0 ***
12:27:54 Build timed out (after 120 minutes). Marking the build as failed.

Patch: https://gerrit.fd.io/r/#/c/14744/

Other jobs have finished more than one and half-hour ago and the patch cannot be
marked Verified+1 because Jenkins is still waiting for the ARM job to complete
(timeout = 120 minutes). IMHO it makes the overall patch submission and review
very painful for authors and commiters.

I would recommend (it has been done in the past for other jobs) to disable this
job in production, move it again to sandbox, get it fixed and eventually moved
again to production...

- Marco

On Fri, 2018-09-07 at 12:04 -0700, Florin Coras wrote:
ARM jobs have not been working for some days now. That’s why their result is
skipped. Timeout is 2h but probably we should drop that even further …

Florin

On Sep 7, 2018, at 11:59 AM, Ole Troan 
mailto:otr...@employees.org>> wrote:
Trying out a change in:
https://gerrit.fd.io/r/#/c/14732/

All others succeed but ARM this doesn’t look too good.
Stuck apparently.

https://jenkins.fd.io/job/vpp-arm-verify-master-ubuntu1604/2157/console
20:16:04

==

20:16:04
ARP Test Case

20:16:04

==

20:16:07
*** Error in `python': free(): invalid pointer: 0x553263a8 ***

20:16:09
ARP  OK

20:16:09
ARP Duplicates   OK

20:16:09
test_arp_incomplete (test_neighbor.ARPTestCase)  OK

20:16:09
ARP Static   OK

20:16:15
ARP reply with VRRP virtual src hw addr  OK

20:16:15
GARP OK

20:16:15
MPLS OK


Cheers,
Ole


On 7 Sep 2018, at 18:50, Ole Troan 
mailto:otr...@employees.org>> wrote:

Ed,

Let me take a closer look at these.
It appears if VPP is slow to start it might not have created the socket
yet. Let me try to put in a retry loop and see if that fixes verify.

Cheers
Ole

On 7 Sep 2018, at 18:06, Ed Kern via Lists.Fd.Io <
ejk=cisco@lists.fd.io> wrote:

make test failures due to the below causing pretty consistent failures.
Note:  For whatever reason the failures are not 100%.
The failures and with the retries on nonconcurrent merge jobs may lead
to long build queues.

These are not infra issues but ill be keeping an eye on it.


Ed



13:08:25
Using /var/cache/vpp/python/virtualenv/lib/python2.7/site-packages

13:08:25
Finished processing dependencies for vpp-papi==1.6.1

13:08:27
Traceback (most recent call last):

13:08:27
File "sanity_run_vpp.py", line 21, in 

13:08:27
  tc.setUpClass()

13:08:27
File "/w/workspace/vpp-merge-master-ubuntu1604/test/framework.py", line
394, in setUpClass

13:08:27
  cls.statistics = VPPStats(socketname=cls.tempdir+'/stats.sock')

13:08:27

Re: [vpp-dev] build queue backed up a bit foo

2018-09-10 Thread Florin Coras
If 1 makes the arm jobs pass and supposedly run much faster, I’d be fine with 
that as well. 

Florin

> On Sep 10, 2018, at 7:48 AM, Marco Varlese  wrote:
> 
> On Mon, 2018-09-10 at 14:29 +, Ed Kern (ejk) wrote:
>> At least three different possible actions to take at this point:  (outside 
>> of fixing the issue)
>> 
>> 1. remove make test attempt from arm build (return it to the way it was 
>> before a week ago).
>> 2. lower timeout further my first thought would be in the 75 minute range 
>> (from 120)
> I don't like this option mainly because it would still imply that developers 
> will have to wait for 75 minutes to see a patch verified... 
> The worst is that the job is a non-voting one hence having it running does 
> add any meaningful insight into the build result.
> I'm in favour for the issue being resolved in the sandbox and eventually 
> moved to production when stable.
> 
>> 3. remove job altogether.
> I believe this should be the option to purse unless it could be fixed quickly.
> As it is it only causes delays to the overall build / review / merging 
> process... 
> 
>> 
>> Im happy to push a patch that accomplishes the above or other options I 
>> haven’t thought of
>> for this mail.
>> 
>> Just let me know..
>> 
>> Ed
>> 
>> 
>> 
>>> On Sep 10, 2018, at 4:31 AM, Marco Varlese >> > wrote:
>>> Last example (today) can be found here
>>> 
>>> 10:47:06 Not running extended tests (some tests will be skipped)
>>> 10:47:06 Perform 4 attempts to pass the suite...
>>> 10:47:10 *** Error in `python': double free or corruption (out):
>>> 0x75d483f0 ***
>>> 12:27:54 Build timed out (after 120 minutes). Marking the build as failed.
>>> 
>>> Patch: https://gerrit.fd.io/r/#/c/14744/ 
>>> 
>>> Other jobs have finished more than one and half-hour ago and the patch 
>>> cannot be
>>> marked Verified+1 because Jenkins is still waiting for the ARM job to 
>>> complete
>>> (timeout = 120 minutes). IMHO it makes the overall patch submission and 
>>> review
>>> very painful for authors and commiters. 
>>> 
>>> I would recommend (it has been done in the past for other jobs) to disable 
>>> this
>>> job in production, move it again to sandbox, get it fixed and eventually 
>>> moved
>>> again to production...
>>> 
>>> - Marco
>>> 
>>> On Fri, 2018-09-07 at 12:04 -0700, Florin Coras wrote:
 ARM jobs have not been working for some days now. That’s why their result 
 is
 skipped. Timeout is 2h but probably we should drop that even further … 
 
 Florin
 
> On Sep 7, 2018, at 11:59 AM, Ole Troan  > wrote:
> Trying out a change in:
> https://gerrit.fd.io/r/#/c/14732/ 
> 
> All others succeed but ARM this doesn’t look too good.
> Stuck apparently.
> 
> https://jenkins.fd.io/job/vpp-arm-verify-master-ubuntu1604/2157/console
> 20:16:04 
> 
> ==
> 
> 20:16:04 
> ARP Test Case 
> 
> 20:16:04 
> 
> ==
> 
> 20:16:07 
> *** Error in `python': free(): invalid pointer: 0x553263a8 ***
> 
> 20:16:09 
> ARP  
> OK
> 
> 20:16:09 
> ARP Duplicates   
> OK
> 
> 20:16:09 
> test_arp_incomplete (test_neighbor.ARPTestCase)  
> OK
> 
> 20:16:09 
> ARP Static   
> OK
> 
> 20:16:15 
> ARP reply with VRRP virtual src hw addr  
> OK
> 
> 20:16:15 
> GARP 
> OK
> 
> 20:16:15 
> MPLS 
> OK
> 
> 
> Cheers,
> Ole
> 
> 
>> On 7 Sep 2018, at 18:50, Ole Troan  wrote:
>> 
>> Ed,
>> 
>> Let me take a closer look at these.
>> It appears if VPP is slow to start it might not have created the socket
>> yet. Let me try to put in a retry loop and see if that fixes verify. 
>> 
>> Cheers 
>> Ole
>> 
>> On 7 Sep 2018, at 18:06, Ed Kern via Lists.Fd.Io <
>> ejk=cisco@lists.fd.io> wrote:
>> 
>>> make test failures due to the below causing pretty consistent failures. 
>>> Note:  For whatever reason the failures are not 100%. 
>>> The failures and with the retries on nonconcurrent merge jobs may lead
>>> to long build queues.   
>>> 
>>> These are not infra issues but ill be keeping an eye on it.
>>> 
>>> 
>>> Ed
>>> 
>>> 
>>> 
>>> 13:08:25 
>>> 

Re: [vpp-dev] build queue backed up a bit foo

2018-09-10 Thread Marco Varlese
Obviously I meant "does NOT add any meaningful insight..."
On Mon, 2018-09-10 at 16:48 +0200, Marco Varlese wrote:
> On Mon, 2018-09-10 at 14:29 +, Ed Kern (ejk) wrote:
> > At least three different possible actions to take at this point:  (outside
> > of fixing the issue)
> > 
> > 
> > 
> > 
> > 1. remove make test attempt from arm build (return it to the way it was
> > before a week ago).
> > 
> > 2. lower timeout further my first thought would be in the 75 minute range
> > (from 120)
> I don't like this option mainly because it would still imply that developers
> will have to wait for 75 minutes to see a patch verified... The worst is that
> the job is a non-voting one hence having it running does add any meaningful
> insight into the build result. I'm in favour for the issue being resolved in
> the sandbox and eventually moved to production when stable.
> > 3. remove job altogether.
> I believe this should be the option to purse unless it could be fixed
> quickly.As it is it only causes delays to the overall build / review / merging
> process... 
> > 
> > 
> > Im happy to push a patch that accomplishes the above or other options I
> > haven’t thought of
> > for this mail.
> > 
> > 
> > 
> > Just let me know..
> > 
> > 
> > 
> > Ed
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > > On Sep 10, 2018, at 4:31 AM, Marco Varlese  wrote:
> > > 
> > > Last
> > >  example (today) can be found here
> > > 
> > > 
> > > 
> > > 10:47:06
> > >  Not running extended tests (some tests will be skipped)
> > > 
> > > 10:47:06
> > >  Perform 4 attempts to pass the suite...
> > > 
> > > 10:47:10
> > >  *** Error in `python': double free or corruption (out):
> > > 
> > > 0x75d483f0
> > >  ***
> > > 
> > > 12:27:54
> > >  Build timed out (after 120 minutes). Marking the build as failed.
> > > 
> > > 
> > > 
> > > Patch: https://gerrit.fd.io/r/#/c/14744/
> > > 
> > > 
> > > 
> > > Other
> > >  jobs have finished more than one and half-hour ago and the patch cannot
> > > be
> > > 
> > > marked
> > >  Verified+1 because Jenkins is still waiting for the ARM job to complete
> > > 
> > > (timeout
> > >  = 120 minutes). IMHO it makes the overall patch submission and review
> > > 
> > > very
> > >  painful for authors and commiters. 
> > > 
> > > 
> > > 
> > > I
> > >  would recommend (it has been done in the past for other jobs) to disable
> > > this
> > > 
> > > job
> > >  in production, move it again to sandbox, get it fixed and eventually
> > > moved
> > > 
> > > again
> > >  to production...
> > > 
> > > 
> > > 
> > > -
> > >  Marco
> > > 
> > > 
> > > 
> > > On
> > >  Fri, 2018-09-07 at 12:04 -0700, Florin Coras wrote:
> > > 
> > > > ARM jobs have not been working for some days now. That’s why their
> > > > result is
> > > > 
> > > > skipped. Timeout is 2h but probably we should drop that even further … 
> > > > 
> > > > 
> > > > 
> > > > Florin
> > > > 
> > > > 
> > > > 
> > > > > On Sep 7, 2018, at 11:59 AM, Ole Troan  wrote:
> > > > > 
> > > > > Trying out a change in:
> > > > > 
> > > > > https://gerrit.fd.io/r/#/c/14732/
> > > > > 
> > > > > 
> > > > > 
> > > > > All others succeed but ARM this doesn’t look too good.
> > > > > 
> > > > > Stuck apparently.
> > > > > 
> > > > > 
> > > > > 
> > > > > 
https://jenkins.fd.io/job/vpp-arm-verify-master-ubuntu1604/2157/console
> > > > > 
> > > > > 20:16:04 
> > > > > 
> > > > > ==
> > > > > ==
> > > > > 
> > > > > ==
> > > > > 
> > > > > 
> > > > > 
> > > > > 20:16:04 
> > > > > 
> > > > > ARP Test Case 
> > > > > 
> > > > > 
> > > > > 
> > > > > 20:16:04 
> > > > > 
> > > > > ==
> > > > > ==
> > > > > 
> > > > > ==
> > > > > 
> > > > > 
> > > > > 
> > > > > 20:16:07 
> > > > > 
> > > > > *** Error in `python': free(): invalid pointer: 0x553263a8 ***
> > > > > 
> > > > > 
> > > > > 
> > > > > 20:16:09 
> > > > > 
> > > > > ARP
> > > > >  O
> > > > > K
> > > > > 
> > > > > 
> > > > > 
> > > > > 20:16:09 
> > > > > 
> > > > > ARP Duplicates
> > > > >   OK
> > > > > 
> > > > > 
> > > > > 
> > > > > 20:16:09 
> > > > > 
> > > > > test_arp_incomplete (test_neighbor.ARPTestCase)
> > > > >  OK
> > > > > 
> > > > > 
> > > > > 
> > > > > 20:16:09 
> > > > > 
> > > > > ARP Static
> > > > >   OK
> > > > > 
> > > > > 
> > > > > 
> > > > > 20:16:15 
> > > > > 
> > > > > ARP reply with VRRP virtual src hw addr
> > > > >  OK
> > > > > 
> > > > > 
> > > > > 
> > > > > 20:16:15 
> > > > > 
> > > > > GARP
> > > > > OK
> > > > > 
> > > > > 
> > > > > 
> > > > > 20:16:15 
> > > > > 
> > > > > MPLS
> > > > >

Re: [vpp-dev] build queue backed up a bit foo

2018-09-10 Thread Marco Varlese
On Mon, 2018-09-10 at 14:29 +, Ed Kern (ejk) wrote:
> At least three different possible actions to take at this point:  (outside of
> fixing the issue)
> 
> 
> 
> 
> 1. remove make test attempt from arm build (return it to the way it was before
> a week ago).
> 
> 2. lower timeout further my first thought would be in the 75 minute range
> (from 120)
I don't like this option mainly because it would still imply that developers
will have to wait for 75 minutes to see a patch verified... The worst is that
the job is a non-voting one hence having it running does add any meaningful
insight into the build result. I'm in favour for the issue being resolved in the
sandbox and eventually moved to production when stable.
> 3. remove job altogether.
I believe this should be the option to purse unless it could be fixed quickly.As
it is it only causes delays to the overall build / review / merging process... 
> 
> 
> Im happy to push a patch that accomplishes the above or other options I
> haven’t thought of
> for this mail.
> 
> 
> 
> Just let me know..
> 
> 
> 
> Ed
> 
> 
> 
> 
> 
> 
> 
> 
> > On Sep 10, 2018, at 4:31 AM, Marco Varlese  wrote:
> > 
> > Last
> >  example (today) can be found here
> > 
> > 
> > 
> > 10:47:06
> >  Not running extended tests (some tests will be skipped)
> > 
> > 10:47:06
> >  Perform 4 attempts to pass the suite...
> > 
> > 10:47:10
> >  *** Error in `python': double free or corruption (out):
> > 
> > 0x75d483f0
> >  ***
> > 
> > 12:27:54
> >  Build timed out (after 120 minutes). Marking the build as failed.
> > 
> > 
> > 
> > Patch: https://gerrit.fd.io/r/#/c/14744/
> > 
> > 
> > 
> > Other
> >  jobs have finished more than one and half-hour ago and the patch cannot be
> > 
> > marked
> >  Verified+1 because Jenkins is still waiting for the ARM job to complete
> > 
> > (timeout
> >  = 120 minutes). IMHO it makes the overall patch submission and review
> > 
> > very
> >  painful for authors and commiters. 
> > 
> > 
> > 
> > I
> >  would recommend (it has been done in the past for other jobs) to disable
> > this
> > 
> > job
> >  in production, move it again to sandbox, get it fixed and eventually moved
> > 
> > again
> >  to production...
> > 
> > 
> > 
> > -
> >  Marco
> > 
> > 
> > 
> > On
> >  Fri, 2018-09-07 at 12:04 -0700, Florin Coras wrote:
> > 
> > > ARM jobs have not been working for some days now. That’s why their result
> > > is
> > > 
> > > skipped. Timeout is 2h but probably we should drop that even further … 
> > > 
> > > 
> > > 
> > > Florin
> > > 
> > > 
> > > 
> > > > On Sep 7, 2018, at 11:59 AM, Ole Troan  wrote:
> > > > 
> > > > Trying out a change in:
> > > > 
> > > > https://gerrit.fd.io/r/#/c/14732/
> > > > 
> > > > 
> > > > 
> > > > All others succeed but ARM this doesn’t look too good.
> > > > 
> > > > Stuck apparently.
> > > > 
> > > > 
> > > > 
> > > > https://jenkins.fd.io/job/vpp-arm-verify-master-ubuntu1604/2157/console
> > > > 
> > > > 20:16:04 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > ==
> > > > 
> > > > 
> > > > 
> > > > 20:16:04 
> > > > 
> > > > ARP Test Case 
> > > > 
> > > > 
> > > > 
> > > > 20:16:04 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > ==
> > > > 
> > > > 
> > > > 
> > > > 20:16:07 
> > > > 
> > > > *** Error in `python': free(): invalid pointer: 0x553263a8 ***
> > > > 
> > > > 
> > > > 
> > > > 20:16:09 
> > > > 
> > > > ARP
> > > >  OK
> > > > 
> > > > 
> > > > 
> > > > 20:16:09 
> > > > 
> > > > ARP Duplicates
> > > >   OK
> > > > 
> > > > 
> > > > 
> > > > 20:16:09 
> > > > 
> > > > test_arp_incomplete (test_neighbor.ARPTestCase)
> > > >  OK
> > > > 
> > > > 
> > > > 
> > > > 20:16:09 
> > > > 
> > > > ARP Static
> > > >   OK
> > > > 
> > > > 
> > > > 
> > > > 20:16:15 
> > > > 
> > > > ARP reply with VRRP virtual src hw addr
> > > >  OK
> > > > 
> > > > 
> > > > 
> > > > 20:16:15 
> > > > 
> > > > GARP
> > > > OK
> > > > 
> > > > 
> > > > 
> > > > 20:16:15 
> > > > 
> > > > MPLS
> > > > OK
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > Cheers,
> > > > 
> > > > Ole
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > > On 7 Sep 2018, at 18:50, Ole Troan  wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > Ed,
> > > > > 
> > > > > 
> > > > > 
> > > > > Let me take a closer look at these.
> > > > > 
> > > > > It appears if VPP is slow to start it might not have created the
> > > > > socket
> > > > > 
> > > > > yet. Let me try to put in a retry loop and see if that fixes verify. 
> > > > > 
> > > 

Re: [vpp-dev] build queue backed up a bit foo

2018-09-10 Thread Ed Kern via Lists.Fd.Io
At least three different possible actions to take at this point:  (outside of 
fixing the issue)

1. remove make test attempt from arm build (return it to the way it was before 
a week ago).
2. lower timeout further my first thought would be in the 75 minute range (from 
120)
3. remove job altogether.

Im happy to push a patch that accomplishes the above or other options I haven’t 
thought of
for this mail.

Just let me know..

Ed



On Sep 10, 2018, at 4:31 AM, Marco Varlese 
mailto:mvarl...@suse.de>> wrote:

Last example (today) can be found here

10:47:06 Not running extended tests (some tests will be skipped)
10:47:06 Perform 4 attempts to pass the suite...
10:47:10 *** Error in `python': double free or corruption (out):
0x75d483f0 ***
12:27:54 Build timed out (after 120 minutes). Marking the build as failed.

Patch: https://gerrit.fd.io/r/#/c/14744/

Other jobs have finished more than one and half-hour ago and the patch cannot be
marked Verified+1 because Jenkins is still waiting for the ARM job to complete
(timeout = 120 minutes). IMHO it makes the overall patch submission and review
very painful for authors and commiters.

I would recommend (it has been done in the past for other jobs) to disable this
job in production, move it again to sandbox, get it fixed and eventually moved
again to production...

- Marco

On Fri, 2018-09-07 at 12:04 -0700, Florin Coras wrote:
ARM jobs have not been working for some days now. That’s why their result is
skipped. Timeout is 2h but probably we should drop that even further …

Florin

On Sep 7, 2018, at 11:59 AM, Ole Troan 
mailto:otr...@employees.org>> wrote:
Trying out a change in:
https://gerrit.fd.io/r/#/c/14732/

All others succeed but ARM this doesn’t look too good.
Stuck apparently.

https://jenkins.fd.io/job/vpp-arm-verify-master-ubuntu1604/2157/console
20:16:04

==

20:16:04
ARP Test Case

20:16:04

==

20:16:07
*** Error in `python': free(): invalid pointer: 0x553263a8 ***

20:16:09
ARP  OK

20:16:09
ARP Duplicates   OK

20:16:09
test_arp_incomplete (test_neighbor.ARPTestCase)  OK

20:16:09
ARP Static   OK

20:16:15
ARP reply with VRRP virtual src hw addr  OK

20:16:15
GARP OK

20:16:15
MPLS OK


Cheers,
Ole


On 7 Sep 2018, at 18:50, Ole Troan  wrote:

Ed,

Let me take a closer look at these.
It appears if VPP is slow to start it might not have created the socket
yet. Let me try to put in a retry loop and see if that fixes verify.

Cheers
Ole

On 7 Sep 2018, at 18:06, Ed Kern via Lists.Fd.Io <
ejk=cisco@lists.fd.io> wrote:

make test failures due to the below causing pretty consistent failures.
Note:  For whatever reason the failures are not 100%.
The failures and with the retries on nonconcurrent merge jobs may lead
to long build queues.

These are not infra issues but ill be keeping an eye on it.


Ed



13:08:25
Using /var/cache/vpp/python/virtualenv/lib/python2.7/site-packages

13:08:25
Finished processing dependencies for vpp-papi==1.6.1

13:08:27
Traceback (most recent call last):

13:08:27
File "sanity_run_vpp.py", line 21, in 

13:08:27
  tc.setUpClass()

13:08:27
File "/w/workspace/vpp-merge-master-ubuntu1604/test/framework.py", line
394, in setUpClass

13:08:27
  cls.statistics = VPPStats(socketname=cls.tempdir+'/stats.sock')

13:08:27
File "build/bdist.linux-x86_64/egg/vpp_papi/vpp_stats.py", line 117, in
__init__

13:08:27
IOError

13:08:27
***

13:08:27
* Sanity check failed, cannot run vpp

13:08:27
***

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10432): https://lists.fd.io/g/vpp-dev/message/10432
Mute This Topic: https://lists.fd.io/mt/25308161/675193
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsubb  
[otr...@employees.org
]
-=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10435): https://lists.fd.io/g/vpp-dev/message/10435
Mute This Topic: https://lists.fd.io/mt/25308161/675193
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsubb  
[otr...@employees.org]
-=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10437): 

Re: [vpp-dev] Cavium ThunderX (ARM64) - Crash in VPP (Kubernetes + Contiv-VPP network plugin)

2018-09-10 Thread Stanislav Chlebec
Hi Neale
The patch solved my problem.
Thank you very much.

Stan

From: Neale Ranns (nranns) [mailto:nra...@cisco.com]
Sent: Friday, September 7, 2018 10:51 AM
To: Stanislav Chlebec ; vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Cavium ThunderX (ARM64) - Crash in VPP (Kubernetes + 
Contiv-VPP network plugin)

Hi Stan,

Thanks for the decode.

Given that I cannot analyse your core, I cannot be sure why the crash occurred. 
But I do notice when using the route type we see in the trace in a new unit 
test that it doesn’t produce the correct result. Here is the patch:
  https://gerrit.fd.io/r/#/c/14714/
maybe it will fix your crash too.

Regards
Neale


From: Stanislav Chlebec 
mailto:stanislav.chle...@pantheon.tech>>
Date: Thursday, 6 September 2018 at 11:00
To: "Neale Ranns (nranns)" mailto:nra...@cisco.com>>, 
"vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: RE: [vpp-dev] Cavium ThunderX (ARM64) - Crash in VPP (Kubernetes + 
Contiv-VPP network plugin)

Thans for advice.
Here is the result:
https://gist.github.com/stanislav-chlebec/7466935c41b60eb23ea711f6a4fcafeb

Stan
From: Neale Ranns (nranns) [mailto:nra...@cisco.com]
Sent: Wednesday, September 5, 2018 1:58 PM
To: Stanislav Chlebec 
mailto:stanislav.chle...@pantheon.tech>>; 
vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Cavium ThunderX (ARM64) - Crash in VPP (Kubernetes + 
Contiv-VPP network plugin)

On the exact same version of VPP that produced the crash do:
  api trace custom-dump /path/to/trace/flie.txt

/neale


From: Stanislav Chlebec 
mailto:stanislav.chle...@pantheon.tech>>
Date: Wednesday, 5 September 2018 at 13:24
To: "Neale Ranns (nranns)" mailto:nra...@cisco.com>>, 
"vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: RE: [vpp-dev] Cavium ThunderX (ARM64) - Crash in VPP (Kubernetes + 
Contiv-VPP network plugin)

Hi Neale
Could you please describe, how to do it?
Thanks
Stan

From: Neale Ranns (nranns) [mailto:nra...@cisco.com]
Sent: Tuesday, September 4, 2018 3:27 PM
To: Stanislav Chlebec 
mailto:stanislav.chle...@pantheon.tech>>; 
vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Cavium ThunderX (ARM64) - Crash in VPP (Kubernetes + 
Contiv-VPP network plugin)

Hi Stan,

Unfortunately I don’t have an ARM machine on to decode the post-mortem data. 
Could you do this?

Thanks,
Neale


From: Stanislav Chlebec 
mailto:stanislav.chle...@pantheon.tech>>
Date: Tuesday, 4 September 2018 at 11:06
To: Stanislav Chlebec 
mailto:stanislav.chle...@pantheon.tech>>, 
"Neale Ranns (nranns)" mailto:nra...@cisco.com>>, 
"vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: RE: [vpp-dev] Cavium ThunderX (ARM64) - Crash in VPP (Kubernetes + 
Contiv-VPP network plugin)

Hi Neale
Have you had the occasion to look at that api_post_mortem data?
Have you found the reason of crash?
Thanks
Stan


From: Stanislav Chlebec [mailto:stanislav.chle...@pantheon.tech]
Sent: Wednesday, August 22, 2018 3:39 PM
To: Neale Ranns (nranns) mailto:nra...@cisco.com>>; 
vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Cavium ThunderX (ARM64) - Crash in VPP (Kubernetes + 
Contiv-VPP network plugin)

Hi Neale
I attached the file api_post_mortem.43407
to the  issue https://jira.fd.io/browse/VPP-1394
Thanks
Stan

From: Neale Ranns (nranns) [mailto:nra...@cisco.com]
Sent: Tuesday, August 21, 2018 5:02 PM
To: Stanislav Chlebec 
mailto:stanislav.chle...@pantheon.tech>>; 
Nitin Saxena mailto:nitin.sax...@cavium.com>>; 
vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Cavium ThunderX (ARM64) - Crash in VPP (Kubernetes + 
Contiv-VPP network plugin)

Hi Stan,

What route were you adding at the time? Can you give me the post-mortem API 
dump [1]

/neale

[1] see https://wiki.fd.io/view/VPP/BugReports


From: mailto:vpp-dev@lists.fd.io>> on behalf of Stanislav 
Chlebec 
mailto:stanislav.chle...@pantheon.tech>>
Date: Tuesday, 21 August 2018 at 16:41
To: Nitin Saxena mailto:nitin.sax...@cavium.com>>, 
"vpp-dev@lists.fd.io" 
mailto:vpp-dev@lists.fd.io>>
Subject: [vpp-dev] Cavium ThunderX (ARM64) - Crash in VPP (Kubernetes + 
Contiv-VPP network plugin)

Hello all

Could you please help mi with this issue:
https://jira.fd.io/browse/VPP-1394

Thanks.
Stan
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10458): https://lists.fd.io/g/vpp-dev/message/10458
Mute This Topic: https://lists.fd.io/mt/24876710/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] anomaly detection changes

2018-09-10 Thread Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) via Lists.Fd.Io

Hello people watching the trending pages [0].

We have just merged a Change [1]
which affects how MRR tests work.
Instead of single trial of 10 second duration,
the test is now executing 10 trials of 1 second.

PAL is told to aggregate the results,
so in theory nothing should change,
but in practice the anomaly detection
could spot some regressions or progressions.

If the anomalies are big, we will revert.
If they are small (or none),
we will tell PAL (next week)
to take standard deviations into account,
which will definitely make anomaly detection
to mark many regressions and progressions
(just because it is not easy to have the detection
algorithm compatible with several trial durations at once).

After that, the anomaly detection should be able
to detect smaller regressions and progressions.
If it starts detecting more noise than signal,
we will revert (and try something else).

Vratko.

[0] https://docs.fd.io/csit/master/trending/introduction/dashboard.html
[1] https://gerrit.fd.io/r/14596
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10457): https://lists.fd.io/g/vpp-dev/message/10457
Mute This Topic: https://lists.fd.io/mt/25500458/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] vl_api_sw_interface_dump problem

2018-09-10 Thread emma sdi
I tested your patch, bug still exists.

On Mon, Sep 10, 2018 at 4:22 PM Damjan Marion  wrote:

> Can you please try this patch:
>
> https://paste.ubuntu.com/p/GtqHd2yWqK/
>
> --
> Damjan
>
> On 10 Sep 2018, at 13:44, khers  wrote:
>
> Dear Damjan
>
> This is output  of 'show
> event-logger' before the problem.
> I set GigabitEthernet0/4/0 interface up, then call dump_sw_interface api
> in a c program.
> duplex an speed is not accurate.
> Outout  of 'show event-loger'
> after I saw the bug.
>
> footnote: output of 'git diff' 
> Cheers
>
> On Sun, Sep 9, 2018 at 2:47 PM Damjan Marion  wrote:
>
>>
>> Dear Emma, Chore,
>>
>> Patch 14647 is not valid solution to the problem.
>> DPDK_DEVICE_FLAG_ADMIN_UP is not valid hw interface flag.
>> From that code section, you can see that dpdk_update_link_state()
>> function is called which is supposed to update link speed and duplex after
>> link goes up.
>>
>> Can you change LINK_STATE_ELOGS  in src/plugins/dpdk/device/init.c to 1,
>> recompile vpp and capture link state events with "show event-logger" after
>> problem happens?
>>
>> --
>> Damjan
>>
>> On 9 Sep 2018, at 08:50, emma sdi  wrote:
>>
>> Dear community
>>
>> I have the same problem, and commit this suggestion in
>> https://gerrit.fd.io/r/#/c/14647/.
>> Please someone review this code, it seems OK to me.
>>
>> Cheers,
>> Khers
>>
>> On Mon, Sep 3, 2018 at 12:18 PM sadjad  wrote:
>>
>>> Hi Dear VPP
>>> I tried to solve this problem. so i changed device.c in dpdk plugin as
>>> you can see below:
>>>
>>> On branch stable/1807
>>> Your branch is up-to-date with 'origin/stable/1807'.
>>> Changes not staged for commit:
>>>
>>> modified:   src/plugins/dpdk/device/device.c
>>>
>>> diff --git a/src/plugins/dpdk/device/device.c
>>> b/src/plugins/dpdk/device/device.c
>>> index d5ab2585..159a395e 100644
>>> --- a/src/plugins/dpdk/device/device.c
>>> +++ b/src/plugins/dpdk/device/device.c
>>> @@ -547,11 +547,12 @@ dpdk_interface_admin_up_down (vnet_main_t * vnm,
>>> u32 hw_if_index, u32 flags)
>>>
>>>if (xd->flags & DPDK_DEVICE_FLAG_PMD_INIT_FAIL)
>>>  return clib_error_return (0, "Interface not initialized");
>>> -
>>> +  u32 hw_flags = hif->flags;
>>>if (is_up)
>>>  {
>>> + hw_flags |= DPDK_DEVICE_FLAG_ADMIN_UP;
>>>vnet_hw_interface_set_flags (vnm, xd->hw_if_index,
>>> -  VNET_HW_INTERFACE_FLAG_LINK_UP);
>>> +  hw_flags);
>>>if ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0)
>>> dpdk_device_start (xd);
>>>xd->flags |= DPDK_DEVICE_FLAG_ADMIN_UP;
>>> @@ -561,7 +562,8 @@ dpdk_interface_admin_up_down (vnet_main_t * vnm, u32
>>> hw_if_index, u32 flags)
>>>  }
>>>else
>>>  {
>>> -  vnet_hw_interface_set_flags (vnm, xd->hw_if_index, 0);
>>> + hw_flags &= ~DPDK_DEVICE_FLAG_ADMIN_UP;
>>> +  vnet_hw_interface_set_flags (vnm, xd->hw_if_index, hw_flags);
>>>if ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) != 0)
>>> dpdk_device_stop (xd);
>>>xd->flags &= ~DPDK_DEVICE_FLAG_ADMIN_UP;
>>>
>>> and problem is fixed. what is your idea?
>>>
>>> Best Regards
>>>
>>> On Mon, Sep 3, 2018 at 5:54 AM chore  wrote:
>>>
 Hi Dear VPP
 I wrote a small api client like vpp_api_test that contains
 sw_interface_dump api. when i was trying to use this api client i faced a
 problem in "stable/1807".
 At first i disconnected one of my links and my api client printed below
 output:

 GigabitEthernet0/9/0 duplex half speed 0
 admin: down
 link: down

 Then i connected the link and got this:

 GigabitEthernet0/9/0 duplex full speed 1000
 admin: down
 link: down

 At the end i changed admin status of link and saw below output:

 GigabitEthernet0/9/0 duplex bogus speed 0
 admin: up
 link: up

 and show hardware-interface GigabitEthernet0/9/0:
   NameIdx   Link  Hardware
 GigabitEthernet0/9/0   2 up   GigabitEthernet0/9/0
   Ethernet address 08:00:27:94:50:ba
   Intel 82540EM (e1000)
 carrier up full duplex speed 1000 mtu 9202
 flags: admin-up pmd maybe-multiseg tx-offload intel-phdr-cksum
 rx queues 1, rx desc 1024, tx queues 1, tx desc 1024
 cpu socket 0

 based on last results, it seems to be a problem in
 vl_api_sw_interface_dump.
 If I want to describe this problem more, I have to say that 'duplex'
 and 'speed' api returned values are wrong. however you can see correct
 values in  "show hardware-interface" cli output.

 In addition, GDB output shows that both 'speed' and 'duplex' are zero
 in replied mp.

 Breakpoint 1, vl_api_sw_interface_details_t_handler (mp=0x3005eabc) at
 interface-api.c:24
 24  int 

Re: [vpp-dev] vl_api_sw_interface_dump problem

2018-09-10 Thread Damjan Marion via Lists.Fd.Io
Can you please try this patch:

https://paste.ubuntu.com/p/GtqHd2yWqK/

-- 
Damjan

> On 10 Sep 2018, at 13:44, khers  wrote:
> 
> Dear Damjan
> 
> This is output  of 'show 
> event-logger' before the problem.
> I set GigabitEthernet0/4/0 interface up, then call dump_sw_interface api in a 
> c program. 
> duplex an speed is not accurate.
> Outout  of 'show event-loger' after I 
> saw the bug.
> 
> footnote: output of 'git diff' 
> Cheers
> 
> On Sun, Sep 9, 2018 at 2:47 PM Damjan Marion  > wrote:
> 
> Dear Emma, Chore,
> 
> Patch 14647 is not valid solution to the problem. DPDK_DEVICE_FLAG_ADMIN_UP 
> is not valid hw interface flag.
> From that code section, you can see that dpdk_update_link_state() function is 
> called which is supposed to update link speed and duplex after link goes up.
> 
> Can you change LINK_STATE_ELOGS  in src/plugins/dpdk/device/init.c to 1, 
> recompile vpp and capture link state events with "show event-logger" after 
> problem happens?
> 
> -- 
> Damjan
> 
>> On 9 Sep 2018, at 08:50, emma sdi > > wrote:
>> 
>> Dear community
>> 
>> I have the same problem, and commit this suggestion in 
>> https://gerrit.fd.io/r/#/c/14647/ .
>> Please someone review this code, it seems OK to me.
>> 
>> Cheers,
>> Khers
>> 
>> On Mon, Sep 3, 2018 at 12:18 PM sadjad > > wrote:
>> Hi Dear VPP
>> I tried to solve this problem. so i changed device.c in dpdk plugin as you 
>> can see below:
>> 
>> On branch stable/1807
>> Your branch is up-to-date with 'origin/stable/1807'.
>> Changes not staged for commit:
>> 
>> modified:   src/plugins/dpdk/device/device.c
>> 
>> diff --git a/src/plugins/dpdk/device/device.c 
>> b/src/plugins/dpdk/device/device.c
>> index d5ab2585..159a395e 100644
>> --- a/src/plugins/dpdk/device/device.c
>> +++ b/src/plugins/dpdk/device/device.c
>> @@ -547,11 +547,12 @@ dpdk_interface_admin_up_down (vnet_main_t * vnm, u32 
>> hw_if_index, u32 flags)
>>  
>>if (xd->flags & DPDK_DEVICE_FLAG_PMD_INIT_FAIL)
>>  return clib_error_return (0, "Interface not initialized");
>> -
>> +  u32 hw_flags = hif->flags;
>>if (is_up)
>>  {
>> + hw_flags |= DPDK_DEVICE_FLAG_ADMIN_UP;
>>vnet_hw_interface_set_flags (vnm, xd->hw_if_index,
>> -  VNET_HW_INTERFACE_FLAG_LINK_UP);
>> +  hw_flags);
>>if ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0)
>> dpdk_device_start (xd);
>>xd->flags |= DPDK_DEVICE_FLAG_ADMIN_UP;
>> @@ -561,7 +562,8 @@ dpdk_interface_admin_up_down (vnet_main_t * vnm, u32 
>> hw_if_index, u32 flags)
>>  }
>>else
>>  {
>> -  vnet_hw_interface_set_flags (vnm, xd->hw_if_index, 0);
>> + hw_flags &= ~DPDK_DEVICE_FLAG_ADMIN_UP;
>> +  vnet_hw_interface_set_flags (vnm, xd->hw_if_index, hw_flags);
>>if ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) != 0)
>> dpdk_device_stop (xd);
>>xd->flags &= ~DPDK_DEVICE_FLAG_ADMIN_UP;
>> 
>> and problem is fixed. what is your idea?
>> 
>> Best Regards
>> 
>> On Mon, Sep 3, 2018 at 5:54 AM chore > > wrote:
>> Hi Dear VPP
>> I wrote a small api client like vpp_api_test that contains sw_interface_dump 
>> api. when i was trying to use this api client i faced a problem in 
>> "stable/1807".
>> At first i disconnected one of my links and my api client printed below 
>> output:
>>  
>> GigabitEthernet0/9/0 duplex half speed 0
>> admin: down
>> link: down
>> 
>> Then i connected the link and got this:
>> 
>> GigabitEthernet0/9/0 duplex full speed 1000
>> admin: down
>> link: down
>> 
>> At the end i changed admin status of link and saw below output: 
>> 
>> GigabitEthernet0/9/0 duplex bogus speed 0
>> admin: up
>> link: up
>> 
>> and show hardware-interface GigabitEthernet0/9/0:
>>   NameIdx   Link  Hardware
>> GigabitEthernet0/9/0   2 up   GigabitEthernet0/9/0
>>   Ethernet address 08:00:27:94:50:ba
>>   Intel 82540EM (e1000)
>> carrier up full duplex speed 1000 mtu 9202
>> flags: admin-up pmd maybe-multiseg tx-offload intel-phdr-cksum
>> rx queues 1, rx desc 1024, tx queues 1, tx desc 1024
>> cpu socket 0
>> 
>> based on last results, it seems to be a problem in vl_api_sw_interface_dump.
>> If I want to describe this problem more, I have to say that 'duplex' and 
>> 'speed' api returned values are wrong. however you can see correct values in 
>>  "show hardware-interface" cli output.
>> 
>> In addition, GDB output shows that both 'speed' and 'duplex' are zero in 
>> replied mp.
>> 
>> Breakpoint 1, vl_api_sw_interface_details_t_handler (mp=0x3005eabc) at 
>> interface-api.c:24
>> 24  int speed = 0;
>> (gdb) p *mp
>> $2 = {_vl_msg_id = 21504, 

Re: [vpp-dev] vl_api_sw_interface_dump problem

2018-09-10 Thread emma sdi
Dear Damjan

This is output  of 'show
event-logger' before the problem.
I set GigabitEthernet0/4/0 interface up, then call dump_sw_interface api in
a c program.
duplex an speed is not accurate.
Outout  of 'show event-loger' after
I saw the bug.

footnote: output of 'git diff' 
Cheers

On Sun, Sep 9, 2018 at 2:47 PM Damjan Marion  wrote:

>
> Dear Emma, Chore,
>
> Patch 14647 is not valid solution to the problem.
> DPDK_DEVICE_FLAG_ADMIN_UP is not valid hw interface flag.
> From that code section, you can see that dpdk_update_link_state() function
> is called which is supposed to update link speed and duplex after link goes
> up.
>
> Can you change LINK_STATE_ELOGS  in src/plugins/dpdk/device/init.c to 1,
> recompile vpp and capture link state events with "show event-logger" after
> problem happens?
>
> --
> Damjan
>
> On 9 Sep 2018, at 08:50, emma sdi  wrote:
>
> Dear community
>
> I have the same problem, and commit this suggestion in
> https://gerrit.fd.io/r/#/c/14647/.
> Please someone review this code, it seems OK to me.
>
> Cheers,
> Khers
>
> On Mon, Sep 3, 2018 at 12:18 PM sadjad  wrote:
>
>> Hi Dear VPP
>> I tried to solve this problem. so i changed device.c in dpdk plugin as
>> you can see below:
>>
>> On branch stable/1807
>> Your branch is up-to-date with 'origin/stable/1807'.
>> Changes not staged for commit:
>>
>> modified:   src/plugins/dpdk/device/device.c
>>
>> diff --git a/src/plugins/dpdk/device/device.c
>> b/src/plugins/dpdk/device/device.c
>> index d5ab2585..159a395e 100644
>> --- a/src/plugins/dpdk/device/device.c
>> +++ b/src/plugins/dpdk/device/device.c
>> @@ -547,11 +547,12 @@ dpdk_interface_admin_up_down (vnet_main_t * vnm,
>> u32 hw_if_index, u32 flags)
>>
>>if (xd->flags & DPDK_DEVICE_FLAG_PMD_INIT_FAIL)
>>  return clib_error_return (0, "Interface not initialized");
>> -
>> +  u32 hw_flags = hif->flags;
>>if (is_up)
>>  {
>> + hw_flags |= DPDK_DEVICE_FLAG_ADMIN_UP;
>>vnet_hw_interface_set_flags (vnm, xd->hw_if_index,
>> -  VNET_HW_INTERFACE_FLAG_LINK_UP);
>> +  hw_flags);
>>if ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0)
>> dpdk_device_start (xd);
>>xd->flags |= DPDK_DEVICE_FLAG_ADMIN_UP;
>> @@ -561,7 +562,8 @@ dpdk_interface_admin_up_down (vnet_main_t * vnm, u32
>> hw_if_index, u32 flags)
>>  }
>>else
>>  {
>> -  vnet_hw_interface_set_flags (vnm, xd->hw_if_index, 0);
>> + hw_flags &= ~DPDK_DEVICE_FLAG_ADMIN_UP;
>> +  vnet_hw_interface_set_flags (vnm, xd->hw_if_index, hw_flags);
>>if ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) != 0)
>> dpdk_device_stop (xd);
>>xd->flags &= ~DPDK_DEVICE_FLAG_ADMIN_UP;
>>
>> and problem is fixed. what is your idea?
>>
>> Best Regards
>>
>> On Mon, Sep 3, 2018 at 5:54 AM chore  wrote:
>>
>>> Hi Dear VPP
>>> I wrote a small api client like vpp_api_test that contains
>>> sw_interface_dump api. when i was trying to use this api client i faced a
>>> problem in "stable/1807".
>>> At first i disconnected one of my links and my api client printed below
>>> output:
>>>
>>> GigabitEthernet0/9/0 duplex half speed 0
>>> admin: down
>>> link: down
>>>
>>> Then i connected the link and got this:
>>>
>>> GigabitEthernet0/9/0 duplex full speed 1000
>>> admin: down
>>> link: down
>>>
>>> At the end i changed admin status of link and saw below output:
>>>
>>> GigabitEthernet0/9/0 duplex bogus speed 0
>>> admin: up
>>> link: up
>>>
>>> and show hardware-interface GigabitEthernet0/9/0:
>>>   NameIdx   Link  Hardware
>>> GigabitEthernet0/9/0   2 up   GigabitEthernet0/9/0
>>>   Ethernet address 08:00:27:94:50:ba
>>>   Intel 82540EM (e1000)
>>> carrier up full duplex speed 1000 mtu 9202
>>> flags: admin-up pmd maybe-multiseg tx-offload intel-phdr-cksum
>>> rx queues 1, rx desc 1024, tx queues 1, tx desc 1024
>>> cpu socket 0
>>>
>>> based on last results, it seems to be a problem in
>>> vl_api_sw_interface_dump.
>>> If I want to describe this problem more, I have to say that 'duplex' and
>>> 'speed' api returned values are wrong. however you can see correct values
>>> in  "show hardware-interface" cli output.
>>>
>>> In addition, GDB output shows that both 'speed' and 'duplex' are zero in
>>> replied mp.
>>>
>>> Breakpoint 1, vl_api_sw_interface_details_t_handler (mp=0x3005eabc) at
>>> interface-api.c:24
>>> 24  int speed = 0;
>>> (gdb) p *mp
>>> $2 = {_vl_msg_id = 21504, context = 0, sw_if_index = 33554432,
>>> sup_sw_if_index = 33554432, l2_address_length = 100663296, l2_address =
>>> "\b\000'\224P\272\000",
>>>   interface_name = "GigabitEthernet0/9/0", '\000' ,
>>> admin_up_down = 1 '\001', link_up_down = 1 '\001', link_duplex = 0 '\000',
>>>   link_speed = 0 '\000', link_mtu = 61987, 

Re: [vpp-dev] build queue backed up a bit foo

2018-09-10 Thread Marco Varlese
Last example (today) can be found here

10:47:06 Not running extended tests (some tests will be skipped)
10:47:06 Perform 4 attempts to pass the suite...
10:47:10 *** Error in `python': double free or corruption (out):
0x75d483f0 ***
12:27:54 Build timed out (after 120 minutes). Marking the build as failed.

Patch: https://gerrit.fd.io/r/#/c/14744/

Other jobs have finished more than one and half-hour ago and the patch cannot be
marked Verified+1 because Jenkins is still waiting for the ARM job to complete
(timeout = 120 minutes). IMHO it makes the overall patch submission and review
very painful for authors and commiters. 

I would recommend (it has been done in the past for other jobs) to disable this
job in production, move it again to sandbox, get it fixed and eventually moved
again to production...

- Marco

On Fri, 2018-09-07 at 12:04 -0700, Florin Coras wrote:
> ARM jobs have not been working for some days now. That’s why their result is
> skipped. Timeout is 2h but probably we should drop that even further … 
> 
> Florin
> 
> > On Sep 7, 2018, at 11:59 AM, Ole Troan  wrote:
> > Trying out a change in:
> > https://gerrit.fd.io/r/#/c/14732/
> > 
> > All others succeed but ARM this doesn’t look too good.
> > Stuck apparently.
> > 
> > https://jenkins.fd.io/job/vpp-arm-verify-master-ubuntu1604/2157/console
> > 20:16:04 
> > 
> > ==
> > 
> > 20:16:04 
> > ARP Test Case 
> > 
> > 20:16:04 
> > 
> > ==
> > 
> > 20:16:07 
> > *** Error in `python': free(): invalid pointer: 0x553263a8 ***
> > 
> > 20:16:09 
> > ARP  OK
> > 
> > 20:16:09 
> > ARP Duplicates   OK
> > 
> > 20:16:09 
> > test_arp_incomplete (test_neighbor.ARPTestCase)  OK
> > 
> > 20:16:09 
> > ARP Static   OK
> > 
> > 20:16:15 
> > ARP reply with VRRP virtual src hw addr  OK
> > 
> > 20:16:15 
> > GARP OK
> > 
> > 20:16:15 
> > MPLS OK
> > 
> > 
> > Cheers,
> > Ole
> > 
> > 
> > > On 7 Sep 2018, at 18:50, Ole Troan  wrote:
> > > 
> > > Ed,
> > > 
> > > Let me take a closer look at these.
> > > It appears if VPP is slow to start it might not have created the socket
> > > yet. Let me try to put in a retry loop and see if that fixes verify. 
> > > 
> > > Cheers 
> > > Ole
> > > 
> > > On 7 Sep 2018, at 18:06, Ed Kern via Lists.Fd.Io <
> > > ejk=cisco@lists.fd.io> wrote:
> > > 
> > > > make test failures due to the below causing pretty consistent failures. 
> > > > Note:  For whatever reason the failures are not 100%. 
> > > > The failures and with the retries on nonconcurrent merge jobs may lead
> > > > to long build queues.   
> > > > 
> > > > These are not infra issues but ill be keeping an eye on it.
> > > > 
> > > > 
> > > > Ed
> > > > 
> > > > 
> > > > 
> > > > 13:08:25 
> > > > Using /var/cache/vpp/python/virtualenv/lib/python2.7/site-packages
> > > > 
> > > > 13:08:25 
> > > > Finished processing dependencies for vpp-papi==1.6.1
> > > > 
> > > > 13:08:27 
> > > > Traceback (most recent call last):
> > > > 
> > > > 13:08:27 
> > > >  File "sanity_run_vpp.py", line 21, in 
> > > > 
> > > > 13:08:27 
> > > >tc.setUpClass()
> > > > 
> > > > 13:08:27 
> > > >  File "/w/workspace/vpp-merge-master-ubuntu1604/test/framework.py", line
> > > > 394, in setUpClass
> > > > 
> > > > 13:08:27 
> > > >cls.statistics = VPPStats(socketname=cls.tempdir+'/stats.sock')
> > > > 
> > > > 13:08:27 
> > > >  File "build/bdist.linux-x86_64/egg/vpp_papi/vpp_stats.py", line 117, in
> > > > __init__
> > > > 
> > > > 13:08:27 
> > > > IOError
> > > > 
> > > > 13:08:27 
> > > > ***
> > > > 
> > > > 13:08:27 
> > > > * Sanity check failed, cannot run vpp
> > > > 
> > > > 13:08:27
> > > > ***
> > > > 
> > > > -=-=-=-=-=-=-=-=-=-=-=-
> > > > Links: You receive all messages sent to this group.
> > > > 
> > > > View/Reply Online (#10432): https://lists.fd.io/g/vpp-dev/message/10432
> > > > Mute This Topic: https://lists.fd.io/mt/25308161/675193
> > > > Group Owner: vpp-dev+ow...@lists.fd.io
> > > > Unsubscribe: https://lists.fd.io/g/vpp-dev/unsubb  [otr...@employees.org
> > > > ]
> > > > -=-=-=-=-=-=-=-=-=-=-=-
> > > -=-=-=-=-=-=-=-=-=-=-=-
> > > Links: You receive all messages sent to this group.
> > > 
> > > View/Reply Online (#10435): https://lists.fd.io/g/vpp-dev/message/10435
> > > Mute This Topic: https://lists.fd.io/mt/25308161/675193
> > > Group Owner: vpp-dev+ow...@lists.fd.io
> > > 

Re: [vpp-dev] vpp_api_test missing after vpp installation

2018-09-10 Thread Damjan Marion via Lists.Fd.Io

Should be fixed with this one:

https://gerrit.fd.io/r/#/c/14744/

-- 
Damjan

> On 10 Sep 2018, at 10:04, Peter Mikus via Lists.Fd.Io 
>  wrote:
> 
> Hello vpp-dev,
>  
> In CSIT we are facing the issue with missing “vpp_api_test” binary located in 
> “/usr/bin” after *deb packages are installed in system.
>  
> testuser@s14-t32-sut1:/ $ sudo find / -name vpp_api_test
> testuser@s14-t32-sut1:/ $
>  
> testuser@s14-t32-sut1:/ $ ll /usr/bin/vpp*
> -rwxr-xr-x 1 root root 881800 Sep  7 15:19 /usr/bin/vpp*
> -rwxr-xr-x 1 root root  24246 Sep  7 15:19 /usr/bin/vppapigen*
> -rwxr-xr-x 1 root root  14712 Sep  7 15:19 /usr/bin/vppctl*
> -rwxr-xr-x 1 root root  10544 Sep  7 15:19 /usr/bin/vpp_get_metrics*
> -rwxr-xr-x 1 root root  14768 Sep  7 15:19 /usr/bin/vpp_prometheus_export*
> -rwxr-xr-x 1 root root  10528 Sep  7 15:19 /usr/bin/vpp_restart*
>  
>  
> From CSIT perspective this is crucial as “vpp_api_test” is used as primary 
> tool for configuring VPP.
>  
> Short bisect investigation finds commit-id range:
> Last good: 4e588aa
> First bad: 74cac88
>  
> git log --pretty="short" 4e588aa..74cac88
>  
> commit 74cac8839efae6a69baea031fb01602ef8907e8a
> Author: Florin Coras mailto:fco...@cisco.com>>
>  
> session: fix reentrant listens
>  
> commit d790c7e1fa5f1accb621aa75089212be586c137f
> Author: Matthew Smith mailto:mgsm...@netgate.com>>
>  
> update regex used by rpm build to find lib files
>  
> commit 36feebb42f1fb9734c1b99b4afae87d3c8233548
> Author: Dave Barach mailto:dbar...@cisco.com>>
>  
> Improve NTP / kernel time change event handling
>  
> commit 833de8cab672c806176d580a1ebc001f394b2eaf
> Author: Damjan Marion mailto:damar...@cisco.com>>
>  
> cmake: set packaging component for different files
>  
> commit 0745036cb9ead1a3aaf9686c8c8046cb7285ea52
> Author: Marco Varlese mailto:marco.varl...@suse.com>>
>  
> Cavium OcteonTX: cache line fix
>  
> Short elimination of those changes got its primary suspect.
>  
> Can you please take a look and advise?
>  
> Thank you.
>  
> Peter Mikus
> Engineer – Software
> Cisco Systems Limited
>  
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> 
> View/Reply Online (#10451): https://lists.fd.io/g/vpp-dev/message/10451 
> 
> Mute This Topic: https://lists.fd.io/mt/25499282/675642 
> 
> Group Owner: vpp-dev+ow...@lists.fd.io 
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub 
>   [dmar...@me.com 
> ]
> -=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10452): https://lists.fd.io/g/vpp-dev/message/10452
Mute This Topic: https://lists.fd.io/mt/25499282/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] vpp_api_test missing after vpp installation

2018-09-10 Thread Peter Mikus via Lists.Fd.Io
Hello vpp-dev,

In CSIT we are facing the issue with missing "vpp_api_test" binary located in 
"/usr/bin" after *deb packages are installed in system.

testuser@s14-t32-sut1:/ $ sudo find / -name vpp_api_test
testuser@s14-t32-sut1:/ $

testuser@s14-t32-sut1:/ $ ll /usr/bin/vpp*
-rwxr-xr-x 1 root root 881800 Sep  7 15:19 /usr/bin/vpp*
-rwxr-xr-x 1 root root  24246 Sep  7 15:19 /usr/bin/vppapigen*
-rwxr-xr-x 1 root root  14712 Sep  7 15:19 /usr/bin/vppctl*
-rwxr-xr-x 1 root root  10544 Sep  7 15:19 /usr/bin/vpp_get_metrics*
-rwxr-xr-x 1 root root  14768 Sep  7 15:19 /usr/bin/vpp_prometheus_export*
-rwxr-xr-x 1 root root  10528 Sep  7 15:19 /usr/bin/vpp_restart*


>From CSIT perspective this is crucial as "vpp_api_test" is used as primary 
>tool for configuring VPP.

Short bisect investigation finds commit-id range:
Last good: 4e588aa
First bad: 74cac88

git log --pretty="short" 4e588aa..74cac88

commit 74cac8839efae6a69baea031fb01602ef8907e8a
Author: Florin Coras 

session: fix reentrant listens

commit d790c7e1fa5f1accb621aa75089212be586c137f
Author: Matthew Smith 

update regex used by rpm build to find lib files

commit 36feebb42f1fb9734c1b99b4afae87d3c8233548
Author: Dave Barach 

Improve NTP / kernel time change event handling

commit 833de8cab672c806176d580a1ebc001f394b2eaf
Author: Damjan Marion 

cmake: set packaging component for different files

commit 0745036cb9ead1a3aaf9686c8c8046cb7285ea52
Author: Marco Varlese 

Cavium OcteonTX: cache line fix

Short elimination of those changes got its primary suspect.

Can you please take a look and advise?

Thank you.

Peter Mikus
Engineer - Software
Cisco Systems Limited

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10451): https://lists.fd.io/g/vpp-dev/message/10451
Mute This Topic: https://lists.fd.io/mt/25499282/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] vpp api test issue

2018-09-10 Thread Kevin Yan
Hi ,
I started to learn VPP recently and I am trying to write a 
control plane application to start and configure vpp automatilly, so firstly I 
tried to compile test_client.c and run it , some commands are good , but some 
are not well worked. For example, if I type 'L' which is used to make the link 
up, but it didn't work at all. It seems that 
"vl_api_want_interface_events_t_handler" is not implemented, can anybody help? 
I suppose all the command listed in test_client should be work.

Type 'h' for help, 'q' to quit...
h
q=quit,d=dump,L=link evts on,l=link evts off
S=stats on,s=stats off
4=add v4 route, 3=del v4 route
6=add v6 route, 5=del v6 route
A=add v4 intfc route, a=del v4 intfc route
B=add v6 intfc route, b=del v6 intfc route
z=del all intfc routes
t=set v4 intfc table, T=set v6 intfc table
c=connect unix tap
j=set dhcpv4 and v6 link-address/option-82 params
k=set dhcpv4 relay agent params
K=set dhcpv6 relay agent params
E=add l2 patch, e=del l2 patch
V=ip6 set link-local address
w=ip6 enable
W=ip6 disable
x=ip6 nd config
X=no ip6 nd config
y=ip6 nd prefix
Y=no ip6 nd prefix
@=l2 xconnect
#=l2 bridge

BRs,
Kevin



This e-mail message may contain confidential or proprietary information of 
Mavenir Systems, Inc. or its affiliates and is intended solely for the use of 
the intended recipient(s). If you are not the intended recipient of this 
message, you are hereby notified that any review, use or distribution of this 
information is absolutely prohibited and we request that you delete all copies 
in your control and contact us by e-mailing to secur...@mavenir.com. This 
message contains the views of its author and may not necessarily reflect the 
views of Mavenir Systems, Inc. or its affiliates, who employ systems to monitor 
email messages, but make no representation that such messages are authorized, 
secure, uncompromised, or free from computer viruses, malware, or other 
defects. Thank You
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10450): https://lists.fd.io/g/vpp-dev/message/10450
Mute This Topic: https://lists.fd.io/mt/25484407/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-