Re: [vpp-dev] can't establish tcp connection with new introduced transport_endpoint_freelist

2023-03-14 Thread Zhang Dongya
Just use this patch and the connection can be reconnected after closed.

However, I find another possible bug when using local ip + local port for
different target server due to transport_endpoint_mark_used return error
if it find local ip + port being created.

I think it should increase the refcnt instead if it find 6 tuple is unique.

static int
> transport_endpoint_mark_used (u8 proto, ip46_address_t *ip, u16 port)
> {
>   transport_main_t *tm = &tp_main;
>   local_endpoint_t *lep;
>   u32 tei;
>
>   ASSERT (vlib_get_thread_index () <= transport_cl_thread ());
>
  // BUG??? maybe should allow reuse ???
>
  tei =
> transport_endpoint_lookup (&tm->local_endpoints_table, proto, ip,
> port);
>   if (tei != ENDPOINT_INVALID_INDEX)
> return SESSION_E_PORTINUSE;
>
>   /* Pool reallocs with worker barrier */
>   lep = transport_endpoint_alloc ();
>   clib_memcpy_fast (&lep->ep.ip, ip, sizeof (*ip));
>   lep->ep.port = port;
>   lep->proto = proto;
>   lep->refcnt = 1;
>
>   transport_endpoint_table_add (&tm->local_endpoints_table, proto,
> &lep->ep,
> lep - tm->local_endpoints);
>
>   return 0;
> }
>

Florin Coras  于2023年3月14日周二 11:38写道:

> Hi,
>
> Could you try this out [1]? I’ve hit this issue myself today but with udp
> sessions. Unfortunately, as you’ve correctly pointed out, we were forcing a
> cleanup only on the non-fixed local port branch.
>
> Regards,
> Florin
>
> [1] https://gerrit.fd.io/r/c/vpp/+/38473
>
> On Mar 13, 2023, at 7:35 PM, Zhang Dongya 
> wrote:
>
> Hi list,
>
> We have update coded from the upstream session&tcp changes to our code
> base and find a possible bug which cause tcp connection can't be
> established anymore.
>
> Our scenario is that we will connect to a remote tcp server with specified
> local port and local ip, however, new vpp code have introduced a
> lcl_endpts_freelist which will be either flushed when pending local
> endpoint exceeded the limit (32) or when transport_alloc_local_port is
> called.
>
> However, since we specify the local port and local ip and the total
> session count is limited (< 32), in this case, the
> transport_cleanup_freelist will never be called which cause the previous
> session which use the specified local port and local ip will not be
> released after the session aborted.
>
> I think we should also try to free the list in such case as I did in the
> following code:
>
> int
>> transport_alloc_local_endpoint (u8 proto, transport_endpoint_cfg_t *
>> rmt_cfg,
>> ip46_address_t * lcl_addr, u16 * lcl_port)
>> {
>>   // ZDY:
>>   transport_main_t *tm = &tp_main;
>>   transport_endpoint_t *rmt = (transport_endpoint_t *) rmt_cfg;
>>   session_error_t error;
>>   int port;
>>
>>   /*
>>* Find the local address
>>*/
>>   if (ip_is_zero (&rmt_cfg->peer.ip, rmt_cfg->peer.is_ip4))
>> {
>>   error = transport_find_local_ip_for_remote
>> (&rmt_cfg->peer.sw_if_index,
>>  rmt, lcl_addr);
>>   if (error)
>> return error;
>> }
>>   else
>> {
>>   /* Assume session layer vetted this address */
>>   clib_memcpy_fast (lcl_addr, &rmt_cfg->peer.ip,
>> sizeof (rmt_cfg->peer.ip));
>> }
>>
>>   /*
>>* Allocate source port
>>*/
>>   if (rmt_cfg->peer.port == 0)
>> {
>>   port = transport_alloc_local_port (proto, lcl_addr, rmt_cfg);
>>   if (port < 1)
>> return SESSION_E_NOPORT;
>>   *lcl_port = port;
>> }
>>   else
>> {
>>   port = clib_net_to_host_u16 (rmt_cfg->peer.port);
>>   *lcl_port = port;
>>
>>
>>
>>
>>
>>
>> *  // ZDY: need add this to to cleanup because in specified src port
>> // case, we will not run to transport_alloc_local_port, then  //
>> freelist will only be freeed when list is full (>32).  /* Cleanup
>> freelist if need be */  if (vec_len (tm->lcl_endpts_freelist))
>> transport_cleanup_freelist ();*
>>
>>   return transport_endpoint_mark_used (proto, lcl_addr, port);
>> }
>>
>>   return 0;
>> }
>>
>
>
>
>
>
>
> 
>
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22700): https://lists.fd.io/g/vpp-dev/message/22700
Mute This Topic: https://lists.fd.io/mt/97596886/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] Race condition between bihash deletion and searching - misuse or bug?

2023-03-14 Thread Hao Tian
Hi all,

I found that bihash might return zero'ed value (0xff's) if deletion and 
searching were performed in parallel on different threads. This is reproducible 
by only ~100 lines of code, from 21.06 all the way up to git master, and on 
multiple machines from Coffee Lake to Tiger Lake.

The code (a plugin) and test command is shown below. Given how easy it is to 
reproduce the problem, I'm not sure whether this is a bug or something missing 
on my end. Any advice is welcomed. Thanks!

 bihash_race.c 

#include 
#include 
#include 

static volatile int br_run;
static clib_bihash_8_8_t br_bihash_8_8;
static int test_run_cnt[2];

static void
bihash_race_set (u64 key)
{
  clib_bihash_kv_8_8_t kv = { .key = key, .value = 1 };

  clib_bihash_add_del_8_8 (&br_bihash_8_8, &kv, BIHASH_ADD);
}

static void
bihash_race_del (u64 key)
{
  clib_bihash_kv_8_8_t kv = { .key = key, .value = 1 };

  clib_bihash_add_del_8_8 (&br_bihash_8_8, &kv, BIHASH_DEL);
}

static void
bihash_race_get (u64 key)
{
  clib_bihash_kv_8_8_t kv = { .key = key, .value = 1 };
  clib_bihash_kv_8_8_t value;

  if (!clib_bihash_search_8_8 (&br_bihash_8_8, &kv, &value))
    {
  if (value.value != 1)
    {
  clib_warning ("suspicious value: 0x%lx\n", value.value);
    }
    }
}

static clib_error_t *
bihash_race_init (vlib_main_t *vm)
{
  clib_bihash_init_8_8 (&br_bihash_8_8, "bihash_race hash", 4096, 4096 * 256);

  return 0;
}

VLIB_NODE_FN (br_test_node)
(vlib_main_t *vm, vlib_node_runtime_t *node, vlib_frame_t *frame)
{
  u64 k = 0x01010101UL;

  if (!br_run || vm->thread_index == 0)
    {
  return 0;
    }

  // perform bihash set and delete for 5 times each in thread 1
  if (vm->thread_index == 1 && test_run_cnt[0] < 10)
    {
  if (test_run_cnt[1] & 1)
    bihash_race_del (k);
  else
    bihash_race_set (k);
  test_run_cnt[0]++;
  return 0;
    }

  // perform bihash lookup for 10 times in thread 2
  if (vm->thread_index == 2 && test_run_cnt[1] < 10)
    {
  bihash_race_get (k);
  test_run_cnt[1]++;
  return 0;
    }
  return 0;
}

void
bihash_race_test_reset ()
{
  memset (test_run_cnt, 0, sizeof (test_run_cnt));
  clib_warning ("== test reset ==");
}

static clib_error_t *
bihash_race_test_command_fn (vlib_main_t *vm, unformat_input_t *input,
 vlib_cli_command_t *cmd)
{
  u8 state = ~0;

  while (unformat_check_input (input) != UNFORMAT_END_OF_INPUT)
    {
  if (unformat (input, "on"))
    state = 1;
  else if (unformat (input, "off"))
    state = 0;
  else
    return clib_error_return (0, "invalid input");
    }

  if (state)
    {
  br_run = 1;
    }
  else
    {
  br_run = 0;
  bihash_race_test_reset ();
  vlib_cli_output (vm, "== test reset ==");
    }

  return 0;
}

VLIB_CLI_COMMAND (bihash_race_test_command, static) = {
  .path = "set bihash-race-test",
  .short_help = "set bihash-race-test |",
  .function = bihash_race_test_command_fn,
};

VLIB_REGISTER_NODE (br_test_node) = {
  .name = "bihash-race-test",
  .type = VLIB_NODE_TYPE_INPUT,
  .state = VLIB_NODE_STATE_POLLING,
};

VLIB_INIT_FUNCTION (bihash_race_init);

VLIB_PLUGIN_REGISTER () = {
  .version = VPP_BUILD_VER,
  .description = "bihash race test",
};

= CMakeLists.txt =

add_vpp_plugin(bihash_race
  SOURCES
  bihash_race.c
)

= startup.conf =

unix {
  nodaemon
  full-coredump
  cli-listen /run/vpp/cli.sock
  gid 1000
  interactive
}

cpu {
  workers 3
  main-core 1
}

dpdk {
  no-pci
}

= test command =

terminal1 # vpp -c startup.conf
terminal2 # while true; do vppctl set bihash-race-test on; sleep 1; vppctl set 
bihash-race-test off; done

VPP will spit out a lot of "suspicious value: 0x" in terminal1, 
while the code above never saves such value into bihash - the value comes from 
the !is_add branch in clib_bihash_add_del_inline_with_hash. Changing the memset 
value (and clib_bihash_is_free_* obviously) in this branch will lead to this 
new value being returned.

Regards,
Hao Tian
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22701): https://lists.fd.io/g/vpp-dev/message/22701
Mute This Topic: https://lists.fd.io/mt/97599770/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] VPP 23.06 release plan is available

2023-03-14 Thread Andrew Yourtchenko
Hi all,

I’ve prepared the 23.06 release plan - and linked it off the usual place on VPP 
wiki:

https://wiki.fd.io/view/VPP#Get_Involved

Tl;dr: release the last Wednesday of June, RC2 two weeks prior; RC1 three weeks 
prior to RC2. Looks like this schedule works well. Same logic as usual - 
post-RC2 only the fixes for issues found in CSIT, post-RC1 only the fixes + 
specially agreed low risk commits.

This kind of schedule seems to have worked pretty well, so I will keep it, 
unless there’s anyone tells a good reason not to.

Onwards to 23.06! :-)

--a /* your friendly 23.06 release manager */
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22702): https://lists.fd.io/g/vpp-dev/message/22702
Mute This Topic: https://lists.fd.io/mt/97602729/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] Race condition between bihash deletion and searching - misuse or bug?

2023-03-14 Thread Dave Barach
Quick experiment: in
src/vppinfra/bihash_template.h:clib_bihash_search_inline_2_with_hash(),
replace this:

  if (PREDICT_FALSE (b->lock))
{
  volatile BVT (clib_bihash_bucket) * bv = b;
  while (bv->lock)
CLIB_PAUSE ();
}

With:

BV(clib_bihash_lock_bucket(b));

and make sure to BV(clib_bihash_unlock_bucket(b)); just prior to return 0
and return -1 in that function.

Please let me know what happens. 

Thanks... Dave

-Original Message-
From: vpp-dev@lists.fd.io  On Behalf Of Hao Tian
Sent: Tuesday, March 14, 2023 4:13 AM
To: vpp-dev@lists.fd.io
Subject: [vpp-dev] Race condition between bihash deletion and searching -
misuse or bug?

Hi all,

I found that bihash might return zero'ed value (0xff's) if deletion and
searching were performed in parallel on different threads. This is
reproducible by only ~100 lines of code, from 21.06 all the way up to git
master, and on multiple machines from Coffee Lake to Tiger Lake.

The code (a plugin) and test command is shown below. Given how easy it is to
reproduce the problem, I'm not sure whether this is a bug or something
missing on my end. Any advice is welcomed. Thanks!

 bihash_race.c 

#include 
#include 
#include 

static volatile int br_run;
static clib_bihash_8_8_t br_bihash_8_8;
static int test_run_cnt[2];

static void
bihash_race_set (u64 key)
{
  clib_bihash_kv_8_8_t kv = { .key = key, .value = 1 };

  clib_bihash_add_del_8_8 (&br_bihash_8_8, &kv, BIHASH_ADD); }

static void
bihash_race_del (u64 key)
{
  clib_bihash_kv_8_8_t kv = { .key = key, .value = 1 };

  clib_bihash_add_del_8_8 (&br_bihash_8_8, &kv, BIHASH_DEL); }

static void
bihash_race_get (u64 key)
{
  clib_bihash_kv_8_8_t kv = { .key = key, .value = 1 };
  clib_bihash_kv_8_8_t value;

  if (!clib_bihash_search_8_8 (&br_bihash_8_8, &kv, &value))
    {
  if (value.value != 1)
    {
  clib_warning ("suspicious value: 0x%lx\n", value.value);
    }
    }
}

static clib_error_t *
bihash_race_init (vlib_main_t *vm)
{
  clib_bihash_init_8_8 (&br_bihash_8_8, "bihash_race hash", 4096, 4096 *
256);

  return 0;
}

VLIB_NODE_FN (br_test_node)
(vlib_main_t *vm, vlib_node_runtime_t *node, vlib_frame_t *frame) {
  u64 k = 0x01010101UL;

  if (!br_run || vm->thread_index == 0)
    {
  return 0;
    }

  // perform bihash set and delete for 5 times each in thread 1
  if (vm->thread_index == 1 && test_run_cnt[0] < 10)
    {
  if (test_run_cnt[1] & 1)
    bihash_race_del (k);
  else
    bihash_race_set (k);
  test_run_cnt[0]++;
  return 0;
    }

  // perform bihash lookup for 10 times in thread 2
  if (vm->thread_index == 2 && test_run_cnt[1] < 10)
    {
  bihash_race_get (k);
  test_run_cnt[1]++;
  return 0;
    }
  return 0;
}

void
bihash_race_test_reset ()
{
  memset (test_run_cnt, 0, sizeof (test_run_cnt));
  clib_warning ("== test reset =="); }

static clib_error_t *
bihash_race_test_command_fn (vlib_main_t *vm, unformat_input_t *input,
 vlib_cli_command_t *cmd) {
  u8 state = ~0;

  while (unformat_check_input (input) != UNFORMAT_END_OF_INPUT)
    {
  if (unformat (input, "on"))
    state = 1;
  else if (unformat (input, "off"))
    state = 0;
  else
    return clib_error_return (0, "invalid input");
    }

  if (state)
    {
  br_run = 1;
    }
  else
    {
  br_run = 0;
  bihash_race_test_reset ();
  vlib_cli_output (vm, "== test reset ==");
    }

  return 0;
}

VLIB_CLI_COMMAND (bihash_race_test_command, static) = {
  .path = "set bihash-race-test",
  .short_help = "set bihash-race-test |",
  .function = bihash_race_test_command_fn, };

VLIB_REGISTER_NODE (br_test_node) = {
  .name = "bihash-race-test",
  .type = VLIB_NODE_TYPE_INPUT,
  .state = VLIB_NODE_STATE_POLLING,
};

VLIB_INIT_FUNCTION (bihash_race_init);

VLIB_PLUGIN_REGISTER () = {
  .version = VPP_BUILD_VER,
  .description = "bihash race test",
};

= CMakeLists.txt =

add_vpp_plugin(bihash_race
  SOURCES
  bihash_race.c
)

= startup.conf =

unix {
  nodaemon
  full-coredump
  cli-listen /run/vpp/cli.sock
  gid 1000
  interactive
}

cpu {
  workers 3
  main-core 1
}

dpdk {
  no-pci
}

= test command =

terminal1 # vpp -c startup.conf
terminal2 # while true; do vppctl set bihash-race-test on; sleep 1; vppctl
set bihash-race-test off; done

VPP will spit out a lot of "suspicious value: 0x" in
terminal1, while the code above never saves such value into bihash - the
value comes from the !is_add branch in clib_bihash_add_del_inline_with_hash.
Changing the memset value (and clib_bihash_is_free_* obviously) in this
branch will lead to this new value being returned.

Regards,
Hao Tian


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22703): https://lists.fd.io/g/vpp-dev/message/22703
Mute This Topic: https://lists.fd.io/

Re: [vpp-dev] can't establish tcp connection with new introduced transport_endpoint_freelist

2023-03-14 Thread Florin Coras
Hi, 

Are you looking for behavior similar to the one when random local ports are 
allocated when, if port is used, we check if the 5-tuple is available? 

Don’t think we explicitly supported this before but here’s a patch [1]. 

Regards,
Florin

[1] https://gerrit.fd.io/r/c/vpp/+/38486


> On Mar 14, 2023, at 12:56 AM, Zhang Dongya  wrote:
> 
> Just use this patch and the connection can be reconnected after closed.
> 
> However, I find another possible bug when using local ip + local port for 
> different target server due to transport_endpoint_mark_used return error
> if it find local ip + port being created.
> 
> I think it should increase the refcnt instead if it find 6 tuple is unique.
> 
>> static int
>> transport_endpoint_mark_used (u8 proto, ip46_address_t *ip, u16 port)
>> {
>>   transport_main_t *tm = &tp_main;
>>   local_endpoint_t *lep;
>>   u32 tei;
>> 
>>   ASSERT (vlib_get_thread_index () <= transport_cl_thread ());
>>   // BUG??? maybe should allow reuse ??? 
>>   tei =
>> transport_endpoint_lookup (&tm->local_endpoints_table, proto, ip, port);
>>   if (tei != ENDPOINT_INVALID_INDEX)
>> return SESSION_E_PORTINUSE;
>> 
>>   /* Pool reallocs with worker barrier */
>>   lep = transport_endpoint_alloc ();
>>   clib_memcpy_fast (&lep->ep.ip, ip, sizeof (*ip));
>>   lep->ep.port = port;
>>   lep->proto = proto;
>>   lep->refcnt = 1;
>> 
>>   transport_endpoint_table_add (&tm->local_endpoints_table, proto, &lep->ep,
>> lep - tm->local_endpoints);
>> 
>>   return 0;
>> }
> 
> Florin Coras mailto:fcoras.li...@gmail.com>> 
> 于2023年3月14日周二 11:38写道:
>> Hi, 
>> 
>> Could you try this out [1]? I’ve hit this issue myself today but with udp 
>> sessions. Unfortunately, as you’ve correctly pointed out, we were forcing a 
>> cleanup only on the non-fixed local port branch. 
>> 
>> Regards, 
>> Florin
>> 
>> [1] https://gerrit.fd.io/r/c/vpp/+/38473
>> 
>>> On Mar 13, 2023, at 7:35 PM, Zhang Dongya >> > wrote:
>>> 
>>> Hi list,
>>> 
>>> We have update coded from the upstream session&tcp changes to our code base 
>>> and find a possible bug which cause tcp connection can't be established 
>>> anymore.
>>> 
>>> Our scenario is that we will connect to a remote tcp server with specified 
>>> local port and local ip, however, new vpp code have introduced a 
>>> lcl_endpts_freelist which will be either flushed when pending local 
>>> endpoint exceeded the limit (32) or when transport_alloc_local_port is 
>>> called.
>>> 
>>> However, since we specify the local port and local ip and the total session 
>>> count is limited (< 32), in this case, the transport_cleanup_freelist will 
>>> never be called which cause the previous session which use the specified 
>>> local port and local ip will not be released after the session aborted.
>>> 
>>> I think we should also try to free the list in such case as I did in the 
>>> following code:
>>> 
 int
 transport_alloc_local_endpoint (u8 proto, transport_endpoint_cfg_t * 
 rmt_cfg,
 ip46_address_t * lcl_addr, u16 * lcl_port)
 {
   // ZDY:
   transport_main_t *tm = &tp_main;
   transport_endpoint_t *rmt = (transport_endpoint_t *) rmt_cfg;
   session_error_t error;
   int port;
 
   /*
* Find the local address
*/
   if (ip_is_zero (&rmt_cfg->peer.ip, rmt_cfg->peer.is_ip4))
 {
   error = transport_find_local_ip_for_remote 
 (&rmt_cfg->peer.sw_if_index,
  rmt, lcl_addr);
   if (error)
 return error;
 }
   else
 {
   /* Assume session layer vetted this address */
   clib_memcpy_fast (lcl_addr, &rmt_cfg->peer.ip,
 sizeof (rmt_cfg->peer.ip));
 }
 
   /*
* Allocate source port
*/
   if (rmt_cfg->peer.port == 0)
 {
   port = transport_alloc_local_port (proto, lcl_addr, rmt_cfg);
   if (port < 1)
 return SESSION_E_NOPORT;
   *lcl_port = port;
 }
   else
 {
   port = clib_net_to_host_u16 (rmt_cfg->peer.port);
   *lcl_port = port;
 
   // ZDY: need add this to to cleanup because in specified src port
   // case, we will not run to transport_alloc_local_port, then
   // freelist will only be freeed when list is full (>32).
   /* Cleanup freelist if need be */
   if (vec_len (tm->lcl_endpts_freelist))
 transport_cleanup_freelist ();
 
   return transport_endpoint_mark_used (proto, lcl_addr, port);
 }
 
   return 0;
 }
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22704): https://lists.fd.io/g/vpp-dev/message/22704
Mute This Topic: https://lists.fd.io/mt/97596886/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-

Re: [vpp-dev] lcpng with dhcp client on Linux

2023-03-14 Thread Bauruine

Sali Pim

I just found the solution.

set interface unnumbered GigabitEthernet3/0/0 use tap2

This in combination with RequestBroadcast worked and I've got an IP on 
Linux.


Thank you for your help, for your work on lcpng and for your articles on 
ipng.ch they are very interesting and helpful.


Best regards

Stefan

On 02.03.23 07:52, Bauruine wrote:


Hoi Pim

The dhclient of Ubuntu 22.04 doesn't have the -B option but 
systemd-networkd has a RequestBroadcast option 
(https://www.freedesktop.org/software/systemd/man/systemd.network.html#RequestBroadcast=) 
which seems to do the same. I still don't see the response on the TAP 
interface.


07:45:19.392398 00:0d:b9:50:99:0a > ff:ff:ff:ff:ff:ff, ethertype IPv4 
(0x0800), length 335: (tos 0xc0, ttl 64, id 0, offset 0, flags [none], 
proto UDP (17), length 321)
    0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request 
from 00:0d:b9:50:99:0a, length 293, xid 0x50045ee8, secs 257, Flags 
[Broadcast] (0x8000)

      Client-Ethernet-Address 00:0d:b9:50:99:0a
      Vendor-rfc1048 Extensions
    Magic Cookie 0x63825363
    DHCP-Message (53), length 1: Discover
    Client-ID (61), length 19: hardware-type 255, 
fa:b6:49:fa:00:02:00:00:ab:11:0f:e3:35:ec:c0:75:5b:6f

    Parameter-Request (55), length 10:
      Subnet-Mask (1), Default-Gateway (3), Domain-Name-Server 
(6), Hostname (12)

      Domain-Name (15), Static-Route (33), NTP (42), Unknown (119)
      Unknown (120), Classless-Static-Route (121)
    MSZ (57), length 2: 576
    Hostname (12), length 10: "gw02"
07:45:20.398006 00:0d:b9:50:9a:ad > ff:ff:ff:ff:ff:ff, ethertype IPv4 
(0x0800), length 342: (tos 0x10, ttl 128, id 0, offset 0, flags 
[none], proto UDP (17), length 328)
    192.168.25.1.67 > 255.255.255.255.68: [udp sum ok] BOOTP/DHCP, 
Reply, length 300, xid 0x50045ee8, secs 257, Flags [Broadcast] (0x8000)

      Your-IP 192.168.25.36
      Server-IP 192.168.25.1
      Client-Ethernet-Address 00:0d:b9:50:99:0a
      Vendor-rfc1048 Extensions
    Magic Cookie 0x63825363
    DHCP-Message (53), length 1: Offer
    Server-ID (54), length 4: 192.168.25.1
    Lease-Time (51), length 4: 43200
    Subnet-Mask (1), length 4: 255.255.255.0
    Default-Gateway (3), length 4: 192.168.25.1
    Domain-Name-Server (6), length 4: 192.168.1.1
    Domain-Name (15), length 15: "local.tuxli.ch."



On 02.03.23 00:19, Pim van Pelt via lists.fd.io wrote:

Hoi,

aha- the response is going to an IPv4 address that the server has 
assigned to you (192.168.25.1.67 > 192.168.25.36.68) and because that 
address is not configured in VPP, it will not relay it to the TAP.
For isc-dhclient, I think you can use -B 
(https://linux.die.net/man/8/dhclient) to ask it to tell the 
dhcp-server to /broadcast/ it's reply. Does that help (can you show a 
tcpdump of that transaction) ?


groet,
Pim

On Thu, Mar 2, 2023 at 12:05 AM Bauruine  wrote:

Grüezi Pim

Thank you for the reply.

I rewired it to a DHCP server under my control and did a tcpdump
on it while running dhclient -v ge0-3.

23:56:25.067464 00:0d:b9:50:99:0a > ff:ff:ff:ff:ff:ff, ethertype
IPv4 (0x0800), length 342: (tos 0x10, ttl 128, id 0, offset 0,
flags [none], proto UDP (17), length 328)
    0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP,
Request from 00:0d:b9:50:99:0a, length 300, xid 0xbba5cf6a, secs
48, Flags [none] (0x)
      Client-Ethernet-Address 00:0d:b9:50:99:0a
      Vendor-rfc1048 Extensions
    Magic Cookie 0x63825363
    DHCP-Message (53), length 1: Discover
    Hostname (12), length 10: "gw02"
    Parameter-Request (55), length 13:
      Subnet-Mask (1), BR (28), Time-Zone (2),
Default-Gateway (3)
      Domain-Name (15), Domain-Name-Server (6), Unknown
(119), Hostname (12)
      Netbios-Name-Server (44), Netbios-Scope (47), MTU (26),
Classless-Static-Route (121)
      NTP (42)
23:56:25.072057 00:0d:b9:50:9a:ad > 00:0d:b9:50:99:0a, ethertype
IPv4 (0x0800), length 342: (tos 0x10, ttl 128, id 0, offset 0,
flags [none], proto UDP (17), length 328)
    192.168.25.1.67 > 192.168.25.36.68: [udp sum ok] BOOTP/DHCP,
Reply, length 300, xid 0xbba5cf6a, secs 48, Flags [none] (0x)
      Your-IP 192.168.25.36
      Server-IP 192.168.25.1
      Client-Ethernet-Address 00:0d:b9:50:99:0a
      Vendor-rfc1048 Extensions
    Magic Cookie 0x63825363
    DHCP-Message (53), length 1: Offer
    Server-ID (54), length 4: 192.168.25.1
    Lease-Time (51), length 4: 43200
    Subnet-Mask (1), length 4: 255.255.255.0
    Default-Gateway (3), length 4: 192.168.25.1
    Domain-Name (15), length 15: "local.tuxli.ch
."
    Domain-Name-Server (6), length 4: 192.168.1.1

I'm not sure how to do a "tcpdump" in VPP. On the TAP interface I
can't s

Re: [vpp-dev] Race condition between bihash deletion and searching - misuse or bug?

2023-03-14 Thread Hao Tian
Hi,

I've done the change you asked, and the "suspicious value" warnings are all 
gone. Here is the diff:

diff --git a/src/vppinfra/bihash_template.h b/src/vppinfra/bihash_template.h
index c4e120e4a..b9f658db3 100644
--- a/src/vppinfra/bihash_template.h
+++ b/src/vppinfra/bihash_template.h
@@ -532,12 +532,7 @@ static inline int BV 
(clib_bihash_search_inline_2_with_hash)
   if (PREDICT_FALSE (BV (clib_bihash_bucket_is_empty) (b)))
 return -1;
 
-  if (PREDICT_FALSE (b->lock))
-{
-  volatile BVT (clib_bihash_bucket) * bv = b;
-  while (bv->lock)
-   CLIB_PAUSE ();
-}
+  BV (clib_bihash_lock_bucket) (b);
 
   v = BV (clib_bihash_get_value) (h, b->offset);
 
@@ -557,9 +552,11 @@ static inline int BV 
(clib_bihash_search_inline_2_with_hash)
   if (BV (clib_bihash_key_compare) (v->kvp[i].key, search_key->key))
{
  *valuep = v->kvp[i];
+ BV (clib_bihash_unlock_bucket) (b);
  return 0;
}
 }
+  BV (clib_bihash_unlock_bucket) (b);
   return -1;
 }
 

Some questions regarding the change:

1. There is similar logic in clib_bihash_search_inline_with_hash. I suppose 
that this change needs to be applied there as well. Am I right?

2. This patch does eliminate the race condition, but appears to introduce a 
huge performance regression. The clocks in `show runtime` will double after the 
change (with the 10-ops limit removed, so the test node runs indefinitely).

Regards,
Hao Tian
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22706): https://lists.fd.io/g/vpp-dev/message/22706
Mute This Topic: https://lists.fd.io/mt/97599770/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-