Hi, Jiri On 2017/9/25 14:57, Jiri Pirko wrote: > Mon, Sep 25, 2017 at 02:45:08AM CEST, linyunsh...@huawei.com wrote: >> Hi, Jiri >> >> On 2017/9/24 19:37, Jiri Pirko wrote: >>> Sat, Sep 23, 2017 at 02:47:20AM CEST, linyunsh...@huawei.com wrote: >>>> Hi, Jiri >>>> >>>> On 2017/9/23 0:03, Jiri Pirko wrote: >>>>> Fri, Sep 22, 2017 at 04:11:51PM CEST, linyunsh...@huawei.com wrote: >>>>>> Hi, Jiri >>>>>> >>>>>>>> - if (!tc) { >>>>>>>> + if (if_running) { >>>>>>>> + (void)hns3_nic_net_stop(netdev); >>>>>>>> + msleep(100); >>>>>>>> + } >>>>>>>> + >>>>>>>> + ret = (kinfo->dcb_ops && kinfo->dcb_ops->>setup_tc) ? >>>>>>>> + kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : ->EOPNOTSUPP; >>>>>> >>>>>>> This is most odd. Why do you call dcb_ops from >ndo_setup_tc callback? >>>>>>> Why are you mixing this together? prio->tc mapping >can be done >>>>>>> directly in dcbnl >>>>>> >>>>>> Here is what we do in dcb_ops->setup_tc: >>>>>> Firstly, if current tc num is different from the tc num >>>>>> that user provide, then we setup the queues for each >>>>>> tc. >>>>>> >>>>>> Secondly, we tell hardware the pri to tc mapping that >>>>>> the stack is using. In rx direction, our hardware need >>>>>> that mapping to put different packet into different tc' >>>>>> queues according to the priority of the packet, then >>>>>> rss decides which specific queue in the tc should the >>>>>> packet goto. >>>>>> >>>>>> By mixing, I suppose you meant why we need the >>>>>> pri to tc infomation? >>>>> >>>>> by mixing, I mean what I wrote. You are calling dcb_ops callback from >>>>> ndo_setup_tc callback. So you are mixing DCBNL subsystem and TC >>>>> subsystem. Why? Why do you need sch_mqprio? Why DCBNL is not enough for >>>>> all? >>>> >>>> When using lldptool, dcbnl is involved. >>>> >>>> But when using tc qdisc, dcbbl is not involved, below is the a few key >>>> call graph in the kernel when tc qdisc cmd is executed. >>>> >>>> cmd: >>>> tc qdisc add dev eth0 root handle 1:0 mqprio num_tc 4 map 1 2 3 3 1 3 1 1 >>>> hw 1 >>>> >>>> call graph: >>>> rtnetlink_rcv_msg -> tc_modify_qdisc -> qdisc_create -> mqprio_init -> >>>> hns3_nic_setup_tc >>>> >>>> When hns3_nic_setup_tc is called, we need to know how many tc num and >>>> prio_tc mapping from the tc_mqprio_qopt which is provided in the paramter >>>> in the ndo_setup_tc function, and dcb_ops is the our hardware specific >>>> method to setup the tc related parameter to the hardware, so this is why >>>> we call dcb_ops callback in ndo_setup_tc callback. >>>> >>>> I hope this will answer your question, thanks for your time. >>> >>> Okay. I understand that you have a usecase for mqprio mapping offload >>> without lldptool being involved. Ok. I believe it is wrong to call dcb_ops >>> from tc callback. You should have a generic layer inside the driver and >>> call it from both dcb_ops and tc callbacks. >> >> Actually, dcb_ops is our generic layer inside the driver. >> Below is high level architecture: >> >> [ tc qdisc ] [ lldpad ] >> | | >> | | >> | | >> [ hns3_enet ] [ hns3_dcbnl ] >> \ / >> \ / >> \ / >> [ hclge_dcb ] >> / \ >> / \ >> / \ >> [ hclgc_main ] [ hclge_tm ] >> >> hns3_enet.c implements the ndo_setup_tc callback. >> hns3_dcbnl.c implements the dcbnl_rtnl_ops for stack's DCBNL system. >> hclge_dcb implements the dcb_ops. >> So we already have a generic layer that tc and dcbnl all call from. >> >>> >>> Also, what happens If I run lldptool concurrently with mqprio? Who wins >>> and is going to configure the mapping? >> >> Both lldptool and tc qdisc cmd use rtnl interface provided by stack, so >> they are both protected by rtnl_lock, so we do not have to do the locking >> in the driver. > > I was not asking about locking, which is obvious, I was asking about the > behaviour. Like for example: > If I use tc to configure some mapping, later on I run lldptool and change > the mapping. Does the tc dump show the updated values or the original > ones?
If it is that case, tc dump show the updated values. Normally, we use tc qdisc to configure the netdev to use mqprio, and then use the lldptool the tc_prio mapping, tc schedule mode, tc bandwidth and pfc option. if lldptool change the tc num and tc_prio mapping, it will tell the tc system by the following function which is called in client_ops->setup_tc: netdev_set_tc_queue netdev_set_prio_tc_map So lldptool and tc qdisc can work together. > >> >> The locking is in rtnetlink_rcv_msg: >> >> rtnl_lock(); >> handlers = rtnl_dereference(rtnl_msg_handlers[family]); >> if (handlers) { >> doit = READ_ONCE(handlers[type].doit); >> if (doit) >> err = doit(skb, nlh, extack); >> } >> rtnl_unlock(); >> >> Thanks. >> >>> >>> >>>> >>>>> >>>>> >>>>> >>>>>> I hope I did not misunderstand your question, thanks >>>>>> for your time reviewing. >>>>> >>>>> . >>>>> >>>> >>> >>> . >>> >> > > . >