Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
Hi Andrew, Andrew Lunn wrote on Fri, 1 Feb 2019 15:08:31 +0100: > On Fri, Feb 01, 2019 at 12:01:19PM +0100, Miquel Raynal wrote: > > Hi Vivien, > > > > Vivien Didelot wrote on Wed, 30 Jan 2019 > > 19:46:08 -0500: > > > > > Hi Miquèl, > > > > > > On Wed, 30 Jan 2019 10:46:06 +0100, Miquel Raynal > > > wrote: > > > > > > > > > > Today, there is no S2RAM support for switches. First, I proposed > > > > > > > to add > > > > > > > suspend/resume callbacks to the mv88e6xxx driver - just enough to > > > > > > > avoid > > > > > > > crashing the kernel. > > > > > > > > > > > > Then i would suggest the mv88e6xxx refuses the suspend. Actually > > > > > > that > > > > > > probably is the first correct step. We don't have suspend support, > > > > > > so > > > > > > stop the suspend happening, so preventing the kernel crash. > > > > > > Actually can you show me the crash that is happening? > > > > Sure, here it is: http://code.bulix.org/swwb11-569137 > > > > Actually it is a silent crash but the platform never resumes. I am > > pretty sure this is due to the kthread_queue_delayed_work() loop which > > might access registers before it is allowed to do so. > > Hi Miquel > > That sounds like it is an MDIO driver problem, or at least, a resume > ordering problem. You need to ensure that the MDIO bus driver is > resumed before the switch driver. Also, that the switch is suspended > before the MDIO bus driver is suspended. I don't think there is an ordering problem. The MDIO bus is suspended after the switch and resumed before. But if there is no cancellation of the work thread (which is always automatically restarted) in the suspend path, soon or later this work will run at a time when the MDIO bus is still not accessible and will freeze the platform entirely. Thanks, Miquèl
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
On Fri, Feb 01, 2019 at 12:01:19PM +0100, Miquel Raynal wrote: > Hi Vivien, > > Vivien Didelot wrote on Wed, 30 Jan 2019 > 19:46:08 -0500: > > > Hi Miquèl, > > > > On Wed, 30 Jan 2019 10:46:06 +0100, Miquel Raynal > > wrote: > > > > > > > > Today, there is no S2RAM support for switches. First, I proposed to > > > > > > add > > > > > > suspend/resume callbacks to the mv88e6xxx driver - just enough to > > > > > > avoid > > > > > > crashing the kernel. > > > > > > > > > > Then i would suggest the mv88e6xxx refuses the suspend. Actually that > > > > > probably is the first correct step. We don't have suspend support, so > > > > > stop the suspend happening, so preventing the kernel crash. > > > > Actually can you show me the crash that is happening? > > Sure, here it is: http://code.bulix.org/swwb11-569137 > > Actually it is a silent crash but the platform never resumes. I am > pretty sure this is due to the kthread_queue_delayed_work() loop which > might access registers before it is allowed to do so. Hi Miquel That sounds like it is an MDIO driver problem, or at least, a resume ordering problem. You need to ensure that the MDIO bus driver is resumed before the switch driver. Also, that the switch is suspended before the MDIO bus driver is suspended. Andrew
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
Hi Vivien, Vivien Didelot wrote on Wed, 30 Jan 2019 19:46:08 -0500: > Hi Miquèl, > > On Wed, 30 Jan 2019 10:46:06 +0100, Miquel Raynal > wrote: > > > > > > Today, there is no S2RAM support for switches. First, I proposed to > > > > > add > > > > > suspend/resume callbacks to the mv88e6xxx driver - just enough to > > > > > avoid > > > > > crashing the kernel. > > > > > > > > Then i would suggest the mv88e6xxx refuses the suspend. Actually that > > > > probably is the first correct step. We don't have suspend support, so > > > > stop the suspend happening, so preventing the kernel crash. > > Actually can you show me the crash that is happening? Sure, here it is: http://code.bulix.org/swwb11-569137 Actually it is a silent crash but the platform never resumes. I am pretty sure this is due to the kthread_queue_delayed_work() loop which might access registers before it is allowed to do so. In my proposal I just canceled it at suspend and restarted it at resume. Next week I will send a patch to refuse the suspend as you both suggested and if people want to suspend, they will have to remove the switch support. Thanks, Miquèl
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
Hi Miquèl, On Wed, 30 Jan 2019 10:46:06 +0100, Miquel Raynal wrote: > > > > Today, there is no S2RAM support for switches. First, I proposed to add > > > > suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid > > > > crashing the kernel. > > > > > > Then i would suggest the mv88e6xxx refuses the suspend. Actually that > > > probably is the first correct step. We don't have suspend support, so > > > stop the suspend happening, so preventing the kernel crash. Actually can you show me the crash that is happening?
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
> So your proposal is to refuse suspending when using a mv88e6xxx switch. Hi Miquèl That is the first step. It makes the mv88e6xxx suspend compliant, in that it currently does not support suspend. > What about the current situation where suspending is allowed, but all > the configuration gone? That is broken. The whole point of suspending is that you resume back to the original state. Andrew
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
Hi Vivien & Andrew, Vivien Didelot wrote on Tue, 29 Jan 2019 10:46:13 -0500: > Hi Miquèl, > > On Tue, 29 Jan 2019 15:51:57 +0100, Andrew Lunn wrote: > > > > Today, there is no S2RAM support for switches. First, I proposed to add > > > suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid > > > crashing the kernel. > > > > Then i would suggest the mv88e6xxx refuses the suspend. Actually that > > probably is the first correct step. We don't have suspend support, so > > stop the suspend happening, so preventing the kernel crash. > > I am not confortable with adding support for S2RAM in mv88e6xxx yet because > as it was explained, we are aware of much complicated scenarios out there > to pretend that DSA /partly/ supports suspend-resume. The prefered approach > for the moment is to keep things simple and not supporting this feature yet, > especially at the mv88e6xxx driver level. > > However crashing is unacceptable so I'm backing Andrew's point here, please > submit a fix to prevent the suspend (and crash) for the moment. > > Sorry if you felt that your work is being delayed, it is much appreciated! Thanks for the more detailed explanation, I got your point and I better understand your reluctance. So your proposal is to refuse suspending when using a mv88e6xxx switch. What about the current situation where suspending is allowed, but all the configuration gone? As long as all the ports are disabled during suspend, it should not hurt anything, right? Plus, this is what the bcm_sf2 and qca8k drivers are doing. I can even add an error message in the resume path to warn about this drawback. Thanks, Miquèl
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
Hi Miquèl, On Tue, 29 Jan 2019 15:51:57 +0100, Andrew Lunn wrote: > > Today, there is no S2RAM support for switches. First, I proposed to add > > suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid > > crashing the kernel. > > Then i would suggest the mv88e6xxx refuses the suspend. Actually that > probably is the first correct step. We don't have suspend support, so > stop the suspend happening, so preventing the kernel crash. I am not confortable with adding support for S2RAM in mv88e6xxx yet because as it was explained, we are aware of much complicated scenarios out there to pretend that DSA /partly/ supports suspend-resume. The prefered approach for the moment is to keep things simple and not supporting this feature yet, especially at the mv88e6xxx driver level. However crashing is unacceptable so I'm backing Andrew's point here, please submit a fix to prevent the suspend (and crash) for the moment. Sorry if you felt that your work is being delayed, it is much appreciated! Thanks, Vivien
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
On Tue, Jan 29, 2019 at 10:01:17AM +0100, Miquel Raynal wrote: > Hi Andrew, > > Andrew Lunn wrote on Mon, 28 Jan 2019 18:42:46 +0100: > > > On Mon, Jan 28, 2019 at 04:57:49PM +0100, Miquel Raynal wrote: > > > Hi Andrew, > > > > > > Thanks for helping! > > > > > > Andrew Lunn wrote on Mon, 28 Jan 2019 15:44:17 +0100: > > > > > > > > I don't see where VLAN and bridge information are cached, can you > > > > > point > > > > > me to the relevant locations? > > > > > > > > Miquèl > > > > > > > > The bridge should have all that information. You need to ask it to > > > > enumerate the current configuration and replay it to the switch. > > > > > > > > There might be something in the Mellanox driver you can copy? But i've > > > > not looked, i'm just guessing. > > > > > > I am still searching but so far I did not find a mechanism reading the > > > configuration of the bridge out of a 'net' object. Indeed there are > > > multiple lists with the configuration but they are all 'mellanox' > > > objects, they do not belong to the core. > > > > Hi Miquèl > > > > Look at how iproute2 works. How does the bridge command enumerate the > > fdb and mdb's? How does bridge vlan show work? bridge link show? See > > if you can use this infrastructure within the kernel. > > Thanks! > > > > > > > We also need to think about how we are going to test this. There is a > > > > lot of state information in a switch. So we are going to need some > > > > pretty good tests to show we have recreated all of it. > > > > > > My understanding of all this is rather short, until know I used what > > > you proposed in the v1 of this series but I am all ears if I need to > > > add anything to my test list. > > > > What you probably need is a generic DSA test suite, with a number of > > hardware devices, with different generations of mv88e6xxx devices, and > > ideally different sf2, kzs, etc switches. Setup a configuration and > > test is works correctly. Suspend, resume, and test is still works. And > > you probably need to go through a number of cycles of suspend/resume. > > And you are going to need to maintain that for a number of years, > > testing every release, to see what breaks as we add new features and > > new devices. > > I am very sorry but I kind of disagree with the above proposal. Usually > contributors try to write the best solution with the help of the > community, test on the hardware they have in hand and propose the > changes. I cannot bond on a 10-years involvement in testing several > switches over the releases. Hi Miquèl I was trying to point out this is a very hard subject to tackle. And to do it right is not going to be a few patches. It needs a lot of work, and a lot of testing, and it needs ongoing work because the mv88e6xxx driver is not complete, there are more features to add, which are going to need suspend/resume support adding. > Today, there is no S2RAM support for switches. First, I proposed to add > suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid > crashing the kernel. Then i would suggest the mv88e6xxx refuses the suspend. Actually that probably is the first correct step. We don't have suspend support, so stop the suspend happening, so preventing the kernel crash. Having to maintain the mv88e6xxx, i don't want a suspend which might work in the simplest configuration, but fails badly for more complex configurations. Before accepting any patches, i want a good feeling it works correctly. I would be willing to accept support and testing on one Marvell family of switches, but again, i want to know it is well tested. And i want to know somebody is going to stay around and look after the support as the switch driver develops new features, which are going to need suspend/resume support. If you are only willing to consider a limited number of features, you need to track if the switch is still within those limited set of features, and refuse the suspend if not. > > There also needs to be some though put into what happens when the > > network changes while the switch is suspended. A port looses its link, > > a port comes up, an SFP module is ejected, and SFP module is > > inserted. The PTP grand master moves, etc. I hope the usual mechanisms > > just work, but it all needs testing. > > Is this really specific to switches? I know it is an issue and I > understand you would prefer not to support S2RAM at all rather than > addressing part of it, but isn't it better to support the simplest > situation first, than supporting nothing at all? Worst case scenario, you induce a loop in your network, and a broadcast storm takes down the whole network. It is unlikely, but it is very disruptive if it does happen. It is also the sort of situation which is probably not going to get tested, making it more likely to actually happen. And this is specific to switches. A single network card cannot do this, you need two ports to form a loop. Andrew
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
Hi Andrew, Andrew Lunn wrote on Mon, 28 Jan 2019 18:42:46 +0100: > On Mon, Jan 28, 2019 at 04:57:49PM +0100, Miquel Raynal wrote: > > Hi Andrew, > > > > Thanks for helping! > > > > Andrew Lunn wrote on Mon, 28 Jan 2019 15:44:17 +0100: > > > > > > I don't see where VLAN and bridge information are cached, can you point > > > > me to the relevant locations? > > > > > > Miquèl > > > > > > The bridge should have all that information. You need to ask it to > > > enumerate the current configuration and replay it to the switch. > > > > > > There might be something in the Mellanox driver you can copy? But i've > > > not looked, i'm just guessing. > > > > I am still searching but so far I did not find a mechanism reading the > > configuration of the bridge out of a 'net' object. Indeed there are > > multiple lists with the configuration but they are all 'mellanox' > > objects, they do not belong to the core. > > Hi Miquèl > > Look at how iproute2 works. How does the bridge command enumerate the > fdb and mdb's? How does bridge vlan show work? bridge link show? See > if you can use this infrastructure within the kernel. Thanks! > > > > We also need to think about how we are going to test this. There is a > > > lot of state information in a switch. So we are going to need some > > > pretty good tests to show we have recreated all of it. > > > > My understanding of all this is rather short, until know I used what > > you proposed in the v1 of this series but I am all ears if I need to > > add anything to my test list. > > What you probably need is a generic DSA test suite, with a number of > hardware devices, with different generations of mv88e6xxx devices, and > ideally different sf2, kzs, etc switches. Setup a configuration and > test is works correctly. Suspend, resume, and test is still works. And > you probably need to go through a number of cycles of suspend/resume. > And you are going to need to maintain that for a number of years, > testing every release, to see what breaks as we add new features and > new devices. I am very sorry but I kind of disagree with the above proposal. Usually contributors try to write the best solution with the help of the community, test on the hardware they have in hand and propose the changes. I cannot bond on a 10-years involvement in testing several switches over the releases. Today, there is no S2RAM support for switches. First, I proposed to add suspend/resume callbacks to the mv88e6xxx driver - just enough to avoid crashing the kernel. It was reported that the configuration was lost so I wrote a rule-saving mechanism to replay the rules at resume. I was told that this mechanism would best fit in the DSA core directly. I am open to do that, I don't think it is that much work. But it is also required that I use as less memory as possible. This is going to take more time but I think I can do it as well. At least for a minimal set of configuration. Then, why not let other people improve things as they need? IIUC Switch S2RAM does not work at all, I may try to improve the situation but I do not have the abilities nor the time to do it exhaustively for every piece of hardware and every situation. > > There also needs to be some though put into what happens when the > network changes while the switch is suspended. A port looses its link, > a port comes up, an SFP module is ejected, and SFP module is > inserted. The PTP grand master moves, etc. I hope the usual mechanisms > just work, but it all needs testing. Is this really specific to switches? I know it is an issue and I understand you would prefer not to support S2RAM at all rather than addressing part of it, but isn't it better to support the simplest situation first, than supporting nothing at all? Thanks Andrew for your guidance and help anyway, Miquèl
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
On Mon, Jan 28, 2019 at 04:57:49PM +0100, Miquel Raynal wrote: > Hi Andrew, > > Thanks for helping! > > Andrew Lunn wrote on Mon, 28 Jan 2019 15:44:17 +0100: > > > > I don't see where VLAN and bridge information are cached, can you point > > > me to the relevant locations? > > > > Miquèl > > > > The bridge should have all that information. You need to ask it to > > enumerate the current configuration and replay it to the switch. > > > > There might be something in the Mellanox driver you can copy? But i've > > not looked, i'm just guessing. > > I am still searching but so far I did not find a mechanism reading the > configuration of the bridge out of a 'net' object. Indeed there are > multiple lists with the configuration but they are all 'mellanox' > objects, they do not belong to the core. Hi Miquèl Look at how iproute2 works. How does the bridge command enumerate the fdb and mdb's? How does bridge vlan show work? bridge link show? See if you can use this infrastructure within the kernel. > > We also need to think about how we are going to test this. There is a > > lot of state information in a switch. So we are going to need some > > pretty good tests to show we have recreated all of it. > > My understanding of all this is rather short, until know I used what > you proposed in the v1 of this series but I am all ears if I need to > add anything to my test list. What you probably need is a generic DSA test suite, with a number of hardware devices, with different generations of mv88e6xxx devices, and ideally different sf2, kzs, etc switches. Setup a configuration and test is works correctly. Suspend, resume, and test is still works. And you probably need to go through a number of cycles of suspend/resume. And you are going to need to maintain that for a number of years, testing every release, to see what breaks as we add new features and new devices. There also needs to be some though put into what happens when the network changes while the switch is suspended. A port looses its link, a port comes up, an SFP module is ejected, and SFP module is inserted. The PTP grand master moves, etc. I hope the usual mechanisms just work, but it all needs testing. S2RAM is hard for a device like this. It is not something i personally would want to do :-( Andrew
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
Hi Andrew, Thanks for helping! Andrew Lunn wrote on Mon, 28 Jan 2019 15:44:17 +0100: > > I don't see where VLAN and bridge information are cached, can you point > > me to the relevant locations? > > Miquèl > > The bridge should have all that information. You need to ask it to > enumerate the current configuration and replay it to the switch. > > There might be something in the Mellanox driver you can copy? But i've > not looked, i'm just guessing. I am still searching but so far I did not find a mechanism reading the configuration of the bridge out of a 'net' object. Indeed there are multiple lists with the configuration but they are all 'mellanox' objects, they do not belong to the core. Maybe I don't find this configuration because I don't know what it is. I imagine this configuration being one (or multiple) list(s), stored somewhere in a net_device being a bridge. Am I on the wrong path? Otherwise I might just save my own structures in net/dsa/switch.c like I did for the mv88e6xx driver, and once this works, net-folks might want to optimize the memory consumption and re-use the bridge configuration directly? > We also need to think about how we are going to test this. There is a > lot of state information in a switch. So we are going to need some > pretty good tests to show we have recreated all of it. My understanding of all this is rather short, until know I used what you proposed in the v1 of this series but I am all ears if I need to add anything to my test list. Thanks, Miquèl
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
> I don't see where VLAN and bridge information are cached, can you point > me to the relevant locations? Miquèl The bridge should have all that information. You need to ask it to enumerate the current configuration and replay it to the switch. There might be something in the Mellanox driver you can copy? But i've not looked, i'm just guessing. We also need to think about how we are going to test this. There is a lot of state information in a switch. So we are going to need some pretty good tests to show we have recreated all of it. Andrew
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
Hi Florian, Florian Fainelli wrote on Fri, 25 Jan 2019 10:37:38 -0800: > Hi Miquel, > > On 1/25/19 1:55 AM, Miquel Raynal wrote: > > The user might apply a specific switch configuration, with specific > > forwarding rules, VLAN, bridges, etc. > > > > During suspend to RAM the switch power will be turned off and the > > switch will lost its configuration. In an attempt to bring S2RAM > > support to the mv88e6xxx DSA, let's first save these rules in a > > per-chip list thanks to the mv88e6xxx_add/del_xxx_rule() > > helpers. These helpers are then called from various callbacks: > > * mv88e6xxx_port_fdb_add/del() > > * mv88e6xxx_port_mdb_add/del() > > * mv88e6xxx_port_vlan_add/del() > > * mv88e6xxx_port_bridge_join/leave() > > * mv88e6xxx_crosschip_bridge_join/leave() > > > > To avoid recursion problems when replaying the rules, the content of > > the above *_add()/*_join() callbacks has been moved in separate > > helpers with a '_' prefix. Hence, each callback just calls the > > corresponding helper and the corresponding *_add_xxx_rule(). > > None of this should be done in the driver IMHO, because this is > presumably applicable to all switch devices that lose their state during > suspend/resume, so at best, this should be moved to the core DSA layer, > but doing this means that we should also have a well established > contract between the DSA layer and individual switch drivers as far as > quiescing/saving/restoring state goes. > > By moving things to the core we can also more tightly control what data > structures get used to represent e.g.: VLANs, FDBs, MDBs etc and > possibly push/utilize caching into the original subsystem. For instance > VLAN/bridge already do maintain caches of VLANs, so if we could somehow > expose those, we would not bloat the kernel's memory footprint by having > an additional layer to maintain with identical information. So you suggest to move the intelligence of FDBs/MDBs in net/dsa/port.c, is this right? I don't see where VLAN and bridge information are cached, can you point me to the relevant locations? What about cross-chip bridges? There is nothing about them in net/dsa/port.c. The implementation I see in the mv88e6xxx driver only touches the PVT but I don't get whether we should handle this calls like regular bridge-join/leave events or not (maybe they are cached with regular bridge events?). Thanks, Miquèl
Re: [PATCH net-next v2 1/2] net: dsa: mv88e6xxx: Save switch rules
Hi Miquel, On 1/25/19 1:55 AM, Miquel Raynal wrote: > The user might apply a specific switch configuration, with specific > forwarding rules, VLAN, bridges, etc. > > During suspend to RAM the switch power will be turned off and the > switch will lost its configuration. In an attempt to bring S2RAM > support to the mv88e6xxx DSA, let's first save these rules in a > per-chip list thanks to the mv88e6xxx_add/del_xxx_rule() > helpers. These helpers are then called from various callbacks: > * mv88e6xxx_port_fdb_add/del() > * mv88e6xxx_port_mdb_add/del() > * mv88e6xxx_port_vlan_add/del() > * mv88e6xxx_port_bridge_join/leave() > * mv88e6xxx_crosschip_bridge_join/leave() > > To avoid recursion problems when replaying the rules, the content of > the above *_add()/*_join() callbacks has been moved in separate > helpers with a '_' prefix. Hence, each callback just calls the > corresponding helper and the corresponding *_add_xxx_rule(). None of this should be done in the driver IMHO, because this is presumably applicable to all switch devices that lose their state during suspend/resume, so at best, this should be moved to the core DSA layer, but doing this means that we should also have a well established contract between the DSA layer and individual switch drivers as far as quiescing/saving/restoring state goes. By moving things to the core we can also more tightly control what data structures get used to represent e.g.: VLANs, FDBs, MDBs etc and possibly push/utilize caching into the original subsystem. For instance VLAN/bridge already do maintain caches of VLANs, so if we could somehow expose those, we would not bloat the kernel's memory footprint by having an additional layer to maintain with identical information. Just my 2 cents. -- Florian