Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Tue, Sep 22, 2015 at 03:36:28AM CEST, vivien.dide...@savoirfairelinux.com wrote: >Hi Jiri, > >On Sep. Monday 21 (39) 08:25 PM, Jiri Pirko wrote: >> Mon, Sep 21, 2015 at 07:13:58PM CEST, vivien.dide...@savoirfairelinux.com >> wrote: >> >Hi Jiri, Scott, >> > >> >On Sep. Monday 21 (39) 10:09 AM, Jiri Pirko wrote: >> >> Mon, Sep 21, 2015 at 09:23:24AM CEST, sfel...@gmail.com wrote: >> >> >On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko wrote: >> >> >> Jiri Pirko (6): >> >> >> switchdev: rename "trans" to "trans_ph". >> >> >> switchdev: introduce transaction infrastructure for attr_set and >> >> >> obj_add >> >> >> rocker: switch to local transaction phase enum >> >> >> switchdev: move transaction phase enum under transaction structure >> >> >> rocker: use switchdev transaction queue for allocated memory >> >> >> switchdev: split commit and prepare phase into two callbacks >> >> > >> >> >Patches compile, but first test bombs. Cut-and-paste of dump at end >> >> >of this email. >> >> >> >> Told you :) >> >> >> >> >> >> > >> >> >I'm not sure I'm liking this patchset because it looks like a way for >> >> >switchdev drivers to easily opt-out of the prepare-commit transaction >> >> >model by simply not implementing the *_pre op. I would rather drivers >> >> >explicitly handle the PREPARE phase in code, even if that means >> >> >skipping it gracefully (in code) with a comment (in code) explaining >> >> >why it does not matter for this device/operation. That's what DSA had >> >> >done, mostly because it was a retro-fit. >> >> >> >> Each driver should handle this inside it. If it does not need prepare >> >> state, it simply does not implement it. That is the same for all cb, >> >> ndos, netdev notifiers, etc. It is much cleaner and nicer to have these as >> >> separate callbacks. Implementing multiple callback in one is just ugly, >> >> sorry. >> > >> >This is true, (in DSA) we don't have to implement the prepare phase if >> >we fully support the feature in hardware. >> > >> >To give a real example, Marvell switch drivers currently implement all >> >add/del/dump calls for VLAN FDB (where VID 0 means the port itself). No >> >prepare phase needed. >> > >> >Now, I have local patches to enable strict 802.1Q mode in these switches >> >(all the logic is based on the hardware VLAN table). But it does not use >> >per-port FDB, so fdb_add with VID 0 doesn't make sense anymore. That's >> >why we need to push the feature checking down to the drivers in DSA. >> > >> >I have another pending patch to add .port_fdb_pre_add, where mv88e6xxx >> >code will return -EOPNOTSUPP if the given VID is 0. >> > >> >Another example: mv88e6xxx support tagged VLANs, so no hardware check >> >needed. But the Broadcom Starfighter 2 only supports port-based VLANs >> >(which is today wrongly implemented through "bridge_join/leave"). By >> >implementing .port_vlan_pre_add (another pending patch for DSA), the >> >driver will be able to return -EOPNOTSUPP if !BRIDGE_VLAN_INFO_PVID. >> > >> >Also, having logic in switchdev drivers to check SWITCHDEV_TRANS_NONE >> >and SWITCHDEV_TRANS_ABORT is not really nice. Having switchdev handle >> >the abort phase (calling each destructor) and getting rid of the >> >SWITCHDEV_TRANS_* flags sounds better to me. >> >> Agree, if pre/commit is going to be in one function, we should have >> only prepare/commit enums. It can be carried around as a single bool >> value in switchdev_trans structure. Will include that in my transaction >> patchset. >> >> >> > >> >> >Also, the patchset removes the ABORT callback in case of a rollback >> >> >due to a failed PREPARE. We can't make the assumption that it's just >> >> >a memory list to destroy on ABORT. The driver, on PREPARE, may have >> >> >reserved device space or staged an operation on the device which we'll >> >> >need to undo on ABORT. >> >> >> >> Yep, just register an item with custom destructor, there you can do >> >> whatever. Also, I believe much nicer comparing to current code. >> >> >> >> >> >> > >> >> >So we need ABORT back, and we need PREPARE to not be optional, so >> >> >what's left list enqueue/dequeue helpers, which I'm not seeing much >> >> >value in up-leveling as the driver can do list_add/del itself. >> >> >> >> Why would every driver do it itself, over and over when there can be a >> >> clean infrastructure to do that. Including abort phase. Without the driver >> >> needed to be involved. >> > >> >Maybe the term ".destructor" has a too strong meaning to deallocation, >> >but you can indeed do whatever you need in this function. >> >> It is a destructor. Don't know about a better name, suggestions? > >Nope, I'm personally fine with this term. > >> > >> >> >Am I missing something? I didn't see a motivation statement for the >> >> >RFC so I'm not sure where you wanted to take this. >> >> >> >> I want to make current code much nicer, easier to read and implement in >> >> other drivers. Look at rocker.c and how often there is == PREP
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Hi Jiri, On Sep. Monday 21 (39) 08:25 PM, Jiri Pirko wrote: > Mon, Sep 21, 2015 at 07:13:58PM CEST, vivien.dide...@savoirfairelinux.com > wrote: > >Hi Jiri, Scott, > > > >On Sep. Monday 21 (39) 10:09 AM, Jiri Pirko wrote: > >> Mon, Sep 21, 2015 at 09:23:24AM CEST, sfel...@gmail.com wrote: > >> >On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko wrote: > >> >> Jiri Pirko (6): > >> >> switchdev: rename "trans" to "trans_ph". > >> >> switchdev: introduce transaction infrastructure for attr_set and > >> >> obj_add > >> >> rocker: switch to local transaction phase enum > >> >> switchdev: move transaction phase enum under transaction structure > >> >> rocker: use switchdev transaction queue for allocated memory > >> >> switchdev: split commit and prepare phase into two callbacks > >> > > >> >Patches compile, but first test bombs. Cut-and-paste of dump at end > >> >of this email. > >> > >> Told you :) > >> > >> > >> > > >> >I'm not sure I'm liking this patchset because it looks like a way for > >> >switchdev drivers to easily opt-out of the prepare-commit transaction > >> >model by simply not implementing the *_pre op. I would rather drivers > >> >explicitly handle the PREPARE phase in code, even if that means > >> >skipping it gracefully (in code) with a comment (in code) explaining > >> >why it does not matter for this device/operation. That's what DSA had > >> >done, mostly because it was a retro-fit. > >> > >> Each driver should handle this inside it. If it does not need prepare > >> state, it simply does not implement it. That is the same for all cb, > >> ndos, netdev notifiers, etc. It is much cleaner and nicer to have these as > >> separate callbacks. Implementing multiple callback in one is just ugly, > >> sorry. > > > >This is true, (in DSA) we don't have to implement the prepare phase if > >we fully support the feature in hardware. > > > >To give a real example, Marvell switch drivers currently implement all > >add/del/dump calls for VLAN FDB (where VID 0 means the port itself). No > >prepare phase needed. > > > >Now, I have local patches to enable strict 802.1Q mode in these switches > >(all the logic is based on the hardware VLAN table). But it does not use > >per-port FDB, so fdb_add with VID 0 doesn't make sense anymore. That's > >why we need to push the feature checking down to the drivers in DSA. > > > >I have another pending patch to add .port_fdb_pre_add, where mv88e6xxx > >code will return -EOPNOTSUPP if the given VID is 0. > > > >Another example: mv88e6xxx support tagged VLANs, so no hardware check > >needed. But the Broadcom Starfighter 2 only supports port-based VLANs > >(which is today wrongly implemented through "bridge_join/leave"). By > >implementing .port_vlan_pre_add (another pending patch for DSA), the > >driver will be able to return -EOPNOTSUPP if !BRIDGE_VLAN_INFO_PVID. > > > >Also, having logic in switchdev drivers to check SWITCHDEV_TRANS_NONE > >and SWITCHDEV_TRANS_ABORT is not really nice. Having switchdev handle > >the abort phase (calling each destructor) and getting rid of the > >SWITCHDEV_TRANS_* flags sounds better to me. > > Agree, if pre/commit is going to be in one function, we should have > only prepare/commit enums. It can be carried around as a single bool > value in switchdev_trans structure. Will include that in my transaction > patchset. > > > > > >> >Also, the patchset removes the ABORT callback in case of a rollback > >> >due to a failed PREPARE. We can't make the assumption that it's just > >> >a memory list to destroy on ABORT. The driver, on PREPARE, may have > >> >reserved device space or staged an operation on the device which we'll > >> >need to undo on ABORT. > >> > >> Yep, just register an item with custom destructor, there you can do > >> whatever. Also, I believe much nicer comparing to current code. > >> > >> > >> > > >> >So we need ABORT back, and we need PREPARE to not be optional, so > >> >what's left list enqueue/dequeue helpers, which I'm not seeing much > >> >value in up-leveling as the driver can do list_add/del itself. > >> > >> Why would every driver do it itself, over and over when there can be a > >> clean infrastructure to do that. Including abort phase. Without the driver > >> needed to be involved. > > > >Maybe the term ".destructor" has a too strong meaning to deallocation, > >but you can indeed do whatever you need in this function. > > It is a destructor. Don't know about a better name, suggestions? Nope, I'm personally fine with this term. > > > >> >Am I missing something? I didn't see a motivation statement for the > >> >RFC so I'm not sure where you wanted to take this. > >> > >> I want to make current code much nicer, easier to read and implement in > >> other drivers. Look at rocker.c and how often there is == PREPARE there. > >> It's nearly impossible to followthe code, sorry. > >> > >> My next patchset is to un-mess rocker.c (that freaking ofdpa stuff is > >> everywhere) > > > >I'm basically fo
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Mon, Sep 21, 2015 at 06:34:33PM CEST, sfel...@gmail.com wrote: >On Mon, Sep 21, 2015 at 1:09 AM, Jiri Pirko wrote: >> Mon, Sep 21, 2015 at 09:23:24AM CEST, sfel...@gmail.com wrote: >>>On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko wrote: Jiri Pirko (6): switchdev: rename "trans" to "trans_ph". switchdev: introduce transaction infrastructure for attr_set and obj_add rocker: switch to local transaction phase enum switchdev: move transaction phase enum under transaction structure rocker: use switchdev transaction queue for allocated memory switchdev: split commit and prepare phase into two callbacks >>> >>>Patches compile, but first test bombs. Cut-and-paste of dump at end >>>of this email. >> >> Told you :) >> >> >>> >>>I'm not sure I'm liking this patchset because it looks like a way for >>>switchdev drivers to easily opt-out of the prepare-commit transaction >>>model by simply not implementing the *_pre op. I would rather drivers >>>explicitly handle the PREPARE phase in code, even if that means >>>skipping it gracefully (in code) with a comment (in code) explaining >>>why it does not matter for this device/operation. That's what DSA had >>>done, mostly because it was a retro-fit. >> >> Each driver should handle this inside it. If it does not need prepare >> state, it simply does not implement it. That is the same for all cb, >> ndos, netdev notifiers, etc. It is much cleaner and nicer to have these as >> separate callbacks. Implementing multiple callback in one is just ugly, >> sorry. >> >> >>> >>>Also, the patchset removes the ABORT callback in case of a rollback >>>due to a failed PREPARE. We can't make the assumption that it's just >>>a memory list to destroy on ABORT. The driver, on PREPARE, may have >>>reserved device space or staged an operation on the device which we'll >>>need to undo on ABORT. >> >> Yep, just register an item with custom destructor, there you can do >> whatever. Also, I believe much nicer comparing to current code. >> >> >>> >>>So we need ABORT back, and we need PREPARE to not be optional, so >>>what's left list enqueue/dequeue helpers, which I'm not seeing much >>>value in up-leveling as the driver can do list_add/del itself. >> >> Why would every driver do it itself, over and over when there can be a >> clean infrastructure to do that. Including abort phase. Without the driver >> needed to be involved. >> >> >>> >>>Am I missing something? I didn't see a motivation statement for the >>>RFC so I'm not sure where you wanted to take this. >> >> I want to make current code much nicer, easier to read and implement in >> other drivers. Look at rocker.c and how often there is == PREPARE there. >> It's nearly impossible to followthe code, sorry. > >You're going to end up duplicate a lot of code and make it easier to >make mistakes. The reason why the == PREPARE checks are there is the >same code is run for both PREPARE and COMMIT, with == PREPARE checks >right at the last moment to avoid a irreversible change, such as >deleting an entry from a table or changing a state variable or bumping >a hardware desc pointer. Otherwise the code paths should be the same >for PREPARE and COMMIT. If you split these, you're going to duplicate >lots of code and one path will deviate from the other over time and >now we've broken the prepare-commit model. We want PREPARE to answer >the question "is it OK to do this?" and COMMIT to do without failure. Okay, I'll leave the last patch out for now. Let's see what the future brings :) I really don't like "NONE" phase, does not make sense for switchdev code and it is just a hack for rocker. So I will kick it out from switchdev anyway. "ABORT" phase also does not allign to what you say about one-code-two-passes and I think that my patchset is actually handling abort nicely. That leaves only "PREPARE" and "COMMIT" Makes sense? > >Do you have a specific example where you can't follow the code with == PREPARE? > >> My next patchset is to un-mess rocker.c (that freaking ofdpa stuff is >> everywhere) > >Rocker implements OF-DPA so it's seems natural that freaking OF-DPA >stuff would be everywhere. Well, no. It is only one of many possible worlds. In qemu, the split is done nicely. However in kernel, the ofdpa stuff is squashed to the generic rocker stuff. I have a patchset for splitting that under construction for some time. This week I have some time allocated for that so I hopefully will send the patchset this Thursday (fingers crossed). Thanks for review! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Mon, Sep 21, 2015 at 07:13:58PM CEST, vivien.dide...@savoirfairelinux.com wrote: >Hi Jiri, Scott, > >On Sep. Monday 21 (39) 10:09 AM, Jiri Pirko wrote: >> Mon, Sep 21, 2015 at 09:23:24AM CEST, sfel...@gmail.com wrote: >> >On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko wrote: >> >> Jiri Pirko (6): >> >> switchdev: rename "trans" to "trans_ph". >> >> switchdev: introduce transaction infrastructure for attr_set and >> >> obj_add >> >> rocker: switch to local transaction phase enum >> >> switchdev: move transaction phase enum under transaction structure >> >> rocker: use switchdev transaction queue for allocated memory >> >> switchdev: split commit and prepare phase into two callbacks >> > >> >Patches compile, but first test bombs. Cut-and-paste of dump at end >> >of this email. >> >> Told you :) >> >> >> > >> >I'm not sure I'm liking this patchset because it looks like a way for >> >switchdev drivers to easily opt-out of the prepare-commit transaction >> >model by simply not implementing the *_pre op. I would rather drivers >> >explicitly handle the PREPARE phase in code, even if that means >> >skipping it gracefully (in code) with a comment (in code) explaining >> >why it does not matter for this device/operation. That's what DSA had >> >done, mostly because it was a retro-fit. >> >> Each driver should handle this inside it. If it does not need prepare >> state, it simply does not implement it. That is the same for all cb, >> ndos, netdev notifiers, etc. It is much cleaner and nicer to have these as >> separate callbacks. Implementing multiple callback in one is just ugly, >> sorry. > >This is true, (in DSA) we don't have to implement the prepare phase if >we fully support the feature in hardware. > >To give a real example, Marvell switch drivers currently implement all >add/del/dump calls for VLAN FDB (where VID 0 means the port itself). No >prepare phase needed. > >Now, I have local patches to enable strict 802.1Q mode in these switches >(all the logic is based on the hardware VLAN table). But it does not use >per-port FDB, so fdb_add with VID 0 doesn't make sense anymore. That's >why we need to push the feature checking down to the drivers in DSA. > >I have another pending patch to add .port_fdb_pre_add, where mv88e6xxx >code will return -EOPNOTSUPP if the given VID is 0. > >Another example: mv88e6xxx support tagged VLANs, so no hardware check >needed. But the Broadcom Starfighter 2 only supports port-based VLANs >(which is today wrongly implemented through "bridge_join/leave"). By >implementing .port_vlan_pre_add (another pending patch for DSA), the >driver will be able to return -EOPNOTSUPP if !BRIDGE_VLAN_INFO_PVID. > >Also, having logic in switchdev drivers to check SWITCHDEV_TRANS_NONE >and SWITCHDEV_TRANS_ABORT is not really nice. Having switchdev handle >the abort phase (calling each destructor) and getting rid of the >SWITCHDEV_TRANS_* flags sounds better to me. Agree, if pre/commit is going to be in one function, we should have only prepare/commit enums. It can be carried around as a single bool value in switchdev_trans structure. Will include that in my transaction patchset. > >> >Also, the patchset removes the ABORT callback in case of a rollback >> >due to a failed PREPARE. We can't make the assumption that it's just >> >a memory list to destroy on ABORT. The driver, on PREPARE, may have >> >reserved device space or staged an operation on the device which we'll >> >need to undo on ABORT. >> >> Yep, just register an item with custom destructor, there you can do >> whatever. Also, I believe much nicer comparing to current code. >> >> >> > >> >So we need ABORT back, and we need PREPARE to not be optional, so >> >what's left list enqueue/dequeue helpers, which I'm not seeing much >> >value in up-leveling as the driver can do list_add/del itself. >> >> Why would every driver do it itself, over and over when there can be a >> clean infrastructure to do that. Including abort phase. Without the driver >> needed to be involved. > >Maybe the term ".destructor" has a too strong meaning to deallocation, >but you can indeed do whatever you need in this function. It is a destructor. Don't know about a better name, suggestions? > >> >Am I missing something? I didn't see a motivation statement for the >> >RFC so I'm not sure where you wanted to take this. >> >> I want to make current code much nicer, easier to read and implement in >> other drivers. Look at rocker.c and how often there is == PREPARE there. >> It's nearly impossible to followthe code, sorry. >> >> My next patchset is to un-mess rocker.c (that freaking ofdpa stuff is >> everywhere) > >I'm basically for this patchset, but I do have some concerns about the >general switchdev transaction design. > >I certainly don't have the perspective that you guys have, so please >tell me if I'm totally wrong. From my point of view (with DSA drivers), >the emphase should be done on asking the hardware if it can handle or >not
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Hi Jiri, Scott, On Sep. Monday 21 (39) 10:09 AM, Jiri Pirko wrote: > Mon, Sep 21, 2015 at 09:23:24AM CEST, sfel...@gmail.com wrote: > >On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko wrote: > >> Jiri Pirko (6): > >> switchdev: rename "trans" to "trans_ph". > >> switchdev: introduce transaction infrastructure for attr_set and > >> obj_add > >> rocker: switch to local transaction phase enum > >> switchdev: move transaction phase enum under transaction structure > >> rocker: use switchdev transaction queue for allocated memory > >> switchdev: split commit and prepare phase into two callbacks > > > >Patches compile, but first test bombs. Cut-and-paste of dump at end > >of this email. > > Told you :) > > > > > >I'm not sure I'm liking this patchset because it looks like a way for > >switchdev drivers to easily opt-out of the prepare-commit transaction > >model by simply not implementing the *_pre op. I would rather drivers > >explicitly handle the PREPARE phase in code, even if that means > >skipping it gracefully (in code) with a comment (in code) explaining > >why it does not matter for this device/operation. That's what DSA had > >done, mostly because it was a retro-fit. > > Each driver should handle this inside it. If it does not need prepare > state, it simply does not implement it. That is the same for all cb, > ndos, netdev notifiers, etc. It is much cleaner and nicer to have these as > separate callbacks. Implementing multiple callback in one is just ugly, > sorry. This is true, (in DSA) we don't have to implement the prepare phase if we fully support the feature in hardware. To give a real example, Marvell switch drivers currently implement all add/del/dump calls for VLAN FDB (where VID 0 means the port itself). No prepare phase needed. Now, I have local patches to enable strict 802.1Q mode in these switches (all the logic is based on the hardware VLAN table). But it does not use per-port FDB, so fdb_add with VID 0 doesn't make sense anymore. That's why we need to push the feature checking down to the drivers in DSA. I have another pending patch to add .port_fdb_pre_add, where mv88e6xxx code will return -EOPNOTSUPP if the given VID is 0. Another example: mv88e6xxx support tagged VLANs, so no hardware check needed. But the Broadcom Starfighter 2 only supports port-based VLANs (which is today wrongly implemented through "bridge_join/leave"). By implementing .port_vlan_pre_add (another pending patch for DSA), the driver will be able to return -EOPNOTSUPP if !BRIDGE_VLAN_INFO_PVID. Also, having logic in switchdev drivers to check SWITCHDEV_TRANS_NONE and SWITCHDEV_TRANS_ABORT is not really nice. Having switchdev handle the abort phase (calling each destructor) and getting rid of the SWITCHDEV_TRANS_* flags sounds better to me. > >Also, the patchset removes the ABORT callback in case of a rollback > >due to a failed PREPARE. We can't make the assumption that it's just > >a memory list to destroy on ABORT. The driver, on PREPARE, may have > >reserved device space or staged an operation on the device which we'll > >need to undo on ABORT. > > Yep, just register an item with custom destructor, there you can do > whatever. Also, I believe much nicer comparing to current code. > > > > > >So we need ABORT back, and we need PREPARE to not be optional, so > >what's left list enqueue/dequeue helpers, which I'm not seeing much > >value in up-leveling as the driver can do list_add/del itself. > > Why would every driver do it itself, over and over when there can be a > clean infrastructure to do that. Including abort phase. Without the driver > needed to be involved. Maybe the term ".destructor" has a too strong meaning to deallocation, but you can indeed do whatever you need in this function. > >Am I missing something? I didn't see a motivation statement for the > >RFC so I'm not sure where you wanted to take this. > > I want to make current code much nicer, easier to read and implement in > other drivers. Look at rocker.c and how often there is == PREPARE there. > It's nearly impossible to followthe code, sorry. > > My next patchset is to un-mess rocker.c (that freaking ofdpa stuff is > everywhere) I'm basically for this patchset, but I do have some concerns about the general switchdev transaction design. I certainly don't have the perspective that you guys have, so please tell me if I'm totally wrong. From my point of view (with DSA drivers), the emphase should be done on asking the hardware if it can handle or not a given transaction (not simply an hardware feature!). The driver can handle any allocation and preparation itself, and also errors can actually occur *during* a commit phase. Being able to return -EOPNOTSUPP from any add/del/dump function would be just perfect. I ideally imagine the following implementation in switchdev drivers: int foo_port_obj_add(struct net_device *dev, int id, void *obj) { struct foo *foo = netdev_priv(dev); struct s
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
On Mon, Sep 21, 2015 at 1:09 AM, Jiri Pirko wrote: > Mon, Sep 21, 2015 at 09:23:24AM CEST, sfel...@gmail.com wrote: >>On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko wrote: >>> Jiri Pirko (6): >>> switchdev: rename "trans" to "trans_ph". >>> switchdev: introduce transaction infrastructure for attr_set and >>> obj_add >>> rocker: switch to local transaction phase enum >>> switchdev: move transaction phase enum under transaction structure >>> rocker: use switchdev transaction queue for allocated memory >>> switchdev: split commit and prepare phase into two callbacks >> >>Patches compile, but first test bombs. Cut-and-paste of dump at end >>of this email. > > Told you :) > > >> >>I'm not sure I'm liking this patchset because it looks like a way for >>switchdev drivers to easily opt-out of the prepare-commit transaction >>model by simply not implementing the *_pre op. I would rather drivers >>explicitly handle the PREPARE phase in code, even if that means >>skipping it gracefully (in code) with a comment (in code) explaining >>why it does not matter for this device/operation. That's what DSA had >>done, mostly because it was a retro-fit. > > Each driver should handle this inside it. If it does not need prepare > state, it simply does not implement it. That is the same for all cb, > ndos, netdev notifiers, etc. It is much cleaner and nicer to have these as > separate callbacks. Implementing multiple callback in one is just ugly, > sorry. > > >> >>Also, the patchset removes the ABORT callback in case of a rollback >>due to a failed PREPARE. We can't make the assumption that it's just >>a memory list to destroy on ABORT. The driver, on PREPARE, may have >>reserved device space or staged an operation on the device which we'll >>need to undo on ABORT. > > Yep, just register an item with custom destructor, there you can do > whatever. Also, I believe much nicer comparing to current code. > > >> >>So we need ABORT back, and we need PREPARE to not be optional, so >>what's left list enqueue/dequeue helpers, which I'm not seeing much >>value in up-leveling as the driver can do list_add/del itself. > > Why would every driver do it itself, over and over when there can be a > clean infrastructure to do that. Including abort phase. Without the driver > needed to be involved. > > >> >>Am I missing something? I didn't see a motivation statement for the >>RFC so I'm not sure where you wanted to take this. > > I want to make current code much nicer, easier to read and implement in > other drivers. Look at rocker.c and how often there is == PREPARE there. > It's nearly impossible to followthe code, sorry. You're going to end up duplicate a lot of code and make it easier to make mistakes. The reason why the == PREPARE checks are there is the same code is run for both PREPARE and COMMIT, with == PREPARE checks right at the last moment to avoid a irreversible change, such as deleting an entry from a table or changing a state variable or bumping a hardware desc pointer. Otherwise the code paths should be the same for PREPARE and COMMIT. If you split these, you're going to duplicate lots of code and one path will deviate from the other over time and now we've broken the prepare-commit model. We want PREPARE to answer the question "is it OK to do this?" and COMMIT to do without failure. Do you have a specific example where you can't follow the code with == PREPARE? > My next patchset is to un-mess rocker.c (that freaking ofdpa stuff is > everywhere) Rocker implements OF-DPA so it's seems natural that freaking OF-DPA stuff would be everywhere. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Mon, Sep 21, 2015 at 09:23:24AM CEST, sfel...@gmail.com wrote: >On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko wrote: >> Jiri Pirko (6): >> switchdev: rename "trans" to "trans_ph". >> switchdev: introduce transaction infrastructure for attr_set and >> obj_add >> rocker: switch to local transaction phase enum >> switchdev: move transaction phase enum under transaction structure >> rocker: use switchdev transaction queue for allocated memory >> switchdev: split commit and prepare phase into two callbacks > >Patches compile, but first test bombs. Cut-and-paste of dump at end >of this email. Told you :) > >I'm not sure I'm liking this patchset because it looks like a way for >switchdev drivers to easily opt-out of the prepare-commit transaction >model by simply not implementing the *_pre op. I would rather drivers >explicitly handle the PREPARE phase in code, even if that means >skipping it gracefully (in code) with a comment (in code) explaining >why it does not matter for this device/operation. That's what DSA had >done, mostly because it was a retro-fit. Each driver should handle this inside it. If it does not need prepare state, it simply does not implement it. That is the same for all cb, ndos, netdev notifiers, etc. It is much cleaner and nicer to have these as separate callbacks. Implementing multiple callback in one is just ugly, sorry. > >Also, the patchset removes the ABORT callback in case of a rollback >due to a failed PREPARE. We can't make the assumption that it's just >a memory list to destroy on ABORT. The driver, on PREPARE, may have >reserved device space or staged an operation on the device which we'll >need to undo on ABORT. Yep, just register an item with custom destructor, there you can do whatever. Also, I believe much nicer comparing to current code. > >So we need ABORT back, and we need PREPARE to not be optional, so >what's left list enqueue/dequeue helpers, which I'm not seeing much >value in up-leveling as the driver can do list_add/del itself. Why would every driver do it itself, over and over when there can be a clean infrastructure to do that. Including abort phase. Without the driver needed to be involved. > >Am I missing something? I didn't see a motivation statement for the >RFC so I'm not sure where you wanted to take this. I want to make current code much nicer, easier to read and implement in other drivers. Look at rocker.c and how often there is == PREPARE there. It's nearly impossible to followthe code, sorry. My next patchset is to un-mess rocker.c (that freaking ofdpa stuff is everywhere) > >-scott > > > >[1.998791] BUG: unable to handle kernel NULL pointer dereference >at 0008 >[2.05] IP: [] >rocker_port_kzalloc.isra.50+0x5/0x10 [rocker] >[2.05] PGD 0 >[2.05] Oops: [#1] SMP >[2.05] Modules linked in: floppy(+) ata_piix(+) libata >rocker(+) virtio_pci(+) virtio_ring virtio scsi_mod >[2.05] CPU: 0 PID: 91 Comm: modprobe Not tainted 4.2.0+ #3 >[2.05] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 >[2.05] task: 88000f5e0800 ti: 88000f5f8000 task.ti: >88000f5f8000 >[2.05] RIP: 0010:[] [] >rocker_port_kzalloc.isra.50+0x5/0x10 [rocker] >[2.05] RSP: 0018:88000f5fba20 EFLAGS: 00010246 >[2.05] RAX: 88000f17c050 RBX: 88000b40 RCX: >0020 >[2.05] RDX: RSI: RDI: > >[2.05] RBP: R08: R09: >a0058680 >[2.05] R10: 0001 R11: 9319 R12: > >[2.05] R13: 88000f5b7000 R14: 88000f5b7840 R15: > >[2.05] FS: 7fd132237700() GS:88000fc0() >knlGS: >[2.05] CS: 0010 DS: ES: CR0: 8005003b >[2.05] CR2: 0008 CR3: 0f5d5000 CR4: >06f0 >[2.05] Stack: >[2.05] a005a32a 0001 00ff88000f5b7000 >8175725a >[2.05] a0057750 a0058680 >0020001a >[2.05] 88000b40 88000ea06000 >88000f5b7000 >[2.05] Call Trace: >[2.05] [] ? rocker_cmd_exec+0x5a/0x1f0 [rocker] >[2.05] [] ? >rocker_cmd_get_port_stats_prep+0x90/0x90 [rocker] >[2.05] [] ? >rocker_cmd_get_port_settings_phys_name_proc+0xb0/0xb0 [rocker] >[2.05] [] ? rocker_probe+0xb6a/0xe59 [rocker] >[2.05] [] ? local_pci_probe+0x48/0xa0 >[2.05] [] ? pci_device_probe+0x112/0x130 >[2.05] [] ? driver_probe_device+0x196/0x2a0 >[2.05] [] ? driver_probe_device+0x2a0/0x2a0 >[2.05] [] ? __driver_attach+0x8d/0x90 >[2.05] [] ? driver_probe_device+0x2a0/0x2a0 >[2.05] [] ? bus_for_each_dev+0x4c/0x80 >[2.00
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko wrote: > Jiri Pirko (6): > switchdev: rename "trans" to "trans_ph". > switchdev: introduce transaction infrastructure for attr_set and > obj_add > rocker: switch to local transaction phase enum > switchdev: move transaction phase enum under transaction structure > rocker: use switchdev transaction queue for allocated memory > switchdev: split commit and prepare phase into two callbacks Patches compile, but first test bombs. Cut-and-paste of dump at end of this email. I'm not sure I'm liking this patchset because it looks like a way for switchdev drivers to easily opt-out of the prepare-commit transaction model by simply not implementing the *_pre op. I would rather drivers explicitly handle the PREPARE phase in code, even if that means skipping it gracefully (in code) with a comment (in code) explaining why it does not matter for this device/operation. That's what DSA had done, mostly because it was a retro-fit. Also, the patchset removes the ABORT callback in case of a rollback due to a failed PREPARE. We can't make the assumption that it's just a memory list to destroy on ABORT. The driver, on PREPARE, may have reserved device space or staged an operation on the device which we'll need to undo on ABORT. So we need ABORT back, and we need PREPARE to not be optional, so what's left list enqueue/dequeue helpers, which I'm not seeing much value in up-leveling as the driver can do list_add/del itself. Am I missing something? I didn't see a motivation statement for the RFC so I'm not sure where you wanted to take this. -scott [1.998791] BUG: unable to handle kernel NULL pointer dereference at 0008 [2.05] IP: [] rocker_port_kzalloc.isra.50+0x5/0x10 [rocker] [2.05] PGD 0 [2.05] Oops: [#1] SMP [2.05] Modules linked in: floppy(+) ata_piix(+) libata rocker(+) virtio_pci(+) virtio_ring virtio scsi_mod [2.05] CPU: 0 PID: 91 Comm: modprobe Not tainted 4.2.0+ #3 [2.05] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 [2.05] task: 88000f5e0800 ti: 88000f5f8000 task.ti: 88000f5f8000 [2.05] RIP: 0010:[] [] rocker_port_kzalloc.isra.50+0x5/0x10 [rocker] [2.05] RSP: 0018:88000f5fba20 EFLAGS: 00010246 [2.05] RAX: 88000f17c050 RBX: 88000b40 RCX: 0020 [2.05] RDX: RSI: RDI: [2.05] RBP: R08: R09: a0058680 [2.05] R10: 0001 R11: 9319 R12: [2.05] R13: 88000f5b7000 R14: 88000f5b7840 R15: [2.05] FS: 7fd132237700() GS:88000fc0() knlGS: [2.05] CS: 0010 DS: ES: CR0: 8005003b [2.05] CR2: 0008 CR3: 0f5d5000 CR4: 06f0 [2.05] Stack: [2.05] a005a32a 0001 00ff88000f5b7000 8175725a [2.05] a0057750 a0058680 0020001a [2.05] 88000b40 88000ea06000 88000f5b7000 [2.05] Call Trace: [2.05] [] ? rocker_cmd_exec+0x5a/0x1f0 [rocker] [2.05] [] ? rocker_cmd_get_port_stats_prep+0x90/0x90 [rocker] [2.05] [] ? rocker_cmd_get_port_settings_phys_name_proc+0xb0/0xb0 [rocker] [2.05] [] ? rocker_probe+0xb6a/0xe59 [rocker] [2.05] [] ? local_pci_probe+0x48/0xa0 [2.05] [] ? pci_device_probe+0x112/0x130 [2.05] [] ? driver_probe_device+0x196/0x2a0 [2.05] [] ? driver_probe_device+0x2a0/0x2a0 [2.05] [] ? __driver_attach+0x8d/0x90 [2.05] [] ? driver_probe_device+0x2a0/0x2a0 [2.05] [] ? bus_for_each_dev+0x4c/0x80 [2.05] [] ? bus_add_driver+0x119/0x220 [2.05] [] ? 0xa0065000 [2.05] [] ? driver_register+0x5a/0xe0 [2.05] [] ? 0xa0065000 [2.05] [] ? rocker_module_init+0x33/0x1000 [rocker] [2.05] [] ? do_one_initcall+0x90/0x1e0 [2.05] [] ? do_init_module+0x50/0x1d8 [2.05] [] ? load_module+0x1c1c/0x2240 [2.05] [] ? show_initstate+0x50/0x50 [2.05] [] ? SyS_init_module+0x104/0x130 [2.05] [] ? entry_SYSCALL_64_fastpath+0x16/0x75 [2.05] Code: 48 89 c1 48 c7 c2 10 dc 15 81 48 89 c6 e8 e4 0d 4a e1 48 8b 3b e8 1c 1b 4a e1 eb b6 66 2e 0f 1f 84 00 00 00 00 00 48 89 d1 89 f2 <8b> 77 08 e9 73 ff ff ff 0f 1f 00 48 83 ec 68 4c 89 7c 24 60 41 [2.05] RIP [] rocker_port_kzalloc.isra.50+0x5/0x10 [rocker] -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Sat, Sep 19, 2015 at 07:46:14PM CEST, sfel...@gmail.com wrote: >On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko wrote: >> Jiri Pirko (6): >> switchdev: rename "trans" to "trans_ph". >> switchdev: introduce transaction infrastructure for attr_set and >> obj_add >> rocker: switch to local transaction phase enum >> switchdev: move transaction phase enum under transaction structure >> rocker: use switchdev transaction queue for allocated memory >> switchdev: split commit and prepare phase into two callbacks > >Whew, that's a lot of work! Seems like a good idea to up-level this >for other drivers to share. Let me apply the patches and run my tests >and get back to you. Expect some blow-ups. Untested. Will test, fix some spelling issues and post v1, most probably during Monday. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Sat, Sep 19, 2015 at 09:02:00PM CEST, vivien.dide...@savoirfairelinux.com wrote: >Hi Jiri, > >On Sep. Saturday 19 (38) 06:23 PM, Jiri Pirko wrote: >> Sat, Sep 19, 2015 at 03:35:51PM CEST, rami.ro...@intel.com wrote: >> >Hi, >> > >> >>introduce tranction enfra and for pre-commit split >> > >> >Typo: >> >Instead "tranction enfra" should be "transaction infrastructure". >> >> Will fix. Thanks! > >Just being picky, there are a couple more typos in: > >2/6: s/separatelly/separately/ >6/6: s/nore/more/ and s/separete/separate/ Will fix. Thanks! > > >Thanks, >-v -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Hi Jiri, On Sep. Saturday 19 (38) 06:23 PM, Jiri Pirko wrote: > Sat, Sep 19, 2015 at 03:35:51PM CEST, rami.ro...@intel.com wrote: > >Hi, > > > >>introduce tranction enfra and for pre-commit split > > > >Typo: > >Instead "tranction enfra" should be "transaction infrastructure". > > Will fix. Thanks! Just being picky, there are a couple more typos in: 2/6: s/separatelly/separately/ 6/6: s/nore/more/ and s/separete/separate/ Thanks, -v -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
On Sat, Sep 19, 2015 at 5:29 AM, Jiri Pirko wrote: > Jiri Pirko (6): > switchdev: rename "trans" to "trans_ph". > switchdev: introduce transaction infrastructure for attr_set and > obj_add > rocker: switch to local transaction phase enum > switchdev: move transaction phase enum under transaction structure > rocker: use switchdev transaction queue for allocated memory > switchdev: split commit and prepare phase into two callbacks Whew, that's a lot of work! Seems like a good idea to up-level this for other drivers to share. Let me apply the patches and run my tests and get back to you. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Sat, Sep 19, 2015 at 03:35:51PM CEST, rami.ro...@intel.com wrote: >Hi, > >>introduce tranction enfra and for pre-commit split > >Typo: >Instead "tranction enfra" should be "transaction infrastructure". Will fix. Thanks! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Hi, >introduce tranction enfra and for pre-commit split Typo: Instead "tranction enfra" should be "transaction infrastructure". Regards, Rami Rosen Intel Corporation -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch net-next RFC 0/6] switchdev: introduce tranction enfra and for pre-commit split
Jiri Pirko (6): switchdev: rename "trans" to "trans_ph". switchdev: introduce transaction infrastructure for attr_set and obj_add rocker: switch to local transaction phase enum switchdev: move transaction phase enum under transaction structure rocker: use switchdev transaction queue for allocated memory switchdev: split commit and prepare phase into two callbacks drivers/net/ethernet/rocker/rocker.c | 579 ++- include/net/switchdev.h | 39 ++- net/dsa/slave.c | 129 +--- net/switchdev/switchdev.c| 153 +++-- 4 files changed, 545 insertions(+), 355 deletions(-) -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html