Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Thu, 2017-01-19 at 00:38 -0500, David Miller wrote: > From: Tom Herbert > Date: Wed, 18 Jan 2017 21:22:53 -0800 > > > In principle I agree, but as I previously wrote backporting this > > driver is already a bear. We need to do something here as this is > > quickly approaching being infeasible to backport. Even if we don't > > restructure the directories I'd still like to see some effort > towards > > feature modularization and isolation. > > There simply is a lot changing all the time, no restructuring is > going > to reduce that. The driver simply gets a lot of churn. > > Backporting such things through a bunch of file and/or directory > renames is seriously nothing but asking for more punishment. > > That's why I am against it. So, I spent the better part of two years dealing with this after the drivers/net/mellanox drivers/net/ethernet/mellanox rename, so I'll chime in. Especially since Tom's problem directly relates to the issues I dealt with when trying to take the latest Mellanox driver and shoe-horn it into a Red Hat kernel that didn't have the same net core (I had to pick and choose which mlx patches to take in order to work with our existing net core, and the files were renamed on top of that). If you rename a file from place A to place B and keep the file the same, then a git backport can be done using techniques such as git cherry-pick with the --follow option enabled. It's slow, but works. But that's not what you've brought up. You brought up continuing the rename to go further and split code out into files based upon features. While git cherry-pick can follow renames mostly automatically, it can not follow refactors. If they move the tcflower stuff out to a file on its own, and you then attempt to backport some fix that touches the tcflower support in the new location while you have not taken the move patch, you will find that git will be totally stymied. What you have now is not perfect and you have to sort through commits to get the right ones and skip the ones you can't support, but things mostly apply except when context diffs break issues. With the refactor and separation you speak of, git will be totally unable to help and you will have to manually edit patches to point at new files or hand edit the files to apply the patches. I understand the desire for order instead of chaos, but it really will not help. Bringing patches from the new order back to the old chaos will be so much harder than just continuing as things are that you will silently curse the change shortly after it is made. -- Doug Ledford GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD signature.asc Description: This is a digitally signed message part
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
From: Tom Herbert Date: Wed, 18 Jan 2017 21:22:53 -0800 > In principle I agree, but as I previously wrote backporting this > driver is already a bear. We need to do something here as this is > quickly approaching being infeasible to backport. Even if we don't > restructure the directories I'd still like to see some effort towards > feature modularization and isolation. There simply is a lot changing all the time, no restructuring is going to reduce that. The driver simply gets a lot of churn. Backporting such things through a bunch of file and/or directory renames is seriously nothing but asking for more punishment. That's why I am against it.
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Wed, Jan 18, 2017 at 9:32 AM, David Miller wrote: > From: Saeed Mahameed > Date: Mon, 16 Jan 2017 22:30:29 +0200 > >>> But please bear with me here, what if we queue this patch up to -stable ? > > You've got to be seriously kidding me that your idea is to submit an > incredibly diruptive driver rename to -stable to solve this problem. > > That is completely a non-starter. > > The whole idea is to _MINIMIZE_ the amount of change happening in > -stable in order to avoid regressions and negative consequence for > everyone using the -stable tree. > > I am strongly against a major reorganization of this driver, sorry. > The Synopsys folks want to do the same thing for the stmmac driver, > for even more nefarious reasons, and I'm rejecting all of their > attempts to do that as well. > > If you look in that thread, they said they would "help" with the > -stable backports, and in there I explained why that is a completely > empty gesture. There are people doing the backports outside of your > spehere of influence, who need to get their work done "right now" and > aren't going to consult you and be on your schedule for doing those > backports. David, In principle I agree, but as I previously wrote backporting this driver is already a bear. We need to do something here as this is quickly approaching being infeasible to backport. Even if we don't restructure the directories I'd still like to see some effort towards feature modularization and isolation. Tom
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
From: Saeed Mahameed Date: Mon, 16 Jan 2017 22:30:29 +0200 >> But please bear with me here, what if we queue this patch up to -stable ? You've got to be seriously kidding me that your idea is to submit an incredibly diruptive driver rename to -stable to solve this problem. That is completely a non-starter. The whole idea is to _MINIMIZE_ the amount of change happening in -stable in order to avoid regressions and negative consequence for everyone using the -stable tree. I am strongly against a major reorganization of this driver, sorry. The Synopsys folks want to do the same thing for the stmmac driver, for even more nefarious reasons, and I'm rejecting all of their attempts to do that as well. If you look in that thread, they said they would "help" with the -stable backports, and in there I explained why that is a completely empty gesture. There are people doing the backports outside of your spehere of influence, who need to get their work done "right now" and aren't going to consult you and be on your schedule for doing those backports.
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Mon, Jan 16, 2017 at 11:06 PM, Tom Herbert wrote: > On Mon, Jan 16, 2017 at 12:30 PM, Saeed Mahameed > wrote: >> On Sat, Jan 14, 2017 at 12:07 AM, Saeed Mahameed >> wrote: >>> On Fri, Jan 13, 2017 at 7:14 PM, David Miller wrote: From: Saeed Mahameed Date: Thu, 12 Jan 2017 19:22:34 +0200 > This pull request includes one patch from Leon, this patch as described > below will change the driver directory structure and layout for better, > logical and modular driver files separation. > > This change is important to both rdma and net maintainers in order to > have smoother management of driver patches for different mlx5 sub modules > and smoother rdma-next vs. net-next features submissions. > > Please find more info below -in the tag commit message-, > review and let us know if there's any problem. > > This change doesn't introduce any conflicts with the current mlx5 > fixes and cleanups posted on 2017-01-10 to net branch, and merge tests > worked flawlessly with no issues. > > This is the last pull request meant for both rdma-next and net-next. > Once pulled, this will be the base shared code for both trees. This is pretty crazy, it will make all bug fix backporting to -stable a complete nightmare for myself, Doug, various distribution maintainers and many other people who quietly have to maintain their own trees and do backporting. >>> >>> I hear you, >>> But please bear with me here, what if we queue this patch up to -stable ? >>> and we >>> (Mellanox) and specifically our dedicated inbox team, will make sure >>> that this patch >>> will land on the various distributions and for those maintaining their >>> own trees. >>> This patch is really straight forward (rename files) and I already >>> tried to cherry-pick it >>> on older kernels, I only got a couple of conflicts on some of the >>> "#inlcude" lines we've >>> changed, and they are pretty straightforward to fix, we can even avoid >>> this if we decide to >>> not move mlx5 header files in this phase. >>> >>> If this is possible then all trees will be aligned and it will be a >>> win win situation. >>> >> >> Hi Dave, >> >> Any chance you saw my -stable suggestion above ? >> I think it would really close the backporting gap. >> >> Sorry i am bothering you with this topic, but we really desire the new >> structure and >> I never got your feedback on this suggestion, so i would like to hear >> your thoughts. >> > Saeed, > > I've already you specific suggestions on your new structure, please > consider your reviewers feedback more carefully. Again, the starting I know, sorry i didn't respond on time, I though Leno's response was sufficient. > point for your restructuring should be to separate out the minimum set > of files required to build reasonable driver and then cleanly > compartmentalize the rest of the features to make it easy for your > users to include or not include those in their build. Unless you've I see your point and i am 100% with you on this, we are all on the same page here, we all want to achieve the same goal -modular break down and features logic separation-. And in order to achieve this we need to do some big steps. some of them will have real benefits and others are just moving code around. One thing I need to point out (that Leon already mentioned) that the current driver files sitting flat in mlx5/core directory can be categorized to 4 categories 1. core related & init flows 2. common (infinband/ethernet resource management API) 3. infiniband related logic and FW commands 4. ethernet related logic and FW commands so we figured that first we should split the bigger chunks (ethernet vs. inifniband) into their own directories. then i can follow up with ethernet features break down and kconfig separation. > done this I'm not seeing much benefit for this restructuring. Also, I > would rather see this done in one shot then expecting some sort of > evolution over time to the right solution-- as Dave said the > complexity of this driver is to far down the road to do that. > Well, I also prefer to do this in one shot, but I don't want to start wasting time doing such a huge change then get a NACK on submission. Let's assume we did this in one shot i.e: 1. restructure files and directories 2. re-arrange code (separate logic) 3. Kconfig features control - to show the immediate benefit of this change If this is acceptable by you and we get Dave's blessing on this then i would be happy to provide (as this is our long term plan), but if Dave doesn't approve, then ... I don't really know. Also for 1. backporting is really easy since such patch can be applied easily to any kernel release back to 4.4. but 2. and 3. are really tricky ones. > Tom > >> Thanks, >> Saeed. >> I really don't think you can justify this rearrangement based upon the consequences and how much activity happens in this driver. >>> >>> Ri
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Mon, Jan 16, 2017 at 12:30 PM, Saeed Mahameed wrote: > On Sat, Jan 14, 2017 at 12:07 AM, Saeed Mahameed > wrote: >> On Fri, Jan 13, 2017 at 7:14 PM, David Miller wrote: >>> From: Saeed Mahameed >>> Date: Thu, 12 Jan 2017 19:22:34 +0200 >>> This pull request includes one patch from Leon, this patch as described below will change the driver directory structure and layout for better, logical and modular driver files separation. This change is important to both rdma and net maintainers in order to have smoother management of driver patches for different mlx5 sub modules and smoother rdma-next vs. net-next features submissions. Please find more info below -in the tag commit message-, review and let us know if there's any problem. This change doesn't introduce any conflicts with the current mlx5 fixes and cleanups posted on 2017-01-10 to net branch, and merge tests worked flawlessly with no issues. This is the last pull request meant for both rdma-next and net-next. Once pulled, this will be the base shared code for both trees. >>> >>> This is pretty crazy, it will make all bug fix backporting to -stable >>> a complete nightmare for myself, Doug, various distribution maintainers >>> and many other people who quietly have to maintain their own trees and >>> do backporting. >>> >> >> I hear you, >> But please bear with me here, what if we queue this patch up to -stable ? >> and we >> (Mellanox) and specifically our dedicated inbox team, will make sure >> that this patch >> will land on the various distributions and for those maintaining their >> own trees. >> This patch is really straight forward (rename files) and I already >> tried to cherry-pick it >> on older kernels, I only got a couple of conflicts on some of the >> "#inlcude" lines we've >> changed, and they are pretty straightforward to fix, we can even avoid >> this if we decide to >> not move mlx5 header files in this phase. >> >> If this is possible then all trees will be aligned and it will be a >> win win situation. >> > > Hi Dave, > > Any chance you saw my -stable suggestion above ? > I think it would really close the backporting gap. > > Sorry i am bothering you with this topic, but we really desire the new > structure and > I never got your feedback on this suggestion, so i would like to hear > your thoughts. > Saeed, I've already you specific suggestions on your new structure, please consider your reviewers feedback more carefully. Again, the starting point for your restructuring should be to separate out the minimum set of files required to build reasonable driver and then cleanly compartmentalize the rest of the features to make it easy for your users to include or not include those in their build. Unless you've done this I'm not seeing much benefit for this restructuring. Also, I would rather see this done in one shot then expecting some sort of evolution over time to the right solution-- as Dave said the complexity of this driver is to far down the road to do that. Tom > Thanks, > Saeed. > >>> I really don't think you can justify this rearrangement based upon the >>> consequences and how much activity happens in this driver. >>> >> >> Right, but this is not the only justification, I can sum it up to that >> we would like >> to lay out the foundation for many years to come for a well designed >> driver with a modular >> sub modules break down and scalable infrastructure. We already plan to >> submit more mlx5 >> independent sub modules - just like mlx5e (en_*) files and mlx5_ib >> driver- so this was also >> a reason for us to consider this re-engagement at this stage. >> >>> You should have thought long and hard about the layout a long time ago >>> rather than after the driver has been in the tree for many years. >>> >> >> I had this Idea for the separation before the mlx5 Ethernet >> submission, but I wasn't the maintainer >> back then, and i have been itching to submit such patch for long as >> well, still i don't think >> it is too late, We (Me and Leon) will keep maintaining this driver for >> only god knows how many years to come, >> and the mlx5 drivers are meant to serve at least 3-4 more future HW >> generations. >> >> Long story short, We would like to re-arrange the driver in a way that >> would serve us (the maintainers) and serve those >> who are going do develop the future Stack features and the future HW >> generations over the well designed (Hopefully) >> mlx5 infrastructure. >> Keeping it as it is today, will only make the situation worst in the >> future and it will be really hard to avoid having a spaghetti code >> in the mlx5 driver. All i want to point out here is that maintaining >> such a flat subtree is also nightmare. >> >> So i am only asking you to reconsider this change and give my -stable >> suggestion a thought. >> >> Thank you. >> Saeed.
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Sat, Jan 14, 2017 at 12:07 AM, Saeed Mahameed wrote: > On Fri, Jan 13, 2017 at 7:14 PM, David Miller wrote: >> From: Saeed Mahameed >> Date: Thu, 12 Jan 2017 19:22:34 +0200 >> >>> This pull request includes one patch from Leon, this patch as described >>> below will change the driver directory structure and layout for better, >>> logical and modular driver files separation. >>> >>> This change is important to both rdma and net maintainers in order to >>> have smoother management of driver patches for different mlx5 sub modules >>> and smoother rdma-next vs. net-next features submissions. >>> >>> Please find more info below -in the tag commit message-, >>> review and let us know if there's any problem. >>> >>> This change doesn't introduce any conflicts with the current mlx5 >>> fixes and cleanups posted on 2017-01-10 to net branch, and merge tests >>> worked flawlessly with no issues. >>> >>> This is the last pull request meant for both rdma-next and net-next. >>> Once pulled, this will be the base shared code for both trees. >> >> This is pretty crazy, it will make all bug fix backporting to -stable >> a complete nightmare for myself, Doug, various distribution maintainers >> and many other people who quietly have to maintain their own trees and >> do backporting. >> > > I hear you, > But please bear with me here, what if we queue this patch up to -stable ? and > we > (Mellanox) and specifically our dedicated inbox team, will make sure > that this patch > will land on the various distributions and for those maintaining their > own trees. > This patch is really straight forward (rename files) and I already > tried to cherry-pick it > on older kernels, I only got a couple of conflicts on some of the > "#inlcude" lines we've > changed, and they are pretty straightforward to fix, we can even avoid > this if we decide to > not move mlx5 header files in this phase. > > If this is possible then all trees will be aligned and it will be a > win win situation. > Hi Dave, Any chance you saw my -stable suggestion above ? I think it would really close the backporting gap. Sorry i am bothering you with this topic, but we really desire the new structure and I never got your feedback on this suggestion, so i would like to hear your thoughts. Thanks, Saeed. >> I really don't think you can justify this rearrangement based upon the >> consequences and how much activity happens in this driver. >> > > Right, but this is not the only justification, I can sum it up to that > we would like > to lay out the foundation for many years to come for a well designed > driver with a modular > sub modules break down and scalable infrastructure. We already plan to > submit more mlx5 > independent sub modules - just like mlx5e (en_*) files and mlx5_ib > driver- so this was also > a reason for us to consider this re-engagement at this stage. > >> You should have thought long and hard about the layout a long time ago >> rather than after the driver has been in the tree for many years. >> > > I had this Idea for the separation before the mlx5 Ethernet > submission, but I wasn't the maintainer > back then, and i have been itching to submit such patch for long as > well, still i don't think > it is too late, We (Me and Leon) will keep maintaining this driver for > only god knows how many years to come, > and the mlx5 drivers are meant to serve at least 3-4 more future HW > generations. > > Long story short, We would like to re-arrange the driver in a way that > would serve us (the maintainers) and serve those > who are going do develop the future Stack features and the future HW > generations over the well designed (Hopefully) > mlx5 infrastructure. > Keeping it as it is today, will only make the situation worst in the > future and it will be really hard to avoid having a spaghetti code > in the mlx5 driver. All i want to point out here is that maintaining > such a flat subtree is also nightmare. > > So i am only asking you to reconsider this change and give my -stable > suggestion a thought. > > Thank you. > Saeed.
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Sat, Jan 14, 2017 at 12:56 AM, Tom Herbert wrote: > On Fri, Jan 13, 2017 at 2:45 PM, Saeed Mahameed > wrote: >> On Sat, Jan 14, 2017 at 12:06 AM, Tom Herbert wrote: >>> Btw, we did hit one issue in the backport. We started to get rx csum >>> faults (checksum complete value indicates TCP checksum is bad, but >>> host computation says checksum is good). I ran against 4.9 upstream >>> kernel and do see these, however don't see them in 4.10. I haven't >>> bisected yet. Is this a known issue? >>> >> >> Not to me, I don't recall any csum related fixes or feature submitted >> lately to mlx5, >> Maybe something changed in the stack ? >> >> what configuration are you running ? what traffic ? >> > Nothing fancy. 8 queues and 20 concurrent netperf TCP_STREAMs trips > it. Not a lot of them, but I don't think we really should ever see > these errors. > Hi Tom I've been trying to repro with no luck with kernel v4.9. I am looking for checksum errors in dmesg, is that the place where you saw the errors ? can you dump the error in here ?
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Sat, Jan 14, 2017 at 09:37:20AM -0800, Tom Herbert wrote: > On Fri, Jan 13, 2017 at 2:45 PM, Saeed Mahameed > wrote: > > On Sat, Jan 14, 2017 at 12:06 AM, Tom Herbert wrote: > >> On Fri, Jan 13, 2017 at 12:29 PM, Leon Romanovsky wrote: > >>> On Fri, Jan 13, 2017 at 12:14:07PM -0500, David Miller wrote: > From: Saeed Mahameed > Date: Thu, 12 Jan 2017 19:22:34 +0200 > > > This pull request includes one patch from Leon, this patch as described > > below will change the driver directory structure and layout for better, > > logical and modular driver files separation. > > > > This change is important to both rdma and net maintainers in order to > > have smoother management of driver patches for different mlx5 sub > > modules > > and smoother rdma-next vs. net-next features submissions. > > > > Please find more info below -in the tag commit message-, > > review and let us know if there's any problem. > > > > This change doesn't introduce any conflicts with the current mlx5 > > fixes and cleanups posted on 2017-01-10 to net branch, and merge tests > > worked flawlessly with no issues. > > > > This is the last pull request meant for both rdma-next and net-next. > > Once pulled, this will be the base shared code for both trees. > > This is pretty crazy, it will make all bug fix backporting to -stable > a complete nightmare for myself, Doug, various distribution maintainers > and many other people who quietly have to maintain their own trees and > do backporting. > >>> > >>> Hi Dave, > >>> > >>> I understand your worries, but our case is similar to various other > >>> drivers, for example hfi1 which was in staging for years while > >>> supported in RedHat and moved from there to IB. The Chelsio drivers did > >>> similar reorg in 2016 (drivers/net/ethernet/chelsio/libcxgb) while their > >>> drivers were in the tree for long time before. > >>> > >>> Additionally, Doug doesn't need to maintain -stable queue and it is done > >>> by relevant submaintainers who are adding stable tags by themselves. In > >>> the IB case, the burden will continue to be on me and not on Doug. > >>> > >> Recently I had to backport the mlx5 driver from 4.9 to 4.6 in order to > >> get support for XDP. The biggest issue I faced was the lack of > >> modularity in the many driver features that are now supported. The > >> problem with backporting these new features is the spider web of > >> dependencies that they bring in from the rest of the kernel. I ended > >> up taking out en_rep, vxlan, en_tc, eswitch, and dcbnl. The result was > >> ~340 patches which is still a lot but at least this was constrained to > >> patches in the mlx5 directories and are relevant to what we want to > >> do. We had that complexity in mind. This pull request is our initial step to simplify backporters' work (e.g. rename of en_XXX.c to be XXX.c is the same as en/XXX.c for the backports). We are aware of IB backports which required EN part of driver which had all net dependencies mentioned above and IMHO it is wrong. Our plan is: 1. Reorg/rename files. 2. Separate en and IB from shared code, to allow parallel work. 3. Reduce en and IB dependencies, to simplify backporting. It is really awkward to move to item 2 and 3 before we cleaned the desk. That rename patch provided to all backporters an easy way to bring new features into the driver. > >> > >> In lieu of restructuring the directories, I would much rather see more > >> config options so that we can build drivers that don't unnecessarily > >> complicate our lives with features we don't use. This is not just true > >> for Mellanox, but I would say it would be true of any driver that > >> someone is trying to deploy and maintain at large scale. Different maintainers have a different view on the subject and not everyone is ready to accept new drivers config options. > >> > > > > I think we should have both, if the restructuring made right, > > new whole features (e.g eswitch and eswitch offlaods or any independent > > module), > > can sit in their own directory and keep their own logic concentrated > > in one place, and only touch the > > main driver code with simple entry points in the main flow, this way > > you can simply compile their whole directories > > out with a config flag directly from the Makefile. > > > That would be nice, but that is not what your new layout gives us yet. > The starting point for this structure should be to discern what the > minimum set of files is to build a functional and good performance > driver with the basic functionality including only the most common > offloads. These I would think constitute the core files for the driver > and hence make sense to be in the "core" directory. Personally I would > drop the en_ file prefix and not make an en directory, we're already > in the ethernet driver part of the tree so I don't know what this name >
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Fri, Jan 13, 2017 at 2:45 PM, Saeed Mahameed wrote: > On Sat, Jan 14, 2017 at 12:06 AM, Tom Herbert wrote: >> On Fri, Jan 13, 2017 at 12:29 PM, Leon Romanovsky wrote: >>> On Fri, Jan 13, 2017 at 12:14:07PM -0500, David Miller wrote: From: Saeed Mahameed Date: Thu, 12 Jan 2017 19:22:34 +0200 > This pull request includes one patch from Leon, this patch as described > below will change the driver directory structure and layout for better, > logical and modular driver files separation. > > This change is important to both rdma and net maintainers in order to > have smoother management of driver patches for different mlx5 sub modules > and smoother rdma-next vs. net-next features submissions. > > Please find more info below -in the tag commit message-, > review and let us know if there's any problem. > > This change doesn't introduce any conflicts with the current mlx5 > fixes and cleanups posted on 2017-01-10 to net branch, and merge tests > worked flawlessly with no issues. > > This is the last pull request meant for both rdma-next and net-next. > Once pulled, this will be the base shared code for both trees. This is pretty crazy, it will make all bug fix backporting to -stable a complete nightmare for myself, Doug, various distribution maintainers and many other people who quietly have to maintain their own trees and do backporting. >>> >>> Hi Dave, >>> >>> I understand your worries, but our case is similar to various other >>> drivers, for example hfi1 which was in staging for years while >>> supported in RedHat and moved from there to IB. The Chelsio drivers did >>> similar reorg in 2016 (drivers/net/ethernet/chelsio/libcxgb) while their >>> drivers were in the tree for long time before. >>> >>> Additionally, Doug doesn't need to maintain -stable queue and it is done >>> by relevant submaintainers who are adding stable tags by themselves. In >>> the IB case, the burden will continue to be on me and not on Doug. >>> >> Recently I had to backport the mlx5 driver from 4.9 to 4.6 in order to >> get support for XDP. The biggest issue I faced was the lack of >> modularity in the many driver features that are now supported. The >> problem with backporting these new features is the spider web of >> dependencies that they bring in from the rest of the kernel. I ended >> up taking out en_rep, vxlan, en_tc, eswitch, and dcbnl. The result was >> ~340 patches which is still a lot but at least this was constrained to >> patches in the mlx5 directories and are relevant to what we want to >> do. >> >> In lieu of restructuring the directories, I would much rather see more >> config options so that we can build drivers that don't unnecessarily >> complicate our lives with features we don't use. This is not just true >> for Mellanox, but I would say it would be true of any driver that >> someone is trying to deploy and maintain at large scale. >> > > I think we should have both, if the restructuring made right, > new whole features (e.g eswitch and eswitch offlaods or any independent > module), > can sit in their own directory and keep their own logic concentrated > in one place, and only touch the > main driver code with simple entry points in the main flow, this way > you can simply compile their whole directories > out with a config flag directly from the Makefile. > That would be nice, but that is not what your new layout gives us yet. The starting point for this structure should be to discern what the minimum set of files is to build a functional and good performance driver with the basic functionality including only the most common offloads. These I would think constitute the core files for the driver and hence make sense to be in the "core" directory. Personally I would drop the en_ file prefix and not make an en directory, we're already in the ethernet driver part of the tree so I don't know what this name is supposed to convey. For reference I gave you this list of components I disabled to do the backport, some we're not split out in your directory layout. A good example is the VXLAN offload support, the changes in that area we're mostly due to Alexander's patch set to redo the interface for these accelerations-- so to back port driver from 4.9 to 4.6 I would need to pull those in and whatever else those depend on. It was way easier to just ifdef out the relevant calls and not compile the vxlan files to do the backport. Tom >> Btw, we did hit one issue in the backport. We started to get rx csum >> faults (checksum complete value indicates TCP checksum is bad, but >> host computation says checksum is good). I ran against 4.9 upstream >> kernel and do see these, however don't see them in 4.10. I haven't >> bisected yet. Is this a known issue? >> > > Not to me, I don't recall any csum related fixes or feature submitted > lately to mlx5, > Maybe something changed in the stack ? > > what
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Fri, Jan 13, 2017 at 2:59 PM, Rick Jones wrote: > On 01/13/2017 02:56 PM, Tom Herbert wrote: >> >> On Fri, Jan 13, 2017 at 2:45 PM, Saeed Mahameed >>> >>> what configuration are you running ? what traffic ? >>> >> Nothing fancy. 8 queues and 20 concurrent netperf TCP_STREAMs trips >> it. Not a lot of them, but I don't think we really should ever see >> these errors. > > > Straight-up defaults with netperf, or do you use specific -s/S or -m/M > options? > ./super_netperf_tput 20 -H test001 -l 100 -t TCP_STREAM > happy benchmarking, > > rick jones >
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On 01/13/2017 02:56 PM, Tom Herbert wrote: On Fri, Jan 13, 2017 at 2:45 PM, Saeed Mahameed what configuration are you running ? what traffic ? Nothing fancy. 8 queues and 20 concurrent netperf TCP_STREAMs trips it. Not a lot of them, but I don't think we really should ever see these errors. Straight-up defaults with netperf, or do you use specific -s/S or -m/M options? happy benchmarking, rick jones
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Fri, Jan 13, 2017 at 2:45 PM, Saeed Mahameed wrote: > On Sat, Jan 14, 2017 at 12:06 AM, Tom Herbert wrote: >> On Fri, Jan 13, 2017 at 12:29 PM, Leon Romanovsky wrote: >>> On Fri, Jan 13, 2017 at 12:14:07PM -0500, David Miller wrote: From: Saeed Mahameed Date: Thu, 12 Jan 2017 19:22:34 +0200 > This pull request includes one patch from Leon, this patch as described > below will change the driver directory structure and layout for better, > logical and modular driver files separation. > > This change is important to both rdma and net maintainers in order to > have smoother management of driver patches for different mlx5 sub modules > and smoother rdma-next vs. net-next features submissions. > > Please find more info below -in the tag commit message-, > review and let us know if there's any problem. > > This change doesn't introduce any conflicts with the current mlx5 > fixes and cleanups posted on 2017-01-10 to net branch, and merge tests > worked flawlessly with no issues. > > This is the last pull request meant for both rdma-next and net-next. > Once pulled, this will be the base shared code for both trees. This is pretty crazy, it will make all bug fix backporting to -stable a complete nightmare for myself, Doug, various distribution maintainers and many other people who quietly have to maintain their own trees and do backporting. >>> >>> Hi Dave, >>> >>> I understand your worries, but our case is similar to various other >>> drivers, for example hfi1 which was in staging for years while >>> supported in RedHat and moved from there to IB. The Chelsio drivers did >>> similar reorg in 2016 (drivers/net/ethernet/chelsio/libcxgb) while their >>> drivers were in the tree for long time before. >>> >>> Additionally, Doug doesn't need to maintain -stable queue and it is done >>> by relevant submaintainers who are adding stable tags by themselves. In >>> the IB case, the burden will continue to be on me and not on Doug. >>> >> Recently I had to backport the mlx5 driver from 4.9 to 4.6 in order to >> get support for XDP. The biggest issue I faced was the lack of >> modularity in the many driver features that are now supported. The >> problem with backporting these new features is the spider web of >> dependencies that they bring in from the rest of the kernel. I ended >> up taking out en_rep, vxlan, en_tc, eswitch, and dcbnl. The result was >> ~340 patches which is still a lot but at least this was constrained to >> patches in the mlx5 directories and are relevant to what we want to >> do. >> >> In lieu of restructuring the directories, I would much rather see more >> config options so that we can build drivers that don't unnecessarily >> complicate our lives with features we don't use. This is not just true >> for Mellanox, but I would say it would be true of any driver that >> someone is trying to deploy and maintain at large scale. >> > > I think we should have both, if the restructuring made right, > new whole features (e.g eswitch and eswitch offlaods or any independent > module), > can sit in their own directory and keep their own logic concentrated > in one place, and only touch the > main driver code with simple entry points in the main flow, this way > you can simply compile their whole directories > out with a config flag directly from the Makefile. > >> Btw, we did hit one issue in the backport. We started to get rx csum >> faults (checksum complete value indicates TCP checksum is bad, but >> host computation says checksum is good). I ran against 4.9 upstream >> kernel and do see these, however don't see them in 4.10. I haven't >> bisected yet. Is this a known issue? >> > > Not to me, I don't recall any csum related fixes or feature submitted > lately to mlx5, > Maybe something changed in the stack ? > > what configuration are you running ? what traffic ? > Nothing fancy. 8 queues and 20 concurrent netperf TCP_STREAMs trips it. Not a lot of them, but I don't think we really should ever see these errors. Tom >> Thanks, >> Tom >> I really don't think you can justify this rearrangement based upon the consequences and how much activity happens in this driver. You should have thought long and hard about the layout a long time ago rather than after the driver has been in the tree for many years. Sorry. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Sat, Jan 14, 2017 at 12:06 AM, Tom Herbert wrote: > On Fri, Jan 13, 2017 at 12:29 PM, Leon Romanovsky wrote: >> On Fri, Jan 13, 2017 at 12:14:07PM -0500, David Miller wrote: >>> From: Saeed Mahameed >>> Date: Thu, 12 Jan 2017 19:22:34 +0200 >>> >>> > This pull request includes one patch from Leon, this patch as described >>> > below will change the driver directory structure and layout for better, >>> > logical and modular driver files separation. >>> > >>> > This change is important to both rdma and net maintainers in order to >>> > have smoother management of driver patches for different mlx5 sub modules >>> > and smoother rdma-next vs. net-next features submissions. >>> > >>> > Please find more info below -in the tag commit message-, >>> > review and let us know if there's any problem. >>> > >>> > This change doesn't introduce any conflicts with the current mlx5 >>> > fixes and cleanups posted on 2017-01-10 to net branch, and merge tests >>> > worked flawlessly with no issues. >>> > >>> > This is the last pull request meant for both rdma-next and net-next. >>> > Once pulled, this will be the base shared code for both trees. >>> >>> This is pretty crazy, it will make all bug fix backporting to -stable >>> a complete nightmare for myself, Doug, various distribution maintainers >>> and many other people who quietly have to maintain their own trees and >>> do backporting. >> >> Hi Dave, >> >> I understand your worries, but our case is similar to various other >> drivers, for example hfi1 which was in staging for years while >> supported in RedHat and moved from there to IB. The Chelsio drivers did >> similar reorg in 2016 (drivers/net/ethernet/chelsio/libcxgb) while their >> drivers were in the tree for long time before. >> >> Additionally, Doug doesn't need to maintain -stable queue and it is done >> by relevant submaintainers who are adding stable tags by themselves. In >> the IB case, the burden will continue to be on me and not on Doug. >> > Recently I had to backport the mlx5 driver from 4.9 to 4.6 in order to > get support for XDP. The biggest issue I faced was the lack of > modularity in the many driver features that are now supported. The > problem with backporting these new features is the spider web of > dependencies that they bring in from the rest of the kernel. I ended > up taking out en_rep, vxlan, en_tc, eswitch, and dcbnl. The result was > ~340 patches which is still a lot but at least this was constrained to > patches in the mlx5 directories and are relevant to what we want to > do. > > In lieu of restructuring the directories, I would much rather see more > config options so that we can build drivers that don't unnecessarily > complicate our lives with features we don't use. This is not just true > for Mellanox, but I would say it would be true of any driver that > someone is trying to deploy and maintain at large scale. > I think we should have both, if the restructuring made right, new whole features (e.g eswitch and eswitch offlaods or any independent module), can sit in their own directory and keep their own logic concentrated in one place, and only touch the main driver code with simple entry points in the main flow, this way you can simply compile their whole directories out with a config flag directly from the Makefile. > Btw, we did hit one issue in the backport. We started to get rx csum > faults (checksum complete value indicates TCP checksum is bad, but > host computation says checksum is good). I ran against 4.9 upstream > kernel and do see these, however don't see them in 4.10. I haven't > bisected yet. Is this a known issue? > Not to me, I don't recall any csum related fixes or feature submitted lately to mlx5, Maybe something changed in the stack ? what configuration are you running ? what traffic ? > Thanks, > Tom > >>> >>> I really don't think you can justify this rearrangement based upon the >>> consequences and how much activity happens in this driver. >>> >>> You should have thought long and hard about the layout a long time ago >>> rather than after the driver has been in the tree for many years. >>> >>> Sorry. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Fri, Jan 13, 2017 at 7:14 PM, David Miller wrote: > From: Saeed Mahameed > Date: Thu, 12 Jan 2017 19:22:34 +0200 > >> This pull request includes one patch from Leon, this patch as described >> below will change the driver directory structure and layout for better, >> logical and modular driver files separation. >> >> This change is important to both rdma and net maintainers in order to >> have smoother management of driver patches for different mlx5 sub modules >> and smoother rdma-next vs. net-next features submissions. >> >> Please find more info below -in the tag commit message-, >> review and let us know if there's any problem. >> >> This change doesn't introduce any conflicts with the current mlx5 >> fixes and cleanups posted on 2017-01-10 to net branch, and merge tests >> worked flawlessly with no issues. >> >> This is the last pull request meant for both rdma-next and net-next. >> Once pulled, this will be the base shared code for both trees. > > This is pretty crazy, it will make all bug fix backporting to -stable > a complete nightmare for myself, Doug, various distribution maintainers > and many other people who quietly have to maintain their own trees and > do backporting. > I hear you, But please bear with me here, what if we queue this patch up to -stable ? and we (Mellanox) and specifically our dedicated inbox team, will make sure that this patch will land on the various distributions and for those maintaining their own trees. This patch is really straight forward (rename files) and I already tried to cherry-pick it on older kernels, I only got a couple of conflicts on some of the "#inlcude" lines we've changed, and they are pretty straightforward to fix, we can even avoid this if we decide to not move mlx5 header files in this phase. If this is possible then all trees will be aligned and it will be a win win situation. > I really don't think you can justify this rearrangement based upon the > consequences and how much activity happens in this driver. > Right, but this is not the only justification, I can sum it up to that we would like to lay out the foundation for many years to come for a well designed driver with a modular sub modules break down and scalable infrastructure. We already plan to submit more mlx5 independent sub modules - just like mlx5e (en_*) files and mlx5_ib driver- so this was also a reason for us to consider this re-engagement at this stage. > You should have thought long and hard about the layout a long time ago > rather than after the driver has been in the tree for many years. > I had this Idea for the separation before the mlx5 Ethernet submission, but I wasn't the maintainer back then, and i have been itching to submit such patch for long as well, still i don't think it is too late, We (Me and Leon) will keep maintaining this driver for only god knows how many years to come, and the mlx5 drivers are meant to serve at least 3-4 more future HW generations. Long story short, We would like to re-arrange the driver in a way that would serve us (the maintainers) and serve those who are going do develop the future Stack features and the future HW generations over the well designed (Hopefully) mlx5 infrastructure. Keeping it as it is today, will only make the situation worst in the future and it will be really hard to avoid having a spaghetti code in the mlx5 driver. All i want to point out here is that maintaining such a flat subtree is also nightmare. So i am only asking you to reconsider this change and give my -stable suggestion a thought. Thank you. Saeed.
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Fri, Jan 13, 2017 at 12:29 PM, Leon Romanovsky wrote: > On Fri, Jan 13, 2017 at 12:14:07PM -0500, David Miller wrote: >> From: Saeed Mahameed >> Date: Thu, 12 Jan 2017 19:22:34 +0200 >> >> > This pull request includes one patch from Leon, this patch as described >> > below will change the driver directory structure and layout for better, >> > logical and modular driver files separation. >> > >> > This change is important to both rdma and net maintainers in order to >> > have smoother management of driver patches for different mlx5 sub modules >> > and smoother rdma-next vs. net-next features submissions. >> > >> > Please find more info below -in the tag commit message-, >> > review and let us know if there's any problem. >> > >> > This change doesn't introduce any conflicts with the current mlx5 >> > fixes and cleanups posted on 2017-01-10 to net branch, and merge tests >> > worked flawlessly with no issues. >> > >> > This is the last pull request meant for both rdma-next and net-next. >> > Once pulled, this will be the base shared code for both trees. >> >> This is pretty crazy, it will make all bug fix backporting to -stable >> a complete nightmare for myself, Doug, various distribution maintainers >> and many other people who quietly have to maintain their own trees and >> do backporting. > > Hi Dave, > > I understand your worries, but our case is similar to various other > drivers, for example hfi1 which was in staging for years while > supported in RedHat and moved from there to IB. The Chelsio drivers did > similar reorg in 2016 (drivers/net/ethernet/chelsio/libcxgb) while their > drivers were in the tree for long time before. > > Additionally, Doug doesn't need to maintain -stable queue and it is done > by relevant submaintainers who are adding stable tags by themselves. In > the IB case, the burden will continue to be on me and not on Doug. > Recently I had to backport the mlx5 driver from 4.9 to 4.6 in order to get support for XDP. The biggest issue I faced was the lack of modularity in the many driver features that are now supported. The problem with backporting these new features is the spider web of dependencies that they bring in from the rest of the kernel. I ended up taking out en_rep, vxlan, en_tc, eswitch, and dcbnl. The result was ~340 patches which is still a lot but at least this was constrained to patches in the mlx5 directories and are relevant to what we want to do. In lieu of restructuring the directories, I would much rather see more config options so that we can build drivers that don't unnecessarily complicate our lives with features we don't use. This is not just true for Mellanox, but I would say it would be true of any driver that someone is trying to deploy and maintain at large scale. Btw, we did hit one issue in the backport. We started to get rx csum faults (checksum complete value indicates TCP checksum is bad, but host computation says checksum is good). I ran against 4.9 upstream kernel and do see these, however don't see them in 4.10. I haven't bisected yet. Is this a known issue? Thanks, Tom >> >> I really don't think you can justify this rearrangement based upon the >> consequences and how much activity happens in this driver. >> >> You should have thought long and hard about the layout a long time ago >> rather than after the driver has been in the tree for many years. >> >> Sorry.
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
On Fri, Jan 13, 2017 at 12:14:07PM -0500, David Miller wrote: > From: Saeed Mahameed > Date: Thu, 12 Jan 2017 19:22:34 +0200 > > > This pull request includes one patch from Leon, this patch as described > > below will change the driver directory structure and layout for better, > > logical and modular driver files separation. > > > > This change is important to both rdma and net maintainers in order to > > have smoother management of driver patches for different mlx5 sub modules > > and smoother rdma-next vs. net-next features submissions. > > > > Please find more info below -in the tag commit message-, > > review and let us know if there's any problem. > > > > This change doesn't introduce any conflicts with the current mlx5 > > fixes and cleanups posted on 2017-01-10 to net branch, and merge tests > > worked flawlessly with no issues. > > > > This is the last pull request meant for both rdma-next and net-next. > > Once pulled, this will be the base shared code for both trees. > > This is pretty crazy, it will make all bug fix backporting to -stable > a complete nightmare for myself, Doug, various distribution maintainers > and many other people who quietly have to maintain their own trees and > do backporting. Hi Dave, I understand your worries, but our case is similar to various other drivers, for example hfi1 which was in staging for years while supported in RedHat and moved from there to IB. The Chelsio drivers did similar reorg in 2016 (drivers/net/ethernet/chelsio/libcxgb) while their drivers were in the tree for long time before. Additionally, Doug doesn't need to maintain -stable queue and it is done by relevant submaintainers who are adding stable tags by themselves. In the IB case, the burden will continue to be on me and not on Doug. > > I really don't think you can justify this rearrangement based upon the > consequences and how much activity happens in this driver. > > You should have thought long and hard about the layout a long time ago > rather than after the driver has been in the tree for many years. > > Sorry. signature.asc Description: PGP signature
Re: [pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
From: Saeed Mahameed Date: Thu, 12 Jan 2017 19:22:34 +0200 > This pull request includes one patch from Leon, this patch as described > below will change the driver directory structure and layout for better, > logical and modular driver files separation. > > This change is important to both rdma and net maintainers in order to > have smoother management of driver patches for different mlx5 sub modules > and smoother rdma-next vs. net-next features submissions. > > Please find more info below -in the tag commit message-, > review and let us know if there's any problem. > > This change doesn't introduce any conflicts with the current mlx5 > fixes and cleanups posted on 2017-01-10 to net branch, and merge tests > worked flawlessly with no issues. > > This is the last pull request meant for both rdma-next and net-next. > Once pulled, this will be the base shared code for both trees. This is pretty crazy, it will make all bug fix backporting to -stable a complete nightmare for myself, Doug, various distribution maintainers and many other people who quietly have to maintain their own trees and do backporting. I really don't think you can justify this rearrangement based upon the consequences and how much activity happens in this driver. You should have thought long and hard about the layout a long time ago rather than after the driver has been in the tree for many years. Sorry.
[pull request][for-next] Mellanox mlx5 Reorganize core driver directory layout
Hi Dave and Doug, This pull request includes one patch from Leon, this patch as described below will change the driver directory structure and layout for better, logical and modular driver files separation. This change is important to both rdma and net maintainers in order to have smoother management of driver patches for different mlx5 sub modules and smoother rdma-next vs. net-next features submissions. Please find more info below -in the tag commit message-, review and let us know if there's any problem. This change doesn't introduce any conflicts with the current mlx5 fixes and cleanups posted on 2017-01-10 to net branch, and merge tests worked flawlessly with no issues. This is the last pull request meant for both rdma-next and net-next. Once pulled, this will be the base shared code for both trees. Thanks, Saeed and Leon. The following changes since commit f502d834950a28e02651bb7e2cc7111ddd352644: net/mlx5: Activate support for 4K UARs (2017-01-09 20:25:10 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git tags/mlx5-dir-layout-reorg for you to fetch changes up to 8a9a0375cee4888f7d70e651f3e76347c731c51f: {net, IB}/mlx5: Reorganize driver file layout (2017-01-11 13:40:43 +0200) {net, IB}/mlx5: Reorganize driver file layout This patch organizes mlx5 driver file layout to better reflect mlx5 modularity and allow future separation between EN, IB and shared code parts. The new structure: * drivers/net/ethernet/mellanox/mlx5/* - core HW/PCI driver logic * drivers/net/ethernet/mellanox/mlx5/en/*- ethernet * drivers/net/ethernet/mellanox/mlx5/fs/*- flow steering * drivers/net/ethernet/mellanox/mlx5/sriov/* - SR-IOV and E-Switch * drivers/net/ethernet/mellanox/mlx5/lib/* - common mlx5 commands and API library In future submissions, we will shrink the "lib" directory to the code related to both subsystems only, while IB part will be moved to drivers/infiniband/hw/mlx5, and EN will be moved to "en" directory. Such separation will make this library (shared) code to be lean and minimal, and help avoid future conflicts between IB and net submissions. The proposed structure allows us to remove include/linux/mlx5, which belongs solely to Mellanox's devices and don't need to be exposed in common linux include directory. The following change goes together with update of MAINTAINERS file to more granular maintainership roles: * drivers/net/ethernet/mellanox/mlx5/* - Saeed, Matan and Leon * drivers/net/ethernet/mellanox/mlx5/en/*- Saeed * drivers/net/ethernet/mellanox/mlx5/fs/*- Saeed, Matan and Leon * drivers/net/ethernet/mellanox/mlx5/sriov/* - Saeed * drivers/net/ethernet/mellanox/mlx5/lib/* - Saeed, Matan and Leon * include/uapi/rdma/mlx5-abi.h - Matan and Leon Thanks, Leon Romanovsky. Leon Romanovsky (1): {net, IB}/mlx5: Reorganize driver file layout MAINTAINERS | 11 ++- drivers/infiniband/hw/mlx5/Makefile | 1 + drivers/infiniband/hw/mlx5/ib_virt.c| 2 +- drivers/infiniband/hw/mlx5/mad.c| 4 ++-- drivers/infiniband/hw/mlx5/main.c | 6 +++--- drivers/infiniband/hw/mlx5/mlx5_ib.h| 10 +- drivers/infiniband/hw/mlx5/srq.c| 4 ++-- drivers/net/ethernet/mellanox/Kconfig | 2 +- drivers/net/ethernet/mellanox/Makefile | 2 +- .../net/ethernet/mellanox/mlx5/{core => }/Kconfig | 0 drivers/net/ethernet/mellanox/mlx5/Makefile | 21 + drivers/net/ethernet/mellanox/mlx5/{core => }/cmd.c | 5 ++--- drivers/net/ethernet/mellanox/mlx5/core/Makefile| 13 - .../net/ethernet/mellanox/mlx5/{core => }/debugfs.c | 8 drivers/net/ethernet/mellanox/mlx5/{core => }/dev.c | 4 ++-- .../mellanox/mlx5/{core/en_arfs.c => en/arfs.c} | 4 ++-- .../mellanox/mlx5/{core/en_clock.c => en/clock.c} | 2 +- .../mellanox/mlx5/{core/en_common.c => en/common.c} | 2 +- .../mellanox/mlx5/{core/en_dcbnl.c => en/dcbnl.c} | 2 +- .../net/ethernet/mellanox/mlx5/{core => en}/en.h| 18 +- .../mlx5/{core/en_ethtool.c => en/ethtool.c}| 2 +- .../mellanox/mlx5/{core/en_fs.c => en/fs.c} | 4 ++-- .../mlx5/{core/en_fs_ethtool.c => en/fs_ethtool.c} | 4 ++-- .../mellanox/mlx5/{core/en_main.c => en/main.c} | 10 +- .../mellanox/mlx5/{core/en_rep.c => en/rep.c} | 8 .../mellanox/mlx5/{core/en_rx.c => en/rx.c} | 6 +++--- .../mellanox/mlx5/{core/en_rx_am.c => en/rx_am.c} | 2 +- .../mlx5/{core/en_selftest.c => en/selftest.c} | 2 +- ...