Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
Hi Gregory On Fri, 26 Aug 2016 10:43:43 +0200 Gregory CLEMENT wrote: > Hi Ralph, > > On jeu., août 25 2016, Ralph Sennhauser > wrote: > > > Hi Jason. > > > > On Wed, 24 Aug 2016 21:48:36 + > > Jason Cooper wrote: > > > >> All, > >> > >> On Wed, Aug 24, 2016 at 10:41:02PM +0200, Ralph Sennhauser wrote: > >> > On Wed, 24 Aug 2016 20:15:31 +0200 > >> > Thomas Petazzoni wrote: > >> > > On Wed, 24 Aug 2016 19:10:04 +0200, Ralph Sennhauser wrote: > >> > > > >> > > The people who can take this decision are rather the > >> > > maintainers of the platform itself, or possibly the arm-soc > >> > > maintainers if you still don't like what the platform > >> > > maintainers decided. > >> > > >> > We both have a conflict of interest here, so your offer in an > >> > other message to let the platform maintainers decide is > >> > appreciated. > >> > >> Ok, I'm going to jump in here with the caveats that a) I don't mind > >> being overruled, and b) I'm not going to participate in a > >> never-ending thread on this topic. ;-) > > > > I'm also not interested in a never ending thread. It's moot that > > udev > > I am the one who applied this commit. > > First it is really unfortunate this problem was not raised when this > patch was discussed especially because openwrt was part of the > discussion: > http://lists.infradead.org/pipermail/linux-arm-kernel/2016-February/411382.html > > Then, the main motivation for merging this patch was to ease the work > of people doing bring-up of new board using Armada 38x SoCs. When > doing this we just rely on datasheet and U-Boot and it occurred that > the way the Soc was designed they put the first GMAC at a higher > address to the second one. Respecting this order helps low level > development. > > However for more advanced configuration we expect that more clever > tools such as udev should be used. (and Lennart confirm that it is > still working). > > Also note that in the past we considered to being able to modify the > order of the ethernet device from the device tree by using the > alias. But the idea was rejected by David Miller (the network > subsystem maintainer) because this kind of policy should be done at > userspace level. > > For these reasons I won't revert this commit. > Look at the Gentoo ebuild for Lennarts definition of works. But yes, obviously it can among other ways be addressed in userspace. For me this never was a technical discussion but one about politics. From your mail I take the following: 1) It's not a case of sneaking in the change and hoping no one notices before it hits stable. 2) It "broke" those Linksys devices but as you had an offer for testing I can only assume fair game. 3) You feel comfortable holding out your neck creating a precedent. Therefore I respect your decision and wont pursue the issue any further. Thanks Ralph > Gregory > > > > can't rename to kernel names sanely and we were sold ep34aj17asz98 > > as the replacement. Or that tearing apart the casing to replace the > > wifi modules with an ethernet one when there are already 5 Gbit > > ports is not a case I care about. > > > > Relying on the order might in theory have flaws but works just fine > > in practice. Thomas Petazzoni, I, OpenWrt / Lede / etc all do so > > with success. > > > > It's also not a side effect from major changes, so it didn't break > > by accident but as the subject says deliberately. > > > >> So, I'm just a back-seat co-maintainer for mvebu nowadays, my > >> opinion should be taken with a grain of salt. > >> > >> From the kernel-side, there is no guarantee of device naming. The > >> kernel simply doesn't have sufficient information at boot time. > >> This is why udev, systemd-udev-ntpd-syslog-kitchensink, and others > >> were created. To read immutable attributes of a device and apply > >> consistent naming to them based on configuration files. That's why > >> userspace frequently uses uuids to locate partitions, rather > >> than /dev/sdX[0-9] nodes. > >> > >> The devicetree "ordered by address" rule is, in my opinion, an > >> arbitrary CDO-rule [1]. It doesn't describe the hardware, or a > >> relationship between them. As such, it's just as arbitrary as > >> probe order. It just happens to be something we can twiddle. It > >> also depends on dtc preserving that order. > >> > > > > How exceptional is this exception we are talking about? I mean if > > it's the only place you might want to change it even in 2 years but > > you can't because of the very same rule which was broken here. > > > >> From the user side, without udev and friends, shit changed from one > >> kernel to the next. That's not good. I can certainly see the > >> point that it should have been left alone to begin with. But we > >> aren't there today. > >> > > > > I think if we were at 4.6-rcX reverting would be an easy call. I > > know it's more difficult to make this call now. > > > > Ltsi is 4.1, longterm is 4.4, so penetration
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
Hi Ralph, On jeu., août 25 2016, Ralph Sennhauser wrote: > Hi Jason. > > On Wed, 24 Aug 2016 21:48:36 + > Jason Cooper wrote: > >> All, >> >> On Wed, Aug 24, 2016 at 10:41:02PM +0200, Ralph Sennhauser wrote: >> > On Wed, 24 Aug 2016 20:15:31 +0200 >> > Thomas Petazzoni wrote: >> > > On Wed, 24 Aug 2016 19:10:04 +0200, Ralph Sennhauser wrote: >> > > >> > > The people who can take this decision are rather the maintainers >> > > of the platform itself, or possibly the arm-soc maintainers if >> > > you still don't like what the platform maintainers decided. >> > >> > We both have a conflict of interest here, so your offer in an other >> > message to let the platform maintainers decide is appreciated. >> >> Ok, I'm going to jump in here with the caveats that a) I don't mind >> being overruled, and b) I'm not going to participate in a never-ending >> thread on this topic. ;-) > > I'm also not interested in a never ending thread. It's moot that udev I am the one who applied this commit. First it is really unfortunate this problem was not raised when this patch was discussed especially because openwrt was part of the discussion: http://lists.infradead.org/pipermail/linux-arm-kernel/2016-February/411382.html Then, the main motivation for merging this patch was to ease the work of people doing bring-up of new board using Armada 38x SoCs. When doing this we just rely on datasheet and U-Boot and it occurred that the way the Soc was designed they put the first GMAC at a higher address to the second one. Respecting this order helps low level development. However for more advanced configuration we expect that more clever tools such as udev should be used. (and Lennart confirm that it is still working). Also note that in the past we considered to being able to modify the order of the ethernet device from the device tree by using the alias. But the idea was rejected by David Miller (the network subsystem maintainer) because this kind of policy should be done at userspace level. For these reasons I won't revert this commit. Gregory > can't rename to kernel names sanely and we were sold ep34aj17asz98 as > the replacement. Or that tearing apart the casing to replace the wifi > modules with an ethernet one when there are already 5 Gbit ports is not > a case I care about. > > Relying on the order might in theory have flaws but works just fine in > practice. Thomas Petazzoni, I, OpenWrt / Lede / etc all do so with > success. > > It's also not a side effect from major changes, so it didn't break by > accident but as the subject says deliberately. > >> So, I'm just a back-seat co-maintainer for mvebu nowadays, my opinion >> should be taken with a grain of salt. >> >> From the kernel-side, there is no guarantee of device naming. The >> kernel simply doesn't have sufficient information at boot time. This >> is why udev, systemd-udev-ntpd-syslog-kitchensink, and others were >> created. To read immutable attributes of a device and apply >> consistent naming to them based on configuration files. That's why >> userspace frequently uses uuids to locate partitions, rather >> than /dev/sdX[0-9] nodes. >> >> The devicetree "ordered by address" rule is, in my opinion, an >> arbitrary CDO-rule [1]. It doesn't describe the hardware, or a >> relationship between them. As such, it's just as arbitrary as probe >> order. It just happens to be something we can twiddle. It also >> depends on dtc preserving that order. >> > > How exceptional is this exception we are talking about? I mean if it's > the only place you might want to change it even in 2 years but you > can't because of the very same rule which was broken here. > >> From the user side, without udev and friends, shit changed from one >> kernel to the next. That's not good. I can certainly see the point >> that it should have been left alone to begin with. But we aren't >> there today. >> > > I think if we were at 4.6-rcX reverting would be an easy call. I know > it's more difficult to make this call now. > > Ltsi is 4.1, longterm is 4.4, so penetration is probably marginal at > best at this time. From my view the damage caused by a revert would be > less than the damage that will be caused by the commit if left in but we > can't wait much much longer until this changes. > >> So what happens if we revert now? Right or wrong, in a couple of >> months, someone else will complain, asking to revert the revert. >> And what will our answer be? "We did it last time, but not this >> time." or "Ok, but gosh-darnit, this is the last time." >> > > If you use the ordering by address as main argument for the revert there > will be nothing to argue about. > >> To be blunt, I think our best path forward is to just hold our noses >> and let it stand as is. Some will fix their userspace to adapt, >> others will carry a patch. It's more important at this point to be >> consistent moving forward. It's better to hear "Yeah, it fucking >> changed once." rather
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
On Thu, Aug 25, 2016 at 09:38:39AM +0200, Ralph Sennhauser wrote: > I'm also not interested in a never ending thread. It's moot that udev > can't rename to kernel names sanely and we were sold ep34aj17asz98 as > the replacement. Or that tearing apart the casing to replace the wifi > modules with an ethernet one when there are already 5 Gbit ports is not > a case I care about. > > Relying on the order might in theory have flaws but works just fine in > practice. Thomas Petazzoni, I, OpenWrt / Lede / etc all do so with > success. It has worked so far. I expect it to break at some point. > It's also not a side effect from major changes, so it didn't break by > accident but as the subject says deliberately. In this case that is certainly true. > How exceptional is this exception we are talking about? I mean if it's > the only place you might want to change it even in 2 years but you > can't because of the very same rule which was broken here. I did a quick look at the dts files, and there are quite a few cases that are not ordered, and I can't find anything in Documentation that says they should be sorted. It just appears that in most cases people do sort them, since that just makes sense for finding things and it is as good an order as any other. > I think if we were at 4.6-rcX reverting would be an easy call. I know > it's more difficult to make this call now. > > Ltsi is 4.1, longterm is 4.4, so penetration is probably marginal at > best at this time. From my view the damage caused by a revert would be > less than the damage that will be caused by the commit if left in but we > can't wait much much longer until this changes. Certainly no longterm kernels have this change yet, so it certainly does seem the impact would be rather minimal. Usually the rule seems to be "Don't break user space". This did potentially break userspace, so then the question is whether what user space was doing was considred proper and something that should expect not to break. > If you use the ordering by address as main argument for the revert there > will be nothing to argue about. If only there was a documented place that had that rule. I can't find it. -- Len Sorensen
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
Hi Jason. On Wed, 24 Aug 2016 21:48:36 + Jason Cooper wrote: > All, > > On Wed, Aug 24, 2016 at 10:41:02PM +0200, Ralph Sennhauser wrote: > > On Wed, 24 Aug 2016 20:15:31 +0200 > > Thomas Petazzoni wrote: > > > On Wed, 24 Aug 2016 19:10:04 +0200, Ralph Sennhauser wrote: > > > > > > The people who can take this decision are rather the maintainers > > > of the platform itself, or possibly the arm-soc maintainers if > > > you still don't like what the platform maintainers decided. > > > > We both have a conflict of interest here, so your offer in an other > > message to let the platform maintainers decide is appreciated. > > Ok, I'm going to jump in here with the caveats that a) I don't mind > being overruled, and b) I'm not going to participate in a never-ending > thread on this topic. ;-) I'm also not interested in a never ending thread. It's moot that udev can't rename to kernel names sanely and we were sold ep34aj17asz98 as the replacement. Or that tearing apart the casing to replace the wifi modules with an ethernet one when there are already 5 Gbit ports is not a case I care about. Relying on the order might in theory have flaws but works just fine in practice. Thomas Petazzoni, I, OpenWrt / Lede / etc all do so with success. It's also not a side effect from major changes, so it didn't break by accident but as the subject says deliberately. > So, I'm just a back-seat co-maintainer for mvebu nowadays, my opinion > should be taken with a grain of salt. > > From the kernel-side, there is no guarantee of device naming. The > kernel simply doesn't have sufficient information at boot time. This > is why udev, systemd-udev-ntpd-syslog-kitchensink, and others were > created. To read immutable attributes of a device and apply > consistent naming to them based on configuration files. That's why > userspace frequently uses uuids to locate partitions, rather > than /dev/sdX[0-9] nodes. > > The devicetree "ordered by address" rule is, in my opinion, an > arbitrary CDO-rule [1]. It doesn't describe the hardware, or a > relationship between them. As such, it's just as arbitrary as probe > order. It just happens to be something we can twiddle. It also > depends on dtc preserving that order. > How exceptional is this exception we are talking about? I mean if it's the only place you might want to change it even in 2 years but you can't because of the very same rule which was broken here. > From the user side, without udev and friends, shit changed from one > kernel to the next. That's not good. I can certainly see the point > that it should have been left alone to begin with. But we aren't > there today. > I think if we were at 4.6-rcX reverting would be an easy call. I know it's more difficult to make this call now. Ltsi is 4.1, longterm is 4.4, so penetration is probably marginal at best at this time. From my view the damage caused by a revert would be less than the damage that will be caused by the commit if left in but we can't wait much much longer until this changes. > So what happens if we revert now? Right or wrong, in a couple of > months, someone else will complain, asking to revert the revert. > And what will our answer be? "We did it last time, but not this > time." or "Ok, but gosh-darnit, this is the last time." > If you use the ordering by address as main argument for the revert there will be nothing to argue about. > To be blunt, I think our best path forward is to just hold our noses > and let it stand as is. Some will fix their userspace to adapt, > others will carry a patch. It's more important at this point to be > consistent moving forward. It's better to hear "Yeah, it fucking > changed once." rather than "I don't know what to expect, it changes > every few releases." > > thx, > > Jason. > > > [1] CDO: OCD with the letters neatly arranged in alphabetical order. Thanks for sharing your thoughts Regards Ralph
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
All, On Wed, Aug 24, 2016 at 10:41:02PM +0200, Ralph Sennhauser wrote: > On Wed, 24 Aug 2016 20:15:31 +0200 > Thomas Petazzoni wrote: > > On Wed, 24 Aug 2016 19:10:04 +0200, Ralph Sennhauser wrote: > > > > The people who can take this decision are rather the maintainers of > > the platform itself, or possibly the arm-soc maintainers if you still > > don't like what the platform maintainers decided. > > We both have a conflict of interest here, so your offer in an other > message to let the platform maintainers decide is appreciated. Ok, I'm going to jump in here with the caveats that a) I don't mind being overruled, and b) I'm not going to participate in a never-ending thread on this topic. ;-) So, I'm just a back-seat co-maintainer for mvebu nowadays, my opinion should be taken with a grain of salt. >From the kernel-side, there is no guarantee of device naming. The kernel simply doesn't have sufficient information at boot time. This is why udev, systemd-udev-ntpd-syslog-kitchensink, and others were created. To read immutable attributes of a device and apply consistent naming to them based on configuration files. That's why userspace frequently uses uuids to locate partitions, rather than /dev/sdX[0-9] nodes. The devicetree "ordered by address" rule is, in my opinion, an arbitrary CDO-rule [1]. It doesn't describe the hardware, or a relationship between them. As such, it's just as arbitrary as probe order. It just happens to be something we can twiddle. It also depends on dtc preserving that order. >From the user side, without udev and friends, shit changed from one kernel to the next. That's not good. I can certainly see the point that it should have been left alone to begin with. But we aren't there today. So what happens if we revert now? Right or wrong, in a couple of months, someone else will complain, asking to revert the revert. And what will our answer be? "We did it last time, but not this time." or "Ok, but gosh-darnit, this is the last time." To be blunt, I think our best path forward is to just hold our noses and let it stand as is. Some will fix their userspace to adapt, others will carry a patch. It's more important at this point to be consistent moving forward. It's better to hear "Yeah, it fucking changed once." rather than "I don't know what to expect, it changes every few releases." thx, Jason. [1] CDO: OCD with the letters neatly arranged in alphabetical order.
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
On Wed, Aug 24, 2016 at 09:52:00PM +0200, Thomas Petazzoni wrote: > I'll let the platform maintainers decide what's the least > intrusive/problematic option. Both solutions have drawbacks, so it's > really a "political" decision to make here. I think the main valid argument for a revert is that it violates the documented dtb ordering rule. It also fails to actually do what it was intended to do since network device probing order really isn't defined in linux and if you care you should fix it in userspace. Fixing a regression would be a side effect, and since the ordering isn't certain anyhow, anyone that did see a regression was doing it "wrong" already, although also anyone that saw a benefit from the change was also doing it "wrong". > Not only async probing, but also PCIe devices, as you mentioned > earlier :-) Yeah they better come up with a safer way to determine which network device is which from user space. -- Len Sorensen
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
Hi Thomas On Wed, 24 Aug 2016 20:15:31 +0200 Thomas Petazzoni wrote: > Hello, > > On Wed, 24 Aug 2016 19:10:04 +0200, Ralph Sennhauser wrote: > > > Going forward, as we disagree and it's basically a political > > decision, whom do we ask to rule here? Linus? > > I don't think Linus will care about random issues on a random > platform :-) > Probably not about the ordering of the interfaces per se. Let me ask instead do you think he would sign off on that commit? What I do not yet understand is why you not simply carry this patch for your particular board and firmware. As I see it, there is a good chance every one else will just carry the revert of it themselves otherwise. > The people who can take this decision are rather the maintainers of > the platform itself, or possibly the arm-soc maintainers if you still > don't like what the platform maintainers decided. > > Thomas We both have a conflict of interest here, so your offer in an other message to let the platform maintainers decide is appreciated. Regards Ralph
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
Hello, On Wed, 24 Aug 2016 14:27:58 -0400, Lennart Sorensen wrote: > On Wed, Aug 24, 2016 at 08:14:44PM +0200, Thomas Petazzoni wrote: > > Depends on the network driver I believe. But with an e1000e NIC plugged > > in a PCIe slot, it indeed gets assigned as eth0, and the internal > > mvneta devices get eth1, eth2, etc. > > Which of course means the change does not actually ensure the port > ordering matches the marvell documentation or u-boot. It only handles > the relative order of the ports. For now. Correct. > So since it doesn't actually work, maybe reverting it so it no longer > violates the dtb ordeting rule makes sense. I'll let the platform maintainers decide what's the least intrusive/problematic option. Both solutions have drawbacks, so it's really a "political" decision to make here. > Doesn't mean openwrt/lede/etc don't have to deal with the ordering in > the future if async probing takes off. Not only async probing, but also PCIe devices, as you mentioned earlier :-) Thomas -- Thomas Petazzoni, CTO, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
On Wed, Aug 24, 2016 at 01:10:23PM -0400, Lennart Sorensen wrote: > Well certainly doing udevtrigger -n -v I see no ethernet devices (but > lots of other things). Looking in sysfs it is possible to dereive which > ethX belongs to which port based on the directory names, but that's > probably not the most convinient manner to deal with it. OK, udev DOES work: # udevadm info -p /sys/class/net/eth0 P: /devices/platform/soc/soc:internal-regs/f1034000.ethernet/net/eth0 E: DEVPATH=/devices/platform/soc/soc:internal-regs/f1034000.ethernet/net/eth0 E: IFINDEX=2 E: INTERFACE=eth0 E: SUBSYSTEM=net # udevadm info -p /sys/class/net/eth1 P: /devices/platform/soc/soc:internal-regs/f107.ethernet/net/eth1 E: DEVPATH=/devices/platform/soc/soc:internal-regs/f107.ethernet/net/eth1 E: IFINDEX=3 E: INTERFACE=eth1 E: SUBSYSTEM=net So it isn't hopeless. -- Len Sorensen
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
Hello, On Wed, 24 Aug 2016 19:10:04 +0200, Ralph Sennhauser wrote: > Going forward, as we disagree and it's basically a political decision, > whom do we ask to rule here? Linus? I don't think Linus will care about random issues on a random platform :-) The people who can take this decision are rather the maintainers of the platform itself, or possibly the arm-soc maintainers if you still don't like what the platform maintainers decided. Thomas -- Thomas Petazzoni, CTO, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
On Wed, Aug 24, 2016 at 08:14:44PM +0200, Thomas Petazzoni wrote: > Depends on the network driver I believe. But with an e1000e NIC plugged > in a PCIe slot, it indeed gets assigned as eth0, and the internal > mvneta devices get eth1, eth2, etc. Which of course means the change does not actually ensure the port ordering matches the marvell documentation or u-boot. It only handles the relative order of the ports. For now. So since it doesn't actually work, maybe reverting it so it no longer violates the dtb ordeting rule makes sense. Doesn't mean openwrt/lede/etc don't have to deal with the ordering in the future if async probing takes off. -- Len Sorensen
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
Hello, On Wed, 24 Aug 2016 14:07:27 -0400, Lennart Sorensen wrote: > > The nice thing about having the order in the dtb I thought was that it > > wont ever change. > > I wonder, if someone was to build a box with this cpu, and add a PCIe > network device, which order would they get probed in? Any chance the > PCIe could grab eth0 before the mvneta devices get probed? Depends on the network driver I believe. But with an e1000e NIC plugged in a PCIe slot, it indeed gets assigned as eth0, and the internal mvneta devices get eth1, eth2, etc. Thomas -- Thomas Petazzoni, CTO, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
On Wed, Aug 24, 2016 at 07:10:04PM +0200, Ralph Sennhauser wrote: > And in how many places this discrepancy was documented? You won't be > able to update them all. Mailing lists, blogs, fora posts and what ever > else. I'd say the damage is done and can't be fixed by simply changing > the order now. > > Whether strong rule or not, obviously it went in so bending the rule is > at least accepted. It's not what I have an issue with but with the > reordering at this point. > > I'm clearly not convinced reverting would do more harm than good, > otherwise I wouldn't have brought it up. I would agree, although if you see below I think it maybe something that user space will have to figure out a way to deal with sooner or later anyhow. > It's not about being able to fix the code or documentation but about > having to do so in the first place. > > The nice thing about having the order in the dtb I thought was that it > wont ever change. I wonder, if someone was to build a box with this cpu, and add a PCIe network device, which order would they get probed in? Any chance the PCIe could grab eth0 before the mvneta devices get probed? I also would not count on this not changing in the future potentially by accident. The mmc block devices used to get probed in DTB order, then that changed at some point and reordering the dtb no longer controlled it, rather it was whichever device seemed to finish fastests, and then later that was changed to using the controller's opinion of the device number (which at least made it predicable again, even if not controllable from the dtb). So at the moment the dtb order controls ethernet probe order. Well if other things have stopped doing that, why expect ethernet won't some day stop doing that too? "Fixing" the port numbering by reordering is hence a hack as far as I am concerned and not certain to stay working long term. > Going forward, as we disagree and it's basically a political decision, > whom do we ask to rule here? Linus? -- Len Sorensen
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
On Wed, 24 Aug 2016 16:50:11 +0200 Thomas Petazzoni wrote: > Hello, > > On Sun, 21 Aug 2016 15:11:58 +0200, Ralph Sennhauser wrote: > > > Commit cb4f71c4298853db0c6751b1209e4535956f136c changes the order of > > the network interfaces for armada-38x. As a special exception to the > > "order by register address" rule says the comment in the dtsi. The > > commit messages even calls it a violation. > > > > I can't remember having owned a device were the internal and > > external numbering actually matched, so the important bit for me is > > whatever the order is it should remain constant. > > > > Distributions like OpenWrt have to fix their code when moving from > > 4.4 currently to past 4.6 [1]. Worse the so called "wrong ordering" > > is actually documented [2]. There are likely more victims out > > there. In case it goes unnoticed by the distribution the users lan > > becomes wan and vice versa. > > We had many many users getting confused by the fact that the order of > the network interfaces was inverted compared to: > > * The board documentations > * The U-Boot numbering > * And to a lesser extent, the vendor kernel > And in how many places this discrepancy was documented? You won't be able to update them all. Mailing lists, blogs, fora posts and what ever else. I'd say the damage is done and can't be fixed by simply changing the order now. > So having things match the documentation numbering was in our opinion > the least confusing thing moving forward. We should have done it > earlier, but we thought that the rule "order by register address" was > a very strong rule. > Whether strong rule or not, obviously it went in so bending the rule is at least accepted. It's not what I have an issue with but with the reordering at this point. > At this point, reverting the patch is I believe cause more harm than > good. It's going to re-confuse again people. I'm clearly not convinced reverting would do more harm than good, otherwise I wouldn't have brought it up. > > Regarding the fact that the "wrong numbering if actually documented" > is a fairly specious argument. The OpenWRT Wiki has never been an > official documentation of any sort. I see it as a much more important > aspect that the numbering of the Ethernet interfaces matches the user > manual Marvell provides with its development and evaluation boards. > The OpenWRT Wiki can certainly be fixed accordingly. > It's not about being able to fix the code or documentation but about having to do so in the first place. The nice thing about having the order in the dtb I thought was that it wont ever change. Going forward, as we disagree and it's basically a political decision, whom do we ask to rule here? Linus? Best regards Ralph > Best regards, > > Thomas
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
On Wed, Aug 24, 2016 at 06:43:34PM +0200, Thomas Petazzoni wrote: > Well, just like the for the documentation aspect, you're seeing this > from the OpenWRT/LEDE angle only. Other people are using plenty of > other things. > > We knew it would potentially cause some breakage, so it was a > trade-off. I still believe the new arrangement is better, and you've so > far been the only person reporting an issue with this (compared to > numerous people being confused by the original ordering problem). I didn't report it. I just commented. I will be affected when LEDE does move their kernel past 4.4 I suppose, but I imagine something will be done to deal with it. I have run into issues with things being reordered on other systems though, and it sure is annoying, especially when the new behaviour removes the ability to control the order that was previously there (that is not what is happening here of course). > This is more problematic, and something to be investigated. I don't > immediately see why the Marvell network interfaces would not be visible > by udev, but I haven't tested. Well certainly doing udevtrigger -n -v I see no ethernet devices (but lots of other things). Looking in sysfs it is possible to dereive which ethX belongs to which port based on the directory names, but that's probably not the most convinient manner to deal with it. > The solution of adding an alias in the DT, and using that to name > network interfaces has already been proposed multiple times, but has > been rejected by the networking maintainer, who suggests to use > userspace tools (udev or something else) to rename network interfaces. > See for example https://patchwork.kernel.org/patch/4122441/, which was > proposed by my colleague Boris Brezillon. Sure, although it seems many embedded systems would rather avoid udev (even more so after systemd seems to have taken it over), and in this case I get the impression so far that udev may not even be able to help on this system. (although maybe I just did it wrong). > You can always try to propose again a solution, but I doubt it will be > accepted. Yeah me too, hence why I can't be bothered to try. I have accepted there are some things the maintainers won't fix so I have a few patches I maintain myself and can keep doing so. They are tiny after all. -- Len Sorensen
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
Hello, On Wed, 24 Aug 2016 12:19:33 -0400, Lennart Sorensen wrote: > > So having things match the documentation numbering was in our opinion > > the least confusing thing moving forward. We should have done it > > earlier, but we thought that the rule "order by register address" was a > > very strong rule. > > > > At this point, reverting the patch is I believe cause more harm than > > good. It's going to re-confuse again people. > > Well wouldn't it only affect the tiny proportian that have moved to a >4.4 > kernel? Looking at LEDE, openWrt, etc, it seems very few have so far. Well, just like the for the documentation aspect, you're seeing this from the OpenWRT/LEDE angle only. Other people are using plenty of other things. We knew it would potentially cause some breakage, so it was a trade-off. I still believe the new arrangement is better, and you've so far been the only person reporting an issue with this (compared to numerous people being confused by the original ordering problem). > > Regarding the fact that the "wrong numbering if actually documented" is > > a fairly specious argument. The OpenWRT Wiki has never been an official > > documentation of any sort. I see it as a much more important aspect > > that the numbering of the Ethernet interfaces matches the user manual > > Marvell provides with its development and evaluation boards. The > > OpenWRT Wiki can certainly be fixed accordingly. > > That could be a good argument. > > Of course this would not be the first linux release to change network > device ordering. It has happened many times before. > > Unfortunately I don't see the eth ports on the marvell in udev, so I > don't know if udev could even help fix the names based on the address > of the port. This is more problematic, and something to be investigated. I don't immediately see why the Marvell network interfaces would not be visible by udev, but I haven't tested. > On any system I have been part of making in the past, I always added an > ifname attribute to the dtb and made the driver use that. This problem > is just too silly to put up with. There needs to be some sane way to > control the names of the interfaces. The solution of adding an alias in the DT, and using that to name network interfaces has already been proposed multiple times, but has been rejected by the networking maintainer, who suggests to use userspace tools (udev or something else) to rename network interfaces. See for example https://patchwork.kernel.org/patch/4122441/, which was proposed by my colleague Boris Brezillon. You can always try to propose again a solution, but I doubt it will be accepted. Best regards, Thomas -- Thomas Petazzoni, CTO, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
On Wed, Aug 24, 2016 at 04:50:11PM +0200, Thomas Petazzoni wrote: > We had many many users getting confused by the fact that the order of > the network interfaces was inverted compared to: > > * The board documentations > * The U-Boot numbering > * And to a lesser extent, the vendor kernel > > So having things match the documentation numbering was in our opinion > the least confusing thing moving forward. We should have done it > earlier, but we thought that the rule "order by register address" was a > very strong rule. > > At this point, reverting the patch is I believe cause more harm than > good. It's going to re-confuse again people. Well wouldn't it only affect the tiny proportian that have moved to a >4.4 kernel? Looking at LEDE, openWrt, etc, it seems very few have so far. I don't think that's a good argument. > Regarding the fact that the "wrong numbering if actually documented" is > a fairly specious argument. The OpenWRT Wiki has never been an official > documentation of any sort. I see it as a much more important aspect > that the numbering of the Ethernet interfaces matches the user manual > Marvell provides with its development and evaluation boards. The > OpenWRT Wiki can certainly be fixed accordingly. That could be a good argument. Of course this would not be the first linux release to change network device ordering. It has happened many times before. Unfortunately I don't see the eth ports on the marvell in udev, so I don't know if udev could even help fix the names based on the address of the port. On any system I have been part of making in the past, I always added an ifname attribute to the dtb and made the driver use that. This problem is just too silly to put up with. There needs to be some sane way to control the names of the interfaces. -- Len Sorensen
Re: [Regression?] Commit cb4f71c429 deliberately changes order of network interfaces
Hello, On Sun, 21 Aug 2016 15:11:58 +0200, Ralph Sennhauser wrote: > Commit cb4f71c4298853db0c6751b1209e4535956f136c changes the order of > the network interfaces for armada-38x. As a special exception to the > "order by register address" rule says the comment in the dtsi. The > commit messages even calls it a violation. > > I can't remember having owned a device were the internal and external > numbering actually matched, so the important bit for me is whatever the > order is it should remain constant. > > Distributions like OpenWrt have to fix their code when moving from 4.4 > currently to past 4.6 [1]. Worse the so called "wrong ordering" is > actually documented [2]. There are likely more victims out there. In > case it goes unnoticed by the distribution the users lan becomes wan > and vice versa. We had many many users getting confused by the fact that the order of the network interfaces was inverted compared to: * The board documentations * The U-Boot numbering * And to a lesser extent, the vendor kernel So having things match the documentation numbering was in our opinion the least confusing thing moving forward. We should have done it earlier, but we thought that the rule "order by register address" was a very strong rule. At this point, reverting the patch is I believe cause more harm than good. It's going to re-confuse again people. Regarding the fact that the "wrong numbering if actually documented" is a fairly specious argument. The OpenWRT Wiki has never been an official documentation of any sort. I see it as a much more important aspect that the numbering of the Ethernet interfaces matches the user manual Marvell provides with its development and evaluation boards. The OpenWRT Wiki can certainly be fixed accordingly. Best regards, Thomas -- Thomas Petazzoni, CTO, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com