I'm going to risk the wrath of the open source purists amongst you by 
top-posting. The reason is that there has been lots of email on this subject, 
and I want to summarise the key issue and clearly state our position on it. 
Hoperfully nobody is offended by this!

This thread has generated lots of debate, which is good, and there are a number 
of items that I believe everybody agrees on (having a MAINTAINERS file, better 
tools etc.). However, the key issue that we do not agree on is the granularity 
of the repositories within DPDK.

The current state is:
- One master repository
- A small number of sub-repositories, each with a separate maintainer/SME, 
including: i40e, fm10k, bnx2x, docs, dts. These are used to generate pull 
requests to the main repo.

You're proposing the following:
- Remove the individual PMD repositories, and replace them with a single 
repository containing all PMDs, plus parts of the core code that are closely 
linked to the PMDs, with you as the maintainer and SMEs for each PMD. 

As I've said before, and as Venky has also explained in private email on this, 
we do not agree with this proposed change. We believe it would be a backward 
step, and would have an adverse impact on DPDK.

The key issue here is that we've deliberately tried to give as much control as 
possible to those who are intimately familiar with a particular PMD and the 
underlying hardware. In our view, that's just common sense. What you're 
proposing is to take some of that control away, and give it to someone else (in 
this case you) who isn't familiar with the details. We don't agree with that 
approach.

The arguments I've heard in favour of your proposal are summarised below. 
Apologies for paraphrasing, and for any errors, but the email thread is too big 
and too convoluted to address these all separately. I've also included an 
explanation for each item to say why we don't believe they're sufficient to 
justify your proposed change.

1. Your proposal is consistent with the Linux kernel, but current state is not.
In both cases (current state and your proposal), the workflow is exactly the 
same as the Linux kernel. The difference we're discussing is simply the 
granularity of the repositories.

DPDK is much smaller than the kernel, so the granularity is naturally going to 
be different. The kernel may combine drivers into a single repository due to 
its size and complexity, but that doesn't mean we need to do the same for DPDK.

2. Maintainer and SME are separate roles and should not be combined.
We understand that they're separate roles. For the PMDs that we're developing, 
the most efficient way for us to manage the work is to combine these and have 
one of our SMEs act as maintainer as well. That's an internal decision that 
suits the structure of our development team and the current state of the i40e 
and fm10k PMDs. Others are obviously free to make their own choices for PMDs 
that they're developing.

3. A maintainer can handle a higher volume of patches than will be present in 
any individual PMD.
That's true, but it's also not relevant. Our goal is not to make the 
SME/maintainer for i40e, fm10k etc. a full-time position. Our goal is to have 
an expert in this position, so that we can move quickly and still ensure good 
quality software.

4. We shouldn't give maintainer work to an SME as it detracts from their other 
tasks.
We don't anticipate a problem with workload for the people that we have in 
these positions.

5. There will be extra overhead for developers who want to implement changes 
spanning multiple PMDs.
That's true, but it's also something of a hypothetical argument. The people 
who've spoken against your proposal on the mailing list are from Intel and 
6Wind, who are also the people contributing to most of these PMDs. I had a 
quick scan of the commits to see if I could spot anything from another 
contributor that might fall into this category, and I couldn't (admittedly it 
wasn't an exhaustive search, which someone else is free to do if they want). If 
this situation does arise, then Thomas has previously outlined how it can be 
handled.


In terms of where we go from here, I'd suggest the following:
- Thomas has already asked us about a maintainer for older Intel NICs, which 
we're looking into. We may choose to have a single repository with a single 
maintainer/SME for e1000/igb/ixgbe combined, which would limit the overall 
number of repositories.
- You could pursue a path of having a single repository for non-Intel NICs. 
This would obviously need to be worked with those responsible (Stephen, Sujith 
etc.).
- As Thomas previously suggested, we should review this in future, possibly 
after 2.0. Maybe you'll be proven right and we'll all have to apologise for 
disagreeing!


Tim

> -----Original Message-----
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Friday, January 16, 2015 4:51 PM
> To: O'driscoll, Tim
> Cc: Thomas Monjalon; dev at dpdk.org
> Subject: Re: [dpdk-dev] Why nothing since 1.8.0?
> 
> On Thu, Jan 15, 2015 at 09:55:00PM +0000, O'driscoll, Tim wrote:
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > > Sent: Thursday, January 15, 2015 6:51 PM
> > > To: Thomas Monjalon
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] Why nothing since 1.8.0?
> > >
> > > On Thu, Jan 15, 2015 at 06:25:33PM +0100, Thomas Monjalon wrote:
> > > > 2015-01-15 08:06, Neil Horman:
> > > > > On Thu, Jan 15, 2015 at 10:51:38AM +0100, Thomas Monjalon wrote:
> > > > > > 2015-01-15 04:27, Ouyang, Changchun:
> > > > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhang,
> Helin
> > > > > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil
> > > Horman
> > > > > > > > > On Wed, Jan 14, 2015 at 12:23:52PM -0800, Stephen
> Hemminger
> > > wrote:
> > > > > > > > > > Ok, so 1.8.0 came out almost a month ago and none of the
> > > patches
> > > > > > > > > > that were deferred waiting for the release got merged since
> > > then.
> > > > > > > > > > Last commit in git is the 1.8.0 release.
> > > > > > > > > >
> > > > > > > > > > Where is the post-merge window bundle, where are the
> later
> > > commits?
> > > > > > > > > > Lots of patches are sitting rotting in patchwork...
> > > > > > > > >
> > > > > > > > > +1, I've had the same questions.
> > > > > > > > > Neil
> > > > > > > >
> > > > > > > > +1, Some patch set might be ready for being merged.
> > > > > > >
> > > > > > > +1,  the earlier some patches are merged into mainline, and the
> easier
> > > those
> > > > > > > sequent patch sets can resolve their conflicts.
> > > > > >
> > > > > > +1, there are some patches which are properly reviewed
> > > > > >
> > > > > > Reminder: sub-tree to manage specific part of DPDK can be open on
> > > request
> > > > >
> > > > > Ok, I think what you're saying here is you're too busy to handle all 
> > > > > the
> > > patches
> > > > > comming in at the moment.  As such I'd like to propose a sub-tree
> > > encompassing
> > > > > all the pmds in DPDK.  I would envision that including all the acutal
> pmd's
> > > in
> > > > > the tree, as well as the infrastructure that is used to interface 
> > > > > them to
> the
> > > > > core (i.e. the ethdev/rte_ether library).  I'll gladly maintain the 
> > > > > patch
> pool
> > > > > and send you pull requests.
> > > >
> > > > The list of PMDs is increasing:
> > > >         librte_pmd_af_packet
> > > >         librte_pmd_bond
> > > >         librte_pmd_e1000
> > > >         librte_pmd_enic
> > > >         librte_pmd_i40e
> > > >         librte_pmd_ixgbe
> > > >         librte_pmd_pcap
> > > >         librte_pmd_ring
> > > >         librte_pmd_virtio
> > > >         librte_pmd_vmxnet3
> > > >         librte_pmd_xenvirt
> > > > There is already some sub-trees for bnx2x, fm10k and i40e:
> > > >         http://dpdk.org/browse/
> > > >
> > > Yes, and I've mentioned before that that is an absolutely silly way to
> break
> > > out
> > > subtrees.  You have to find a balance of workload distribution and
> developer
> > > convienience.
> > >
> > > I also note that these are problematic because you're not merging
> anything
> > > from them. Is it your intention to keep bnx2 and fm10k separate in
> > > perpituity?
> > > If so, thats a real problem, because then we effectively just have several
> out
> > > of tree drivers, and thats just unacceptible.
> > >
> > > > > If you could set me up with a login to dpdk.org, I'd appreciate it.
> > > >
> > > > It is preferred to have 1 sub-tree per module.
> > > > What do you think of managing contributions for af_packet and/or
> virtio?
> > > > It would make sense as virtio is a RedHat technology.
> > > > Maybe it could include vhost lib and example.
> > > >
> > > No, for reasons I've mentioned before.  If you take each pmd/library and
> > > create
> > > a subtree for it, you've created the most fine grained control of subtrees
> you
> > > could ask for, but you've created a nighmare of a burden on developers
> who
> > > want
> > > to update any code, especially if they have patches that hit multiple
> trees.
> > >
> > > Look at some of the stats in the dpdk tree:
> > >
> > > Library           Commits between 1.7.0 and 1.8.0
> > > librte_acl                5
> > > librte_cfgfile            0
> > > librte_cmdline            4
> > > librte_compat             0
> > > librte_distributor        5
> > > librte_eal                125
> > > librte_ether              31
> > > librte_hash               1
> > > librte_ip_frag            5
> > > librte_ivshmem            0
> > > librte_kni                2
> > > librte_kvargs             0
> > > librte_lpm                1
> > > librte_malloc             1
> > > librte_mbuf               39
> > > librte_mempool            4
> > > librte_meter              0
> > > librte_net                4
> > > librte_pipeline   0
> > > librte_pmd_af_packet      4
> > > librte_pmd_bond   20
> > > librte_pmd_e1000  21
> > > librte_pmd_enic   12
> > > librte_pmd_i40e   90
> > > librte_pmd_ixgbe  83
> > > librte_pmd_pcap   4
> > > librte_pmd_ring   0
> > > librte_pmd_virtio         21
> > > librte_pmd_vmxnet3        21
> > > librte_pmd_xenvirt        6
> > > librte_port               6
> > > librte_power              3
> > > librte_ring               2
> > > librte_sched              1
> > > librte_table              7
> > > librte_timer              0
> > > librte_vhost              30
> > >
> > > If you look at all of the pmds in the dpdk tree, we're talking about ~300
> > > patches per release.  If you look at the net-next tree for the linux 
> > > kernel,
> > > Dave Miller merged 569 patches on his own (based on the following
> > > command:
> > > git log --pretty=format:%H v3.17..v3.18 -- drivers/net/ethernet/
> net/core/ |
> > > wc
> > > -l)
> > >
> > > And that doesn't account for the ~500 patches that come in via pull
> request
> > > from
> > > the wireless subtree.  Nor does it account for the merge window for net-
> > > next
> > > being 2 months instead of dpdk's 6 months.  Theres no need in any way
> for
> > > 12
> > > maintainers to be twiddling their thumbs waiting on ~20 patches each,
> and
> > > for
> > > that split, you've forced developers to potentially develop patches
> against 12
> > > trees (12 being the current number of PMD's that are in the dpdk).
> > >
> > > The right answer here is balance.  Let me split out the pmd's and ethernet
> > > infrastructure libraries to a subtree.  I'll pull in patches posted 
> > > regarding
> > > pmd's and librte_ether/ip_frag etc, and send you a pull requests after
> each
> > > release so you get all the latest bits, and then pulls for stabilization 
> > > on
> each
> > > -rc. I can manage 300 patches without issue, and that takes a load off 
> > > your
> > > shoulders.  I'll get fm10k integrated, as well as bnx2.  That gives us a 
> > > single
> > > alternate tree for developers to go to for pmd and pmd infrastructure
> > > updates.
> > > Its a win-win.
> > >
> > > Regards
> > > Neil
> >
> > I agree with Thomas on this. The approach we've been taking for PMDs for
> newer Intel NICs is to have a separate sub-repository with a maintainer
> who's an expert in the area. This offloads some work from Thomas, ensures
> that the maintainer is very familiar with the PMD code and the corresponding
> hardware,
> As I mentioned to you in a private note, you're convoluting two roles into a
> single one here to the detriment of both:
> 
> 1) A tree maintainer is a machine, who is responsible for the mechanicm of
> SCM
> and tree maintenence.  They are responsible for merging patches, making
> sure
> that merged patches make it to their upstream tree, ensuring conflicts get
> resolved, and above all, making sure that SME's review patches before they
> get
> merged
> 
> 2) An SME (subject matter expert), is just that, someone who is intimately
> familiar with the hardware and software of a certain bit of a project.  Their
> only responsibility is to ensure that proposed changes are legitimate and
> safe.
> They review patches within their purview and provide acks to the tree
> maintainer.
> 
> When you merge these roles, you take the person most responsible for the
> stability of their code, and distract them with 1000 patch management
> operations, and from the work of developing new code for their area of
> expertise.
> 
> The separation of these roles has evolved in several communities because of
> exactly the above.  Linus knows alot about the kernel, but he's never opened
> the
> I40e datasheet, and he doesn't have to because he trusts Dave M to ensure
> that
> code in his pull requests is legitimate.  Dave in turn relies on someone at
> Intel to ack every I40e patch on the list before he merges it.  He does 
> exactly
> the same thing with every single other sub-component as listed in the
> MAINTAINERS file.  Thats how he is able to manage twice as many patches as
> the
> dpdk can in half the amount of time.  If its efficiency you're after, thats 
> the
> route to go.  Please don't ignore all that evolutionary refinement.
> 
> 
> >and doesn't involve too much additional work for the developers involved
> (as you said, there isn't a huge volume of commits for any individual PMD).
> Patch volume isn't where the additional effort comes in.  Its in the
> determination of what tree to develop against.
> 
> Lets take an example.  I'm doing work on the I40e card, so I have to look to
> see
> if theres a separate tree for it.  I figure out there is, so I have to clone
> that tree, and develop a patch against it.  So far so good.
> 
> A few days later I notice a bug in the pcap pmd, so I want to write a patch to
> fix it.  If we carry your model out to the point where each pmd has its own
> subtree, I have to go find that new tree, and clone it as well (or add a new
> remote to my existing tree), pull those changes, branch from the proper
> master,
> and continue my work).
> 
> If later I find a bug in the virtio pmd, I have to repeat the above process
> again, cloning a new tree, or adding a new remote for that drivers tree.
> 
> It doesn't sound like alot when you're just sitting in a room talking about 
> it,
> but for the developers actually working on the code, its a significant
> inconvienience, since it means that any pmd you want to touch requires you
> to go
> through this lookup process, ensuring your branching from the right master
> branch in the proper tree.
> 
> Conversely, if you put all pmds in a single subtree, it doesn't matter what
> pmd a person is working on, they only have to clone the pmd subtree, and
> they're
> good to go.
> 
> 
> > We have this in place for i40e now, and will be applying this to fm10k, 
> > which
> hasn't been upstreamed yet but will be in time for the 2.0 release. Based on
> our experiences with those, we may well look at extending the model to
> include PMDs for other Intel NICs, and possibly other libraries, as well.
> >
> You really haven't.  You have an i40e tree, but you have dozens of I40e
> patches
> on the list, and all of them thus far have been integrated into the main tree.
> Ditto with the bnx2x tree and others that have been separated.  Please
> remember
> subject matter expert != repo maintainer.  The roles can be, and should be,
> separeated.
> 
> > As you said, there's a balance to be struck, and too many subtrees may
> become unmanageable. With respect to your concern about developers
> having to potentially develop patches against multiple subtrees, this has
> never been raised as a concern by any of our development team. Is there
> any historical data on the number of changes that would fall into this
> category so we can see if it's a real problem or not?
> Look at the linux kernel if you want historical data on where the balance is.
> Major subsystems has become the natural breaking point for subtree
> maintenence.
> Every net driver goes through Millers net or net-next tree.  All scsi updates
> go
> through Bottomleys scsi tree.  FCoE changes used go through what used to
> be Rob
> Loves FCoE tree.  It just makes sense.  I'm sure there isn't recorded data to
> show anything more granular than that because no one ever seriously
> considered
> breaking out all 301 network drivers into their own subtree, because its
> obviously unmanageable at that scale. Even taken all as a single unit (as
> the kernel does for network drivers), a single maintainer can still handle the
> patch volume with the right subject matter experts doing timely reviews.
> 
> 
> The point of this very long message is that maintainers != repos.  Just
> because
> you're a subject matter expert in a given bit of hardware/pmd, in no way
> means
> you need to be responsible for patch merging/maintenence.  Doing so just
> makes
> everything harder.
> 
> Neil

Reply via email to