I've been helping package a load of stuff recently for Robot OS and in
checking the copyright files I've come up aginst the question of exactly
how much segmentation there should be in copyright files, and the answer
to that depends on what it is they are actually for?

Is it sufficient to specify what licence things are under, or do we
really want to split it up into every licence x copyright-holder, or
even every licence x copyright statement (i.e date + holder)?

Clearly we need to know what licence things are under, and that seems
to me to be the main purpose of the file. One can imagine
circumstances when some argument develops and we might need to care
about exactly _who_ owns the copyright on each file, but under normal
circumstances that simply doesn't matter. We just care if it was BSD
or GPL or Apache or whatever, not who actually contributed it under
those terms: that's part of the point of free-software licencing. It's
easy enough to go look at exactly which file is copyright who if need
be.

So, for an example of why this matters look at ompl 
(https://tracker.debian.org/pkg/ompl)

The package is largely BSD-3-Clause, with a couple of files that are Apache-2.0 
and Expat

However there are numerous copyright holders and files contributed on
various dates so I spent several hours making this copyright file:
https://sources.debian.net/src/ompl/1.0.0%2Bds2-1/debian/copyright/
with each copyright owner split out into a separate stanza.

Is there any real benefit in doing this? It's moderately accurate, but
what is the practical benefit over lumping all the BSD-3-clause
copyright holders together into one list?

If the answer is 'it's more accurate' then shouldn't we be requiring a
stanza for every different copyright statement (which in this package
would split the Rice University and Willow Garage sections into
another 10-15 stanzas with different dates/date-combinations). This
plethora of stanzas would also make the file very hard to read which
is why I didn't go that far. But then I thought about it and started
to wonder what exactly are the 'splitting' criteria?

The criteria that makes most sense to me is 'by licence'.

I just uploaded rosdistro
(https://tracker.debian.org/pkg/ros-rosdistro) and got a comment from
the reviewing ftpmaster that combining the two different copyright holders for 
BSD-3-clause files into one stanza was not really right:
https://sources.debian.net/src/ros-rosdistro/0.4.2-1/debian/copyright/

Files: *
Copyright: 2012 Willow Garage, Inc.
           2013-2014 Open Source Robotics Foundation
License: BSD-3-clause

This interpretation says that the copyright line is not a list of
copyright owners having stuff under this licence in the package, but a
statement of the copyright on all of those files. However if you
follow that logic then shouldn't we be having separate stanzas for
each statement:

File: foo
Copyright: 2012 Willow Garage, Inc.
License: BSD-3-clause

File: bar
Copyright: 2008 Willow Garage, Inc.
License: BSD-3-clause
 
File: bar
Copyright: 2010-2012 Willow Garage, Inc.
License: BSD-3-clause

and so on? Because if not then we are concatenating statements into stanzas:
File: bar
Copyright: 2008,2010-2012 Willow Garage, Inc.
License: BSD-3-clause

And if that's OK then why not concatenate owners too, to get:
Files: *
Copyright: 2012 Willow Garage, Inc.
           2013-2014 Open Source Robotics Foundation
License: BSD-3-clause

Lots of little stanzas is more accurate, but provides a much less
clear overview to the casual inspector. Which comes back to the
question of what exactly is this file for (people or computers)?

So, after thinking about this for a while I decided that it was not
clear to me what best/acceptable practice is, and that it would be
best to ask here.

I can't see much real usefulness in splitting beyond licence-type,
although where separate contributions are clear (e.g debian/*) then
having a stanza for that is generally informative. But in a package
like opml with loads of mixed-up files from various people and
instritutions over several years, the separation into
copyright-holders doesn't tell you much and is very laborious to
produce. Do we always want this done? What logic is there for that
that doesn't also imply splitting by date too?

I'm happy to do whatever we agree is right, but at the moment this
feels like pointless makework, so I'd like to understand what the
ftpmasters/policy actually requires, and why, and what other people do
about this.

I hope I have adequately explained the issue, and await guidance.

Wookey
-- 
Principal hats:  Linaro, Debian, Wookware, ARM
http://wookware.org/

Attachment: signature.asc
Description: Digital signature

Reply via email to