On 11 February 2016 at 12:03, Shane Curcuru <a...@shanecurcuru.org> wrote:
> I need to annotate our structured data set of Apache projects to track
> which project names are registered trademarks.  This is needed to be
> able to properly generate a.o/foundation/marks/list (which is currently
> sadly outdated since it's manually built now).  This is a serious need
> for Brand Management, since we regularly have third parties say "but you
> didn't SAY it was your trademark, so I can do it anyway..."
>
> My thought is to annotate the PMC DOAP files with a registered marker,
> then use the existing projects.a.o building of the organized data.  Then
> use either JS or some cron static generation to display the actual
> marks/list page.

There are two kinds of RDF files:
- the PMC RDF files [1] which are mainly stored in the comdev area
[2], though they can also be stored elsewhere.
The locations of the files are held in committees.xml [3]
[These are not actually DOAP files, though the format looks similar.]

- the project DOAP files which are stored by individual projects; they
are listed in projects.xml [4]

A single PMC RDF file can be associated with multiple DOAP files, e.g.
Commons, Creadur, Tomcat all have multiple independent project
releases.

> Is annotating the project data sources the best idea, or should I simply
> create a new stable URL data source that's just a list of registered
> names, and join the tables?

I doubt if either of the above file types are suitable.
The location of the index XML files [3], [4] has already been changed
once (when projects-new was established).

DOAP files are located all over the place and are often moved within
the SCM without updating the index file.
If they are located in the source tree there are often multiple copies
in different branches.

PMC RDF files may not be updateable except by the project (if located
in their SCM), and again may move without warning if they are not in
[2].

It would potentially be possible to recover the PMC RDF files from
their external locations and insist that they only be stored in the
comdev area.
But a single PMC may have multiple marks. Potentially also a project
may move from a PMC to become its own PMC.

Therefore I think a separate file is needed.
That would also allow write access to be limited if necessary.

> The end result needs to be webcontent listing projects like:
>
> <h2>The ASF claims these trademarks</h2>
> ...list all active TLPs
> <a href="{$homepage}">Apache <b>{$projectname}</b></a>
> {$if registered then "&reg;" else "&trade;"}
>
> <br/>
>   {$shortdesc}
> ...
> <h2>The following projects are retired</h2>
> ...list all Attic projects
>
> <h2>The following projects are in incubation; all trademarks here may be
> property of respective owners</h2>
> ...list all Incubation projects
>
> Separately, we should list the name of each software *product* here,
> since if we offer something with a clear name as an independently
> downloadable software product, it can be our trademark.  So I'd like to
> list "Apache Directory Studio", since that's a notable name and a major
> product.  But I don't want to list "Apache Commons Foo Bar Baz and
> Kitchensink", since those are effectively just minor components that
> aren't really worth claiming.
>
> Comments/suggestions please?  I'm including the Whimsical project since
> they are also major consumers of this data.
>
> - Shane

[1] https://projects.apache.org/pmc_rdf.html

[2] https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/committees/
[3] 
https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/committees.xml
[4] 
https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/projects.xml

Reply via email to