On Thu, Feb 11, 2016 at 2:38 PM, Stian Soiland-Reyes <st...@apache.org> wrote:
> How about something very modern - moving to JSON-LD schema.org annotations
> in the root index of the project homepage and just fetching all of those..?
>
> Seriously; keeping them under a single comdev control sounds most sensible
> as I doubt the distributed DOAP files are well maintained.  Projects can
> raise pull requests to update and then see their changes live on the new
> projects.apache.org pages

I agree with centralize first, and decentralize when the need shows itself.

As for format: let prototype.  Seriously.

If Shane can provide some initial test data in any format (e.g. CSV) I
can convert that to YAML and you can convert it to JSON-LD, and Shane
can determine which would be easier for him to maintain.  I'll also go
the extra step and write a small script that converts it to JSON
(note: POJO, not LD), and write an ugly page that fetches and displays
that data.  Others can do likewise.

Shane should be able to use these programs as examples and extend them
as he sees fit.

- Sam Ruby

> On 11 Feb 2016 17:35, "sebb" <seb...@gmail.com> wrote:
>
>> On 11 February 2016 at 12:03, Shane Curcuru <a...@shanecurcuru.org> wrote:
>> > I need to annotate our structured data set of Apache projects to track
>> > which project names are registered trademarks.  This is needed to be
>> > able to properly generate a.o/foundation/marks/list (which is currently
>> > sadly outdated since it's manually built now).  This is a serious need
>> > for Brand Management, since we regularly have third parties say "but you
>> > didn't SAY it was your trademark, so I can do it anyway..."
>> >
>> > My thought is to annotate the PMC DOAP files with a registered marker,
>> > then use the existing projects.a.o building of the organized data.  Then
>> > use either JS or some cron static generation to display the actual
>> > marks/list page.
>>
>> There are two kinds of RDF files:
>> - the PMC RDF files [1] which are mainly stored in the comdev area
>> [2], though they can also be stored elsewhere.
>> The locations of the files are held in committees.xml [3]
>> [These are not actually DOAP files, though the format looks similar.]
>>
>> - the project DOAP files which are stored by individual projects; they
>> are listed in projects.xml [4]
>>
>> A single PMC RDF file can be associated with multiple DOAP files, e.g.
>> Commons, Creadur, Tomcat all have multiple independent project
>> releases.
>>
>> > Is annotating the project data sources the best idea, or should I simply
>> > create a new stable URL data source that's just a list of registered
>> > names, and join the tables?
>>
>> I doubt if either of the above file types are suitable.
>> The location of the index XML files [3], [4] has already been changed
>> once (when projects-new was established).
>>
>> DOAP files are located all over the place and are often moved within
>> the SCM without updating the index file.
>> If they are located in the source tree there are often multiple copies
>> in different branches.
>>
>> PMC RDF files may not be updateable except by the project (if located
>> in their SCM), and again may move without warning if they are not in
>> [2].
>>
>> It would potentially be possible to recover the PMC RDF files from
>> their external locations and insist that they only be stored in the
>> comdev area.
>> But a single PMC may have multiple marks. Potentially also a project
>> may move from a PMC to become its own PMC.
>>
>> Therefore I think a separate file is needed.
>> That would also allow write access to be limited if necessary.
>>
>> > The end result needs to be webcontent listing projects like:
>> >
>> > <h2>The ASF claims these trademarks</h2>
>> > ...list all active TLPs
>> > <a href="{$homepage}">Apache <b>{$projectname}</b></a>
>> > {$if registered then "&reg;" else "&trade;"}
>> >
>> > <br/>
>> >   {$shortdesc}
>> > ...
>> > <h2>The following projects are retired</h2>
>> > ...list all Attic projects
>> >
>> > <h2>The following projects are in incubation; all trademarks here may be
>> > property of respective owners</h2>
>> > ...list all Incubation projects
>> >
>> > Separately, we should list the name of each software *product* here,
>> > since if we offer something with a clear name as an independently
>> > downloadable software product, it can be our trademark.  So I'd like to
>> > list "Apache Directory Studio", since that's a notable name and a major
>> > product.  But I don't want to list "Apache Commons Foo Bar Baz and
>> > Kitchensink", since those are effectively just minor components that
>> > aren't really worth claiming.
>> >
>> > Comments/suggestions please?  I'm including the Whimsical project since
>> > they are also major consumers of this data.
>> >
>> > - Shane
>>
>> [1] https://projects.apache.org/pmc_rdf.html
>>
>> [2]
>> https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/committees/
>> [3]
>> https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/committees.xml
>> [4]
>> https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/projects.xml
>>

Reply via email to