Re: DOAP format question

2015-05-11 Thread Hervé BOUTEMY
Le lundi 11 mai 2015 08:46:41 Sergio Fernández a écrit :
> Hi sebb,
> 
> On Tue, May 5, 2015 at 7:08 PM, sebb  wrote:
> > > What I can already say is that I do not understand what
> > 
> > https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/
> > data_files> 
> > > aim to represent.
> > 
> > This is the default location for the PMC data [1] files which provide
> > data about the PMC.
> > A single such file may be referenced by multiple DOAPs.
> > E.g. all the Commons components refer to the same PMC data file.
> 
> I do understand how DOAP is being used, and I guess it has been wrong from
> the very beginning.
I fear that's true :/
let's analyze what is wrong and fix it :)

> 
> Taking commons-lang as example, they currently have:
> 
>   http://commons.apache.org/lang/";>
> http://commons.apache.org/"/>
>   
> 
> which does not really link (in RDF) to the file
> https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/da
> ta_files/commons.rdf
> 
> RDF is a directly-label graph data model that uses URIs as names. Therefore
> the URI you put as as value of a property has a meaning, you should be able
> to directly fetch it, but not having such implicit rules where files a
> located in a svn. I guess apply that would mean a major restructuration of
> the current DOAP data, but that's something I can help to do to all PMCs.
> 
> Beside that issue on linking, I have come to the conclusions that asfext
> actually have the sense of two things:
> 
> * asfext:pmc is the property that links a project with its PMC
> * asfext:PMC should be a class for referring to PMCs
> 
> And that completely valid, but the tooling should know the difference and
> not just try to fix wrong data.
while working on it, I think I found one root cause: we're not clear about 
TLPs (with PMCs) vs software (often called "projects" too, but that have 
releases)

it seems 
https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/data_files
 is about TLPs/PMCs
and "classical" DOAP files written by people are about softwares that are done 
by TLPs/PMCs

We have a TLPs/PMCs authoritative source of information: that is committee-
info.txt
I used it as main source of data for
https://projects-new.apache.org/json/foundation/tlps.json
The only informations in tlps.json that do not come from committee-info.txt 
are:
- committers list (committee-info.txt only lists PMC members, LDAP gives 
committers)
- description of the TLPs: I'm still looking which could be the authoritative 
information source

for more details, the parsing code is here:
http://svn.apache.org/viewvc/comdev/projects.apache.org/scripts/import/parsecommittees.py?view=markup

I think we could generate an authoritative DOAP url for TLPs from committee-
info.txt
then give instructions to projects to update their software DOAP files to point 
to these reference TLPs DOAP files

this would:
- fix issues regarding standard DOAP/RDF semantics
- be easy to explain

I can generate tonight http://projects-new.apache.org/doap/tlp/ as a POC for 
https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/data_files
 replacement

notice: from TLP and PMC, I prefer to talk about TLP, since the TLP has a PMC 
+ a homepage, committers, and other attributes

WDYT?

Regards,

Hervé

> 
> Let me try to find some time to put some of these things in order to have a
> better overview...



Re: Chairs: A small addition to the Marvin email you received yesterday.

2015-05-11 Thread David Crossley
On Mon, May 11, 2015 at 11:51:44AM +0200, Daniel Gruno wrote:
> Hi David,
> the quarterly reminder from Marvin should have a link to the reporter 
> service in its messages to chairs now. Earlier it did not, which is why 
> I sent an extra email to the chairs that had a report coming up in 
> March, so that they too would be aware of the new site.
> 
> I hope this answers your question :)

It does. Sorry for the noise. I read Shazron's reply earlier in this
thread as being about the current month.

-David

> With regards,
> Daniel.
> 
> On 2015-05-09 01:23, David Crossley wrote:
> >Hi Daniel, i wonder about your list of people that were sent this.
> >I did not receive either the email below or the previous one that you 
> >refer to.
> >I did receive marvin's initial reminder on 27 April.
> >
> >-David
> >
> >>On Thu, Mar 5, 2015 at 6:31 AM, Daniel Gruno  wrote:
> >>>Hi Project chairs,
> >>>In yesterday's email to you about your upcoming board report, we forgot 
> >>>to
> >>>mention that we have a new tool that can help you in cobbling together a
> >>>report, or just view statistics of the PMCs you are on.
> >>>
> >>>The new service is located at: https://reporter.apache.org and is PMC
> >>>members only.
> >>>Should you choose to make use of the board report template in this 
> >>>system,
> >>>do remember to add in the important activity bits and any issues that
> >>>require board activity.
> >>>
> >>>Next time Marvin sends you an email, it will include the URL for the
> >>>reporter system.
> >>>
> >>>If you have ANY feedback about this system, don't hesitate to let us 
> >>>know!
> >>>:)
> >>>
> >>>On behalf of the Community Development Project,
> >>>Daniel.
> 


Re: DOAP format question

2015-05-11 Thread sebb
On 11 May 2015 at 13:21, Sergio Fernández  wrote:
> Hi,
>
> On Mon, May 11, 2015 at 1:13 PM, sebb  wrote:
>>
>> There is some special-case XSL code which does this conversion.
>> If the rdf:resource URL does not end in ".rdf", then the the http://
>> and .apache.org/ header and trailer are stripped off, leaving
>> "commons"
>> This is then assumed to be the name of an RDF file in data_files/
>>
>
> Then such " assumption" is a custom patch, it' d need to be know by any
> other tool. Therefore no external tool is able to process such data.

Agreed, but I suspect think that may not be the only assumption made
by the XSL scripts.
IMO it would be best to establish all the changes that need to be made
in one hit rather than to have to keep asking PMCs to fix another
aspect of their DOAPs.

>
>> > RDF is a directly-label graph data model that uses URIs as names.
>> Therefore
>> > the URI you put as as value of a property has a meaning, you should be
>> able
>> > to directly fetch it, but not having such implicit rules where files a
>> > located in a svn. I guess apply that would mean a major restructuration
>> of
>> > the current DOAP data, but that's something I can help to do to all PMCs.
>>
>> Changing it now would be very time-consuming and tedious.
>> However, if this were to be done, I suggest dropping the data_files
>> directory entirely and insisting that PMCs host their own PMC data
>> files. This would mean adding a new script to create a basic PMC file.
>>
>
> We can provides some infrastructure, either to validate or create whatever
> in the right way.
>
>
>> Another aspect of this is where the RDF files are stored.
>> These need to be under the control of the project, and it makes sense
>> to keep them under source control (SC).
>> However SC URLs tend to change, thus breaking the project build.
>> Therefore I suggest it would be best if the RDF files were stored
>> under the website URL - which is much less prone to change, and
>> redirects can deal with any required changes.
>> Website source is nowadays stored in SC anyway.
>>
>
> FMPOV the cleanest approach would be to serve all that files from each web
> site.

Yes, that was my point.

>
>> Another aspect is that DOAP files are not useful in source/binary releases.
>>
>
> We can workout a DOAP extension for such feature.

I don't think that would be a good idea.
If the DOAP file is part of the source release, how can it have the
correct release date for the release it is part of?
Also, source trees are duplicated in tags and branches; it's confusing
to have multiple copies of a DOAP.

The point is that DOAPs are meta data about a project and its releases.
They are not data that is useful to the project code.

> I just need time ;-)
>
> Cheers,
>
> --
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 6602747925
> e: sergio.fernan...@redlink.co
> w: http://redlink.co


Re: DOAP format question

2015-05-11 Thread Sergio Fernández
Hi,

On Mon, May 11, 2015 at 1:13 PM, sebb  wrote:
>
> There is some special-case XSL code which does this conversion.
> If the rdf:resource URL does not end in ".rdf", then the the http://
> and .apache.org/ header and trailer are stripped off, leaving
> "commons"
> This is then assumed to be the name of an RDF file in data_files/
>

Then such " assumption" is a custom patch, it' d need to be know by any
other tool. Therefore no external tool is able to process such data.


> > RDF is a directly-label graph data model that uses URIs as names.
> Therefore
> > the URI you put as as value of a property has a meaning, you should be
> able
> > to directly fetch it, but not having such implicit rules where files a
> > located in a svn. I guess apply that would mean a major restructuration
> of
> > the current DOAP data, but that's something I can help to do to all PMCs.
>
> Changing it now would be very time-consuming and tedious.
> However, if this were to be done, I suggest dropping the data_files
> directory entirely and insisting that PMCs host their own PMC data
> files. This would mean adding a new script to create a basic PMC file.
>

We can provides some infrastructure, either to validate or create whatever
in the right way.


> Another aspect of this is where the RDF files are stored.
> These need to be under the control of the project, and it makes sense
> to keep them under source control (SC).
> However SC URLs tend to change, thus breaking the project build.
> Therefore I suggest it would be best if the RDF files were stored
> under the website URL - which is much less prone to change, and
> redirects can deal with any required changes.
> Website source is nowadays stored in SC anyway.
>

FMPOV the cleanest approach would be to serve all that files from each web
site.


> Another aspect is that DOAP files are not useful in source/binary releases.
>

We can workout a DOAP extension for such feature.

I just need time ;-)

Cheers,

-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernan...@redlink.co
w: http://redlink.co


Re: DOAP format question

2015-05-11 Thread sebb
On 11 May 2015 at 07:46, Sergio Fernández  wrote:
> Hi sebb,
>
> On Tue, May 5, 2015 at 7:08 PM, sebb  wrote:
>>
>> > What I can already say is that I do not understand what
>> >
>> > https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/data_files
>> > aim to represent.
>>
>> This is the default location for the PMC data [1] files which provide
>> data about the PMC.
>> A single such file may be referenced by multiple DOAPs.
>> E.g. all the Commons components refer to the same PMC data file.
>
>
> I do understand how DOAP is being used, and I guess it has been wrong from
> the very beginning.
>
> Taking commons-lang as example, they currently have:
>
>   http://commons.apache.org/lang/";>
> http://commons.apache.org/"/>
>   
>
> which does not really link (in RDF) to the file
> https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/data_files/commons.rdf

There is some special-case XSL code which does this conversion.
If the rdf:resource URL does not end in ".rdf", then the the http://
and .apache.org/ header and trailer are stripped off, leaving
"commons"
This is then assumed to be the name of an RDF file in data_files/

I've no idea why the actual URL was not required - perhaps it was
thought that the data_files directory location might not be stable.
This was in the code before I got involved.

> RDF is a directly-label graph data model that uses URIs as names. Therefore
> the URI you put as as value of a property has a meaning, you should be able
> to directly fetch it, but not having such implicit rules where files a
> located in a svn. I guess apply that would mean a major restructuration of
> the current DOAP data, but that's something I can help to do to all PMCs.

Changing it now would be very time-consuming and tedious.
However, if this were to be done, I suggest dropping the data_files
directory entirely and insisting that PMCs host their own PMC data
files.
This would mean adding a new script to create a basic PMC file.

Another aspect of this is where the RDF files are stored.
These need to be under the control of the project, and it makes sense
to keep them under source control (SC).
However SC URLs tend to change, thus breaking the project build.
Therefore I suggest it would be best if the RDF files were stored
under the website URL - which is much less prone to change, and
redirects can deal with any required changes.
Website source is nowadays stored in SC anyway.

Another aspect is that DOAP files are not useful in source/binary releases.

> Beside that issue on linking, I have come to the conclusions that asfext
> actually have the sense of two things:
>
> * asfext:pmc is the property that links a project with its PMC
> * asfext:PMC should be a class for referring to PMCs
>
> And that completely valid, but the tooling should know the difference and
> not just try to fix wrong data.
>
> Let me try to find some time to put some of these things in order to have a
> better overview...
>
> --
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 6602747925
> e: sergio.fernan...@redlink.co
> w: http://redlink.co


Re: Chairs: A small addition to the Marvin email you received yesterday.

2015-05-11 Thread Daniel Gruno

Hi David,
the quarterly reminder from Marvin should have a link to the reporter 
service in its messages to chairs now. Earlier it did not, which is why 
I sent an extra email to the chairs that had a report coming up in 
March, so that they too would be aware of the new site.


I hope this answers your question :)

With regards,
Daniel.

On 2015-05-09 01:23, David Crossley wrote:

Hi Daniel, i wonder about your list of people that were sent this.
I did not receive either the email below or the previous one that you refer to.
I did receive marvin's initial reminder on 27 April.

-David


On Thu, Mar 5, 2015 at 6:31 AM, Daniel Gruno  wrote:

Hi Project chairs,
In yesterday's email to you about your upcoming board report, we forgot to
mention that we have a new tool that can help you in cobbling together a
report, or just view statistics of the PMCs you are on.

The new service is located at: https://reporter.apache.org and is PMC
members only.
Should you choose to make use of the board report template in this system,
do remember to add in the important activity bits and any issues that
require board activity.

Next time Marvin sends you an email, it will include the URL for the
reporter system.

If you have ANY feedback about this system, don't hesitate to let us know!
:)

On behalf of the Community Development Project,
Daniel.