You could standardize on website located data with something like:

project.apache.org/doap

using the current data format, maybe with an extra layer of subproject in
there somewhere.

Even if the project frequently updates the site (including that location
for some reason) it doesn't mean you have to invalidate any cache, just
stick to something like a monthly crawl of the data....

However, getting projects to do this could be an interesting exercise!

Peter Hunsberger


On Fri, Sep 8, 2023 at 5:22 AM sebb <seb...@gmail.com> wrote:

> On Fri, 8 Sept 2023 at 10:41, Bertrand Delacretaz
> <bdelacre...@apache.org> wrote:
> >
> > Hi,
> >
> > On Fri, Sep 8, 2023 at 12:26 AM sebb <seb...@gmail.com> wrote:
> > > ...I think it would be worth considering setting up a central store
> for DOAPs...
> >
> > As we require our projects to have a website anyway, wouldn't it be
> > better to get that information from the project's homepage instead of
> > a separate file ?
> >
> > As you mention, I think it's only the fields that rarely change that
> > are actually useful: the project's description, a few useful URLs,
> > programming language, communications channel URLs, project category,
> > code repository and download URLs, that's probably all we need ?
> >
> > That info can be embedded in HTML, for example using <meta> elements,
> > Open Graph [1] maybe, with some ASF-specific extensions such as
> > og:asf:category ?
> >
> > This would put the information in a natural place for projects to
> > maintain it, and Open Graph metadata has other benefits.
> >
> > OTOH this means writing a conversion layer that, starting from a list
> > of *.apache.org subdomain names, grabs that data and converts it to a
> > format that's useful for projects.a.o.
> >
> > Another option would be to embed the current DOAP format using a
> > <script> element with a specific type, example at [2]. That simplifies
> > the conversion layer but it's less convenient to manage at the website
> > level.
>
> Sorry, but though it is neat, I don't think it is sensible to use websites.
>
> Any solution that embeds the data in a webpage means downloading lots
> of data to get at a few bits of information.
> Websites tend to change much more frequently than DOAP data, so
> caching would not help much here.
> Also it should not be necessary to re-publish the website to update DOAP
> data.
> Not all projects have websites that are easy to update.
>
> Also, projects are not the same as PMCs.
> Whilst PMCs have an associated website which can readily be found,
> that is not the case for sub-projects.
>
> The other reason I suggested centralising the data is that it means it
> can potentially be maintained by people outside the project.
> For example, correcting typos and repo moves. Some fixes don't need
> the attention of a person from the project.
>
> In my experience, most PMCs are reasonably quick to fix issues in
> DOAPs and websites, but there are some who just don't respond, even
> for trivial changes.
>
> > -Bertrand
> >
> > [1] https://ogp.me/
> > [2] https://gist.github.com/bdelacretaz/66b72523c15cc5d8cb28ae7eeac2b7c3
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@community.apache.org
> > For additional commands, e-mail: dev-h...@community.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@community.apache.org
> For additional commands, e-mail: dev-h...@community.apache.org
>
>

Reply via email to