Le samedi 16 mai 2015 00:30:55 sebb a écrit : > On 15 May 2015 at 23:28, Hervé BOUTEMY <herve.bout...@free.fr> wrote: > > Le vendredi 15 mai 2015 15:34:47 sebb a écrit : > >> > I think we really have some data model problem here regarding what is a > >> > "project's DOAP file": sometimes, a project is a PMC, sometimes a > >> > project > >> > is a deliverable, more like what is called in projectsnew.a.o a > >> > "sub-project" > >> > >> That is not how I understand DOAPs. > >> > >> DOAP == Description Of A Project > >> > >> i.e. some releaseable artifact. > >> > >> A single PMC may have multiple projects, each with its own releases > >> and repositories. > >> These are modelled quite well in the DOAPs that PMCs have created. > > > > +1 > > > >> Information about the PMC which manages the projects is NOT stored in > >> a DOAP, it is stored in a PMC data file. > >> This is referenced from a DOAP using > >> > >> <asfext:pmd rdf:resource="URL"/> > >> > >> where URL is either an actual URL of a PMC data file or a dummy URL e.g. > >> > >> <asfext:pmc rdf:resource="http://<pmcname>.apache.org" /> > >> > >> which leads to a file here: > >> > >> https://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects > >> /da ta_files/<pmcname>.rdf > > > > I'm not RDF expert, but this Apache-specific algorithm to find PMC rdf > > file seems strange: I understand it is coded/known from projects.a.o xslt > > transformation > Yes. > > > But this should be usable from any RDF tooling, no? > > It's not currently usable except by using special processing. > > The problem is that the shorthand URL is used by all but about 4 of > the PMCs, so it would be a major challenge to get this fixed. > > Some PMCs are quick to fix such issues; some may take weeks or months > to fix even a simple error. I think that people don't understand this PMC information rdf file (I didn't until our current discussion) But with good explanations and visualization help given by projects-new.a.o, we can go really faster: I'm ready to try once we're clear :)
> > > Another problem I see with these PMC data rdf files is that they seem to > > not be really maintained: I doubt PMCs update PMC data rdf files on each > > PMC Chair change. > > Yes. > > > That's why I had the idea of generating/updating the chair when > > parsing committee-info.txt. > > Fair enough, but that does not mean the code needs to create yet > another RDF file. +1 my itend was not to create a new one, but replace with generated info > > > But other information manually written in current PMC data rdf files can't > > be found anywhere else, AFAIK. > > Yes. that's where it hurst: we need to mix handwritten with generated content... nedd to be clear on the process > > > Last problem: I personnally really didn't understand this PMC data rdf > > file > > until now. I don't know who understands it :) > > IMHO, the magic algorithm to find the rdf file is a root cause. > > The PMC data file is documented here: > > http://projects.apache.org/docs/pmc.html yeah, I read it several time before, I knew I was not confident with what I read, and now I know I completely misread it until now. > > >> > if you look at https://projects-new.apache.org/projects.html?pmc, > >> > typical > >> > cases for that are: > >> > - Incubator: there is the "the Incubator project", displayed without > >> > DOAP > >> > file since the incubator has special source info, and many sub-projects > >> > which provide DOAP files > >> > - Commons: there is no "Commons' DOAP file", then no TLP... on > >> > sub-project > >> > is quasi randomly chosen... Common's DOAP file, if it existed would not > >> > release anything, it"s a pure "organizational" project > >> > >> There is an ambiguity here: project can mean an organisational entity > >> and project can mean a releaseable artifact. > >> > >> There are different RDF files for the two meanings; only the artifact > >> has an associated DOAP. > >> > >> > - Ant: there is an Ant DOAP file that represent the TLP and the main > >> > released artifact > >> > >> No, it only links to the TLP = PMC data file, it does not represent the > >> TLP. The Ant DOAP file only represents the Ant product. > > > > ok, IIUC, I should rephrase > > https://projects-new.apache.org/project.html?ant : 1. "Top Level Project > > data:" to "Apache Committee data:" > > 2. "Project established:" to "Committee established:" > > That does not seem necessary. > > > 3. "Sub-projects (8):" to "Projects (8):", eventually boldening the TLP if > > one is the TLP > > No - none of the projects are the TLP. as said in the other thread, this assertion is confusing: "none of the projects are the Top Level Project" > The TLP / PMC is not the same as any of its projects. > > Most PMCs happen to have the same name as one of their projects, but > they are distinct entities. > > To take the Ant example, there needs to be an Ant PMC/TLP page and a > separate Ant project page. > These should be linked somehow. > > > and I should rename tlps.json to committees.json (and update code > > accordingly) > No need. given this problem with "a TLP is not a project", I think using committee or PMC would avoid confusion > > > then on https://projects-new.apache.org/ , do we really want to graph TLPs > > evolution or committees? > > No idea ok, for a later discussion :) > > > I suppose commons can be called a TLP, even if it does not have any "main" > > project that is the effective TLP > > Yes, Commons is a TLP/PMC. > > I don't think it's helpful to think of PMCs having a "main" project. > > PMCs have one or more projects; each project has a single PMC. > > > comdev is not really a TLP: should probably not be listed in projects > > list, > > but as "special committee not producing projects"? > > Well, it is responsible for this mailing list and is probably > responsible for the projects.a.o website. > > > is Labs a TLP? or like comdev? > > What does committee-info.txt say? these are normal committees but form a software perspective, they're not expected to produce any project AFAIK, that's why I think they are special regarding the other 161 PMC that are meant to produce projects > > > I suppose we can hard-code the list of committees that are not expected to > > have projects, the list should not change often: Labs and comdev seem to > > be > > the only 2 (that extend special committees from 5 to 7) > > > > and finally, in https://projects-new.apache.org/ > > change "163 top level software projects > > 107 sub-projects" to "270 projects managed by 163 committees" (or 161 if > > labs and comdev are special committees) > > > > > > this seems to make sense > > if no objection, I'll code it > > > > Regards, > > > > Hervé > > > >> > I chose Commons, but it could have been HttpComponents or Logging > >> > Services, or Lucene (Lucene have been very clear that there is a > >> > "Lucene > >> > core" sub- project), Web Services, Axis, Xalan, Xerces, XML Graphics, > >> > Attic, Creadur, DB, jUDDI, Tcl > >> > > >> > I chose Ant, but it could have been Velocity, MINA, Directory, HTTP > >> > Server, > >> > MyFaces, Tomcat > >> > > >> >> - (future) UI additions for *other* places. It would be awesome, for > >> >> example, to provide a tiny scriptlet that any project could inject in > >> >> their website that displays a "see also" menu. That would link to a > >> >> specific URL on projects.a.o that would say "hey, you came from > >> >> Cassandra, here are: -other big data projects, -other projects in > >> >> Java, > >> >> -other projects with the same committers... etc." as a service. > >> >> > >> >> - Shane > >> > > >> > I'll continue tonight on this > >> > Any help appreciated > >> > > >> > Regards, > >> > > >> > Hervé