I'm trying to understand the twisty maze of data sources that fuel projects.apache.org and either I'm confused, or there's some inconsistency in how this all fits together.

I'll start with just one data source for now, so that I don't muddle multiple things together.

https://svn.apache.org/repos/asf/comdev/projects.apache.org/trunk/data/committees.xml

This file has a list of rdf files which are supposed to be in the committees/ subdirectory. The file itself says:

   This list should agree with the files in the directory committees/

However, in addition to the entries that look like:

  <location>committees/any23.rdf</location>

there are also lines that look like:

  <location>http://flex.apache.org/pmc_Flex.rdf</location>

(4 of them, for whatever that's worth - flex, ofbiz, plc4x, and tez)

Is that correct? Or is that not how the data is supposed to be stored?

Meanwhile, committees.xml contains 209 projects:

grep location committees.xml| grep -vc Retired
209

while the committees/ directory contains just 206 rdf files:

ls committees/*.rdf| wc -l
206
(Note, one of those files is _template.rdf, so it's really 205, and 205 + 4 = 209, so at least everything else matches up.)



--
Rich Bowen - rbo...@rcbowen.com
http://rcbowen.com/
@rbowen

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@community.apache.org
For additional commands, e-mail: dev-h...@community.apache.org

Reply via email to