Re: Data inconsistency in projects.apache.org

2020-03-28 Thread Rich Bowen
On Fri, Mar 27, 2020, 16:16 sebb wrote: > > > So, while to me that seems like an obvious and enormous improvement, my > > understanding is that this was proposed before and someone (I understood > > it was you?) vetoed the change. So I'm a teensy bit confused. > > Not me. > I have always been in

Re: Data inconsistency in projects.apache.org

2020-03-28 Thread Hervé BOUTEMY
yes, I'm convinced some data can be extracted automatically but I also know that some data can't for example: - for committees with multiple projects, like https://projects.apache.org/projects.html?committee#commons - for projects still using svn - the definition of languages and categories and

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Hervé BOUTEMY
there are many more parts, see some examples of human-readable output: https://projects.apache.org/project.html?accumulo https://projects.apache.org/project.html?calcite Le vendredi 27 mars 2020, 21:44:56 CET Dave Fisher a écrit : > metadata for project releases is discoverable from the dist in

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Dave Fisher
metadata for project releases is discoverable from the dist in svn. It is already done for podlings in the Incubator in the clutch analysis. It is python. I can provide some help late next week. Sent from my iPhone > On Mar 27, 2020, at 1:20 PM, sebb wrote: > > On Fri, 27 Mar 2020 at 20:01,

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread sebb
On Fri, 27 Mar 2020 at 20:01, Hervé BOUTEMY wrote: > > Le vendredi 27 mars 2020 20:29:14 CET, vous avez écrit : > > On 3/27/20 3:07 PM, Hervé BOUTEMY wrote: > > > It's good to see some interest back on DOAP files content ad organisation, > > > now that the projects.apache.org rendering makes them

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread sebb
On Fri, 27 Mar 2020 at 18:04, Rich Bowen wrote: > > > > On 3/27/20 1:13 PM, sebb wrote: > > On Fri, 27 Mar 2020 at 13:44, Rich Bowen wrote: > >> there are also lines that look like: > >> > >> http://flex.apache.org/pmc_Flex.rdf > >> > >> (4 of them, for whatever that's worth - flex, ofbiz,

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Rich Bowen
On 3/27/20 4:01 PM, Hervé BOUTEMY wrote: my point about "PMC RDF files" vs "projects DOAP files" is not a question of format, but a question of amount of data and who would have real knowledge to update content: - PMC RDF files are very light, rarely updated, and contain data that are

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Hervé BOUTEMY
Le vendredi 27 mars 2020, 20:11:33 CET Rich Bowen a écrit : > For context, I'm trying to address Sally's complaint that the data on > projects.a.o is inconsistent, out of date, and wonky. yes, I like this objective > I am very willing > to reach out to various projects about data updates (and am

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Hervé BOUTEMY
Le vendredi 27 mars 2020 20:29:14 CET, vous avez écrit : > On 3/27/20 3:07 PM, Hervé BOUTEMY wrote: > > It's good to see some interest back on DOAP files content ad organisation, > > now that the projects.apache.org rendering makes them really useful: a > > few years ago, trying to open any

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Rich Bowen
On 3/27/20 3:07 PM, Hervé BOUTEMY wrote: It's good to see some interest back on DOAP files content ad organisation, now that the projects.apache.org rendering makes them really useful: a few years ago, trying to open any discussion on that was deemed to failure. But any change is hard, since

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Rich Bowen
On 3/27/20 3:07 PM, Hervé BOUTEMY wrote: It's good to see some interest back on DOAP files content ad organisation, now that the projects.apache.org rendering makes them really useful: a few years ago, trying to open any discussion on that was deemed to failure. But any change is hard, since

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Rich Bowen
On 3/27/20 2:45 PM, Hervé BOUTEMY wrote: please start by reading the human-oriented explanation: https://projects.apache.org/about.html this should ease the deep dive into data behind the recurring "Committees vs Projects" discussion Thanks. That is indeed where I started. I think where I

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Hervé BOUTEMY
Le vendredi 27 mars 2020, 19:04:28 CET Rich Bowen a écrit : > > If any changes are made, I strongly recommend centralising the data files. > > DOAP files maintained in project data areas often get moved, and the > > project forgets to update the entry in projects.xml > > Also, sometimes edits to

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Hervé BOUTEMY
please start by reading the human-oriented explanation: https://projects.apache.org/about.html this should ease the deep dive into data behind the recurring "Committees vs Projects" discussion Regards, Hervé Le vendredi 27 mars 2020, 14:43:52 CET Rich Bowen a écrit : > I'm trying to

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread Rich Bowen
On 3/27/20 1:13 PM, sebb wrote: On Fri, 27 Mar 2020 at 13:44, Rich Bowen wrote: there are also lines that look like: http://flex.apache.org/pmc_Flex.rdf (4 of them, for whatever that's worth - flex, ofbiz, plc4x, and tez) Is that correct? Or is that not how the data is supposed to be

Re: Data inconsistency in projects.apache.org

2020-03-27 Thread sebb
On Fri, 27 Mar 2020 at 13:44, Rich Bowen wrote: > > I'm trying to understand the twisty maze of data sources that fuel > projects.apache.org and either I'm confused, or there's some > inconsistency in how this all fits together. > > I'll start with just one data source for now, so that I don't

Data inconsistency in projects.apache.org

2020-03-27 Thread Rich Bowen
I'm trying to understand the twisty maze of data sources that fuel projects.apache.org and either I'm confused, or there's some inconsistency in how this all fits together. I'll start with just one data source for now, so that I don't muddle multiple things together.