Re: parsing debian-devel-changes archives
On Tue, Jul 22, 2008 at 07:36:43PM +0200, Filippo Giunchedi wrote: The weekly update period is rather arbitrary, can be switched to daily effortlessly. There's now a script subscribed to d-d-changes so the updates should be live and pushed rather than pulled. I'm welcoming Luca's idea to push other export formats after UDD filippo -- Filippo Giunchedi - http://esaurito.net PGP key: 0x6B79D401 random quote follows: God may not play dice with the universe, but something strange is going on with the prime numbers. -- Paul Erdos -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: parsing debian-devel-changes archives
On Tue, Jul 22, 2008 at 12:45:34AM +0200, Filippo Giunchedi wrote: Hi, I've produced a script[0] to parse d-d-changes archives to a sane format so to produce an history of uploads made to debian[1] in this form: Source: netselect Version: 0.3.ds1-12.1 Date: Wed, 09 Jul 2008 19:47:21 +0200 Changed-By: Christian Perrier [EMAIL PROTECTED] Maintainer: Filippo Giunchedi [EMAIL PROTECTED] NMU: True Key: D4E5EDACC0143D2D Key-UID: Christian Perrier [EMAIL PROTECTED] after a bit of discussion it seems more natural and informative to have Signed-By: instead of Key-UID:, I just changed that. filippo -- Filippo Giunchedi - http://esaurito.net PGP key: 0x6B79D401 random quote follows: It is easier to change the specification to fit the program than vice versa. -- Alan Perlis -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: parsing debian-devel-changes archives
On 22/07/08 at 19:36 +0200, Filippo Giunchedi wrote: On Tue, Jul 22, 2008 at 09:29:49AM +0200, Stefano Zacchiroli wrote: On Tue, Jul 22, 2008 at 12:45:34AM +0200, Filippo Giunchedi wrote: I've produced a script[0] to parse d-d-changes archives to a sane format so to produce an history of uploads made to debian[1] in this form: Very cool! Thanks :) comments/ideas welcome as usual, [...] As the data encoded by d-d-changes is basically timed change notifications, I would say that the more appropriate format to represent it is a feed format, such as RSS or Atom, what do you think? Having such a format would enable cool stuff to be created quite easily (e.g. even using, say, Pipes), such as per-maintainer feed changes. Personally it is something I would like to have linked from the DDPO. It shouldn't be to hard to convert (or maybe pair) your format to RSS, with an appropriate XML encoding of its content, shout if you want to discuss a potential DTD. Maybe a bit harder can be to provide more lively updates, as if one then wants to use RSS as such, weekly updates are too coarse grained ... Of course the idea of having more updated data is appealing, I'd myself welcome RSS/atom feeds per-package (almost the same as PTS' upload news) or per-maintainer (either changed-by or upload key or whatever). How to proceed for the XML encoding? And what might be the most interesting? The weekly update period is rather arbitrary, can be switched to daily effortlessly. I'd like to avoid that the UDD gsoc project becomes the universal answer in QA-dom, but it seems that it would make sense to do: [d-d-c import] - [UDD DB] -- [raw XML export] `- [RSS/atom export] -- | Lucas Nussbaum | [EMAIL PROTECTED] http://www.lucas-nussbaum.net/ | | jabber: [EMAIL PROTECTED] GPG: 1024D/023B3F4F | -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: parsing debian-devel-changes archives
On Tue, Jul 22, 2008 at 12:45:34AM +0200, Filippo Giunchedi wrote: I've produced a script[0] to parse d-d-changes archives to a sane format so to produce an history of uploads made to debian[1] in this form: Very cool! comments/ideas welcome as usual, I think the greatest part of your work is that you crunched the mail archives, which weren't really accessible, and put them in a format which is more accessible. Still, the format you choose is it not *that* accessible either (but hey, it is way better than before :)). As the data encoded by d-d-changes is basically timed change notifications, I would say that the more appropriate format to represent it is a feed format, such as RSS or Atom, what do you think? Having such a format would enable cool stuff to be created quite easily (e.g. even using, say, Pipes), such as per-maintainer feed changes. Personally it is something I would like to have linked from the DDPO. It shouldn't be to hard to convert (or maybe pair) your format to RSS, with an appropriate XML encoding of its content, shout if you want to discuss a potential DTD. Maybe a bit harder can be to provide more lively updates, as if one then wants to use RSS as such, weekly updates are too coarse grained ... Many thanks for the idea! Cheers. -- Stefano Zacchiroli -*- PhD in Computer Science \ PostDoc @ Univ. Paris 7 [EMAIL PROTECTED],pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/ I'm still an SGML person,this newfangled /\ All one has to do is hit the XML stuff is so ... simplistic -- Manoj \/ right keys at the right time signature.asc Description: Digital signature
Re: parsing debian-devel-changes archives
On Tue, Jul 22, 2008 at 09:29:49AM +0200, Stefano Zacchiroli wrote: On Tue, Jul 22, 2008 at 12:45:34AM +0200, Filippo Giunchedi wrote: I've produced a script[0] to parse d-d-changes archives to a sane format so to produce an history of uploads made to debian[1] in this form: Very cool! Thanks :) comments/ideas welcome as usual, [...] As the data encoded by d-d-changes is basically timed change notifications, I would say that the more appropriate format to represent it is a feed format, such as RSS or Atom, what do you think? Having such a format would enable cool stuff to be created quite easily (e.g. even using, say, Pipes), such as per-maintainer feed changes. Personally it is something I would like to have linked from the DDPO. It shouldn't be to hard to convert (or maybe pair) your format to RSS, with an appropriate XML encoding of its content, shout if you want to discuss a potential DTD. Maybe a bit harder can be to provide more lively updates, as if one then wants to use RSS as such, weekly updates are too coarse grained ... Of course the idea of having more updated data is appealing, I'd myself welcome RSS/atom feeds per-package (almost the same as PTS' upload news) or per-maintainer (either changed-by or upload key or whatever). How to proceed for the XML encoding? And what might be the most interesting? The weekly update period is rather arbitrary, can be switched to daily effortlessly. filippo -- Filippo Giunchedi - http://esaurito.net PGP key: 0x6B79D401 random quote follows: I never forget a face, but in your case I'll be glad to make an exception. -- Groucho Marx -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: parsing debian-devel-changes archives
On Tue, Jul 22, 2008 at 07:36:43PM +0200, Filippo Giunchedi wrote: Of course the idea of having more updated data is appealing, I'd myself welcome RSS/atom feeds per-package (almost the same as PTS' upload news) or per-maintainer (either changed-by or upload key or whatever). Note that I was thinking at generating a big RSS (rotated as needed, assuming there exists a concept like that for RSS) of all d-d-changes. Having that, you can define filters on top of it which dynamically produces the other needed RSS. But sure it depends on who will need to serve the data, for efficiency reasons ... How to proceed for the XML encoding? And what might be the most interesting? I would go for the good old mantra of encoding all the available information, i.e. simply translating the stanza you already generated to XML. Given that RSS is often handy to be looked directly at from browsers, it is probably worth going for a microformat approach (http://microformats.org), i.e. just use XHTML as your XML language, and encode semantic information using CSS classes as needed. Quickly drafted example: dl dtsource/dt dd class=source-packagenetselect/dd dtversion/dt dd class=package-version0.3.ds1-12.1/dd dtdate/dt dd class=dateWed, 09 Jul 2008 19:47:21 +0200/dd !-- check what are the used conventions for date in other microformats ... - dtchanged by/dt dd class=changed-byChristian Perrier lt;[EMAIL PROTECTED]gt;/dd !-- probably should be structured a bit more, to distinguish email from name ..., also avoiding annoying escapes -- dtmaintainer/dt dd class=maintainerFilippo Giunchedi lt;[EMAIL PROTECTED]gt; !-- and so on, you got the idea :-) -- /dl This way you get rendering for free in browsers (maybe with just a tiny bit of CSS) and preserve semantic annotations for who might wants to mix the data with something else playing along with XML. Cheers. -- Stefano Zacchiroli -*- PhD in Computer Science \ PostDoc @ Univ. Paris 7 [EMAIL PROTECTED],pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/ I'm still an SGML person,this newfangled /\ All one has to do is hit the XML stuff is so ... simplistic -- Manoj \/ right keys at the right time signature.asc Description: Digital signature