Re: parsing debian-devel-changes archives

2008-07-27 Thread Filippo Giunchedi
On Tue, Jul 22, 2008 at 07:36:43PM +0200, Filippo Giunchedi wrote:
 The weekly update period is rather arbitrary, can be switched to daily
 effortlessly.

There's now a script subscribed to d-d-changes so the updates should be live and
pushed rather than pulled.

I'm welcoming Luca's idea to push other export formats after UDD

filippo
--
Filippo Giunchedi - http://esaurito.net
PGP key: 0x6B79D401
random quote follows:

God may not play dice with the universe, but something strange is going on with
the prime numbers.
-- Paul Erdos


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: parsing debian-devel-changes archives

2008-07-25 Thread Filippo Giunchedi
On Tue, Jul 22, 2008 at 12:45:34AM +0200, Filippo Giunchedi wrote:
 Hi,
 I've produced a script[0] to parse d-d-changes archives to a sane format so to
 produce an history of uploads made to debian[1] in this form:
 
 Source: netselect
 Version: 0.3.ds1-12.1
 Date: Wed, 09 Jul 2008 19:47:21 +0200
 Changed-By: Christian Perrier [EMAIL PROTECTED]
 Maintainer: Filippo Giunchedi [EMAIL PROTECTED]
 NMU: True
 Key: D4E5EDACC0143D2D
 Key-UID: Christian Perrier [EMAIL PROTECTED]

after a bit of discussion it seems more natural and informative to have
Signed-By: instead of Key-UID:, I just changed that.

filippo
--
Filippo Giunchedi - http://esaurito.net
PGP key: 0x6B79D401
random quote follows:

It is easier to change the specification to fit the program than vice versa.
-- Alan Perlis


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: parsing debian-devel-changes archives

2008-07-23 Thread Lucas Nussbaum
On 22/07/08 at 19:36 +0200, Filippo Giunchedi wrote:
 On Tue, Jul 22, 2008 at 09:29:49AM +0200, Stefano Zacchiroli wrote:
  On Tue, Jul 22, 2008 at 12:45:34AM +0200, Filippo Giunchedi wrote:
   I've produced a script[0] to parse d-d-changes archives to a sane format 
   so to
   produce an history of uploads made to debian[1] in this form:
  
  Very cool!
 
 Thanks :)
 
  
   comments/ideas welcome as usual,
  
 [...]
  As the data encoded by d-d-changes is basically timed change
  notifications, I would say that the more appropriate format to represent
  it is a feed format, such as RSS or Atom, what do you think? Having such
  a format would enable cool stuff to be created quite easily (e.g. even
  using, say, Pipes), such as per-maintainer feed changes. Personally it
  is something I would like to have linked from the DDPO.
  
  It shouldn't be to hard to convert (or maybe pair) your format to RSS,
  with an appropriate XML encoding of its content, shout if you want to
  discuss a potential DTD.  Maybe a bit harder can be to provide more
  lively updates, as if one then wants to use RSS as such, weekly updates
  are too coarse grained ...
 
 Of course the idea of having more updated data is appealing, I'd myself 
 welcome
 RSS/atom feeds per-package (almost the same as PTS' upload news) or
 per-maintainer (either changed-by or upload key or whatever). How to proceed 
 for
 the XML encoding? And what might be the most interesting?
 
 The weekly update period is rather arbitrary, can be switched to daily
 effortlessly.

I'd like to avoid that the UDD gsoc project becomes the universal answer
in QA-dom, but it seems that it would make sense to do:
[d-d-c import] - [UDD DB] -- [raw XML export]
  `- [RSS/atom export]
-- 
| Lucas Nussbaum
| [EMAIL PROTECTED]   http://www.lucas-nussbaum.net/ |
| jabber: [EMAIL PROTECTED] GPG: 1024D/023B3F4F |


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: parsing debian-devel-changes archives

2008-07-22 Thread Stefano Zacchiroli
On Tue, Jul 22, 2008 at 12:45:34AM +0200, Filippo Giunchedi wrote:
 I've produced a script[0] to parse d-d-changes archives to a sane format so to
 produce an history of uploads made to debian[1] in this form:

Very cool!

 comments/ideas welcome as usual,

I think the greatest part of your work is that you crunched the mail
archives, which weren't really accessible, and put them in a format
which is more accessible.  Still, the format you choose is it not *that*
accessible either (but hey, it is way better than before :)).

As the data encoded by d-d-changes is basically timed change
notifications, I would say that the more appropriate format to represent
it is a feed format, such as RSS or Atom, what do you think? Having such
a format would enable cool stuff to be created quite easily (e.g. even
using, say, Pipes), such as per-maintainer feed changes. Personally it
is something I would like to have linked from the DDPO.

It shouldn't be to hard to convert (or maybe pair) your format to RSS,
with an appropriate XML encoding of its content, shout if you want to
discuss a potential DTD.  Maybe a bit harder can be to provide more
lively updates, as if one then wants to use RSS as such, weekly updates
are too coarse grained ...

Many thanks for the idea!
Cheers.

-- 
Stefano Zacchiroli -*- PhD in Computer Science \ PostDoc @ Univ. Paris 7
[EMAIL PROTECTED],pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/
I'm still an SGML person,this newfangled /\ All one has to do is hit the
XML stuff is so ... simplistic  -- Manoj \/ right keys at the right time


signature.asc
Description: Digital signature


Re: parsing debian-devel-changes archives

2008-07-22 Thread Filippo Giunchedi
On Tue, Jul 22, 2008 at 09:29:49AM +0200, Stefano Zacchiroli wrote:
 On Tue, Jul 22, 2008 at 12:45:34AM +0200, Filippo Giunchedi wrote:
  I've produced a script[0] to parse d-d-changes archives to a sane format so 
  to
  produce an history of uploads made to debian[1] in this form:
 
 Very cool!

Thanks :)

 
  comments/ideas welcome as usual,
 
[...]
 As the data encoded by d-d-changes is basically timed change
 notifications, I would say that the more appropriate format to represent
 it is a feed format, such as RSS or Atom, what do you think? Having such
 a format would enable cool stuff to be created quite easily (e.g. even
 using, say, Pipes), such as per-maintainer feed changes. Personally it
 is something I would like to have linked from the DDPO.
 
 It shouldn't be to hard to convert (or maybe pair) your format to RSS,
 with an appropriate XML encoding of its content, shout if you want to
 discuss a potential DTD.  Maybe a bit harder can be to provide more
 lively updates, as if one then wants to use RSS as such, weekly updates
 are too coarse grained ...

Of course the idea of having more updated data is appealing, I'd myself welcome
RSS/atom feeds per-package (almost the same as PTS' upload news) or
per-maintainer (either changed-by or upload key or whatever). How to proceed for
the XML encoding? And what might be the most interesting?

The weekly update period is rather arbitrary, can be switched to daily
effortlessly.

filippo
--
Filippo Giunchedi - http://esaurito.net
PGP key: 0x6B79D401
random quote follows:

I never forget a face, but in your case I'll be glad to make an exception.
-- Groucho Marx


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: parsing debian-devel-changes archives

2008-07-22 Thread Stefano Zacchiroli
On Tue, Jul 22, 2008 at 07:36:43PM +0200, Filippo Giunchedi wrote:
 Of course the idea of having more updated data is appealing, I'd myself 
 welcome
 RSS/atom feeds per-package (almost the same as PTS' upload news) or
 per-maintainer (either changed-by or upload key or whatever).

Note that I was thinking at generating a big RSS (rotated as needed,
assuming there exists a concept like that for RSS) of all d-d-changes.
Having that, you can define filters on top of it which dynamically
produces the other needed RSS. But sure it depends on who will need to
serve the data, for efficiency reasons ...

 How to proceed for the XML encoding? And what might be the most
 interesting?

I would go for the good old mantra of encoding all the available
information, i.e. simply translating the stanza you already generated to
XML. Given that RSS is often handy to be looked directly at from
browsers, it is probably worth going for a microformat approach
(http://microformats.org), i.e.  just use XHTML as your XML language,
and encode semantic information using CSS classes as needed.

Quickly drafted example:

  dl
dtsource/dt
dd class=source-packagenetselect/dd

dtversion/dt
dd class=package-version0.3.ds1-12.1/dd

dtdate/dt
dd class=dateWed, 09 Jul 2008 19:47:21 +0200/dd
!-- check what are the used conventions for date in other
microformats ... -

dtchanged by/dt
dd class=changed-byChristian Perrier lt;[EMAIL PROTECTED]gt;/dd
!-- probably should be structured a bit more, to distinguish email
from name ..., also avoiding annoying escapes --

dtmaintainer/dt
dd class=maintainerFilippo Giunchedi lt;[EMAIL PROTECTED]gt;

!-- and so on, you got the idea :-) --
  /dl

This way you get rendering for free in browsers (maybe with just a tiny
bit of CSS) and preserve semantic annotations for who might wants to mix
the data with something else playing along with XML.

Cheers.

-- 
Stefano Zacchiroli -*- PhD in Computer Science \ PostDoc @ Univ. Paris 7
[EMAIL PROTECTED],pps.jussieu.fr,debian.org} -- http://upsilon.cc/zack/
I'm still an SGML person,this newfangled /\ All one has to do is hit the
XML stuff is so ... simplistic  -- Manoj \/ right keys at the right time


signature.asc
Description: Digital signature


parsing debian-devel-changes archives

2008-07-21 Thread Filippo Giunchedi
Hi,
I've produced a script[0] to parse d-d-changes archives to a sane format so to
produce an history of uploads made to debian[1] in this form:

Source: netselect
Version: 0.3.ds1-12.1
Date: Wed, 09 Jul 2008 19:47:21 +0200
Changed-By: Christian Perrier [EMAIL PROTECTED]
Maintainer: Filippo Giunchedi [EMAIL PROTECTED]
NMU: True
Key: D4E5EDACC0143D2D
Key-UID: Christian Perrier [EMAIL PROTECTED]

that is, by basically munging info out of the changes file appended to the d-d-c
message.

comments/ideas welcome as usual,
filippo

[0] 
http://svn.debian.org/wsvn/collab-qa/upload-history/munge_ddc.py?op=filerev=0sc=0
[1] (weekly updated) results available at http://qa.debian.org/~filippo/ddc/
--
Filippo Giunchedi - http://esaurito.net
PGP key: 0x6B79D401
random quote follows:

We are human because our ancestors learned to share their food and their skills
in an honored network of obligation.
-- Richard Leakey 


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]