Jeszenszky Peter wrote:
> Hello Ryan,
> 
> I have tested your deb2n3 script on several (approximately 10
> randomly chosen) different package files from
> 
>       http://packages.debian.org/
> 
> and I always get the "Missing control files" error message.
> 
> For example try this:
> 
>       http://packages.debian.org/stable/editors/emacs21
>       (choose package file for i386)

Fixed, I mistook an optional file for a required one.  I haven't tested 
extensively, the issue of modelling dependencies was more distracting.

> I have updated my rpm2rdf converter. Now it extracts and renders
> dependency information also and the resulting RDF/XML is better
> structured. See:
> 
>       http://www.inf.unideb.hu/~jeszy/rdfizers/rpm.rdf
> 
> If an ontology will be available and used some parts of the result
> should be modeled better. (For example, currently the ChangeLog is
> represented by a Seq of ChangeLogEntry elements.)
> 
> The use of RDF containers is not consistent also: while Files and
> ChangeLogEntries are grouped together and wrapped in a container
> each dependency is represented separately.

I'm not sure I'd bother using a container - but then I suppose that's a 
modelling question.  For some things, I think it's a bit more useful to 
directly relate, say, the file to the package that provides it, even if 
that's just a matter of sugar when it comes to querying.  Or, ugly as it 
is in RDF/XML, maybe an rdf:List would be better suited.

> But the resulting RDF/XML contains almost all metadata that might
> be interesting and useful.
> 
> It is also quite similar to the output of your deb2rdf converter.
> 
> Based on the outputs of out converters it is clear that there are
> RPM package metadata elements and Debian package metadata elements
> with the same meaning. If we would like to develop an universal
> software package ontology a detailed investigation of the formats
> are required. The next step should be to comparing the RPM and Debian
> package formats.
> 
> RPM information can be found here:
> 
>       http://www.rpm.org/
> 
> The following document describes the RPM package format:
> 
>       http://fedora.redhat.com/docs/drafts/rpm-guide-en/
> 
> More precisely Chapter 24 contains the details of the RPM format:
> 
>       
> http://fedora.redhat.com/docs/drafts/rpm-guide-en/ch-package-structure.html
> 
> Unfortunately, it is incomplete. I18N features of the format are not
> discussed, it is not clear how it is used by package maintainers. The
> most painful deficiency is that it does not discuss character encoding
> issues. These problems may be discussed on the following mailing lists:
> 
>       https://lists.rpm.org/mailman/listinfo/rpm-maint
>       https://www.redhat.com/mailman/listinfo/rpm-list

There was an old project called rpm2html that also generates RDF for use 
with finding other RPMs, called, fittingly, rpmfind.  rpmfind.net, 
associated with the author of rpm2html, used to publish RPM RDF as well; 
that feature appears to have died.

The tool is still available and under development:

   https://savannah.nongnu.org/projects/rpm2html/

> Is there an official Debian package format specification?

The official line is that each Debian distribution carries its version's 
packaging spec in the man page for deb(5); here's one available in HTML 
that claims the latest format has been that way since Debian 0.93:

   http://linuxreviews.org/man/deb/

The official line comes from:

   http://www.debian.org/doc/FAQ/ch-pkg_basics.en.html#s-deb-format

which states that the format is subject to change between major 
releases.  Apparently they haven't changed it for a while.

More succinctly:

   http://en.wikipedia.org/wiki/.DEB

> In the next two weeks I will be busy because of my work but I will
> try to write down a few thoughts on a possible RPM software package
> ontology.

Indeed; in addition to finding overlap, the issue to me is how to go 
beyond rpm2html's simplistic dependency modelling (if old examples are 
to be believed, <RPM:dependency>libc6</RPM:dependency> was the sum total 
of capturing that information).

Or it may just be a matter of doing some minimal modelling and hoping 
the tools of the future will be able to do the necessary inferencing / 
querying as needed to determine dependency trees.

In more concrete terms, I suppose my question was whether

<http://packages/emacs21-nox/21.4/> :dependsEqual 
<http://packages/emacs21/21.4/> .

is 'better' than

<http://packages/emacs21-nox/21.4/> :depends [ :package 
<http://packages/emacs21/> ; :version "21.4"; :dependency :equal ] .

What's the right granularity for a resource?  Does each architecture 
deserve to be considered its own resource, or is it sufficient to note 
which architectures are available for each version?  Does an actual 
instance of a package matter, or is this a more abstract matter?

On one hand, it shouldn't matter, even if it's not uniform, so long as 
all the information is there, somewhere; on the other, life would be 
easier if it were at least agreed upon at the outset.

-- 
Ryan Lee                  [EMAIL PROTECTED]
MIT CSAIL Research Staff  http://simile.mit.edu/
_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Reply via email to