Jeszenszky Peter wrote: > Hello Ryan, > > I have tested your deb2n3 script on several (approximately 10 > randomly chosen) different package files from > > http://packages.debian.org/ > > and I always get the "Missing control files" error message. > > For example try this: > > http://packages.debian.org/stable/editors/emacs21 > (choose package file for i386)
Fixed, I mistook an optional file for a required one. I haven't tested extensively, the issue of modelling dependencies was more distracting. > I have updated my rpm2rdf converter. Now it extracts and renders > dependency information also and the resulting RDF/XML is better > structured. See: > > http://www.inf.unideb.hu/~jeszy/rdfizers/rpm.rdf > > If an ontology will be available and used some parts of the result > should be modeled better. (For example, currently the ChangeLog is > represented by a Seq of ChangeLogEntry elements.) > > The use of RDF containers is not consistent also: while Files and > ChangeLogEntries are grouped together and wrapped in a container > each dependency is represented separately. I'm not sure I'd bother using a container - but then I suppose that's a modelling question. For some things, I think it's a bit more useful to directly relate, say, the file to the package that provides it, even if that's just a matter of sugar when it comes to querying. Or, ugly as it is in RDF/XML, maybe an rdf:List would be better suited. > But the resulting RDF/XML contains almost all metadata that might > be interesting and useful. > > It is also quite similar to the output of your deb2rdf converter. > > Based on the outputs of out converters it is clear that there are > RPM package metadata elements and Debian package metadata elements > with the same meaning. If we would like to develop an universal > software package ontology a detailed investigation of the formats > are required. The next step should be to comparing the RPM and Debian > package formats. > > RPM information can be found here: > > http://www.rpm.org/ > > The following document describes the RPM package format: > > http://fedora.redhat.com/docs/drafts/rpm-guide-en/ > > More precisely Chapter 24 contains the details of the RPM format: > > > http://fedora.redhat.com/docs/drafts/rpm-guide-en/ch-package-structure.html > > Unfortunately, it is incomplete. I18N features of the format are not > discussed, it is not clear how it is used by package maintainers. The > most painful deficiency is that it does not discuss character encoding > issues. These problems may be discussed on the following mailing lists: > > https://lists.rpm.org/mailman/listinfo/rpm-maint > https://www.redhat.com/mailman/listinfo/rpm-list There was an old project called rpm2html that also generates RDF for use with finding other RPMs, called, fittingly, rpmfind. rpmfind.net, associated with the author of rpm2html, used to publish RPM RDF as well; that feature appears to have died. The tool is still available and under development: https://savannah.nongnu.org/projects/rpm2html/ > Is there an official Debian package format specification? The official line is that each Debian distribution carries its version's packaging spec in the man page for deb(5); here's one available in HTML that claims the latest format has been that way since Debian 0.93: http://linuxreviews.org/man/deb/ The official line comes from: http://www.debian.org/doc/FAQ/ch-pkg_basics.en.html#s-deb-format which states that the format is subject to change between major releases. Apparently they haven't changed it for a while. More succinctly: http://en.wikipedia.org/wiki/.DEB > In the next two weeks I will be busy because of my work but I will > try to write down a few thoughts on a possible RPM software package > ontology. Indeed; in addition to finding overlap, the issue to me is how to go beyond rpm2html's simplistic dependency modelling (if old examples are to be believed, <RPM:dependency>libc6</RPM:dependency> was the sum total of capturing that information). Or it may just be a matter of doing some minimal modelling and hoping the tools of the future will be able to do the necessary inferencing / querying as needed to determine dependency trees. In more concrete terms, I suppose my question was whether <http://packages/emacs21-nox/21.4/> :dependsEqual <http://packages/emacs21/21.4/> . is 'better' than <http://packages/emacs21-nox/21.4/> :depends [ :package <http://packages/emacs21/> ; :version "21.4"; :dependency :equal ] . What's the right granularity for a resource? Does each architecture deserve to be considered its own resource, or is it sufficient to note which architectures are available for each version? Does an actual instance of a package matter, or is this a more abstract matter? On one hand, it shouldn't matter, even if it's not uniform, so long as all the information is there, somewhere; on the other, life would be easier if it were at least agreed upon at the outset. -- Ryan Lee [EMAIL PROTECTED] MIT CSAIL Research Staff http://simile.mit.edu/ _______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
