On Wed, Mar 26, 2008 at 7:27 PM, Aristotle Pagaltzis <[EMAIL PROTECTED]> wrote: > > authors/id/D/DA/DAGOLDEN/Foo-Bar-1.23.tar.gz > > authors/id/R/RJ/RJBS/Foo-Bar-1.23.tar.gz > > http://search.cpan.org/dist/Foo-Bar/ does something useful > anyhow. URI::cpan should produce the same results.
That it produces something useful comes from some set of heuristics. Unless the set of heuristics is the same everywhere that the URI is used, the same URI could lead to different endpoints, which is not desirable. If what you really want is URI::cpan::search, then write that. But search.cpan.org is not CPAN, it's just one particular (albeit very useful) view of it. > > They don't even need to contain the same modules (*.pm files) > > or packages (package statement within a .pm file). Both > > versions of Foo-Bar-1.23 could appear in the 02packages file. > > That's completely inconsequential for our purposes AFAICT. It is inconsequential until someone intentionally or maliciously collides a tarball name. Nothing prevents it from happening, so I wouldn't suggest writing an URI schema that ignores the possibility. (I'm half tempted to collide with an Acme distribution just to see how search.cpan.org resolves "dist/Acme-Foo-Bar" when there are two of Acme-Foo-Bar. But I wouldn't do it without permission, of course :-) > > That's what I don't really get. What does that *mean*? If a URI > > is supposed to identify a resource (c.f [Uniform Resource > > Identifier on Wikipedia]), what "resource" does (2) identify? I > > > said "latest" because that attempts to pin it to a specific > > resource. In the abstract, it doesn't seem to have any standard > > meaning and thus no real utility. > > Don't confuse resources and representations. A resource is not a > file or anything concrete; it's a platonic ideal. It may have any > number of representations – or none at all. Some of them may be > identical with the representation of a different resource during > a particular time period, even though they're different resources > (such as /dist/Foo-Bar and /dist/Foo-Bar-1.23). I'm not confusing them, or at least I don't think I am. I'm asking what the "platonic ideal" actually is supposed to be and what utility it has because the examples so far are either ambiguous or must have some unstated assumption that I'm trying to draw out into the light of day. Is "cpan://dist/AUTHOR/Foo-Bar" : (a) the set of "Foo-Bar" distributions by AUTHOR? (b) a particular "Foo-Bar" distribution by AUTHOR? If so, which one? If (b), is it predictably deterministic? If not, what practical use is it? And if AUTHOR is removed from the URI, the ambiguity increases but the same questions apply. Here's a practical example I just found: Test::Unit If you load the cpan shell and inquire about Test::Unit, you find that it's in MCAST/Test-Unit-0.25.tar.gz: cpan[3]> m Test::Unit Module id = Test::Unit DESCRIPTION framework for XP style unit testing CPAN_USERID MCAST (Matthew Astley <[EMAIL PROTECTED]>) CPAN_VERSION 0.25 CPAN_FILE M/MC/MCAST/Test-Unit-0.25.tar.gz DSLIP_STATUS bmpOp (beta,mailing-list,perl,object-oriented,Standard-Perl) INST_FILE (not installed) However, if you go to http://search.cpan.org/dist/Test-Unit/ you'll find yourself at CLEMBERG/Test-Unit-0.14.tar.gz. "Other releases" on that page won't even show 0.25. However, if you click into any of the individual modules, you'll then see a link for "Latest Release: Test-Unit-0.25". In one view, 0.25 is "unauthorized" and not shown. In another view, 0.25 is a valid "latest release". Which is "correct"? Arguably, either one. Which should "dist/Test-Unit" lead to? I don't know. One way or other, a decision has to be made as to what it should mean for us to figure out how to use it. Practical example -- continued. Barbie set up the CPAN Testers wiki to have a "shortcut" to CPAN modules. For whatever reason, perhaps the wiki software doesn't like colons, a link to Test::Unit would be [[cpan:Test-Unit]] -- which of course is represented as a link to search.cpan.org/dist/Test-Unit which isn't the latest Test::Unit. My point is that these decisions do have practical consequences and we should seek to remove as much ambiguity as possible through clear, consistent definitions. Regards, David