Re: MD5 and Mirrors ( was Re: MD5 Hash )
Well, after my own little survey, I've determined the following: md5 on BSD (Apache Minotaur): [EMAIL PROTECTED]:/home/mdiggory> md5 foo.bar MD5 (foo.bar) = 7f5e787ff3b930d906d01243ccf7c237 md5 has no built in option to compare the file to the checksum and return true/false. Output of md5sum (GNU textutils) on Redhat: [EMAIL PROTECTED]:/home/mdiggory> md5sum foo.bar 7f5e787ff3b930d906d01243ccf7c237 foo.bar md5sum has a built in option which compares the md5 from the signature against the original file. [EMAIL PROTECTED] mdiggory]$ md5sum -c foo.bar.md5 foo.bar: OK Output of Maven when publishing to repository is the md5 string minus the filename and is dependent on GNU md5sum. *example snippet of the command as its run in jelly* cd ${directory}; md5sum ${artifactName} | sed 's/ .*$//' | tee ${artifactName}.md5; chgrp ${maven.repository.group} *; chmod g+w,a+r *; results in the string with no filename on ibiblio, and actually fails on minotaur as its BSD and the executable is not present. What is the right/wrong way is not really a reasonable question to ask. How to appropriately deal with the variants in both md5/md5sum ... generation and file structure specifically in relation to the repository are the important questions to throw around. My opinions are the following: Server side OS dependent tools are usually accessed in scripts (say, in a cron script which does checking and reports errors). These scripts will always be unique to an OS, It'll often be the case that they are custom for that particular need. the author usually writes their own string parsing routines (ie: md5sum foo.bar | sed 's/ .*$//'). A client side tool needs a simple and standard means of validating the content they are about to download or upload onto a server. If the repository structure already enforces the name of the md5 sum in relation to the file name, any internal naming done inside the md5 file is redundant. It would be good to just have the file contain the checksum which reduces parsing requirements on both the server and the client.. Client tools should be robust enough (or extensible enough) to generate the appropriate md5 sum for a particular artifact and to easily find and read/compare it to the content on the server. -Mark Markus M. May wrote: Hello Mark, this is probably my fault. I checked this whole stuff with a very old maven.md5-file. The format is now equal between the two projects. Sorry for the confusion. Markus -- Mark Diggory Software Developer Harvard MIT Data Center http://www.hmdc.harvard.edu
Re: MD5 and Mirrors ( was Re: MD5 Hash )
Hello Mark, this is probably my fault. I checked this whole stuff with a very old maven.md5-file. The format is now equal between the two projects. Sorry for the confusion. Markus
Re: MD5 and Mirrors ( was Re: MD5 Hash )
> Working together, I believe both Depot, Repository, Maven, ... can come > to a common agreement on the Apache Repository Structure. The separate > groups maintaining different views provides the "tension" neccessary for > growth of an agreement. The key is eventual comprimise and > non-posesiveness in all the parties involved. Well said. Count us in on that.. regards Adam
Re: MD5 and Mirrors ( was Re: MD5 Hash )
Adam R. B. Jack wrote: Adam is perfectly right about this stuff. There is one more thing we need to think about. Some repositories treat md5-files different. The structure on apache.org is [filename - MD5 Hash]. But on ibiblio (maven-repository) it is just [MD5 Hash]. So this needs to be somehow configurable. I think we need to shoot for what is considered the Apache Repository, none other. What Maven do w/ ibiblio will clearly impact us, but ought be secondary. A standard file format for md5 is more important in the long run, I think, than either the way Apache in general or more specifically the Maven project are dealing with generating the file contetns of md5 checksums. Currently, neither apache or maven md5's are validatable using the standard FSF GNU md5sum implementation. That said, Apache Repository is about to become Maven Repository (taken over my the Maven team) unless we help Apache get it's act together. Still, it might not be a bad thing, we expect them to be a primary publisher. regards Adam Working together, I believe both Depot, Repository, Maven, ... can come to a common agreement on the Apache Repository Structure. The separate groups maintaining different views provides the "tension" neccessary for growth of an agreement. The key is eventual comprimise and non-posesiveness in all the parties involved. -Mark -- Mark Diggory Software Developer Harvard MIT Data Center http://www.hmdc.harvard.edu
MD% Standards (was Re: MD5 and Mirrors ( was Re: MD5 Hash ))
Besides, my current experiments with gnu md5sum (2.0.21) show that the sum's on the Maven contents arn't verifyable to any other tool but the maven checksum plugin. If they aren't verifiable to extenral tools thats a bad situation. I'm going to bring this up on the Maven list too. http://www.faqs.org/rfcs/rfc1321.html A hard fast "dig" through the RFC suggests a loophole here as there is no reference to what the contents of a md5 signature fle should look like. Seems more of a inherant "suggestion" in the implementation itself. -Mark Mark R. Diggory wrote: Its a tough call, is there any "standard" for the structure of the md5 contents out there? I think the Maven team would be keen to play along with a standard and yet play along with any configurability as well. -Mark Diggory Markus M. May wrote: Adam is perfectly right about this stuff. There is one more thing we need to think about. Some repositories treat md5-files different. The structure on apache.org is [filename - MD5 Hash]. But on ibiblio (maven-repository) it is just [MD5 Hash]. So this needs to be somehow configurable. One more thing to think about :-) -- Mark Diggory Software Developer Harvard MIT Data Center http://www.hmdc.harvard.edu
Re: MD5 and Mirrors ( was Re: MD5 Hash )
Its a tough call, is there any "standard" for the structure of the md5 contents out there? I think the Maven team would be keen to play along with a standard and yet play along with any configurability as well. -Mark Diggory Markus M. May wrote: Adam is perfectly right about this stuff. There is one more thing we need to think about. Some repositories treat md5-files different. The structure on apache.org is [filename - MD5 Hash]. But on ibiblio (maven-repository) it is just [MD5 Hash]. So this needs to be somehow configurable. One more thing to think about :-) Nick wrote: The MD5 should always come from the authoritative source (apache.org) using https. I'm not sure if all environments (JVMs) have HTTPS available. In a somewhat perfect world we'd try HTTPS and if it failed try HTTP, unless some 'minimum security' was requested. I think we'll have to experiment and experince this area over time/iterations. How are we going to know what the "authoritative" source for a resource is. For java we could enforce a reverse domain name. Four things: 1) Repository URI/URL is what it is (whatever it is) and the URL for the MD5 ought be the URL for the resources plus ".md5" on the end. 2) As current Ruper thinking (coding) goes ... Mirrors ought mirror the hierarchy, so wherever a resource is in the repo, the .md5 ought be next to it, and the original .md5 ought be in exactly the same relative position (just relative to an apache root). 3) Mirroring is kinda hacked into Ruper right now, it silently moves the root of a repository (originally set relative to the mirror locator CGI script) to one such mirror. As such Ruper doesn't really know about mirrors. 4) We probably need to rethink current thinking... ;-) regards, Adam -- Mark Diggory Software Developer Harvard MIT Data Center http://www.hmdc.harvard.edu
Re: MD5 and Mirrors ( was Re: MD5 Hash )
> Adam is perfectly right about this stuff. There is one more thing we need to > think about. Some repositories treat md5-files different. The structure on > apache.org is [filename - MD5 Hash]. But on ibiblio (maven-repository) it is > just [MD5 Hash]. So this needs to be somehow configurable. I think we need to shoot for what is considered the Apache Repository, none other. What Maven do w/ ibiblio will clearly impact us, but ought be secondary. That said, Apache Repository is about to become Maven Repository (taken over my the Maven team) unless we help Apache get it's act together. Still, it might not be a bad thing, we expect them to be a primary publisher. regards Adam
Re: MD5 and Mirrors ( was Re: MD5 Hash )
Adam is perfectly right about this stuff. There is one more thing we need to think about. Some repositories treat md5-files different. The structure on apache.org is [filename - MD5 Hash]. But on ibiblio (maven-repository) it is just [MD5 Hash]. So this needs to be somehow configurable. One more thing to think about :-) > Nick wrote: > > > The MD5 should always come from the authoritative source (apache.org) > > using https. > > I'm not sure if all environments (JVMs) have HTTPS available. In a > somewhat > perfect world we'd try HTTPS and if it failed try HTTP, unless some > 'minimum > security' was requested. > > I think we'll have to experiment and experince this area over > time/iterations. > > > How are we going to know what the "authoritative" source for a resource > > is. > > For java we could enforce a reverse domain name. > > Four things: > > 1) Repository URI/URL is what it is (whatever it is) and the URL for the > MD5 > ought be the URL for the resources plus ".md5" on the end. > > 2) As current Ruper thinking (coding) goes ... Mirrors ought mirror the > hierarchy, so wherever a resource is in the repo, the .md5 ought be next > to > it, and the original .md5 ought be in exactly the same relative position > (just relative to an apache root). > > 3) Mirroring is kinda hacked into Ruper right now, it silently moves the > root of a repository (originally set relative to the mirror locator CGI > script) to one such mirror. As such Ruper doesn't really know about > mirrors. > > 4) We probably need to rethink current thinking... ;-) > > regards, > > Adam >
MD5 and Mirrors ( was Re: MD5 Hash )
Nick wrote: > The MD5 should always come from the authoritative source (apache.org) > using https. I'm not sure if all environments (JVMs) have HTTPS available. In a somewhat perfect world we'd try HTTPS and if it failed try HTTP, unless some 'minimum security' was requested. I think we'll have to experiment and experince this area over time/iterations. > How are we going to know what the "authoritative" source for a resource > is. > For java we could enforce a reverse domain name. Four things: 1) Repository URI/URL is what it is (whatever it is) and the URL for the MD5 ought be the URL for the resources plus ".md5" on the end. 2) As current Ruper thinking (coding) goes ... Mirrors ought mirror the hierarchy, so wherever a resource is in the repo, the .md5 ought be next to it, and the original .md5 ought be in exactly the same relative position (just relative to an apache root). 3) Mirroring is kinda hacked into Ruper right now, it silently moves the root of a repository (originally set relative to the mirror locator CGI script) to one such mirror. As such Ruper doesn't really know about mirrors. 4) We probably need to rethink current thinking... ;-) regards, Adam
Re: MD5 Hash
Adam R. B. Jack wrote: Hmm, what makes folk think that the file could be changed without the MD5 hash file being changed also. I feel there has to be some private key from the originator, to ensure that nobody could fake both. The MD5 should always come from the authoritative source (apache.org) using https. How are we going to know what the "authoritative" source for a resource is. For java we could enforce a reverse domain name. ie packages like org.apache must get a md5 for an apache.org website. So, if there are such keys, how do we acquire them? How do we trust them? regards Adam
Re: MD5 Hash
I think in the same direction. First I will try to compare the generated hash with the hash from the mirror. In a second step I will then try to determine the original .md5 file and compare to this one. Basically the web-of-trust is pretty hard to automate right now. You already have a KEYS file with quite a lot of keys, but you cannot tell which key signed the file. There is no way to do this (or i missed it). So right now, I will concentrate on the MD5-stuff. Markus > > Basically the MD5 Hash does not need keys. > > [...] > > Also apache.org delivers a file named .asc > > Ok, thanks, I get it now (I think.) > > This explains some of the negative comments I've heard about MD5 then (it > not being too strong). I read on some, on one Apache list, that folks will > be ok with this being strong enough though. What will be tricky for us, > should we chose to attempt it, will be supporting mirrors yet using the > original MD5 from Apache... > > Since ASC has keys, that ties in to the 'web of trust' that Apache is > working on, I think. Once on trusts a certain set of keys, those keys can > be > used to verify others that are acquired, and those can be used to verify > the > ASC. This is much harder to automate, but something we could aspire to... > > regards, > > Adam >
Re: MD5 Hash
> Basically the MD5 Hash does not need keys. > [...] > Also apache.org delivers a file named .asc Ok, thanks, I get it now (I think.) This explains some of the negative comments I've heard about MD5 then (it not being too strong). I read on some, on one Apache list, that folks will be ok with this being strong enough though. What will be tricky for us, should we chose to attempt it, will be supporting mirrors yet using the original MD5 from Apache... Since ASC has keys, that ties in to the 'web of trust' that Apache is working on, I think. Once on trusts a certain set of keys, those keys can be used to verify others that are acquired, and those can be used to verify the ASC. This is much harder to automate, but something we could aspire to... regards, Adam
Re: MD5 Hash
Hello once again, > > yes, I can enlighten you all a littel bit about MD5 hashs. The basic is > that > > Ok, thanks for that. I get the gist, I get the premis. Now, more > practically... > > What are the inputs to the algorythm? Meaning, we have the file, we have > the > MD5 resultant hash (assuming the file on the server has not been > modified), > and we have the algorythm, but do we need anything else (e.g. keys) in > order > to re-compute/check the resultant hash? Basically the MD5 Hash does not need keys. It is generated from the file itself without any password or something like that. The code is just a hashcode of the file (a hex-Number). > > Hmm, what makes folk think that the file could be changed without the MD5 > hash file being changed also. I feel there has to be some private key from > the originator, to ensure that nobody could fake both. > Like stated earlier, there are no keys there. Since a normal user uses a mirror to download apache.org sources or binaries, you can then check if the file has the same hash-code as the original file from apache.org (can be checked by using the original .md5-file from apache). Also apache.org delivers a file named .asc (at least some projects, like ant do this). In this file there is a signiture for the original file. This can be checked then by using the public key stored in the root-directory of each project in the KEYS-file. But this has nothing really to do with the MD5 stuff. MD5 just ensures integity basically during the download, but does not, like you said, ensures, that the file is really the one, which was published or intended to be published. > So, if there are such keys, how do we acquire them? How do we trust them? > > regards > > Adam > R, Markus
Re: MD5 Hash
> yes, I can enlighten you all a littel bit about MD5 hashs. The basic is that Ok, thanks for that. I get the gist, I get the premis. Now, more practically... What are the inputs to the algorythm? Meaning, we have the file, we have the MD5 resultant hash (assuming the file on the server has not been modified), and we have the algorythm, but do we need anything else (e.g. keys) in order to re-compute/check the resultant hash? Hmm, what makes folk think that the file could be changed without the MD5 hash file being changed also. I feel there has to be some private key from the originator, to ensure that nobody could fake both. So, if there are such keys, how do we acquire them? How do we trust them? regards Adam
Re: MD5 Hash
Hello, yes, I can enlighten you all a littel bit about MD5 hashs. The basic is that a hash is a unique key for a value (in this case a file). From this key you cannot guess or even generate the original value (file). It is generated with the MD5 algorithm using javas security stuff. So basically if a file is updated on the apache.org servers the MD5 hash is generated. When the file is updated the hash of the updated file is normally (and this would be a very very small chance) not the same. So, basically for each file a new hash is generated. You can then create another hash from the same file. If you are using the same algorithm (MD5/SHA) you get then the same hash (means: from the same file you always get the same hash-code when using the same algorithm). On the apache.org servers there is always an .MD5 file for each deployed file. In this file the original filename and the hashcode is written. This basically means, you can generate with the same algorithm the hashcode and then you can check if the hash in the .md5-file is the same as the generated one. If it is not, it is a good guess, that the file you downloaded is not the one published by apache.org. Hope this helps you understand the issue. If there are more questions concerning this, just go ahead and ask. R, Markus > > I just browsed a little around and found some special solutions for the > > checksum stuff with MD5-Hashes. ANT has already a nice task for this > > How would we integrate with it? Is the task part of 'core'? I'm not > against > leveraging others, especially ant, 'cos I suspect that'll be a large part > of > our user base. So, as a start, I'd be for it. > > That said, longer term I'd love to see it for command line also. > > BTW: Are you at a point where you can explain the mechanics of this? What > keys does one use to check an MD5? Where do the keys come from, can we > trust > them, etc.? Can you educate us all? > > regards, > > Adam >
Re: MD5 Hash
> I just browsed a little around and found some special solutions for the > checksum stuff with MD5-Hashes. ANT has already a nice task for this How would we integrate with it? Is the task part of 'core'? I'm not against leveraging others, especially ant, 'cos I suspect that'll be a large part of our user base. So, as a start, I'd be for it. That said, longer term I'd love to see it for command line also. BTW: Are you at a point where you can explain the mechanics of this? What keys does one use to check an MD5? Where do the keys come from, can we trust them, etc.? Can you educate us all? regards, Adam
Re: [GUMP@lsd]: depot/depot-version failed
This is telling us that the build is working, but that the output (jar) is not placed/named as Gump has been told it is. > [jar] Building jar: /data/gump/depot/version/dist/depot-version-20040211.jar and: > - Error - Missing Output: /data/gump/depot/version/build/version-20040211.jar I'll change the Gump descriptor. regards, Adam - Original Message - From: "Adam Jack" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, February 11, 2004 8:00 AM Subject: [EMAIL PROTECTED]: depot/depot-version failed > Project: depot-version > State: Failed > URL: http://lsd.student.utwente.nl/gump/depot/depot-version.html > - G U M P Y > > > Annotations: > - Info - Set - Info - Dependency on ant exists, no need to add for property ant.home. > - Error - Failed with reason missing build outputs > - Error - Missing Output: /data/gump/depot/version/build/version-20040211.jar > - Error - No such directory (where output is expected) : /data/gump/depot/version/build > > > - G U M P Y > Work Name: build_depot_depot-version (Type: Build) > State: Success > Elapsed: 0 hours, 0 minutes, 34 seconds > Command Line: java -Djava.awt.headless=true org.apache.tools.ant.Main -Dbuild.clonevm=true -Dgump.merge=/data/gump/gump/ work/merge.xml -Dbuild.sysclasspath=only -Dant.home=/data/gump/ant/dist -DDA TE_STAMP=20040211 -f build.xml gump > [Working Directory: /data/gump/depot/version] > - > Buildfile: build.xml > > init: > [echo] Version Library 0.1d1 > > prepare: > [mkdir] Created dir: /data/gump/depot/version/target > [mkdir] Created dir: /data/gump/depot/version/target/classes > [mkdir] Created dir: /data/gump/depot/version/target/docs > [mkdir] Created dir: /data/gump/depot/version/target/docs/api > [mkdir] Created dir: /data/gump/depot/version/target/tests > > static: > > compile: > [javac] Compiling 182 source files to /data/gump/depot/version/target/classes > > gump: > [mkdir] Created dir: /data/gump/depot/version/dist > [jar] Building jar: /data/gump/depot/version/dist/depot-version-20040211.jar > [mkdir] Created dir: /data/gump/depot/version/dist/lib > > BUILD SUCCESSFUL > Total time: 32 seconds > - > > - G U M P Y > RSS: http://lsd.student.utwente.nl/gump/depot/depot-version.rss | Atom: http://lsd.student.utwente.nl/gump/depot/depot-version.atom > > -- > Gump http://jakarta.apache.org/gump > [lsd] >
MD5 Hash
Hello, I just browsed a little around and found some special solutions for the checksum stuff with MD5-Hashes. ANT has already a nice task for this, which I did not knew. Anyway, what do you think, should we use this? This means basically a tight integration with ant. Any comments on this one? R, Markus To think without knowing makes the coincidence the ruler...
Re: WWW Sites
Hi, +1 for the umbrella site, another +1 for the separation of version and ruper. The build process for ruper is pretty much done for the ANT 1.5 build. I would like to step up and use Nicks basics to make a build for ANT 1.6. This seems to be another move. Concerning the separation of version and ruper. There are some dependencies in ruper, which could be resolved easily, others not. I would like to setup a kind of plugin-structure for ruper. Means another subdirectory in the src directory (src/adapter). I hope that everybody agrees that ant is a core dependency and we therefor leave the directory on the top-source level (src/ant) and not (src/adapter/ant). R, Markus Adam R. B. Jack wrote: All, We need to get a WWW site up, to give us a face, and help us get momentum/community. Currently we have two things in depot -- ruper and version -- and we could have more (w/ Avalon folks, when we invite them, a TODO in itself.) As such we could have one depot site that references the others, or we could combine them into one site. My +1 for an umbrella site. [I'd like to keep version separate from version, for good separation if nothing more, and I'd like to use it separately.] regards Adam -- Experience the Unwired Enterprise: http://www.sybase.com/unwiredenterprise Try Sybase: http://www.try.sybase.com
[GUMP@lsd]: depot/depot-version failed
Project: depot-version State: Failed URL: http://lsd.student.utwente.nl/gump/depot/depot-version.html - G U M P Y Annotations: - Info - Set http://lsd.student.utwente.nl/gump/depot/depot-version.rss | Atom: http://lsd.student.utwente.nl/gump/depot/depot-version.atom -- Gump http://jakarta.apache.org/gump [lsd]