Re: MD5 and Mirrors ( was Re: MD5 Hash )

2004-02-11 Thread Mark R. Diggory
Well, after my own little survey, I've determined the following:
md5 on BSD (Apache Minotaur):
[EMAIL PROTECTED]:/home/mdiggory> md5 foo.bar
MD5 (foo.bar) = 7f5e787ff3b930d906d01243ccf7c237
md5 has no built in option to compare the file to the checksum and 
return true/false.

Output of md5sum (GNU textutils) on Redhat:
[EMAIL PROTECTED]:/home/mdiggory> md5sum foo.bar
7f5e787ff3b930d906d01243ccf7c237 foo.bar
md5sum has a built in option which compares the md5 from the signature 
against the original file.

[EMAIL PROTECTED] mdiggory]$ md5sum -c foo.bar.md5
foo.bar: OK
Output of Maven when publishing to repository is the md5 string minus 
the filename and is dependent on GNU md5sum.

*example snippet of the command as its run in jelly*

  cd ${directory};
  md5sum ${artifactName} | sed 's/ .*$//' | tee ${artifactName}.md5;
  chgrp ${maven.repository.group} *;
  chmod g+w,a+r *;

results in the string with no filename on ibiblio, and actually fails on 
minotaur as its BSD and the executable is not present.

What is the right/wrong way is not really a reasonable question to ask.
How to appropriately deal with the variants in both md5/md5sum ... 
generation and file structure specifically in relation to the repository 
are the important questions to throw around.

My opinions are the following:
Server side OS dependent tools are usually accessed in scripts (say, in 
a cron script which does checking and reports errors). These scripts 
will always be unique to an OS, It'll often be the case that they are 
custom for that particular need. the author usually writes their own 
string parsing routines (ie: md5sum foo.bar | sed 's/ .*$//').

A client side tool needs a simple and standard means of validating the 
content they are about to download or upload onto a server. If the 
repository structure already enforces the name of the md5 sum in 
relation to the file name, any internal naming done inside the md5 file 
is redundant. It would be good to just have the file contain the 
checksum which reduces parsing requirements on both the server and the 
client..

Client tools should be robust enough (or extensible enough) to generate 
the appropriate md5 sum for a particular artifact and to easily find and 
read/compare it to the content on the server.

-Mark
Markus M. May wrote:
Hello Mark,
this is probably my fault. I checked this whole stuff with a very old 
maven.md5-file. The format is now equal between the two projects.

Sorry for the confusion.
Markus

--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu


Re: MD5 and Mirrors ( was Re: MD5 Hash )

2004-02-11 Thread Markus M. May
Hello Mark,
this is probably my fault. I checked this whole stuff with a very old 
maven.md5-file. The format is now equal between the two projects.

Sorry for the confusion.
Markus



Re: MD5 and Mirrors ( was Re: MD5 Hash )

2004-02-11 Thread Adam R. B. Jack
> Working together, I believe both Depot, Repository, Maven, ... can come 
> to a common agreement on the Apache Repository Structure. The separate 
> groups maintaining different views provides the "tension" neccessary for 
> growth of an agreement. The key is eventual comprimise and 
> non-posesiveness in all the parties involved.

Well said. Count us in on that..

regards

Adam


Re: MD5 and Mirrors ( was Re: MD5 Hash )

2004-02-11 Thread Mark R. Diggory

Adam R. B. Jack wrote:
Adam is perfectly right about this stuff. There is one more thing we need
to
think about. Some repositories treat md5-files different. The structure on
apache.org is [filename - MD5 Hash]. But on ibiblio (maven-repository) it
is
just [MD5 Hash]. So this needs to be somehow configurable.

I think we need to shoot for what is considered the Apache Repository, none
other. What Maven do w/ ibiblio will clearly impact us, but ought be
secondary.
A standard file format for md5 is more important in the long run, I 
think, than either the way Apache in general or more specifically the 
Maven project are dealing with generating the file contetns of md5 
checksums.

Currently, neither apache or maven md5's are validatable using the 
standard FSF GNU md5sum implementation.

That said, Apache Repository is about to become Maven Repository (taken over
my the Maven team) unless we help Apache get it's act together. Still, it
might not be a bad thing, we expect them to be a primary publisher.
regards
Adam
Working together, I believe both Depot, Repository, Maven, ... can come 
to a common agreement on the Apache Repository Structure. The separate 
groups maintaining different views provides the "tension" neccessary for 
growth of an agreement. The key is eventual comprimise and 
non-posesiveness in all the parties involved.

-Mark
--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu


MD% Standards (was Re: MD5 and Mirrors ( was Re: MD5 Hash ))

2004-02-11 Thread Mark R. Diggory
Besides, my current experiments with gnu md5sum (2.0.21) show that the 
sum's on the Maven contents arn't verifyable to any other tool but the 
maven checksum plugin.

If they aren't verifiable to extenral tools thats a bad situation. I'm 
going to bring this up on the Maven list too.

http://www.faqs.org/rfcs/rfc1321.html
A hard fast "dig" through the RFC suggests a loophole here as there is 
no reference to what the contents of a md5 signature fle should look 
like. Seems more of a inherant "suggestion" in the implementation itself.

-Mark
Mark R. Diggory wrote:
Its a tough call, is there any "standard" for the structure of the md5 
contents out there? I think the Maven team would be keen to play along 
with a standard and yet play along with any configurability as well.

-Mark Diggory
Markus M. May wrote:
Adam is perfectly right about this stuff. There is one more thing we 
need to
think about. Some repositories treat md5-files different. The 
structure on
apache.org is [filename - MD5 Hash]. But on ibiblio (maven-repository) 
it is
just [MD5 Hash]. So this needs to be somehow configurable.
One more thing to think about :-)
--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu


Re: MD5 and Mirrors ( was Re: MD5 Hash )

2004-02-11 Thread Mark R. Diggory
Its a tough call, is there any "standard" for the structure of the md5 
contents out there? I think the Maven team would be keen to play along 
with a standard and yet play along with any configurability as well.

-Mark Diggory
Markus M. May wrote:
Adam is perfectly right about this stuff. There is one more thing we need to
think about. Some repositories treat md5-files different. The structure on
apache.org is [filename - MD5 Hash]. But on ibiblio (maven-repository) it is
just [MD5 Hash]. So this needs to be somehow configurable. 

One more thing to think about :-)

Nick wrote:

The MD5 should always come from the authoritative source (apache.org)
using https.
I'm not sure if all environments (JVMs) have HTTPS available. In a
somewhat
perfect world we'd try HTTPS and if it failed try HTTP, unless some
'minimum
security' was requested.
I think we'll have to experiment and experince this area over
time/iterations.

How are we going to know what the "authoritative" source for a resource
is.
For java we could enforce a reverse domain name.
Four things:
1) Repository URI/URL is what it is (whatever it is) and the URL for the
MD5
ought be the URL for the resources plus ".md5" on the end.
2) As current Ruper thinking (coding) goes ... Mirrors ought mirror the
hierarchy, so wherever a resource is in the repo, the .md5 ought be next
to
it, and the original .md5 ought be in exactly the same relative position
(just relative to an apache root).
3) Mirroring is kinda hacked into Ruper right now, it silently moves the
root of a repository (originally set relative to the mirror locator CGI
script) to one such mirror. As such Ruper doesn't really know about
mirrors.
4) We probably need to rethink current thinking... ;-)
regards,
Adam

--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu


Re: MD5 and Mirrors ( was Re: MD5 Hash )

2004-02-11 Thread Adam R. B. Jack

> Adam is perfectly right about this stuff. There is one more thing we need
to
> think about. Some repositories treat md5-files different. The structure on
> apache.org is [filename - MD5 Hash]. But on ibiblio (maven-repository) it
is
> just [MD5 Hash]. So this needs to be somehow configurable.

I think we need to shoot for what is considered the Apache Repository, none
other. What Maven do w/ ibiblio will clearly impact us, but ought be
secondary.

That said, Apache Repository is about to become Maven Repository (taken over
my the Maven team) unless we help Apache get it's act together. Still, it
might not be a bad thing, we expect them to be a primary publisher.

regards

Adam



Re: MD5 and Mirrors ( was Re: MD5 Hash )

2004-02-11 Thread Markus M. May
Adam is perfectly right about this stuff. There is one more thing we need to
think about. Some repositories treat md5-files different. The structure on
apache.org is [filename - MD5 Hash]. But on ibiblio (maven-repository) it is
just [MD5 Hash]. So this needs to be somehow configurable. 

One more thing to think about :-)

> Nick wrote:
> 
> > The MD5 should always come from the authoritative source (apache.org)
> > using https.
> 
> I'm not sure if all environments (JVMs) have HTTPS available. In a
> somewhat
> perfect world we'd try HTTPS and if it failed try HTTP, unless some
> 'minimum
> security' was requested.
> 
> I think we'll have to experiment and experince this area over
> time/iterations.
> 
> > How are we going to know what the "authoritative" source for a resource
> > is.
> > For java we could enforce a reverse domain name.
> 
> Four things:
> 
> 1) Repository URI/URL is what it is (whatever it is) and the URL for the
> MD5
> ought be the URL for the resources plus ".md5" on the end.
> 
> 2) As current Ruper thinking (coding) goes ... Mirrors ought mirror the
> hierarchy, so wherever a resource is in the repo, the .md5 ought be next
> to
> it, and the original .md5 ought be in exactly the same relative position
> (just relative to an apache root).
> 
> 3) Mirroring is kinda hacked into Ruper right now, it silently moves the
> root of a repository (originally set relative to the mirror locator CGI
> script) to one such mirror. As such Ruper doesn't really know about
> mirrors.
> 
> 4) We probably need to rethink current thinking... ;-)
> 
> regards,
> 
> Adam
> 



MD5 and Mirrors ( was Re: MD5 Hash )

2004-02-11 Thread Adam R. B. Jack
Nick wrote:

> The MD5 should always come from the authoritative source (apache.org)
> using https.

I'm not sure if all environments (JVMs) have HTTPS available. In a somewhat
perfect world we'd try HTTPS and if it failed try HTTP, unless some 'minimum
security' was requested.

I think we'll have to experiment and experince this area over
time/iterations.

> How are we going to know what the "authoritative" source for a resource
> is.
> For java we could enforce a reverse domain name.

Four things:

1) Repository URI/URL is what it is (whatever it is) and the URL for the MD5
ought be the URL for the resources plus ".md5" on the end.

2) As current Ruper thinking (coding) goes ... Mirrors ought mirror the
hierarchy, so wherever a resource is in the repo, the .md5 ought be next to
it, and the original .md5 ought be in exactly the same relative position
(just relative to an apache root).

3) Mirroring is kinda hacked into Ruper right now, it silently moves the
root of a repository (originally set relative to the mirror locator CGI
script) to one such mirror. As such Ruper doesn't really know about mirrors.

4) We probably need to rethink current thinking... ;-)

regards,

Adam



Re: MD5 Hash

2004-02-11 Thread Nick Chalko
Adam R. B. Jack wrote:
Hmm, what makes folk think that the file could be changed without the MD5
hash file being changed also. I feel there has to be some private key from
the originator, to ensure that nobody could fake both.
 

The MD5 should always come from the authoritative source (apache.org)
using https.
How are we going to know what the "authoritative" source for a resource
is.
For java we could enforce a reverse domain name.
ie  packages  like org.apache   must get a md5 for an apache.org
website.
So, if there are such keys, how do we acquire them? How do we trust them?
regards
Adam
 




Re: MD5 Hash

2004-02-11 Thread Markus M. May
I think in the same direction. First I will try to compare the generated
hash with the hash from the mirror. In a second step I will then try to
determine the original .md5 file and compare to this one.
Basically the web-of-trust is pretty hard to automate right now. You already
have a KEYS file with quite a lot of keys, but you cannot tell which key
signed the file. There is no way to do this (or i missed it). So right now, I
will concentrate on the MD5-stuff. 

Markus

> > Basically the MD5 Hash does not need keys.
> > [...]
> > Also apache.org delivers a file named .asc
> 
> Ok, thanks, I get it now (I think.)
> 
> This explains some of the negative comments I've heard about MD5 then (it
> not being too strong). I read on some, on one Apache list, that folks will
> be ok with this being strong enough though. What will be tricky for us,
> should we chose to attempt it, will be supporting mirrors yet using the
> original MD5 from Apache...
> 
> Since ASC has keys, that ties in to the 'web of trust' that Apache is
> working on, I think. Once on trusts a certain set of keys, those keys can
> be
> used to verify others that are acquired, and those can be used to verify
> the
> ASC. This is much harder to automate, but something we could aspire to...
> 
> regards,
> 
> Adam
> 



Re: MD5 Hash

2004-02-11 Thread Adam R. B. Jack
> Basically the MD5 Hash does not need keys.
> [...]
> Also apache.org delivers a file named .asc

Ok, thanks, I get it now (I think.)

This explains some of the negative comments I've heard about MD5 then (it
not being too strong). I read on some, on one Apache list, that folks will
be ok with this being strong enough though. What will be tricky for us,
should we chose to attempt it, will be supporting mirrors yet using the
original MD5 from Apache...

Since ASC has keys, that ties in to the 'web of trust' that Apache is
working on, I think. Once on trusts a certain set of keys, those keys can be
used to verify others that are acquired, and those can be used to verify the
ASC. This is much harder to automate, but something we could aspire to...

regards,

Adam



Re: MD5 Hash

2004-02-11 Thread Markus M. May
Hello once again,

> > yes, I can enlighten you all a littel bit about MD5 hashs. The basic is
> that
> 
> Ok, thanks for that. I get the gist, I get the premis. Now, more
> practically...
> 
> What are the inputs to the algorythm? Meaning, we have the file, we have
> the
> MD5 resultant hash (assuming the file on the server has not been
> modified),
> and we have the algorythm, but do we need anything else (e.g. keys) in
> order
> to re-compute/check the resultant hash?

Basically the MD5 Hash does not need keys. It is generated from the file
itself without any password or something like that. The code is just a hashcode
of the file (a hex-Number).
> 
> Hmm, what makes folk think that the file could be changed without the MD5
> hash file being changed also. I feel there has to be some private key from
> the originator, to ensure that nobody could fake both.
> 
Like stated earlier, there are no keys there. Since a normal user uses a
mirror to download apache.org sources or binaries, you can then check if the
file has the same hash-code as the original file from apache.org (can be checked
by using the original .md5-file from apache).
Also apache.org delivers a file named .asc (at least some projects, like ant
do this). In this file there is a signiture for the original file. This can
be checked then by using the public key stored in the root-directory of each
project in the KEYS-file. But this has nothing really to do with the MD5
stuff. MD5 just ensures integity basically during the download, but does not,
like you said, ensures, that the file is really the one, which was published or
intended to be published.

> So, if there are such keys, how do we acquire them? How do we trust them?
> 
> regards
> 
> Adam
> 


R,

Markus



Re: MD5 Hash

2004-02-11 Thread Adam R. B. Jack
> yes, I can enlighten you all a littel bit about MD5 hashs. The basic is
that

Ok, thanks for that. I get the gist, I get the premis. Now, more
practically...

What are the inputs to the algorythm? Meaning, we have the file, we have the
MD5 resultant hash (assuming the file on the server has not been modified),
and we have the algorythm, but do we need anything else (e.g. keys) in order
to re-compute/check the resultant hash?

Hmm, what makes folk think that the file could be changed without the MD5
hash file being changed also. I feel there has to be some private key from
the originator, to ensure that nobody could fake both.

So, if there are such keys, how do we acquire them? How do we trust them?

regards

Adam



Re: MD5 Hash

2004-02-11 Thread Markus M. May
Hello,

yes, I can enlighten you all a littel bit about MD5 hashs. The basic is that
a hash is a unique key for a value (in this case a file). From this key you
cannot guess or even generate the original value (file). It is generated with
the MD5 algorithm using javas security stuff. So basically if a file is
updated on the apache.org servers the MD5 hash is generated. When the file is
updated the hash of the updated file is normally (and this would be a very very
small chance) not the same. So, basically for each file a new hash is
generated. You can then create another hash from the same file. If you are 
using the
same algorithm (MD5/SHA) you get then the same hash (means: from the same
file you always get the same hash-code when using the same algorithm). On the
apache.org servers there is always an .MD5 file for each deployed file. In
this file the original filename and the hashcode is written. This basically
means, you can generate with the same algorithm the hashcode and then you can
check if the hash in the .md5-file is the same as the generated one. If it is
not, it is a good guess, that the file you downloaded is not the one published
by apache.org.

Hope this helps you understand the issue. If there are more questions
concerning this, just go ahead and ask.

R,

Markus


> > I just browsed a little around and found some special solutions for the
> > checksum stuff with MD5-Hashes. ANT has already a nice task for this
> 
> How would we integrate with it? Is the task part of 'core'? I'm not
> against
> leveraging others, especially ant, 'cos I suspect that'll be a large part
> of
> our user base. So, as a start, I'd be for it.
> 
> That said, longer term I'd love to see it for command line also.
> 
> BTW: Are you at a point where you can explain the mechanics of this? What
> keys does one use to check an MD5? Where do the keys come from, can we
> trust
> them, etc.? Can you educate us all?
> 
> regards,
> 
> Adam
> 



Re: MD5 Hash

2004-02-11 Thread Adam R. B. Jack
> I just browsed a little around and found some special solutions for the
> checksum stuff with MD5-Hashes. ANT has already a nice task for this

How would we integrate with it? Is the task part of 'core'? I'm not against
leveraging others, especially ant, 'cos I suspect that'll be a large part of
our user base. So, as a start, I'd be for it.

That said, longer term I'd love to see it for command line also.

BTW: Are you at a point where you can explain the mechanics of this? What
keys does one use to check an MD5? Where do the keys come from, can we trust
them, etc.? Can you educate us all?

regards,

Adam



Re: [GUMP@lsd]: depot/depot-version failed

2004-02-11 Thread Adam R. B. Jack
This is telling us that the build is working, but that the output (jar) is
not placed/named as Gump has been told it is.

>   [jar] Building jar:
/data/gump/depot/version/dist/depot-version-20040211.jar

and:

>  - Error - Missing Output:
/data/gump/depot/version/build/version-20040211.jar

I'll change the Gump descriptor.

regards,

Adam
- Original Message - 
From: "Adam Jack" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, February 11, 2004 8:00 AM
Subject: [EMAIL PROTECTED]: depot/depot-version failed


> Project: depot-version
> State: Failed
> URL: http://lsd.student.utwente.nl/gump/depot/depot-version.html
> - G U M P Y
>
>
> Annotations:
>  - Info - Set   - Info - Dependency on ant exists, no need to add for property ant.home.
>  - Error - Failed with reason missing build outputs
>  - Error - Missing Output:
/data/gump/depot/version/build/version-20040211.jar
>  - Error - No such directory (where output is expected) :
/data/gump/depot/version/build
>
>
> - G U M P Y
> Work Name: build_depot_depot-version (Type: Build)
> State: Success
> Elapsed: 0 hours, 0 minutes, 34 seconds
> Command Line: java -Djava.awt.headless=true
org.apache.tools.ant.Main -Dbuild.clonevm=true -Dgump.merge=/data/gump/gump/
work/merge.xml -Dbuild.sysclasspath=only -Dant.home=/data/gump/ant/dist -DDA
TE_STAMP=20040211 -f build.xml gump
> [Working Directory: /data/gump/depot/version]
> -
> Buildfile: build.xml
>
> init:
>  [echo]  Version Library 0.1d1 
>
> prepare:
> [mkdir] Created dir: /data/gump/depot/version/target
> [mkdir] Created dir: /data/gump/depot/version/target/classes
> [mkdir] Created dir: /data/gump/depot/version/target/docs
> [mkdir] Created dir: /data/gump/depot/version/target/docs/api
> [mkdir] Created dir: /data/gump/depot/version/target/tests
>
> static:
>
> compile:
> [javac] Compiling 182 source files to
/data/gump/depot/version/target/classes
>
> gump:
> [mkdir] Created dir: /data/gump/depot/version/dist
>   [jar] Building jar:
/data/gump/depot/version/dist/depot-version-20040211.jar
> [mkdir] Created dir: /data/gump/depot/version/dist/lib
>
> BUILD SUCCESSFUL
> Total time: 32 seconds
> -
>
> - G U M P Y
> RSS: http://lsd.student.utwente.nl/gump/depot/depot-version.rss | Atom:
http://lsd.student.utwente.nl/gump/depot/depot-version.atom
>
> --
> Gump http://jakarta.apache.org/gump
> [lsd]
>



MD5 Hash

2004-02-11 Thread Markus M. May
Hello,
I just browsed a little around and found some special solutions for the
checksum stuff with MD5-Hashes. ANT has already a nice task for this, which I 
did
not knew. Anyway, what do you think, should we use this? This means
basically a tight integration with ant. 

Any comments on this one? 

R,

Markus


To think without knowing makes the coincidence the ruler...





Re: WWW Sites

2004-02-11 Thread Markus M. May
Hi,
+1 for the umbrella site,
another +1 for the separation of version and ruper.
The build process for ruper is pretty much done for the ANT 1.5 build. I 
would like to step up and use Nicks basics to make a build for ANT 1.6. 
This seems to be another move.
Concerning the separation of version and ruper. There are some 
dependencies in ruper, which could be resolved easily, others not. I 
would like to setup a kind of plugin-structure for ruper. Means another 
subdirectory in the src directory (src/adapter). I hope that everybody 
agrees that ant is a core dependency and we therefor leave the directory 
on the top-source level (src/ant) and not (src/adapter/ant).

R,
Markus
Adam R. B. Jack wrote:
All,
We need to get a WWW site up, to give us a face, and help us get
momentum/community. Currently we have two things in depot -- ruper and
version -- and we could have more (w/ Avalon folks, when we invite them, a
TODO in itself.) As such we could have one depot site that references the
others, or we could combine them into one site.
My +1 for an umbrella site. [I'd like to keep version separate from version,
for good separation if nothing more, and I'd like to use it separately.]
regards
Adam
--
Experience the Unwired Enterprise:
http://www.sybase.com/unwiredenterprise
Try Sybase: http://www.try.sybase.com




[GUMP@lsd]: depot/depot-version failed

2004-02-11 Thread Adam Jack
Project: depot-version
State: Failed
URL: http://lsd.student.utwente.nl/gump/depot/depot-version.html
- G U M P Y


Annotations:
 - Info - Set http://lsd.student.utwente.nl/gump/depot/depot-version.rss | Atom: 
http://lsd.student.utwente.nl/gump/depot/depot-version.atom

--
Gump http://jakarta.apache.org/gump
[lsd]