Hi Alexander,

Stupid question maybe, but why can't you use your given name for this?

Caveat: please don't let this email stand in the way of your giving your gift.

- Steph

----- Original Message ----- From: "Solar Designer" <[EMAIL PROTECTED]>
To: "Dmitry Stogov" <[EMAIL PROTECTED]>
Cc: "Stanislav Malyshev" <[EMAIL PROTECTED]>; "Andi Gutmans" <[EMAIL PROTECTED]>; "PHP Internals List" <internals@lists.php.net>
Sent: Tuesday, February 05, 2008 11:50 PM
Subject: Re: [PHP-DEV] faster & public domain MD5 implementation


Hi Dmitry and all,

First of all, please accept my apologies for failing to find the time to
participate in the licensing issues discussion in December.  It is a
topic that I would like to discuss and arrive at a conclusion as I often
happen to write code that I'd like to release to the public under the most
relaxed terms possible.  I thought that not claiming copyright (or even
disclaiming copyright) and placing the code into the public domain would
be it, but apparently in many (most? all?) jurisdictions there's no
explicitly specified way for someone to place their works into the
public domain (although the concept of public domain does exist - and
stuff "falls" there as old copyrights expire) and now it has also been
mentioned that some jurisdictions don't even recognize public domain at
all (I have not yet seen/heard a lawyer state that, though).

A possible solution could be to simultaneously try to place stuff in the
public domain with a statement to that extent and license it to the
public under very liberal terms.  One issue with it is that I have to
not claim or disclaim copyright in order to place a work of mine into
the public domain, yet I have to be the copyright holder in order to
license that work.  Maybe this can be taken care of with a severability
clause, making either the public domain or the license work in any given
jurisdiction.  But I'd rather see/hear a lawyer comment on that before I
possibly go that route.

That said, a lot of software that we use has been placed in the public
domain by its authors.  This includes some software by D. J. Bernstein,
perhaps best known as the author of qmail, who is also known for the
Bernstein vs. United States litigation - http://cr.yp.to/export.html -
so perhaps he should know the law.  Then, public domain is officially
recognized as being compatible with GNU GPL by the FSF -
http://www.fsf.org/licensing/licenses/ - and is apparently recognized by
the OSI - http://opensource.org/node/239

On Tue, Feb 05, 2008 at 02:34:40PM +0300, Dmitry Stogov wrote:
We are going to include your md5() implementation into php-5.3.0.

Great!

I confirm at least 25% md5() speedup on my Core2 3GHz, however license
issues are not clear.
We are going to distribute files under standard PHP license including
your original copyright notes.
The files which are going to be committed are attached.

Please confirm your agreement.

Confirmed.  Please note, however, that there were no "copyright notes"
on my original files; instead, there was an authorship note and a public
domain statement.

I also have some comments on the modified files:

| Copyright (c) 1997-2008 The PHP Group |
...
| Author: Solar Designer <solar at openwall.com> |

So you claim copyright to a modified version of my code, that I had
placed in the public domain.  This is fine by me.

I do not formally require it (in fact, I can't), but maybe the "Author"
line could be changed to either:

  | Original author: Solar Designer <solar at openwall.com>              |

or:

  | Authors: Solar Designer <solar at openwall.com> with further         |
  | modifications by others.                                             |

(or you can make it more explicit, e.g. "... by The PHP Group" if that
is appropriate - or whatever).

/* MD5 context. */
typedef struct {
php_uint32 lo, hi;
php_uint32 a, b, c, d;
unsigned char buffer[64];
php_uint32 block[16];
} PHP_MD5_CTX;

Maybe it would be better to do:

typedef php_uint32 MD5_u32plus;

and use the latter type.  This would reduce the number of changes
between my version of the code and yours, making it easier for you to
sync to any newer versions of the code that I might make.

| Author: Solar Designer <solar at openwall.com> |

If you do choose to change this in the .h file, then do the same in the
.c, obviously.

#if (defined(__APPLE__) || defined(__APPLE_CC__)) && (defined(__BIG_ENDIAN__) || defined(__LITTLE_ENDIAN__))
# if defined(__LITTLE_ENDIAN__)
#  undef WORDS_BIGENDIAN
# else
#  if defined(__BIG_ENDIAN__)
#   define WORDS_BIGENDIAN
#  endif
# endif
#endif

This looks wrong to me.  One of the specific properties of my
implementation is that it does not strictly depend on the endianness
being correctly specified at compile-time (and at all, for that matter).
However, if you do happen to use the (little-endian and unaligned-OK)
optimized code on a system that is not in fact little-endian or does not
in fact tolerate unaligned accesses, then problems will arise!  So any
#if's you use must assume (might-be-big-endian and might-disallow-unaligned)
by default.

In fact, I am only aware of three widespread and general-purpose
architectures that satisfy the criteria for the optimized code:

#if defined(__i386__) || defined(__x86_64__) || defined(__vax__)

Thus, I suggest that you leave the above #if intact, the way it was in
the patch that I submitted.  Do not explicitly check for any endianness
macros - this is bound to cause problems.

/*
 *  * SET reads 4 input bytes in little-endian byte order and stores them
 *   * in a properly aligned word in host byte order.
 *    *
* * The check for little-endian architectures that tolerate unaligned * * memory accesses is just an optimization. Nothing will break if it
 *       * doesn't work.
 *        */
#ifndef WORDS_BIGENDIAN
# define SET(n) \
(*(php_uint32 *)&ptr[(n) * 4])
# define GET(n) \
SET(n)
#else
...

As explained above, I strongly recommend that you revert your "#ifndef
WORDS_BIGENDIAN" to my "#if ..."

What if an architecture is big-endian, but WORDS_BIGENDIAN just happens
to not be specified?  You'll have incorrect results (not MD5), whereas
with my version of the code, everything will be just fine.

Similarly, regardless of endianness, if WORDS_BIGENDIAN is not specified
(maybe because the architecture is in fact little-endian), but the
architecture does not tolerate unaligned accesses (at all or supports
them with kernel emulation), things will go wrong (SIGBUS or very poor
performance and a flood of kernel messages).  This issue can't occur
with my original #if that only lists specific known-safe architectures.

data = body(ctx, data, size & ~(unsigned long)0x3f);

If you change all of my unsigned long's to size_t, you should change
this one as well.

When on a 64-bit system (userland pointer size), your size_t better be
64-bit as well (I have not checked whether this is necessarily the case;
I hope so).

PHPAPI void PHP_MD5Final(unsigned char *result, PHP_MD5_CTX *ctx)
{
unsigned long used, free;

Here's another one.

Thanks,

Alexander Peslyak <solar at openwall.com>
GPG key ID: 5B341F15 fp: B3FB 63F4 D7A3 BCCC 6F6E FC55 A2FC 027C 5B34 1F15 http://www.openwall.com - bringing security into open computing environments

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to