Hi all,
some comments :
About caching: I am aware that DNS entries are being cached by the
underlying os (at least the most successful ones :-)), but I liked it
nevertheless, as it might speed things up. I guess what I really mean is
that it could be an option which can be enabled/disabled through system
properties or for selected versions/os's.


-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
Behalf Of Bryce McKinlay
Sent: Monday, April 17, 2000 7:08 AM
To: Aaron M. Renn
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: java.net: Classpath vs. libgcj Comparison


Thanks for these comparisons, they should be very useful.

"Aaron M. Renn" wrote:

> java.net.InetAddress

> Really liked the caching mechanism of the classpath implementation, this
classpath class seems very strong, and is preferable to the libgcj version.
This comparison should
> definitely include some real world testing, but unfortunaltely, i don't
have time.

I disagree with this. Caching inside InetAddress is a misfeature. The
underlying OS already caches DNS entries. This cache is shared with other
processes, and the OS in a better
position to make decisions about what adresses are cacheable, and for how
long. This caching would be very annoying for some applications if we don't
provide a way to bypass it, for
example, what happens if you are trying to connect to a server with a
revolving DNS? With the classpath cache implementation, getByName() will
allways return the same address, until
it expires from the cache.

In environments where native calls are expensive, there is an argument that
a cache will provide better performance. But well written Java applications
tend to call getByName() once,
and keep the returned InetAddress object around for as long as they need it.
In any case, I think the possibility of getting stale results (and being out
of sync with other
applications on the system), outweighs any benefits here.

How about a duration for the cache, as well as fixed-size cache?

Merging the two codebases is something that should really be considered on a
method-by-method (and line-by-line!) basis rather than class-by-class.
Here's a few examples of why this
is important, taken from java.net.InetAddress:

Example 1 - from the classpath's InetAddress:

int[] my_ip;

This is inefficient. IP addresses should be represented as byte[], as libgcj
does.

String hostname_alias;
long lookup_time;

I'm not sure what hostname_alias is for, but it looks like it something to
do with reverse-lookup on cached entries? Its important to consider that
this along with lookup_time is
adding extra bulk to every InetAddress object that gets created, and in some
applications could add up to significant additional memory consumption.

Example 2: equals() implementation

from libgcj:

  public boolean equals (Object obj)
  {
    if (obj == null || ! (obj instanceof InetAddress))
      return false;
    // "The Java Class Libraries" 2nd edition says "If a machine has
    // multiple names instances of InetAddress for different name of
    // that same machine are not equal.  This is because they have
    // different host names."  This violates the description in the
    // JDK 1.2 API documentation.  A little experiementation
    // shows that the latter is correct.
    byte[] addr1 = address;
    byte[] addr2 = ((InetAddress) obj).address;
    if (addr1.length != addr2.length)
      return false;
    for (int i = addr1.length;  --i >= 0;  )
      if (addr1[i] != addr2[i])
        return false;
    return true;
  }

from classpath:

public boolean
equals(Object addr)
{
  if (!(addr instanceof InetAddress))
    return(false);

  byte[] test_ip = ((InetAddress)addr).getAddress();

  if (test_ip.length != my_ip.length)
    return(false);

  for (int i = 0; i < my_ip.length; i++)
    if (test_ip[i] != (byte)my_ip[i])
       return(false);

  return(true);
}

Although the classpath implementation perhaps looks a bit nicer here, there
are a few problems with it. It is slower, because it makes an unneccesary
call to getAddress(), and it has
to cast my_ip(i) to byte because my_ip was declared wrong to begin with. The
libgcj implementation isn't perfect either - it has a redundant check for
null (instanceof will return
false if given a null operand). The comment in libgcj provides useful
insight, and should be included in the merged version.

I hate to be picky about comments, but the way things are commented in
libgcj is far from the 'standard'. I don't want to step on anyone's toes
here, but this Classpath/libgcj implementation is probably going to be more
like a reference implementation than a benchmark-killer in the foreseeable
future, therefore I tend to prefer clean coding, readability and
functionality over performance. IMO comments should be put in the header if
possible, and in the method body only for non-intuitive stuff. What I'm
trying to say is that _lots_ of people are going to look to this
implementation, therefore readability _has_ value.


Example 3: getAddress()

libgcj implementation:

  public byte[] getAddress ()
  {
    // An experiment shows that JDK1.2 returns a different byte array each
    // time.  This makes sense, in terms of security.
    return (byte[]) address.clone();
  }

classpath implementation:

public byte[]
getAddress()
{
  byte[] addr = new byte[my_ip.length];

  for (int i = 0; i < my_ip.length; i++)
    {
      addr[i] = (byte)my_ip[i];
    }

  return(addr);
}

Its difficult to say which would be faster here - its a question of the
overhead of an extra method call (to clone()) vs. the looping and setting of
classpath. In the absence of hard
benchmarks, I would go with the libgcj implementation here. It provides a
useful comment, and the implementation is simpler as well as probibly being
faster.

I disagree, the clone method more than likely does exactly what is done in
the classpath version.

These are just some arbritary examples I came up with because I happened to
have the two InetAddress implementations open on my screen, but the types of
issues here apply to all the
classes that need to be merged. I jsut want to stress that merging shouldn't
a case of "well take this class from libgcj, that one from classpath", but
something that needs to be
done on a lower level than that.

regards

  [ bryce ]

- Gaute

Reply via email to