RE: some history on email addresses and thoughts for a proposal

Dick Brooks Wed, 30 Jan 2002 15:40:49 -0800

>Names like Marshall Rose, Paul Vixie and
>Frederick Avolio come to mind.


And let's not forget Dave Crocker for his part in creating RFC 822 and RFC
1767, the original Internet EDI RFC. Dave also worked at decwrl (Dec's
Western Research Lab) during my years with the company.

>We don't need to limit this model to the use of transactions via email,
>or even to the use of the Internet.  The concept of distributed "name
>spaces" with distributed management of the different domains can be used
>for naming both the trading partners and determining how to reach the
>trading partner.  Once we know how to reach a trading partner, then the
>appropriate communication program can be invoked (e.g. FTP or Kermit) to
>contact that trading partner.

I don't disagree with your logic nor the concept. The DNS model provides a
fine example of distributed management and discovery  of names and
addresses, while isolating itself from the nasty business of routing.
Routing is IP's job. Once again we see the split between identifiers and
routing.

However, there is one "fly in the ointment" that typically must be dealt
with regarding B2B/E-Commerce that a DNS like system can't resolve.
ACCESS CONTROLS!

All of the B2B/E-Commerce system implementations I've been involved with
(30+) required some form of access controls to prevent unauthorized access
to a company's B2B/E-Commerce server.

Even if it were possible to use a DNS like system to resolve identifiers to
"EDI Addresses" and routing could be made "transparent", I suspect that some
organizations will want to maintain tight security over their B2B/E-Commerce
"gateways" and they will implement some form of access control (e.g.
usernames/passwords or something else) to prevent unauthorized access. The
alternative is to eliminate access controls (like in the case of e-mail)
then anyone can send "anything" to a B2B server (can you say "B2B spam" or
B2B virus catcher)..

The exchange of access control information typically requires a one-one
relationship/interaction. I can easily see MrBigPayer saying to MrProvider
or MrClearinghouse "please send all your claims to my B2B server
(http://claims.mrbigpayer.com/b2bservlet) using the following
username/password, "Iwannagetpaid"/"itsasecret"". Once this type of
interaction is needed between two trading partners the value of automated
directory and discovery (like the DNS concept introduced by Kepa)becomes
diminished. The moment MrBigPlayer is forced to interact directly with
MrProvider or MrClearinghouse to provide access control information they
might as well exchange all their other information (e.g. identifier, EDI
Address, contact info, etc.). The alternative is for MrBigPayer to operate
without the protection of access controls and therefore anyone can send
*anything" to MrBigPayers B2B/E-Commerce server. That seems a bit risky to
me.

In summary, the DNS concept described by Kepa works fine for name/address
management, discovery and translation. However, the DNS model does not
relieve two parties from interacting "directly" when access controls are in
place and usernames/passwords must be exchanged before data can flow. When
this happens it may actually be more efficient (and secure) to provide
"authorized parties" with all the information necessary (identifiers, EDI
Addresses, contact names/numbers, public keys, etc.).

I also believe this topic should be allocated more time than 2 hours during
the Seattle meeting, there is much to *discuss*.


Dick Brooks
Systrends, Inc
7855 South River Parkway, Suite 111
Tempe, Arizona 85284
Web: www.systrends.com <http://www.systrends.com>
Phone:480.756.6777,Mobile:205-790-1542,eFax:240-352-0714


-----Original Message-----
From: Kepa Zubeldia [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, January 30, 2002 3:18 PM
To: WEDi/SNIP ID & Routing
Subject: some history on email addresses and thoughts for a proposal


Before you read this, take a seat and get some popcorn...

Since history gives us a mechanism to not repeat the same mistakes all
over again, let me give a little history on Internet and pre-Internet
email addresses, in case we can learn something from it.  There are some
known pioneers in this field, and you can check their books to get a
better understanding.  Names like Marshall Rose, Paul Vixie and
Frederick Avolio come to mind.

Back in the pre-Internet era there was "open" email as part of the Unix
set of programs known as "uucp".  These programs included a facility
though "uux" that I could relay a mail message from my machine to the
next hop known to me, with an email address that was actually a route
descriptor, with each hop separated from the next with a "!". It looked
something like:

eecs!ounorman!okstate!decwrl!netsrus!novannet!wkammerer

This assumed that I knew the path from my machine at "eecs" to William's
machine at Novannet. And he could respond back to me by sending a
message to:

novannet!netsrus!decwrl!okstate!ounorman!eecs!zubeldia

However, for my brother in Spain to send me a message, the route was not
the same.  He had to send it to

spri!telefonica!uunet!ounorman!eecs!zubeldia

So essentially the "address" was not an address, but was a "route" and
the route changed between each pair of correspondents.  Of course the
route could change if I moved my connectivity or if somebody else along
the path changed their connectivity.

To put it in perspective, there were not that many hosts at that time,
in the late 1970s, so it was not as hard as it sounds.  And, there were
a few very smart mail systems that acted as routers of last resort,
contributed generously by some sponsors.  For example DEC and UUNET
contributed smart mail routers at "decwrl" and "uunet" so if you did not
know somebody's direct path, there would be some chance of getting the
mail to one of the smart mailers which, through the magic of path
optimization, would find a way to deliver the message.  At that time
(late 1970s) the decwrl mailer was processing about 1000 mail messages
per hour.  Amazing high traffic!  And they had many high speed leased
lines at 9600 baud.

At about the same time traffic and the number of hosts started growing,
the TCP/IP protocol started to be deployed.  This brought progress in
the form of IP addresses, mapped to a hierarchical name structure
through DNS, and standard mail headers in MIME, as well as the work of
Marshall Rose, Avolio, Vixie, and others in distinguishing between
addresses and routes in what later became known as RFC-822.  The first
attempts had somewhat of a confusing hybrid scheme, as not everybody was
connected with IP yet, so the uucp links and the IP links had to
co-exist.  We would use addresses like
eecs!ounorman!okstate!decwrl!wkammerer@novannet so I could get my
message to decwrl over the uucp hops and then let decwrl find a route to
William by whatever means decwrl could figure out.  This worked most of
the time, even if decwrl had to convert the wkammerer@novannet into
netsrus!novannet!wkammerer to get the mail into the right hands. And
note that these were actual host names, we did not have the "domain
names" that we use today, but each host had a worldwide unique name.
There was a long list in the uunet repository that contained all the
public recorded host names in the world, and you had to check it to make
sure you were not going to use a name that was already taken.

As things got more complicated, decwrl did a wonderful job of keeping
these routes up to date, but it became clear that at some point the
whole system would quit working due to its own inefficiency. Then the
famous "sendmail" program was released to help spread the RFC-822 gospel
and put the efficiency of the decwrl mail processing into everybody's
hands.  If you look at the configuration file for the latest versions of
sendmail, processing most of the Internet mail today, you will find that
it can still handle the "bang" ("!") type of addressing, and even the
old proprietary DECnet mail addresses that use the ":" as the delimiter.
About 25 years old and still ticking.

The introduction of RFC-822 addresses of the form that we know today, as
[EMAIL PROTECTED] only happened once the domain system and the
DNS was put in place, so let me talk about the DNS for a while.

We started with IP addresses.  Those are the four numbers with dots
between them, like 12.24.48.96.  If I wanted to connect my network to
other networks, I had to have a set of unique addresses.  You could
request a block of addresses, either a 256 address block or a 65K
address block, and if you could justify that many computers in your
network, you would get the numbers assigned to you for life. For free
too!  I was the administrator of my own "Class C" network, with room for
up to 253 computers, and I was free to allocate these addresses at will
within my network, as if it was one of those medieval "domains" and my
subjects were my computers.  I had certain authority over the assignment
of numbers in my "domain" and nobody else could use IP numbers that were
assigned to me.  These numbers were easy to transpose so the Internet
whizzes invented a hierarchical structure.

The hierarchical structure was organized from right to left, and was
called the "domain name" as it represented a public name for the IP
addresses under my control.  There is one top of the hierarchy under
which there are the "edu" "org" "net" "gov" "mil" "int" "nato" "arpa"
and later "com" domains. The very top under which all converge is simply
".", but since everybody assumes to be under the "top" the "." is not
mandatory.

Then for each one of the 5 domains that are at the top (or Top Level
Domains, TLDs) there is a registrar that assigns unique names upon
request of the users.  For example, the "claredi.com." domain is
assigned to claredi by the registrar that controls the "com." TLD. In
the early days there was only one registrar (InterNIC, which later
changed its name to Network Solutions, and was later acquired by
VeriSign) but now there are a few competitors.  The registrar
sets the rules for registering a domain name under their TLD.  Believe
it or not, you could get domain names registered for free until
1994-1995.  Before the Internet land run.

Once I had my domain name and my Class C block of IP addresses, I was
free to allocate names within my domain.  For example, I can have three
machines at 12.24.48.10, 12.24.48.11, and 12.24.48.12 and I can call
them one.claredi.com, two.claredi.com, and three.claredi.com if I so
desire.  What I need is to have a DNS server that knows the IP addresses
and the DNS names in my network.  If somebody wants to reach
"one.two.claredi.com", they first go to one of the Internet "root" DNS
servers asking for "com." to find out who is the DNS server that serves
the "com." domain.  The "root server" will say that the "com." domain is
served by VeriSign and the VeriSign DNS server is located at address
(made up) 134.200.89.111.  Then you have to go to the VeriSign DNS
server and ask who has authority over "claredi.com." and VeriSign will
say that "claredi.com." is served by a DNS server located at
98.123.222.2.  Then you go to the Claredi DNS server and ask who has
authority over the "two.claredi.com." domain.  The Claredi DNS server
may say that this domain is handled by a DNS server located at
127.32.43.200, perhaps in another city.  They you go to that DNS server
and ask what is the IP address for "one.two.claredi.com." and finally
you get an IP address.  Just to clarify, al these IP addresses are made
up for this example.

In the dotted name of one.two.claredi.com, the "one" is the host name
and the "two.claredi.com" is the hierarchy of domain names.

It seems like the process is cumbersome, as the query must be repeated
one step at a time, from right to left, in order to get an IP address.
But this happens very fast.  The DNS protocol is very simple and very fast.

And there are some very definite advantages of doing it this way.  For
example, I control the Claredi.COM domain, and there is no centralized
authority at VeriSign that controls my domain.  All that VeriSign knows
is that the Claredi.COM is mine, and that I have a DNS server at a
constant address that responds to queries for the next level down.  By
the same token, I can delegate different sections of my company to other
departments, each with their own DNS server, and they control their own
destiny.  This principle of delegation of authority works VERY well, and
it is very usable in health care, as we will see later.

But, going back to e-mail, the DNS gives not only the IP address of
every host in my "domain", but also other information.  For example, the
DNS server has the "MX" records for my domain.  Here is how it works.

If you want to send a message to [EMAIL PROTECTED], your mailer
has to find the DNS server for one.two.claredi.com using the process
described above.  Once it finds that DNS server it will ask the server
for its MX record.  The MX record will "point" to a machine, either by
IP address or by DNS name, that is in charge of receiving mail for that
domain or host.  For instance, it could point to "mail.claredi.com" or
to "mail.aol.com" as the mail "server" for "one.two.claredi.com".  The
mail server and the DNS server can be separate machines in separate
parts of the world, and they can also be in different domains
themselves.  The mail server will be "listening" for connections in its
SMTP port (port 25) to which any mail sender can connect to send a
message to "Kepa". That is how the mail is delivered to
"[EMAIL PROTECTED]".  From that point, the mail server (perhaps
mail.aol.com) will know how to deliver mail to "Kepa".  Note that for
aesthetic reasons, and just in case there is another "Kepa" out there
;-) the custom is to use names like Kepa.Zubeldia, but the "." is purely
cosmetic and does not mean a hierarchy of names if it is on the left
hand side of the @ sign.  The "." is only a hierarchical domain
separator if it is on the right of the "@" and is purely cosmetic on the
left of the "@" sign.

So, the trick is to have a chain of DNS servers that follow the
hierarchical structure of domain names and host names, as well as a mail
server that is ready to accept messages for the designated domains.

But there is another trick.

Remember the "bang" (or "!") notation?  Some of those paths were more
"expensive", either because they were longer, or at a lower baud rate
(e.g. 300 baud) or went through less reliable systems.  Other paths were
less "expensive", and thus "preferred" for mail delivery.

So the DNS description in the MX record contains both the address of the
mail server as well as a "preference" factor for that server.  And there
can be several mail servers for one domain.  For instance, you could
have a production server with a "preference" factor of say "10", a
backup mail server with a preference factor of "100" and a disaster
server with a preference factor of "1000".  That way the sender can rank
the receiver's preference and make a determination of which of the
servers to use. There could be a variety of factors, such as the servers
being up or down, or the availability of a high speed link to a less
preferred server, or other factors influencing the decision.  The
receiver indicates the preference by using a "preference" factor.  The
"preference" is just an imaginary "cost" to the receiver for using
different delivery mechanisms.   The sender uses those preferences to
make a determination as to how to send the message to the receiver.  It
is good practice for the receiver to have at least two MX records, with
different "preference" factors, in case the primary mail server goes down.

So, let's review this long email.

We have a hierarchical mechanism that allows flexibility in naming
computers, hides IP numbers that are difficult to remember, and allows
delegation of authority over the administration of sections of the
address space.  It is called DNS.

As part of DNS we have a mechanism to designate a server that will
receive a set of transactions (such as electronic mail) by using a
pointer to the server that will get the mail, and with the possibility
to designate multiple servers and even assign a "preference" factor to
each one of them.

I hope that by now you are starting to see the similarity with what we
need in health care.  The clearinghouses could operate the DNS service
on behalf of their provider and payer clients.  The MX records could
actually point to the preferred entry point for each type of
transaction, with the possibility of multiple paths.

For example, ACME insurance could have a HIPAA address of ACME.HIPAA.NET
and under that DNS server have other domains such as 837.ACME.HIPAA.NET,
270.ACME.HIPAA.NET, and 276.ACME.HIPAA.NET. Then, under the
837.ACME.HIPAA.NET DNS server there could be several
MX records listing:
ACME.PAYERS.WEBMD.HIPAA.NET (20) and
NDC.HIPAA.NET (20) and
ACME.CLEARINGHOUSE-A.HIPAA.NET (75) and
2125551212.PHONE.HIPAA.NET (300) which would say
that they can receive transactions directly over the telephone at
2125551212, or though three different clearinghouses.  The best path is
through either WEBMD or NDC, the next choice would be through
clearinghouse "A" and the last choice would be sending direct
transactions to ACME.  I have put the "preference" factors in
parenthesis for readability, but you get the picture.

In fact, this could be used to represent more information in the MX
records, such as an "IP port" or a "protocol" or even, stretching it a
bit, something like xmodem.8005551212.phone.hipaa.net as a direct
connection if we assign the "phone.hipaa.net" to be a magic token.

The infrastructure for all of this is already in place.  The use of DNS
servers, MX records and the "cost" of the MX record is well understood.
   This would be very easy to implement if we agree to do it. And it
lends itself to distributed administration, where a clearinghouse would
have complete control of their routes for all their clients.

It is working very well for the rest of the Internet, for routing email
and there is no reason why it would not work for health care for routing
EDI transactions, as long as we agree on how to do it.

We don't need to limit this model to the use of transactions via email,
or even to the use of the Internet.  The concept of distributed "name
spaces" with distributed management of the different domains can be used
for naming both the trading partners and determining how to reach the
trading partner.  Once we know how to reach a trading partner, then the
appropriate communication program can be invoked (e.g. FTP or Kermit) to
contact that trading partner.

I would like to bring this up for discussion next week in Seattle, or
before next week if you feel so inclined.  This concept uses an existing
distributed infrastructure of DNS servers.  It would be a lot more
elegant to do it with UDDI or LDAP, but the deployment of the
infrastructure will be higher.

Comments?

Kepa

PS: Sorry about the long message...

RE: some history on email addresses and thoughts for a proposal

Reply via email to