Re: [GENERAL] Primary keys for companies and people

Ted Byers Thu, 02 Feb 2006 14:09:29 -0800

----- Original Message -----From: "Leif B. Kristensen" <[EMAIL PROTECTED]>

To: <[email protected]>
Sent: Thursday, February 02, 2006 4:07 AM
Subject: Re: [GENERAL] Primary keys for companies and people

[snip]
I'm very interested to hear what other use in their applications for
holding people and companies.


I've been thinking long and hard about the same thing myself, in
developing my genealogy database. For identification of people, there
seems to be no realistic alternative to an arbitrary ID number.

Still, I'm struggling with the basic concept of /identity/, eg. is the
William Smith born to John Smith and Jane Doe in 1733, the same William

[snip]

I have long been interested in this issue, and it is one that transcends theproblem of IDs in IT. For my second doctorate, I examined this in thecontext of historical investigation, applying numerical classificationtechniques to biographical information that can be extracted from historicaldocuments. It is, I fear, a problem for which only a probabilistic answercan be obtained in most historical cases. For example, there was aneleventh century viking king Harold who as a teenager was part of hiscousin's court, and then found it necessary to flee to Kiev when his cousinfound hiimself on the losing side of a rebellion. He then made his way intothe Byzantine empire and served the emperor as a mercenary through much ofthe mediterranean, finally returning in fame and glory to Norway where hefound another relative (a nephew IIRC) on the throne, which he inheritedabout a year after his return. Impagine yourself as a historian trying towrite his biography. You'd find various documents all over the westernworld (as known in the viking age) written in a variety of languages, andusing different names to refer to him. It isn't an easy task to determinewhich documents refer specifically to him. And to make things even moreinteresting, many documents refer to a given person only by his officialtitle, and in other cases, the eldest son of each generation was given thesame name as his father.

In my own case, in the time I was at the University of Toronto, I know offour other men who had precisely the same name I have. I know this fromstrange phone calls from faculty I never studied with about assignments andexaminations for courses I had never taken. In each case, the professorchecked again with the university's records department and found the correctstudent. The last case was particularly disturbing since in that case,things were a bit different in that I had taken a graduate course with theprofessor in question, and he stopped me on campus and asked about anassignment for a given advanced undergraduate course that I had not taken,but my namesake had. What made this disturbing is that not only did theother student carry my name, but he also looked enough like me that ourprofessor could mistake me for him on campus! I can only hope that he is awell behaved, law abiding citizen! The total time period in question was 18years. In general, the problem only gets more challenging as the length,and as the age, of the historical period considered increases.

The point is, not only are the combinations of family and given names notreliably unique, even certain biological data, such as photographs of thehuman face, not adequately unique. Even DNA fingerprints, putatively thebest available biometric technology, are not entirely reliable since eventhat can not distinguish between identical twins, and at the same time,there can be, admittedly extremely rare as far as we know, developmentalanomalies resulting in a person being his own twin (this results from twinfetuses merging, with the consequence that the resulting person has someorgan systems from one of the original fetuses and some from the other).For historical questions, I don't believe one can get any better thaninference based on a balance of probabilities. A geneologist has no optionbut to become an applied statistician! For purposes of modern investigationor for the purpose of modern business, one may do better through anappropriate use of a combination of technologies. This is a hard problem,even with the use of the best available technologies and especially giventhe current problems associated with identity theft.

For software developers in general, and database developers in particular,there are several distinct questions to consider.:

1) How does one reliably determine identity to begin with, and then use thatidentity with whatever technology one might use to represent it?

2) How good does this technology, and identification process need to be? Inother words, how does the cost of a mistake (esp. in identification) relateto the increased cost of using better technology? In this analysis, oneneeds to consider both the cost of such a mistake to the person identifiedor misidentified, and the cost to the owner or user of the application ordatabase. Who will suffer if a mistake is made? Will, or can, bad thingshappen if a given person ends up with more than one ID? What is the cost,and who bears this cost, if more than one person can use the same ID?

3) Can we construct a suite of best practices from which we can select givenspecific functional or non-functional constraints as developed for ourapplication? Included with this question is consideration of protection ofsensitive data in general, and protection of data that might conceivably beused by cyber-criminals in activity related to identity theft, or to usesensitive data to the harm of the person so identified.

4) How is biometric data best stored and searched for use in authenticationprocesses within an arbitrary application? I guess this question assumesthat biometric data needs to be used in an authentication request, and itoccurs to me that for some applications, it may be sufficient to usebiometric data in creation of a unique user id, and subsequently may beneeded only for certain sensitive processes or resources.

My own feeling is that some options are very easy, and some of these areadequate for some situations, but that there are others that may be neededdepending on the sentivity of the data in question or on the potential costto one or more parties to a given business process. I expect to beconsidering these issues extensively over the next few years since they arerelevant to some of the web applications I am designing. Any insights you,or others, may have on these questions would be greatly appreciated.


Cheers,

Ted

R.E. (Ted) Byers, Ph.D., Ed.D.
R & D Decision Support Solutions

http://www.randddecisionsupportsolutions.com/



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Primary keys for companies and people

Reply via email to