Jon Frisby wrote:

The application is payroll/personnel. A programmer is tasked with creating forms for data entry on new employees, including supervisor. If the user doesn't enter a new employee's supervisor, the application accepts it, figuring that it is not yet known, and stores NULL for the field ("not known" use of NULL).

Meanwhile, a payroll programmer has been tasked with writing an application to give the CEO a huge bonus and stock options. To figure out which employee is the CEO, the application looks for the employee with NULL for supervisor ("not applicable" use of NULL).

Suddenly, a large number of new hires are fabulously wealthy. Who screwed up? Answer: the DB designer who didn't specify what NULL meant.



I would disagree here. I'd place the blame squarely with the programmer
who made an *assumption* about the meaning of NULL in the absence of
documentation.


Even if I grant you that, the DB designer should have produced the documentation. I'm willing to place the blame on both people; either one could have avoided the problem. But, the DBD (data administrator, if you prefer) is more likely to have been *aware* of the problem in the first place; it's their job to understand the data and serve as a reference to it. Even if the programmer caught the ambiguity, the DBD would be the person who needs to resolve it -- and who should have done so in the first place. The programmer was probably unaware of the ambiguity, and assumed that the system worked the way they wanted it to and thought it should.

Heck, we can blame it on QA as well. But it's still the DBD who needs to fix it. And it seems to me that the logical primary target for the blame is the person who has to redo their work once the problem is identified.

It's the responsibility of the programmer to understand
the system he or she is writing code for, before running that code.


When you think you understand the system, the tendency is to go no deeper.

If
documentation can't be found, the programmer should have asked the DB
designer. If the DB designer was unavailable the programmer should have
at LAST tested the assumption ("SELECT COUNT(*) FROM employee WHERE
supervisor_id IS NULL" -- there's only one CEO, so if it returns a value


1 the assumption is definitely false, if it returned 0, the assumption


is definitely false, and if it returned exactly 1, the assumption MAY be
true) before mucking with data.

A good programmer would do that, certainly. But again, in doing so, the programmer is working around the deficiencies of the DBD.

Because you know that a given person has no middle name?
To represent "no value", as differentiated from "not known"?



That's an ugly way to make the distinction between A-mark and I-mark.


Why? If it's documented, it's a reasonably intuitive, clear, and probably efficient way.

In most situations, I'd move the relevant column(s) to a separate table,
with a NULL-allowed column in that table and a FK reference back to the
original table.  The absence of a row in this child table indicates
I-mark, and the presence of a row with a NULL in the column indicates an
A-mark.

So in the example I gave you'd set up a separate middle name table, making the middle name in it NULL (indicating A-mark) and have lack of a row in the middle name table be the I-mark? (I like the A/I-mark notation, by the way -- I hadn't seen it before.) I don't see the advantage, although logically it would do the job. It'd be nasty for performance and programmatic complexity, though, especially for a minor value such as middle name.

Or alternatively you could just get a database that has two kinds of
NULLs, specifically defined to represent the distinction you bring up.
( http://www.firstsql.com )


Interesting! I hadn't been aware of any DBMSs that did that. I'm not sure that it's a good idea, actually, because I suspect that there are other shades in meaning that might cause bugs as well. For instance, is "no value" really the same as "inapplicable" in all cases? I have trouble finding an example of this; here's the best I can do -- a password system. A password might be unknown to the system, it might be an empty string (the user needs to explictly enter the fact that the password is '' somehow), or there might be no password, indicating that the user doesn't even want to be prompted for one. I get very suspicious of scenarios where there are more than two possibilities -- it suggests that there might be an unlimited, or at least a large, number, and that not all possibilities have been clearly identified.

Bruce Feist



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]



Reply via email to