Re: Accountability with MySQL

2006-03-16 Thread Martijn Tonies

From: James Harvard
 At 7:28 pm +0100 14/3/06, Martijn Tonies wrote:
 Given that NULLs basically means the absence of a value (eg: unknown),
you shouldn't be storing NULLs.

 At 1:49 pm -0500 14/3/06, Rhino wrote:
 A null means unknown or not applicable and is a perfectly valid value
to use in many, many situations.


 It's not often you can say that two people are quite literally arguing
about nothing!

:-)

 James Harvard

 (... being flippant because the actual arguments started going over my
head about half-a-dozen posts ago...)


We're arguing about whether or not your database design
(as stored on disk) should contain NULLs.

IMO: no, or at least as few as possible.

Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-16 Thread Mark Leith

Martijn Tonies wrote:


We're arguing about whether or not your database design
(as stored on disk) should contain NULLs.

IMO: no, or at least as few as possible.
 


I believe the answer really is - *it depends*.

You are both right, really. Martijn, yes, according to academia and 
proper database design, you should not really be using NULLs, (Rhino, 
see the writings of Chris Date and Fabian Pascal somewhere like 
www.dbdebunk.com, www.thethirdmanifesto.com etc.). Here's a good one:


http://www.dbdebunk.citymax.com/page/page/1396241.htm

Of course, you're still going to pay $10-$15 for this..

However, what is good for academia is *not* always good for the real 
world. Why store Unknown in a column for a table that could store 
millions, or billions, of rows, when you could simply allow a NULL (and 
save a *bunch* of disk space across your schemas) in the real world. If 
you can allow for the use of the NULL in the application, and *not* try 
to interpret it 5 ways from Sunday, then, in my honest opinion, a NULL 
would be acceptable.


Short answer, as long as you do it without trying to get your 
application to handle them in many different ways, I don't see a great 
problem (when weighing them against extra costs, in both space and 
speed). If you can get away with not using NULL as well, then great.


This argurment will, however, carry on between academia and the real 
world for at least the next 4000 years. :)


Best regards

Mark

--
Mark Leith, Support Engineer
MySQL AB, Worcester, England, www.mysql.com
Are you MySQL certified?  www.mysql.com/certification


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-16 Thread Martijn Tonies
From: Mark Leith
 Martijn Tonies wrote:
 
 We're arguing about whether or not your database design
 (as stored on disk) should contain NULLs.
 
 IMO: no, or at least as few as possible.
   
 
 I believe the answer really is - *it depends*.
 
 You are both right, really. Martijn, yes, according to academia and 
 proper database design, you should not really be using NULLs, (Rhino, 
 see the writings of Chris Date and Fabian Pascal somewhere like 
 www.dbdebunk.com, www.thethirdmanifesto.com etc.). Here's a good one:
 
 http://www.dbdebunk.citymax.com/page/page/1396241.htm

Got that one ;)

The book I quoted is from Date, btw, and I enjoy reading it ...
 
 Of course, you're still going to pay $10-$15 for this..
 
 However, what is good for academia is *not* always good for the real 
 world. Why store Unknown in a column for a table that could store 
 millions, or billions, of rows, when you could simply allow a NULL (and 
 save a *bunch* of disk space across your schemas) in the real world. 

Well, the question still is if you should store unknown at all ;)

Not according to Date: you should store what is known. See the remarks
about the true propositions, from which relational databases are derived
(but you probably know that).

If 
 you can allow for the use of the NULL in the application, and *not* try 
 to interpret it 5 ways from Sunday, then, in my honest opinion, a NULL 
 would be acceptable.
 
 Short answer, as long as you do it without trying to get your 
 application to handle them in many different ways, I don't see a great 
 problem (when weighing them against extra costs, in both space and 
 speed). If you can get away with not using NULL as well, then great.

I guess I can agree with that one.

 This argurment will, however, carry on between academia and the real 
 world for at least the next 4000 years. :)

Until it gets replaced by a better model, but I don't expect that anytime
soon ;-)

Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-16 Thread James Harvard
But (in at least some situations) is not appropriate to record that you know 
that you don't have a value?

I think the words of Donald Rumsfeld are appropriate here:
There are known knowns. These are things we know that we know. There are known 
unknowns. That is to say, there are things that we know we don't know. But 
there are also unknown unknowns. There are things we don't know we don't know.

At 12:02 pm +0100 16/3/06, Martijn Tonies wrote:
Well, the question still is if you should store unknown at all ;)

Not according to Date: you should store what is known.
snip



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-16 Thread Martijn Tonies


 But (in at least some situations) is not appropriate to record that you
know that you don't have a value?


Sure it would. It it's needed.

But the answer doesn't have to be a NULL. Remember
the example about recording the fact that a student did
not make a test for whatever reason?

It was then said you could store null for the test score.

I then replied that that was a bit backward IMO, cause
you're using the test scores table to store that a student
did not take a test, which is wrong, given that each row
in a table should mean exactly the same thing (theory
knocking at the door once again) and that makes a design
more clear.

 I think the words of Donald Rumsfeld are appropriate here:
 There are known knowns. These are things we know that we know. There are
known unknowns. That is to say, there are things that we know we don't know.
But there are also unknown unknowns. There are things we don't know we don't
know.


I think we'll have a hard time storing Donalds knows and
unknowns in a database... :-)

 At 12:02 pm +0100 16/3/06, Martijn Tonies wrote:
 Well, the question still is if you should store unknown at all ;)
 
 Not according to Date: you should store what is known.
 snip

Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-16 Thread Alec . Cawley
Martijn Tonies [EMAIL PROTECTED] wrote on 16/03/2006 11:02:32:

 Well, the question still is if you should store unknown at all ;)
 
 Not according to Date: you should store what is known. See the remarks
 about the true propositions, from which relational databases are 
derived
 (but you probably know that).

As someone totally unread in the theory of databases, that seems unduly 
puritanical. I assume that what Date would propose is that you have 
another table (related by master key) in which, if you do not know 
something, you do not enter it. But this means that if you have 10 
different pieces of potentially but not necessarily available information 
about a single master record (e.g. a person), you have to do a 10-way join 
in order to retrieve all the information about them. Replacing a 
theoretically ugly null flag with a 10 way join strikes me, as an engineer 
rather than a theoretician, the wrong side of the elegance/practicality 
trade-off.

Alec



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Coalesce (was: Re: Accountability with MySQL)

2006-03-16 Thread Martijn Tonies

  Ah well, SQL (these days) provides plenty of functions to avoid
  the problems. Coalesce being one of them...
 
  How often do people write:
 
  (probably not valid MySQL)
  select firstname || coalesce(middlename || ' ', ' ') || lastname
  from ...
 
 Your syntax isn't quite correct; coalesce doesn't use a concatenation 
 symbol, it uses a list of arguments like this:

 coalesce(Salary, commission, bonus)

Yes, the reason I was using || ' ' there was when middle
name was not null, it would already add a space, and
when it's null, it would just return a space.
Given that NULL || ' ' returns NULL anyways, this would
be give me the results for all cases, I think :)

--
Martijn Tonies


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-16 Thread Martijn Tonies
From: [EMAIL PROTECTED]
 Martijn Tonies [EMAIL PROTECTED] wrote on 16/03/2006 11:02:32:

  Well, the question still is if you should store unknown at all ;)
 
  Not according to Date: you should store what is known. See the remarks
  about the true propositions, from which relational databases are
 derived
  (but you probably know that).

 As someone totally unread in the theory of databases, that seems unduly
 puritanical. I assume that what Date would propose is that you have
 another table (related by master key) in which, if you do not know
 something, you do not enter it. But this means that if you have 10
 different pieces of potentially but not necessarily available information
 about a single master record (e.g. a person), you have to do a 10-way join
 in order to retrieve all the information about them. Replacing a
 theoretically ugly null flag with a 10 way join strikes me, as an engineer
 rather than a theoretician, the wrong side of the elegance/practicality
 trade-off.

Using NULLs as well as de-normalization brings the risk of
integrity problems to your storage, storing what is right is only
a good thing.

And when it comes to having to writing JOINs for all your queries,
lo and behold, I bring you the wonder of the VIEW.

;-)



Martijn Tonies
Upscene Productions
http://www.upscene.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-16 Thread Alec . Cawley
Martijn Tonies [EMAIL PROTECTED] wrote on 16/03/2006 11:32:45:

 From: [EMAIL PROTECTED]
  Martijn Tonies [EMAIL PROTECTED] wrote on 16/03/2006 11:02:32:
 
   Well, the question still is if you should store unknown at all ;)
  
   Not according to Date: you should store what is known. See the 
remarks
   about the true propositions, from which relational databases are
  derived
   (but you probably know that).
 
  As someone totally unread in the theory of databases, that seems 
unduly
  puritanical. I assume that what Date would propose is that you have
  another table (related by master key) in which, if you do not know
  something, you do not enter it. But this means that if you have 10
  different pieces of potentially but not necessarily available 
information
  about a single master record (e.g. a person), you have to do a 10-way 
join
  in order to retrieve all the information about them. Replacing a
  theoretically ugly null flag with a 10 way join strikes me, as an 
engineer
  rather than a theoretician, the wrong side of the 
elegance/practicality
  trade-off.
 
 Using NULLs as well as de-normalization brings the risk of
 integrity problems to your storage, storing what is right is only
 a good thing.
 
 And when it comes to having to writing JOINs for all your queries,
 lo and behold, I bring you the wonder of the VIEW.
 
 ;-)

The VIEW eases the syntax, but does it do anything for performance? Surely 
it must be much slower to read 11 different tables (Master record 
containing all NOT NULL fields, and 10 slave records which may or may not 
contain relevant fields)? Ignoring caching, you are going to have at least 
one disk access for every NULL field (index lookup which fails) and two 
for every non NULL field (index lookup, data lookup) for every null field. 
This means that you have multiplied your number of disk accesses (ignoring 
caching, again) by 6-11 times (assuming the master record takes two disk 
accesses). That again seems a very high price to pay for theoretical 
elegance.

Alec





-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-16 Thread Martijn Tonies

   in order to retrieve all the information about them. Replacing a
   theoretically ugly null flag with a 10 way join strikes me, as an
 engineer
   rather than a theoretician, the wrong side of the
 elegance/practicality
   trade-off.
 
  Using NULLs as well as de-normalization brings the risk of
  integrity problems to your storage, storing what is right is only
  a good thing.
 
  And when it comes to having to writing JOINs for all your queries,
  lo and behold, I bring you the wonder of the VIEW.
 
  ;-)

 The VIEW eases the syntax, but does it do anything for performance? Surely
 it must be much slower to read 11 different tables (Master record
 containing all NOT NULL fields, and 10 slave records which may or may not
 contain relevant fields)? Ignoring caching, you are going to have at least
 one disk access for every NULL field (index lookup which fails) and two
 for every non NULL field (index lookup, data lookup) for every null field.
 This means that you have multiplied your number of disk accesses (ignoring
 caching, again) by 6-11 times (assuming the master record takes two disk
 accesses). That again seems a very high price to pay for theoretical
 elegance.

Keeping your design clean is hardly sacrificing anything.

The most important feature of your database and the database
design is the integrity of your data. Period.

De-normalization, NULLs et all make this task much harder,
as well as understanding your design for the guy that takes
over your job ...

Performance comes next, first thing to do there is getting
your indices straight.

Next thing is optimizing the slow stuff - if you can go around
by keeping, for example, (these much smaller slave) tables
in memory, and this works out fine, then do so.

If all your queries are optimized, the next thing to do is
to throw hardware at it. This is very cheap comparing to
your hourly rate. ;-)

Obviously, if your queries and design are valid and the
database system is still slow, you can also complain with
the creators of the database system - performance, after
all, is a physical attribute, not a database design issue perse.

During the years, many database systems have seen many
improvements to bring better performance to its users.

Only recently, for example, the Firebird DBMS implemented
a new on-disk structure for it's indices greatly enhancing
performance for indices with a large number of duplicates. In
the past, people said: you have to modify your design such
and so, with all risks included. Now, you can keep your design
as clear as possible and enjoy greater performance, just because
someone thought of something clever. If no-one ever complained,
this particular piece of code wouldn't have changed.

Martijn Tonies
Database Workbench - tool for InterBase, Firebird, MySQL, Oracle  MS SQL
Server
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-16 Thread James Harvard
Also, surely if you left join to a table then if there's no matching row you 
get a null value in your result set. So I'm not sure what the difference is 
between getting that null because you store it in a column in the primary 
table, and deriving it by left joining to a related table with no matching rows.
James Harvard

At 11:27 am + 16/3/06, [EMAIL PROTECTED] wrote:
As someone totally unread in the theory of databases, that seems unduly 
puritanical. I assume that what Date would propose is that you have another 
table (related by master key) in which, if you do not know something, you do 
not enter it. But this means that if you have 10 different pieces of 
potentially but not necessarily available information about a single master 
record (e.g. a person), you have to do a 10-way join in order to retrieve all 
the information about them. Replacing a theoretically ugly null flag with a 10 
way join strikes me, as an engineer rather than a theoretician, the wrong side 
of the elegance/practicality trade-off.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-16 Thread Martijn Tonies
James,


 Also, surely if you left join to a table then if there's no matching row
you get a null value in your result set. So I'm not sure what the difference
is between getting that null because you store it in a column in the primary
table, and deriving it by left joining to a related table with no matching
rows.

Handling nulls in your resultsets makes a heck of a
lot more sense than also storing these NULLs :-)



Martijn Tonies
Upscene Productions
http://www.upscene.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



True Propositions (was Re: Accountability with MySQL)

2006-03-16 Thread Martijn Tonies
   Read the literature on how to design databases. What you do
   is storing true propositions. That is, each attribute defines a
   certain true proposition. For example:
  
   Employee Martijn has Employee# 14.
  
   You should be able to derive these sentences from every row.
   Putting a NULL or N/A in there fails to meet this requirement.
  
  Perhaps my theory is a bit rusty but I have never heard of this true
  propositions business before. Do you have a citation where I can read
  about
  this?
 
  I would certainly recommend this one:
 
http://www.amazon.com/exec/obidos/ASIN/0321197844/databasede095-20/103-0880452-0854264?creative=327641camp=14573adid=0VQ1KDVJ56MV7389RPEQlink_code=as1
 
 Hmm, I had hoped for a citation to a free online document that I could
read.
 I don't fancy the thought of spending $100+ US on a book just to argue the
 fine points of database design

 Then again, I didn't ask for a link to a free source of this information
so
 I can't fault you for your link, it (presumably) answers the question.

You will find the true-thingy also in here:
http://www.dbdebunk.com/page/page/2928212.htm

Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Accountability with MySQL

2006-03-16 Thread Gordon
And now we are down to reality. This is a MySQL list. Views are a wonderful
thing for creating an isolation layer between the application and the
database. However, MySQL's current implementation makes it extremely
difficult in many cases to avoid full table scans when you define the
logical view. {Not poking at the development team I think 5.x is a huge step
forward}. 

So you can't hide the underlying structure currently in MySQL. Like all
other things we have to commingle best practices with pragmatism. In other
words do what makes sense while at the same time acquiring a good knowledge
of fundamentals. {Not just slap name address phone1  phone n in some
table with an auto-increment ID field and say your done with the design.}

By the way, Date and Codd {both way above me in math and theory} had this
discussion ~20 years ago over a 6 month period in pages and pages of
Database magazine. In the end I believe they agreed to disagree.

-Original Message-
From: Martijn Tonies [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 16, 2006 5:33 AM
To: mysql@lists.mysql.com
Subject: Re: Accountability with MySQL

From: [EMAIL PROTECTED]
 Martijn Tonies [EMAIL PROTECTED] wrote on 16/03/2006 11:02:32:

  Well, the question still is if you should store unknown at all ;)
 
  Not according to Date: you should store what is known. See the remarks
  about the true propositions, from which relational databases are
 derived
  (but you probably know that).

 As someone totally unread in the theory of databases, that seems unduly
 puritanical. I assume that what Date would propose is that you have
 another table (related by master key) in which, if you do not know
 something, you do not enter it. But this means that if you have 10
 different pieces of potentially but not necessarily available information
 about a single master record (e.g. a person), you have to do a 10-way join
 in order to retrieve all the information about them. Replacing a
 theoretically ugly null flag with a 10 way join strikes me, as an engineer
 rather than a theoretician, the wrong side of the elegance/practicality
 trade-off.

Using NULLs as well as de-normalization brings the risk of
integrity problems to your storage, storing what is right is only
a good thing.

And when it comes to having to writing JOINs for all your queries,
lo and behold, I bring you the wonder of the VIEW.

;-)



Martijn Tonies
Upscene Productions
http://www.upscene.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



On MySQL Views (Re: Accountability with MySQL)

2006-03-16 Thread Martijn Tonies
Hello Gordon,

 And now we are down to reality. This is a MySQL list. Views are a
wonderful
 thing for creating an isolation layer between the application and the
 database. However, MySQL's current implementation makes it extremely
 difficult in many cases to avoid full table scans when you define the
 logical view. {Not poking at the development team I think 5.x is a huge
step
 forward}.

Can you elaborate on that?

Any real world experience with MySQL views?

The thing that annoys the blo*** cr** out of me with the
current implementation is the maintainability of views -
as the complete query structure is lost once you've created
a view, so you need to run/modify it from SQL statement
all the time :-(

Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-15 Thread Rhino


- Original Message - 
From: Martijn Tonies [EMAIL PROTECTED]

To: mysql@lists.mysql.com
Sent: Wednesday, March 15, 2006 2:53 AM
Subject: Re: Accountability with MySQL



  Basics of database design: store what you know.
 
  Given that NULLs basically means the absence of a value
  (eg: unknown), you shouldn't be storing NULLs.
 
 Nonsense!!

 That's a bold statement ...

 That's simply wrong. A null means unknown or not applicable and is a

 Ah, so now things become clear, a NULL actually can mean
 two things? So much for clarity then...

Come on; that's not fair. Unknown and not applicable are more like
different senses of the same thing, not two opposite things.


Read the literature on how to design databases. What you do
is storing true propositions. That is, each attribute defines a
certain true proposition. For example:

Employee Martijn has Employee# 14.

You should be able to derive these sentences from every row.
Putting a NULL or N/A in there fails to meet this requirement.

Perhaps my theory is a bit rusty but I have never heard of this true 
propositions business before. Do you have a citation where I can read about 
this?


Frankly, I'm skeptical about your interpretation of this idea. While you 
clearly don't want false information in a database, it isn't false to say 
that student such-and-such's exam mark or such-and-such an employee's 
termination date is unknown or not applicable. It _would_ be false to store 
a grade of 0 for a student who didn't take the exam because it implies that 
he got every question wrong when in fact he didn't write the test at all; 
that would be an example of 0 having two contradicatory meanings. Given a 
phobia about nulls, it is a reasonable design to put students who don't take 
the test in a separate table but it's not the ONLY way to handle the 
situation: a null to indicate a student who didn't take the test is 
perfectly reasonable too.




 perfectly valid value to use in many, many situations. Two classic
 examples:

 Let me first state that there's a difference between storing NULLs
 and handling NULLs in your result.

Yes, of  course there is: so what? You said NULLs should never be

_stored_;
that's what I'm responding to. Using NULLs is not very _difficult_ 
anyway,

just a bit tedious because it is another case to handle.


There's no point in storing what you don't know.

But you _do_ know something: that the student didn't take the exam. That is 
a piece of true knowledge all by itself.



 So, let's debunk these two classic examples...

Please note that I did not say that you MUST use NULLs, just that they

were

perfectly valid to use in a design.


Depends on who you're asking ...

A few years ago, I was working at a company that had developed
an application with an Oracle database with about 400 tables, nothing
too large.

Plenty of tables had NULLs, not because of missing business data,
but rather used internally in the system.


I'm not sure what you mean by that phrase used internally in the system.


The number of problems we had with them, I cannot count on my
two hands, nor can I count them on the hands of all other developers
(5) for that project.

How about enumerating a few of them here? I've dealt with relational 
databases in several capacities for over 20 years and I've never found nulls 
to be a big problem.



Just last week, I spoke to a guy who used to be my collegue there
and we discussed the design a bit (he still does database design and
development for a large company, on Sybase with over 4000
deployed systems) and we both agreed that using NULLs in most
cases as part of the initial design was causing us more problems
then helping us.


Again, how about listing some of these problems?


 1. You have a table containing test scores. Some students get every
 question
 on a given test wrong. Some students don't write the test at all,

perhaps

 because they were sick. In this scenario, I think the correct thing to
 store
 is 0 for the students who got every question wrong and NULL for the
 students
 who didn't write the test.

 IMO, the better thing to do in this particular case is to NOT store a

test

 result for the students that did not make the test.

Okay, that might be acceptable, if it doesn't cause you to lose track of

the

student altogether. But if this table was the only one that even recorded
the _existence_ of the student, you'd have a problem;


Indeed, then I would have a problem, cause you cannot derive from
the test results table that a student exists. This is a rather silly
statement.
If the test results table should ALSO store the existence of a student,
you're design is wrong :-)


if someone tried to
verify that the student had attended this school, you wouldn't know that
they had.


In your own words: nonsense.

Each table should store what is was designed to store. If I would
have to know if a student was enlisted in a certain course or would
be attending this school, I would not be using

Re: Accountability with MySQL

2006-03-15 Thread Martijn Tonies
Hello,

(please read all before replying)

Basics of database design: store what you know.
   
Given that NULLs basically means the absence of a value
(eg: unknown), you shouldn't be storing NULLs.
   
   Nonsense!!
  
   That's a bold statement ...
  
   That's simply wrong. A null means unknown or not applicable and is
a
  
   Ah, so now things become clear, a NULL actually can mean
   two things? So much for clarity then...
  
  Come on; that's not fair. Unknown and not applicable are more like
  different senses of the same thing, not two opposite things.
 
  Read the literature on how to design databases. What you do
  is storing true propositions. That is, each attribute defines a
  certain true proposition. For example:
 
  Employee Martijn has Employee# 14.
 
  You should be able to derive these sentences from every row.
  Putting a NULL or N/A in there fails to meet this requirement.
 
 Perhaps my theory is a bit rusty but I have never heard of this true
 propositions business before. Do you have a citation where I can read
about
 this?

I would certainly recommend this one:
http://www.amazon.com/exec/obidos/ASIN/0321197844/databasede095-20/103-0880452-0854264?creative=327641camp=14573adid=0VQ1KDVJ56MV7389RPEQlink_code=as1

 Frankly, I'm skeptical about your interpretation of this idea. While you
 clearly don't want false information in a database, it isn't false to say
 that student such-and-such's exam mark or such-and-such an employee's
 termination date is unknown or not applicable. It _would_ be false to
store
 a grade of 0 for a student who didn't take the exam because it implies
that
 he got every question wrong when in fact he didn't write the test at all;

Agreed.

 that would be an example of 0 having two contradicatory meanings. Given a
 phobia about nulls, it is a reasonable design to put students who don't
take
 the test in a separate table but it's not the ONLY way to handle the
 situation: a null to indicate a student who didn't take the test is
 perfectly reasonable too.

Disagreed. Let me explain further:
In this case, a null means (according to you and in your design)
that the student did not take the test.

It could also mean that we lots it's grade for now, and that we
want to fill it in later, as in: unknown, which is what NULL also
means (your own words).

In another case, null would be the employee isn't terminated yet,
but could be in the future.

That's the problems with nulls - there's no actual defined
meaning. You can never say:

This table holds test results, a row will mean:
Student with StudentID 123 made test #12 and scored a 4

Cause you can have rows that can mean this:
Student with StudentID 123 made test #12 and scored a
yet unknown score

or

Student with StudentID 123 made test #12 and did not make
the test

or
Student with StudentID 123 and test #12 doesn't apply to him


In short: NULL is ambigious.

   perfectly valid value to use in many, many situations. Two classic
   examples:
  
   Let me first state that there's a difference between storing NULLs
   and handling NULLs in your result.
  
  Yes, of  course there is: so what? You said NULLs should never be
  _stored_;
  that's what I'm responding to. Using NULLs is not very _difficult_
  anyway,
  just a bit tedious because it is another case to handle.
 
  There's no point in storing what you don't know.
 
 But you _do_ know something: that the student didn't take the exam. That
is
 a piece of true knowledge all by itself.

But that's not what the rows in the table would mean - see above.
In the relational model, the idea is to store true propositions - that is,
each row means the same thing, but with different attributes. See above.

   So, let's debunk these two classic examples...
  
  Please note that I did not say that you MUST use NULLs, just that they
  were
  perfectly valid to use in a design.
 
  Depends on who you're asking ...
 
  A few years ago, I was working at a company that had developed
  an application with an Oracle database with about 400 tables, nothing
  too large.
 
  Plenty of tables had NULLs, not because of missing business data,
  but rather used internally in the system.
 
 I'm not sure what you mean by that phrase used internally in the system.

Missing business data, eg: no middle name. Some people would
put in a NULL in there as well, while NULL can also mean: I do
not KNOW this persons middle name. Ambiguity, see above.

  The number of problems we had with them, I cannot count on my
  two hands, nor can I count them on the hands of all other developers
  (5) for that project.
 
 How about enumerating a few of them here? I've dealt with relational
 databases in several capacities for over 20 years and I've never found
nulls
 to be a big problem.

Not big, but a problem... :-)

Do you often go around looking for (little) problems while you
can easily avoid them?

  Just last week, I spoke to a guy who used to be my collegue there
  and we discussed the design 

Re: Accountability with MySQL

2006-03-15 Thread Rhino


- Original Message - 
From: Martijn Tonies [EMAIL PROTECTED]

To: mysql@lists.mysql.com
Sent: Wednesday, March 15, 2006 10:47 AM
Subject: Re: Accountability with MySQL



Hello,

(please read all before replying)


   Basics of database design: store what you know.
  
   Given that NULLs basically means the absence of a value
   (eg: unknown), you shouldn't be storing NULLs.
  
  Nonsense!!
 
  That's a bold statement ...
 
  That's simply wrong. A null means unknown or not applicable and 
  is

a

 
  Ah, so now things become clear, a NULL actually can mean
  two things? So much for clarity then...
 
 Come on; that's not fair. Unknown and not applicable are more like
 different senses of the same thing, not two opposite things.

 Read the literature on how to design databases. What you do
 is storing true propositions. That is, each attribute defines a
 certain true proposition. For example:

 Employee Martijn has Employee# 14.

 You should be able to derive these sentences from every row.
 Putting a NULL or N/A in there fails to meet this requirement.

Perhaps my theory is a bit rusty but I have never heard of this true
propositions business before. Do you have a citation where I can read

about

this?


I would certainly recommend this one:
http://www.amazon.com/exec/obidos/ASIN/0321197844/databasede095-20/103-0880452-0854264?creative=327641camp=14573adid=0VQ1KDVJ56MV7389RPEQlink_code=as1

Hmm, I had hoped for a citation to a free online document that I could read. 
I don't fancy the thought of spending $100+ US on a book just to argue the 
fine points of database design


Then again, I didn't ask for a link to a free source of this information so 
I can't fault you for your link, it (presumably) answers the question.




Frankly, I'm skeptical about your interpretation of this idea. While you
clearly don't want false information in a database, it isn't false to say
that student such-and-such's exam mark or such-and-such an employee's
termination date is unknown or not applicable. It _would_ be false to

store

a grade of 0 for a student who didn't take the exam because it implies

that

he got every question wrong when in fact he didn't write the test at all;


Agreed.


that would be an example of 0 having two contradicatory meanings. Given a
phobia about nulls, it is a reasonable design to put students who don't

take

the test in a separate table but it's not the ONLY way to handle the
situation: a null to indicate a student who didn't take the test is
perfectly reasonable too.


Disagreed. Let me explain further:
In this case, a null means (according to you and in your design)
that the student did not take the test.


Yes.


It could also mean that we lots it's grade for now, and that we
want to fill it in later, as in: unknown, which is what NULL also
means (your own words).

In another case, null would be the employee isn't terminated yet,
but could be in the future.

That's the problems with nulls - there's no actual defined
meaning.


I agree that the meaning is not a single, very narrow thing like 'zero' or 
'unknown' in every single case. You have to put a null into some kind of 
context. But that context is usually fairly obvious, at least in my opinion.



You can never say:

This table holds test results, a row will mean:
Student with StudentID 123 made test #12 and scored a 4

Cause you can have rows that can mean this:
Student with StudentID 123 made test #12 and scored a
yet unknown score

or

Student with StudentID 123 made test #12 and did not make
the test

or
Student with StudentID 123 and test #12 doesn't apply to him

If the exact reason that a null is in a particular row is particularly 
important, you can always define an extra column that describes the reason 
for the null. A could mean the score hadn't been entered yet because the 
exam hadn't taken place yet, B could mean the test has already taken place 
but the student didn't write the exam, C could mean that student has no 
score because he was excused from the test for some reason. I would agree 
that this is cumbersome but I think you'd have a very similar problem with 
your design: you would have put some people in a separate table if they 
didn't write the test but you'd have to identify the reason that the test 
wasn't written too, if you cared about that, or have separate tables for 
students who failed to attend the test and students who were excused from 
the test. So I don't see that you're a lot farther ahead with your design.


In fact, your design could be _more_ complicated. In your design a program 
couldn't simply look up the student by his ID in a single table and then 
react to a null; the program would have to look in the main Grades table 
first, then, if it didn't find the student there, would have to look in the 
table that records people who didn't write the test because they were sick. 
If the row wasn't found there, it would have to look in the table that 
records people who were

Re: Accountability with MySQL

2006-03-15 Thread Stephen Cook
A commonly used example (at least at my last 3 jobs) would be a table of 
demographics for people (whether they be employees, clients, whatever).


You can have one table and allow NULLs for some of the fields (id, 
LastName, FirstName, MiddleInitial, Title, NameSuffix, AddressLine1, 
AddressLine2, City, State, Zip, Phone1, Phone2, Fax, Pager), or you can 
have a separate table for each of the nullable fields (I count 8 in 
addition to the base demographics table).


I am not a master DBA but it seems kind of brain damaged to insist on 
using 9 tables left joined together every time I need a line of 
demographics.



Martijn Tonies wrote:

Hello,

(please read all before replying)



Basics of database design: store what you know.

Given that NULLs basically means the absence of a value
(eg: unknown), you shouldn't be storing NULLs.



Nonsense!!


That's a bold statement ...



That's simply wrong. A null means unknown or not applicable and is


a


Ah, so now things become clear, a NULL actually can mean
two things? So much for clarity then...



Come on; that's not fair. Unknown and not applicable are more like
different senses of the same thing, not two opposite things.


Read the literature on how to design databases. What you do
is storing true propositions. That is, each attribute defines a
certain true proposition. For example:

Employee Martijn has Employee# 14.

You should be able to derive these sentences from every row.
Putting a NULL or N/A in there fails to meet this requirement.



Perhaps my theory is a bit rusty but I have never heard of this true
propositions business before. Do you have a citation where I can read


about


this?



I would certainly recommend this one:
http://www.amazon.com/exec/obidos/ASIN/0321197844/databasede095-20/103-0880452-0854264?creative=327641camp=14573adid=0VQ1KDVJ56MV7389RPEQlink_code=as1



Frankly, I'm skeptical about your interpretation of this idea. While you
clearly don't want false information in a database, it isn't false to say
that student such-and-such's exam mark or such-and-such an employee's
termination date is unknown or not applicable. It _would_ be false to


store


a grade of 0 for a student who didn't take the exam because it implies


that


he got every question wrong when in fact he didn't write the test at all;



Agreed.



that would be an example of 0 having two contradicatory meanings. Given a
phobia about nulls, it is a reasonable design to put students who don't


take


the test in a separate table but it's not the ONLY way to handle the
situation: a null to indicate a student who didn't take the test is
perfectly reasonable too.



Disagreed. Let me explain further:
In this case, a null means (according to you and in your design)
that the student did not take the test.

It could also mean that we lots it's grade for now, and that we
want to fill it in later, as in: unknown, which is what NULL also
means (your own words).

In another case, null would be the employee isn't terminated yet,
but could be in the future.

That's the problems with nulls - there's no actual defined
meaning. You can never say:

This table holds test results, a row will mean:
Student with StudentID 123 made test #12 and scored a 4

Cause you can have rows that can mean this:
Student with StudentID 123 made test #12 and scored a
yet unknown score

or

Student with StudentID 123 made test #12 and did not make
the test

or
Student with StudentID 123 and test #12 doesn't apply to him


In short: NULL is ambigious.



perfectly valid value to use in many, many situations. Two classic


examples:

Let me first state that there's a difference between storing NULLs
and handling NULLs in your result.



Yes, of  course there is: so what? You said NULLs should never be


_stored_;


that's what I'm responding to. Using NULLs is not very _difficult_
anyway,
just a bit tedious because it is another case to handle.


There's no point in storing what you don't know.



But you _do_ know something: that the student didn't take the exam. That


is


a piece of true knowledge all by itself.



But that's not what the rows in the table would mean - see above.
In the relational model, the idea is to store true propositions - that is,
each row means the same thing, but with different attributes. See above.



So, let's debunk these two classic examples...



Please note that I did not say that you MUST use NULLs, just that they


were


perfectly valid to use in a design.


Depends on who you're asking ...

A few years ago, I was working at a company that had developed
an application with an Oracle database with about 400 tables, nothing
too large.

Plenty of tables had NULLs, not because of missing business data,
but rather used internally in the system.



I'm not sure what you mean by that phrase used internally in the system.



Missing business data, eg: no middle name. Some people would
put in a NULL in there as well, while NULL can also mean: I do
not KNOW this 

Re: Accountability with MySQL

2006-03-15 Thread James Harvard
At 7:28 pm +0100 14/3/06, Martijn Tonies wrote:
Given that NULLs basically means the absence of a value (eg: unknown), you 
shouldn't be storing NULLs.

At 1:49 pm -0500 14/3/06, Rhino wrote:
A null means unknown or not applicable and is a perfectly valid value to use 
in many, many situations.


It's not often you can say that two people are quite literally arguing about 
nothing!

James Harvard

(... being flippant because the actual arguments started going over my head 
about half-a-dozen posts ago...)

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Accountability with MySQL

2006-03-14 Thread Bruno B B Magalháes
I am designing a simple accountability system so all the partners can have
direct access to it by intranet.

I was designing the data model, and came up with this:

CREATE TABLE `moviments` (
  `moviment_id` int(20) NOT NULL auto_increment,
  `moviment_date` date NOT NULL default '-00-00',
  `moviment_description` char(200) NOT NULL default '',
  `moviment_assignor` char(80) NOT NULL default '',
  `moviment_drawee` char(80) NOT NULL default '',
  `moviment_amount` int(20) NOT NULL default '0',
  PRIMARY KEY  (`moviment_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

But I am a little bit stuck with this:

Should I use a column to mark if the moviment has been executed?
About income and outcome, should I use a column called moviment_type or
just put a negative value when is an outcome for exampe?

Does anyone ever made something like that, any other idea that could
improve my little system?

Best regards,
Bruno B B Magalhães

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread Martijn Tonies
Hello Bruno,

 I am designing a simple accountability system so all the partners can have
 direct access to it by intranet.

 I was designing the data model, and came up with this:

 CREATE TABLE `moviments` (
   `moviment_id` int(20) NOT NULL auto_increment,
   `moviment_date` date NOT NULL default '-00-00',

What a weird default date - it's rather invalid, isn't it?

   `moviment_description` char(200) NOT NULL default '',
   `moviment_assignor` char(80) NOT NULL default '',
   `moviment_drawee` char(80) NOT NULL default '',
   `moviment_amount` int(20) NOT NULL default '0',
   PRIMARY KEY  (`moviment_id`)
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1


btw, given all the default clauses, did you know you can actually
put a complete empty moviment into this table? Doesn't say a thing.
You might want to change that a bit :-)

 But I am a little bit stuck with this:

 Should I use a column to mark if the moviment has been executed?
 About income and outcome, should I use a column called moviment_type or
 just put a negative value when is an outcome for exampe?

Calculating summaries etc is, I think, much easier if you just do a minus.

 Does anyone ever made something like that, any other idea that could
 improve my little system?

Well, it depends on what you're trying to do and how you want to
query data.

You could also decide to create two tables - one for positives and one
for negatives.

Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread Bruno B B Magalháes
Hi Martijn,

well about the date default value being invalid, well it´s working here in
my system (MySQL 4.1.16, Mac OS X 10.4.5), and all my systens work with
this...

About the business logic I was thinking about showing the current month by
default, and if the user wants he can select diferent days, months and
year, or beetween dates.. And everything will be shown in a simple list,
with some graphics like bills to pay in the next week, income to recieve
in next week, and simple stats of that kind..

To split in two tables (income and outcome) seams a good idea...

Best Regards,
Bruno B B Magalhães

 Hello Bruno,

 I am designing a simple accountability system so all the partners can
 have
 direct access to it by intranet.

 I was designing the data model, and came up with this:

 CREATE TABLE `moviments` (
   `moviment_id` int(20) NOT NULL auto_increment,
   `moviment_date` date NOT NULL default '-00-00',

 What a weird default date - it's rather invalid, isn't it?

   `moviment_description` char(200) NOT NULL default '',
   `moviment_assignor` char(80) NOT NULL default '',
   `moviment_drawee` char(80) NOT NULL default '',
   `moviment_amount` int(20) NOT NULL default '0',
   PRIMARY KEY  (`moviment_id`)
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1


 btw, given all the default clauses, did you know you can actually
 put a complete empty moviment into this table? Doesn't say a thing.
 You might want to change that a bit :-)

 But I am a little bit stuck with this:

 Should I use a column to mark if the moviment has been executed?
 About income and outcome, should I use a column called moviment_type or
 just put a negative value when is an outcome for exampe?

 Calculating summaries etc is, I think, much easier if you just do a minus.

 Does anyone ever made something like that, any other idea that could
 improve my little system?

 Well, it depends on what you're trying to do and how you want to
 query data.

 You could also decide to create two tables - one for positives and one
 for negatives.

 Martijn Tonies
 Database Workbench - development tool for MySQL, and more!
 Upscene Productions
 http://www.upscene.com
 My thoughts:
 http://blog.upscene.com/martijn/
 Database development questions? Check the forum!
 http://www.databasedevelopmentforum.com


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]






-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread SGreen
Bruno B B Magalháes [EMAIL PROTECTED] wrote on 03/14/2006 12:41:35 
PM:

 I am designing a simple accountability system so all the partners can 
have
 direct access to it by intranet.
 
 I was designing the data model, and came up with this:
 
 CREATE TABLE `moviments` (
   `moviment_id` int(20) NOT NULL auto_increment,
   `moviment_date` date NOT NULL default '-00-00',
   `moviment_description` char(200) NOT NULL default '',
   `moviment_assignor` char(80) NOT NULL default '',
   `moviment_drawee` char(80) NOT NULL default '',
   `moviment_amount` int(20) NOT NULL default '0',
   PRIMARY KEY  (`moviment_id`)
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1
 
 But I am a little bit stuck with this:
 
 Should I use a column to mark if the moviment has been executed?
 About income and outcome, should I use a column called moviment_type or
 just put a negative value when is an outcome for exampe?
 
 Does anyone ever made something like that, any other idea that could
 improve my little system?
 
 Best regards,
 Bruno B B Magalhães
 

It all depends on what accounting style you want to use whether you want 
single-entry or double-entry bookkeeping. There are pros and cons to both 
styles. The best answer comes from you: If you were an accountant, how 
would you write this out on paper? Once you figure out what you paper 
books should look like you should be able to manage your electronic data 
almost identically to how you manage your paper data.

If you are used to seeing a list of debits and credits where each line 
represents one account (end) within a single transaction (these lines 
always appear in pairs) then you want to use one type of table. If you 
want to see each transaction on a single line that also lists both ends of 
the transaction, you use a table similar to the one you made. 

If you have a flag for transaction type (debit,credit, etc) then your 
amounts should almost always be listed as positive values. It's the 
position of the account (assignor or drawee) and the transaction type that 
determines the sign (+/-) to apply to the value as you apply the value to 
an account.

Should you have a flag for the status movement complete ? I would say 
yes but instead of a simple checkbox, you could store a date value. That 
gives you two pieces of information

a) if the date is null then the movement is not complete.
b) if the date is NOT null then the movement is complete and you know when 
it finished.

As I said before, how you set up your accounting tables really depends on 
the style of accounting you want to do. Check with a bookkeeper or an 
accountant for more details.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine




Re: Accountability with MySQL

2006-03-14 Thread Martijn Tonies
Hello Bruno,

 well about the date default value being invalid, well it´s working here in
 my system (MySQL 4.1.16, Mac OS X 10.4.5), and all my systens work with
 this...

Yes, it's a valid value in MySQL, but it's an invalid date,
that's what I'm trying to say. Why have an invalid date
as the default?


Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread SGreen
Martijn Tonies [EMAIL PROTECTED] wrote on 03/14/2006 01:16:11 PM:

 Hello Bruno,
 
  well about the date default value being invalid, well it´s working 
here in
  my system (MySQL 4.1.16, Mac OS X 10.4.5), and all my systens work 
with
  this...
 
 Yes, it's a valid value in MySQL, but it's an invalid date,
 that's what I'm trying to say. Why have an invalid date
 as the default?
 
 
 Martijn Tonies
 Database Workbench - development tool for MySQL, and more!
 Upscene Productions
 http://www.upscene.com
 My thoughts:
 http://blog.upscene.com/martijn/
 Database development questions? Check the forum!
 http://www.databasedevelopmentforum.com
 

Martin,

If you used MySQL just a bit more often for day-to-day operations (yes, we 
all know that you build useful tools that interoperate with several RDBM 
systems), you would know that you cannot assign anything but constant 
values as default values for any type of column in MySQL (timestamp being 
the singular, partial exception). That means you cannot design a column to 
have more meaningful default date by capturing the current time because 
functions are not permitted as default values (yet). 

With that in mind, a design default of -00-00 is as good as any 
other single, randomly chosen default date, don't you think? That's also 
the default default date if you define a non-null date column and do not 
specify your own default in the definition.


Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Re: Accountability with MySQL

2006-03-14 Thread Martijn Tonies
Should you have a flag for the status movement complete ? I would say
yes but instead of a simple checkbox, you could store a date value. That
gives you two pieces of information

a) if the date is null then the movement is not complete.
b) if the date is NOT null then the movement is complete and you know when
it finished.

There, Shawn and I disagree :-)

In case of a very simple database, one could use NULL
as a flag or signal. But in general, I would avoid
storing NULLs.

Basics of database design: store what you know.

Given that NULLs basically means the absence of a value
(eg: unknown), you shouldn't be storing NULLs.

Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread Rhino


- Original Message - 
From: Bruno B B Magalháes [EMAIL PROTECTED]

To: mysql@lists.mysql.com
Sent: Tuesday, March 14, 2006 12:41 PM
Subject: Accountability with MySQL



I am designing a simple accountability system so all the partners can have
direct access to it by intranet.

I was designing the data model, and came up with this:

CREATE TABLE `moviments` (
 `moviment_id` int(20) NOT NULL auto_increment,
 `moviment_date` date NOT NULL default '-00-00',
 `moviment_description` char(200) NOT NULL default '',
 `moviment_assignor` char(80) NOT NULL default '',
 `moviment_drawee` char(80) NOT NULL default '',
 `moviment_amount` int(20) NOT NULL default '0',
 PRIMARY KEY  (`moviment_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

One small observation: if your column names are meant to be in English - as 
suggested by your use of amount, drawee, etc. I think you should also 
use an English word where you are currently using moviment. Moviment is 
not an English word. There _is_ an English word movement that might be 
what you want but I suspect that transaction would be the term most 
English-speakers would use in the table you describe. Your email address 
indicates that you are from Brazil so I'll assume you are a 
Portuguese-speaker, not a native English-speaker. This suggestion is only 
meant to be a friendly suggestion, not an insult to your English which is 
very very good. :-)



But I am a little bit stuck with this:

Should I use a column to mark if the moviment has been executed?


That depends. You could decide to only add the row to the table if it has 
been executed; then, you would know implicitly that every entry has been 
executed. On the other hand, if you want to show transactions that have not 
yet been executed, a column that indicates whether the transaction has been 
executed would probably be a good idea.


It might be even better to display details about the transaction that refer 
to its successful execution rather than just showing a yes/no flag. Perhaps 
you could store the timestamp that shows when the transaction was completed 
and maybe the identity of the person or program that completed the 
transaction or even a copy of the document generated by the transaction (or 
a link to this document). For instance, if a receipt was issued for this 
transaction, you might want to store an image of the receipt (or a link to 
the image) in the table.



About income and outcome, should I use a column called moviment_type or
just put a negative value when is an outcome for exampe?

Either approach should work. It would probably be easier to simply store the 
sign of the transaction with the amount.



Does anyone ever made something like that, any other idea that could
improve my little system?

I think the question you need to ask yourself is What do I want to be able 
to find out about a transaction after it has taken place?. If you decide 
you want to know what currency was involved in the transaction, you need to 
record that in your table. If you decide you want to know the serial numbers 
of the banknotes used in the transaction, you need to record that. And so 
on. It might be wise to talk to someone in the bank and ask them what things 
they need to find out while a transaction is taking place and afterwards.


You may want to try asking your users (or their managers) what sorts of 
things people ask for but can't get from the old system; these are prime 
candidates for things that you could add to your new system.


I'm a little surprised that your amount column is an integer. Does the 
Brazilian currency not have a fractional component? Here in Canada, monetary 
amounts are always decimal numbers, like $123.45, meaning one hundred and 
twenty three dollars and forty five cents. Or is your currency like Japanese 
yen, which have no fractional part?


--
Rhino 




--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.385 / Virus Database: 268.2.2/280 - Release Date: 13/03/2006


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread Martijn Tonies
Hello Shawn,

  well about the date default value being invalid, well it´s working here
in
  my system (MySQL 4.1.16, Mac OS X 10.4.5), and all my systens work with
  this...

 Yes, it's a valid value in MySQL, but it's an invalid date,
 that's what I'm trying to say. Why have an invalid date
 as the default?


If you used MySQL just a bit more often for day-to-day operations (yes, we
all know that you build useful tools that interoperate with several RDBM
systems), you would know that you cannot assign anything but constant
values as default values for any type of column in MySQL (timestamp being
the singular, partial exception).

I know that.

That means you cannot design a column to have more meaningful default date
by capturing the current time because functions are not permitted as default
values (yet).


Same here.

With that in mind, a design default of -00-00 is as good as any other
single, randomly chosen default date, don't you think? That's also the
default default date if you define a non-null date column and do not
specify your own default in the definition.

I'd rather say that having no default (and a NOT NULL) would be
the better thing to do.

What's the use of not null and a default?

Isn't that like saying: well, you have to fill in each column, but
if you don't I'll assign something empty-ish.


Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread Rhino


- Original Message - 
From: Martijn Tonies [EMAIL PROTECTED]

To: mysql@lists.mysql.com
Sent: Tuesday, March 14, 2006 1:16 PM
Subject: Re: Accountability with MySQL



Hello Bruno,

well about the date default value being invalid, well it´s working here 
in

my system (MySQL 4.1.16, Mac OS X 10.4.5), and all my systens work with
this...


Yes, it's a valid value in MySQL, but it's an invalid date,
that's what I'm trying to say. Why have an invalid date
as the default?



I think there may be a little bit of language confusion here.

Bruno, I think Martijn is suggesting that you use a date that can actually 
exist as the default transaction date; '000-00-00' is accepted by MySQL as a 
date that can be stored in a table but it is not a date that ever really 
existed. There was never a Year 0 in the history of the world.


I think Martijn is right; it is probably a better choice to use a real 
date in your table. The obvious choice would be to store the current date in 
that column. For example, if you add a row to the table, the new row should 
normally contain the date that the transaction was executed; that might be 
yesterday or a week ago or maybe even longer. But if no transaction date is 
known, it would probably be reasonable to store today's date. But sometimes 
that _wouldn't_ be reasonable.


You should probably talk to the people in the bank about this and ask if it 
is ever possible that the transaction date would be unknown; if the 
transaction data _can_ be unknown, you should ask what date they usually 
store in their existing system in that case. Then your new table should 
probably do the exact same thing as the old system does, assuming everyone 
agrees that this is the right thing to do. (The users may say that it is the 
wrong thing to do and suggest a better value to choose for the transaction 
date.) But if the transaction date can never be unknown, it doesn't make 
much difference what you choose as the default date since it will never be 
used.


--
Rhino 




--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.385 / Virus Database: 268.2.2/280 - Release Date: 13/03/2006


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread Rhino


- Original Message - 
From: Martijn Tonies [EMAIL PROTECTED]

To: mysql@lists.mysql.com
Sent: Tuesday, March 14, 2006 1:28 PM
Subject: Re: Accountability with MySQL



Should you have a flag for the status movement complete ? I would say

yes but instead of a simple checkbox, you could store a date value. That
gives you two pieces of information

a) if the date is null then the movement is not complete.
b) if the date is NOT null then the movement is complete and you know when
it finished.


There, Shawn and I disagree :-)

In case of a very simple database, one could use NULL
as a flag or signal. But in general, I would avoid
storing NULLs.

Basics of database design: store what you know.

Given that NULLs basically means the absence of a value
(eg: unknown), you shouldn't be storing NULLs.


Nonsense!!

That's simply wrong. A null means unknown or not applicable and is a 
perfectly valid value to use in many, many situations. Two classic examples:


1. You have a table containing test scores. Some students get every question 
on a given test wrong. Some students don't write the test at all, perhaps 
because they were sick. In this scenario, I think the correct thing to store 
is 0 for the students who got every question wrong and NULL for the students 
who didn't write the test.
2. You have a table containing employee records. One of the columns is 
termination date. What value do you store for a new employee's termination 
date? Well, if they are a contractor on a fixed length contract, you could 
calculate the date the contract ends; fair enough. If they are a permanent 
employee and your area has mandatory retirement, you could calculate the 
date they turn 65 (or whatever) and use that. But what if they are a 
permanent employee and you don't have mandatory retirement? I would store a 
NULL to mean I don't know right now. Then, if and when they gave notice 
that they were leaving, I would change the termination date from NULL to 
their last day of work.


--
Rhino 




--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.385 / Virus Database: 268.2.2/280 - Release Date: 13/03/2006


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread Bruno B B Magalháes

I am designing a simple accountability system so all the partners can
 have
 direct access to it by intranet.

 I was designing the data model, and came up with this:

 CREATE TABLE `moviments` (
  `moviment_id` int(20) NOT NULL auto_increment,
  `moviment_date` date NOT NULL default '-00-00',
  `moviment_description` char(200) NOT NULL default '',
  `moviment_assignor` char(80) NOT NULL default '',
  `moviment_drawee` char(80) NOT NULL default '',
  `moviment_amount` int(20) NOT NULL default '0',
  PRIMARY KEY  (`moviment_id`)
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1

 One small observation: if your column names are meant to be in English - as
 suggested by your use of amount, drawee, etc. I think you should
also use an English word where you are currently using moviment.
Moviment is
 not an English word. There _is_ an English word movement that might be
what you want but I suspect that transaction would be the term most
English-speakers would use in the table you describe. Your email address
indicates that you are from Brazil so I'll assume you are a
 Portuguese-speaker, not a native English-speaker. This suggestion is
only meant to be a friendly suggestion, not an insult to your English
which is very very good. :-)

Ups, that´s right, I misspelt movement, you know beeing a partner and also
being the financial director, technology director and CEO some times can
be exhaustive. Well, at least we are growing.. (the opposite of my
weekends, that seams to be getting smaller and smaller).  But thanks, I
also agree transaction is more suitable.
 But I am a little bit stuck with this:

 Should I use a column to mark if the moviment has been executed?

 That depends. You could decide to only add the row to the table if it
has been executed; then, you would know implicitly that every entry has
been executed. On the other hand, if you want to show transactions that
have not
 yet been executed, a column that indicates whether the transaction has been
 executed would probably be a good idea.

 It might be even better to display details about the transaction that refer
 to its successful execution rather than just showing a yes/no flag. Perhaps
 you could store the timestamp that shows when the transaction was completed
 and maybe the identity of the person or program that completed the
transaction or even a copy of the document generated by the transaction
(or
 a link to this document). For instance, if a receipt was issued for this
transaction, you might want to store an image of the receipt (or a link
to the image) in the table.

It´s a good idea, but for now, overkill... I mean here is my new design:

CREATE TABLE `earnings` (
  `transaction_id` int(20) NOT NULL auto_increment,
  `transaction_planned_date` date NOT NULL default '-00-00',
  `transaction_executed_date` date NOT NULL default '-00-00',
  `transaction_description` char(200) NOT NULL default '',
  `transaction_assignor` char(80) NOT NULL default '',
  `transaction_amount` int(20) NOT NULL default '0',
  PRIMARY KEY  (`transaction_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

CREATE TABLE `earnings` (
  `transaction_id` int(20) NOT NULL auto_increment,
  `transaction_planned_date` date NOT NULL default '-00-00',
  `transaction_executed_date` date NOT NULL default '-00-00',
  `transaction_description` char(200) NOT NULL default '',
  `transaction_assignor` char(80) NOT NULL default '',
  `transaction_amount` int(20) NOT NULL default '0',
  PRIMARY KEY  (`transaction_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

I think it is a good model for start...

 About income and outcome, should I use a column called moviment_type or
just put a negative value when is an outcome for example?

 Either approach should work. It would probably be easier to simply store
the
 sign of the transaction with the amount.

 Does anyone ever made something like that, any other idea that could
improve my little system?

 I think the question you need to ask yourself is What do I want to be able
 to find out about a transaction after it has taken place?. If you
decide you want to know what currency was involved in the transaction,
you need to
 record that in your table. If you decide you want to know the serial
numbers
 of the banknotes used in the transaction, you need to record that. And
so on. It might be wise to talk to someone in the bank and ask them what
things
 they need to find out while a transaction is taking place and
afterwards. You may want to try asking your users (or their managers)
what sorts of things people ask for but can't get from the old system;
these are prime candidates for things that you could add to your new
system.

What I want? Well that my partners stop asking me to send the spreadsheet
every week... :) They want to take a look the company´s profitability, if
all payments have been made, if all clients have payed... As I am the one
who make every transaction I simple would input it into DB (direct) and
they would access a simple 

Re: Accountability with MySQL

2006-03-14 Thread Martijn Tonies

  Should you have a flag for the status movement complete ? I would say
 yes but instead of a simple checkbox, you could store a date value. That
 gives you two pieces of information
 
 a) if the date is null then the movement is not complete.
 b) if the date is NOT null then the movement is complete and you know
when
 it finished.
 
  There, Shawn and I disagree :-)
 
  In case of a very simple database, one could use NULL
  as a flag or signal. But in general, I would avoid
  storing NULLs.
 
  Basics of database design: store what you know.
 
  Given that NULLs basically means the absence of a value
  (eg: unknown), you shouldn't be storing NULLs.
 
 Nonsense!!

That's a bold statement ...

 That's simply wrong. A null means unknown or not applicable and is a

Ah, so now things become clear, a NULL actually can mean
two things? So much for clarity then...

 perfectly valid value to use in many, many situations. Two classic
examples:

Let me first state that there's a difference between storing NULLs
and handling NULLs in your result.

So, let's debunk these two classic examples...

 1. You have a table containing test scores. Some students get every
question
 on a given test wrong. Some students don't write the test at all, perhaps
 because they were sick. In this scenario, I think the correct thing to
store
 is 0 for the students who got every question wrong and NULL for the
students
 who didn't write the test.

IMO, the better thing to do in this particular case is to NOT store a test
result for the students that did not make the test.

So, an example table with test scores:

TEST_SCORES
StudentID Int,
TestID Int,
Score TinyInt Unsigned

Now, students who didn't make the test won't have a record in here.

Perfectly valid design AND you avoid storing NULLs.

 2. You have a table containing employee records. One of the columns is
 termination date. What value do you store for a new employee's
termination
 date? Well, if they are a contractor on a fixed length contract, you could
 calculate the date the contract ends; fair enough. If they are a permanent
 employee and your area has mandatory retirement, you could calculate the
 date they turn 65 (or whatever) and use that. But what if they are a
 permanent employee and you don't have mandatory retirement? I would store
a
 NULL to mean I don't know right now. Then, if and when they gave notice
 that they were leaving, I would change the termination date from NULL to
 their last day of work.

Why store a date column if you don't know?

Why not use:

EMPLOYEES
EmployeeID int,
StartingDate Date,
...

TERMINATED_EMPLOYEES (albeit a bit agressive ;) )
EmployeeID
TerminationDate

Once more: perfectly valid design.

Martijn Tonies
Database Workbench - development tool for MySQL, and more!
Upscene Productions
http://www.upscene.com
My thoughts:
http://blog.upscene.com/martijn/
Database development questions? Check the forum!
http://www.databasedevelopmentforum.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread Rhino


- Original Message - 
From: Bruno B B Magalháes [EMAIL PROTECTED]

To: mysql@lists.mysql.com
Sent: Tuesday, March 14, 2006 2:07 PM
Subject: Re: Accountability with MySQL





I am designing a simple accountability system so all the partners can
have
direct access to it by intranet.

I was designing the data model, and came up with this:

CREATE TABLE `moviments` (
 `moviment_id` int(20) NOT NULL auto_increment,
 `moviment_date` date NOT NULL default '-00-00',
 `moviment_description` char(200) NOT NULL default '',
 `moviment_assignor` char(80) NOT NULL default '',
 `moviment_drawee` char(80) NOT NULL default '',
 `moviment_amount` int(20) NOT NULL default '0',
 PRIMARY KEY  (`moviment_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

One small observation: if your column names are meant to be in English - 
as

suggested by your use of amount, drawee, etc. I think you should

also use an English word where you are currently using moviment.
Moviment is

not an English word. There _is_ an English word movement that might be

what you want but I suspect that transaction would be the term most
English-speakers would use in the table you describe. Your email address
indicates that you are from Brazil so I'll assume you are a

Portuguese-speaker, not a native English-speaker. This suggestion is

only meant to be a friendly suggestion, not an insult to your English
which is very very good. :-)

Ups, that´s right, I misspelt movement, you know beeing a partner and also
being the financial director, technology director and CEO some times can
be exhaustive. Well, at least we are growing.. (the opposite of my
weekends, that seams to be getting smaller and smaller).  But thanks, I
also agree transaction is more suitable.

But I am a little bit stuck with this:

Should I use a column to mark if the moviment has been executed?


That depends. You could decide to only add the row to the table if it

has been executed; then, you would know implicitly that every entry has
been executed. On the other hand, if you want to show transactions that
have not
yet been executed, a column that indicates whether the transaction has 
been

executed would probably be a good idea.

It might be even better to display details about the transaction that 
refer
to its successful execution rather than just showing a yes/no flag. 
Perhaps
you could store the timestamp that shows when the transaction was 
completed

and maybe the identity of the person or program that completed the

transaction or even a copy of the document generated by the transaction
(or

a link to this document). For instance, if a receipt was issued for this

transaction, you might want to store an image of the receipt (or a link
to the image) in the table.

It´s a good idea, but for now, overkill... I mean here is my new design:

CREATE TABLE `earnings` (
 `transaction_id` int(20) NOT NULL auto_increment,
 `transaction_planned_date` date NOT NULL default '-00-00',
 `transaction_executed_date` date NOT NULL default '-00-00',
 `transaction_description` char(200) NOT NULL default '',
 `transaction_assignor` char(80) NOT NULL default '',
 `transaction_amount` int(20) NOT NULL default '0',
 PRIMARY KEY  (`transaction_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

CREATE TABLE `earnings` (
 `transaction_id` int(20) NOT NULL auto_increment,
 `transaction_planned_date` date NOT NULL default '-00-00',
 `transaction_executed_date` date NOT NULL default '-00-00',
 `transaction_description` char(200) NOT NULL default '',
 `transaction_assignor` char(80) NOT NULL default '',
 `transaction_amount` int(20) NOT NULL default '0',
 PRIMARY KEY  (`transaction_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

I think it is a good model for start...


About income and outcome, should I use a column called moviment_type or

just put a negative value when is an outcome for example?



Either approach should work. It would probably be easier to simply store

the

sign of the transaction with the amount.


Does anyone ever made something like that, any other idea that could

improve my little system?


I think the question you need to ask yourself is What do I want to be 
able

to find out about a transaction after it has taken place?. If you

decide you want to know what currency was involved in the transaction,
you need to

record that in your table. If you decide you want to know the serial

numbers

of the banknotes used in the transaction, you need to record that. And

so on. It might be wise to talk to someone in the bank and ask them what
things

they need to find out while a transaction is taking place and

afterwards. You may want to try asking your users (or their managers)
what sorts of things people ask for but can't get from the old system;
these are prime candidates for things that you could add to your new
system.

What I want? Well that my partners stop asking me to send the spreadsheet
every week... :) They want to take a look the company´s profitability, if
all

Re: Accountability with MySQL

2006-03-14 Thread Rhino


- Original Message - 
From: Martijn Tonies [EMAIL PROTECTED]

To: Rhino [EMAIL PROTECTED]; mysql@lists.mysql.com
Sent: Tuesday, March 14, 2006 2:16 PM
Subject: Re: Accountability with MySQL




 Should you have a flag for the status movement complete ? I would 
 say
yes but instead of a simple checkbox, you could store a date value. 
That

gives you two pieces of information

a) if the date is null then the movement is not complete.
b) if the date is NOT null then the movement is complete and you know

when

it finished.

 There, Shawn and I disagree :-)

 In case of a very simple database, one could use NULL
 as a flag or signal. But in general, I would avoid
 storing NULLs.

 Basics of database design: store what you know.

 Given that NULLs basically means the absence of a value
 (eg: unknown), you shouldn't be storing NULLs.

Nonsense!!


That's a bold statement ...


That's simply wrong. A null means unknown or not applicable and is a


Ah, so now things become clear, a NULL actually can mean
two things? So much for clarity then...

Come on; that's not fair. Unknown and not applicable are more like 
different senses of the same thing, not two opposite things.



perfectly valid value to use in many, many situations. Two classic

examples:

Let me first state that there's a difference between storing NULLs
and handling NULLs in your result.

Yes, of  course there is: so what? You said NULLs should never be _stored_; 
that's what I'm responding to. Using NULLs is not very _difficult_ anyway, 
just a bit tedious because it is another case to handle.



So, let's debunk these two classic examples...

Please note that I did not say that you MUST use NULLs, just that they were 
perfectly valid to use in a design.



1. You have a table containing test scores. Some students get every

question

on a given test wrong. Some students don't write the test at all, perhaps
because they were sick. In this scenario, I think the correct thing to

store

is 0 for the students who got every question wrong and NULL for the

students

who didn't write the test.


IMO, the better thing to do in this particular case is to NOT store a test
result for the students that did not make the test.

Okay, that might be acceptable, if it doesn't cause you to lose track of the 
student altogether. But if this table was the only one that even recorded 
the _existence_ of the student, you'd have a problem; if someone tried to 
verify that the student had attended this school, you wouldn't know that 
they had. Or in a more probable case, if that was the only test for that 
course and the student missed it and then had no row in the table, you might 
not have any way of knowing that they took the course! And if they later 
wanted to write the exam, having recovered from their illness, your query 
might have the effect of keeping them from taking the second exam: your 
query would report that they had never taken the first exam so an 
adminstrator might refuse to let him/her take the second exam because they 
(apparently) had never been scheduled to take the first one.


Now, you could have a second table to record people who were scheduled to 
take tests but failed to take them to cover that situation but I think it 
would be easier to record all students in one table and then simply store a 
null for any test that they fail to take and a zero for every case where a 
student got every answer wrong. When you compute the class average, the 
avg() function would ensure that the students who got every question wrong 
would pull down the class average but that students who failed to write the 
test at all would NOT skew the average because the avg() function ignores 
nulls.



So, an example table with test scores:

TEST_SCORES
StudentID Int,
TestID Int,
Score TinyInt Unsigned

Now, students who didn't make the test won't have a record in here.

Perfectly valid design AND you avoid storing NULLs.


2. You have a table containing employee records. One of the columns is
termination date. What value do you store for a new employee's

termination
date? Well, if they are a contractor on a fixed length contract, you 
could
calculate the date the contract ends; fair enough. If they are a 
permanent

employee and your area has mandatory retirement, you could calculate the
date they turn 65 (or whatever) and use that. But what if they are a
permanent employee and you don't have mandatory retirement? I would store

a

NULL to mean I don't know right now. Then, if and when they gave notice
that they were leaving, I would change the termination date from NULL to
their last day of work.


Why store a date column if you don't know?

Why not use:

EMPLOYEES
EmployeeID int,
StartingDate Date,
...

TERMINATED_EMPLOYEES (albeit a bit agressive ;) )
EmployeeID
TerminationDate

Once more: perfectly valid design.

Yes, that is also a valid design but it means you have to have yet another 
table that you could have avoided simply by permitting a null

Re: Accountability with MySQL

2006-03-14 Thread James Harvard
At 7:48 pm -0500 14/3/06, Rhino wrote:
Ah, so now things become clear, a NULL actually can mean
two things? So much for clarity then...

Come on; that's not fair. Unknown and not applicable are more like 
different senses of the same thing, not two opposite things.

IIRC (but please don't ask for a reference) it has been suggested that there 
should be two different types of NULL (value unknown  not applicable) for just 
this reason.

I suppose what it boils down to is that although the two main reasons one might 
use NULL are *not* the same thing, the outcome (the absence of a value) *is* 
the same.

So you're both right. Sort of.
:-)

James Harvard

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread Martijn Tonies


 At 7:48 pm -0500 14/3/06, Rhino wrote:
 Ah, so now things become clear, a NULL actually can mean
 two things? So much for clarity then...
 
 Come on; that's not fair. Unknown and not applicable are more like
different senses of the same thing, not two opposite things.


 IIRC (but please don't ask for a reference) it has been suggested that
there should be two different types of NULL (value unknown  not applicable)
for just this reason.


Yep. unknown and n/a are two different things and SQL happens
to support them by NULLs. This is confusing at times and more
difficult to handle at least...

The problems with NULLs are numerous. Don't say they aren't, cause
they are. In several SQL functions, NULLs make a difference and
they will bite you every now and then. Especially for new people,
which is a clear signal they are confusing.

 I suppose what it boils down to is that although the two main reasons one
might use NULL are *not* the same thing, the outcome (the absence of a
value) *is* the same.

 So you're both right. Sort of.
 :-)

;-)

Martijn Tonies
Upscene Productions
http://www.upscene.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Accountability with MySQL

2006-03-14 Thread Martijn Tonies
   Basics of database design: store what you know.
  
   Given that NULLs basically means the absence of a value
   (eg: unknown), you shouldn't be storing NULLs.
  
  Nonsense!!
 
  That's a bold statement ...
 
  That's simply wrong. A null means unknown or not applicable and is a
 
  Ah, so now things become clear, a NULL actually can mean
  two things? So much for clarity then...
 
 Come on; that's not fair. Unknown and not applicable are more like
 different senses of the same thing, not two opposite things.

Read the literature on how to design databases. What you do
is storing true propositions. That is, each attribute defines a
certain true proposition. For example:

Employee Martijn has Employee# 14.

You should be able to derive these sentences from every row.
Putting a NULL or N/A in there fails to meet this requirement.

  perfectly valid value to use in many, many situations. Two classic
  examples:
 
  Let me first state that there's a difference between storing NULLs
  and handling NULLs in your result.
 
 Yes, of  course there is: so what? You said NULLs should never be
_stored_;
 that's what I'm responding to. Using NULLs is not very _difficult_ anyway,
 just a bit tedious because it is another case to handle.

There's no point in storing what you don't know.

  So, let's debunk these two classic examples...
 
 Please note that I did not say that you MUST use NULLs, just that they
were
 perfectly valid to use in a design.

Depends on who you're asking ...

A few years ago, I was working at a company that had developed
an application with an Oracle database with about 400 tables, nothing
too large.

Plenty of tables had NULLs, not because of missing business data,
but rather used internally in the system.

The number of problems we had with them, I cannot count on my
two hands, nor can I count them on the hands of all other developers
(5) for that project.

Just last week, I spoke to a guy who used to be my collegue there
and we discussed the design a bit (he still does database design and
development for a large company, on Sybase with over 4000
deployed systems) and we both agreed that using NULLs in most
cases as part of the initial design was causing us more problems
then helping us.

  1. You have a table containing test scores. Some students get every
  question
  on a given test wrong. Some students don't write the test at all,
perhaps
  because they were sick. In this scenario, I think the correct thing to
  store
  is 0 for the students who got every question wrong and NULL for the
  students
  who didn't write the test.
 
  IMO, the better thing to do in this particular case is to NOT store a
test
  result for the students that did not make the test.
 
 Okay, that might be acceptable, if it doesn't cause you to lose track of
the
 student altogether. But if this table was the only one that even recorded
 the _existence_ of the student, you'd have a problem;

Indeed, then I would have a problem, cause you cannot derive from
the test results table that a student exists. This is a rather silly
statement.
If the test results table should ALSO store the existence of a student,
you're design is wrong :-)

if someone tried to
 verify that the student had attended this school, you wouldn't know that
 they had.

In your own words: nonsense.

Each table should store what is was designed to store. If I would
have to know if a student was enlisted in a certain course or would
be attending this school, I would not be using the test results table.

Or in a more probable case, if that was the only test for that
 course and the student missed it and then had no row in the table, you
might
 not have any way of knowing that they took the course!

See above.

And if they later
 wanted to write the exam, having recovered from their illness, your query
 might have the effect of keeping them from taking the second exam: your
 query would report that they had never taken the first exam so an
 adminstrator might refuse to let him/her take the second exam because they
 (apparently) had never been scheduled to take the first one.

Different problem, see above.

 Now, you could have a second table to record people who were scheduled to
 take tests but failed to take them to cover that situation but I think it

Sounds like a decent design to me.

 would be easier to record all students in one table and then simply store
a
 null for any test that they fail to take and a zero for every case where a
 student got every answer wrong. When you compute the class average, the
 avg() function would ensure that the students who got every question wrong
 would pull down the class average but that students who failed to write
the
 test at all would NOT skew the average because the avg() function ignores
 nulls.

Ah, an excellent example of why NULLs are tedious: they are ignored
by some functions, but not by others.

Without having the record there in the first place, there would have
to be no rule of AVG ignoring