Re: Accountability with MySQL
From: James Harvard At 7:28 pm +0100 14/3/06, Martijn Tonies wrote: Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. At 1:49 pm -0500 14/3/06, Rhino wrote: A null means unknown or not applicable and is a perfectly valid value to use in many, many situations. It's not often you can say that two people are quite literally arguing about nothing! :-) James Harvard (... being flippant because the actual arguments started going over my head about half-a-dozen posts ago...) We're arguing about whether or not your database design (as stored on disk) should contain NULLs. IMO: no, or at least as few as possible. Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
Martijn Tonies wrote: We're arguing about whether or not your database design (as stored on disk) should contain NULLs. IMO: no, or at least as few as possible. I believe the answer really is - *it depends*. You are both right, really. Martijn, yes, according to academia and proper database design, you should not really be using NULLs, (Rhino, see the writings of Chris Date and Fabian Pascal somewhere like www.dbdebunk.com, www.thethirdmanifesto.com etc.). Here's a good one: http://www.dbdebunk.citymax.com/page/page/1396241.htm Of course, you're still going to pay $10-$15 for this.. However, what is good for academia is *not* always good for the real world. Why store Unknown in a column for a table that could store millions, or billions, of rows, when you could simply allow a NULL (and save a *bunch* of disk space across your schemas) in the real world. If you can allow for the use of the NULL in the application, and *not* try to interpret it 5 ways from Sunday, then, in my honest opinion, a NULL would be acceptable. Short answer, as long as you do it without trying to get your application to handle them in many different ways, I don't see a great problem (when weighing them against extra costs, in both space and speed). If you can get away with not using NULL as well, then great. This argurment will, however, carry on between academia and the real world for at least the next 4000 years. :) Best regards Mark -- Mark Leith, Support Engineer MySQL AB, Worcester, England, www.mysql.com Are you MySQL certified? www.mysql.com/certification -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
From: Mark Leith Martijn Tonies wrote: We're arguing about whether or not your database design (as stored on disk) should contain NULLs. IMO: no, or at least as few as possible. I believe the answer really is - *it depends*. You are both right, really. Martijn, yes, according to academia and proper database design, you should not really be using NULLs, (Rhino, see the writings of Chris Date and Fabian Pascal somewhere like www.dbdebunk.com, www.thethirdmanifesto.com etc.). Here's a good one: http://www.dbdebunk.citymax.com/page/page/1396241.htm Got that one ;) The book I quoted is from Date, btw, and I enjoy reading it ... Of course, you're still going to pay $10-$15 for this.. However, what is good for academia is *not* always good for the real world. Why store Unknown in a column for a table that could store millions, or billions, of rows, when you could simply allow a NULL (and save a *bunch* of disk space across your schemas) in the real world. Well, the question still is if you should store unknown at all ;) Not according to Date: you should store what is known. See the remarks about the true propositions, from which relational databases are derived (but you probably know that). If you can allow for the use of the NULL in the application, and *not* try to interpret it 5 ways from Sunday, then, in my honest opinion, a NULL would be acceptable. Short answer, as long as you do it without trying to get your application to handle them in many different ways, I don't see a great problem (when weighing them against extra costs, in both space and speed). If you can get away with not using NULL as well, then great. I guess I can agree with that one. This argurment will, however, carry on between academia and the real world for at least the next 4000 years. :) Until it gets replaced by a better model, but I don't expect that anytime soon ;-) Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
But (in at least some situations) is not appropriate to record that you know that you don't have a value? I think the words of Donald Rumsfeld are appropriate here: There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know. At 12:02 pm +0100 16/3/06, Martijn Tonies wrote: Well, the question still is if you should store unknown at all ;) Not according to Date: you should store what is known. snip -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
But (in at least some situations) is not appropriate to record that you know that you don't have a value? Sure it would. It it's needed. But the answer doesn't have to be a NULL. Remember the example about recording the fact that a student did not make a test for whatever reason? It was then said you could store null for the test score. I then replied that that was a bit backward IMO, cause you're using the test scores table to store that a student did not take a test, which is wrong, given that each row in a table should mean exactly the same thing (theory knocking at the door once again) and that makes a design more clear. I think the words of Donald Rumsfeld are appropriate here: There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know. I think we'll have a hard time storing Donalds knows and unknowns in a database... :-) At 12:02 pm +0100 16/3/06, Martijn Tonies wrote: Well, the question still is if you should store unknown at all ;) Not according to Date: you should store what is known. snip Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
Martijn Tonies [EMAIL PROTECTED] wrote on 16/03/2006 11:02:32: Well, the question still is if you should store unknown at all ;) Not according to Date: you should store what is known. See the remarks about the true propositions, from which relational databases are derived (but you probably know that). As someone totally unread in the theory of databases, that seems unduly puritanical. I assume that what Date would propose is that you have another table (related by master key) in which, if you do not know something, you do not enter it. But this means that if you have 10 different pieces of potentially but not necessarily available information about a single master record (e.g. a person), you have to do a 10-way join in order to retrieve all the information about them. Replacing a theoretically ugly null flag with a 10 way join strikes me, as an engineer rather than a theoretician, the wrong side of the elegance/practicality trade-off. Alec -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Coalesce (was: Re: Accountability with MySQL)
Ah well, SQL (these days) provides plenty of functions to avoid the problems. Coalesce being one of them... How often do people write: (probably not valid MySQL) select firstname || coalesce(middlename || ' ', ' ') || lastname from ... Your syntax isn't quite correct; coalesce doesn't use a concatenation symbol, it uses a list of arguments like this: coalesce(Salary, commission, bonus) Yes, the reason I was using || ' ' there was when middle name was not null, it would already add a space, and when it's null, it would just return a space. Given that NULL || ' ' returns NULL anyways, this would be give me the results for all cases, I think :) -- Martijn Tonies -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
From: [EMAIL PROTECTED] Martijn Tonies [EMAIL PROTECTED] wrote on 16/03/2006 11:02:32: Well, the question still is if you should store unknown at all ;) Not according to Date: you should store what is known. See the remarks about the true propositions, from which relational databases are derived (but you probably know that). As someone totally unread in the theory of databases, that seems unduly puritanical. I assume that what Date would propose is that you have another table (related by master key) in which, if you do not know something, you do not enter it. But this means that if you have 10 different pieces of potentially but not necessarily available information about a single master record (e.g. a person), you have to do a 10-way join in order to retrieve all the information about them. Replacing a theoretically ugly null flag with a 10 way join strikes me, as an engineer rather than a theoretician, the wrong side of the elegance/practicality trade-off. Using NULLs as well as de-normalization brings the risk of integrity problems to your storage, storing what is right is only a good thing. And when it comes to having to writing JOINs for all your queries, lo and behold, I bring you the wonder of the VIEW. ;-) Martijn Tonies Upscene Productions http://www.upscene.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
Martijn Tonies [EMAIL PROTECTED] wrote on 16/03/2006 11:32:45: From: [EMAIL PROTECTED] Martijn Tonies [EMAIL PROTECTED] wrote on 16/03/2006 11:02:32: Well, the question still is if you should store unknown at all ;) Not according to Date: you should store what is known. See the remarks about the true propositions, from which relational databases are derived (but you probably know that). As someone totally unread in the theory of databases, that seems unduly puritanical. I assume that what Date would propose is that you have another table (related by master key) in which, if you do not know something, you do not enter it. But this means that if you have 10 different pieces of potentially but not necessarily available information about a single master record (e.g. a person), you have to do a 10-way join in order to retrieve all the information about them. Replacing a theoretically ugly null flag with a 10 way join strikes me, as an engineer rather than a theoretician, the wrong side of the elegance/practicality trade-off. Using NULLs as well as de-normalization brings the risk of integrity problems to your storage, storing what is right is only a good thing. And when it comes to having to writing JOINs for all your queries, lo and behold, I bring you the wonder of the VIEW. ;-) The VIEW eases the syntax, but does it do anything for performance? Surely it must be much slower to read 11 different tables (Master record containing all NOT NULL fields, and 10 slave records which may or may not contain relevant fields)? Ignoring caching, you are going to have at least one disk access for every NULL field (index lookup which fails) and two for every non NULL field (index lookup, data lookup) for every null field. This means that you have multiplied your number of disk accesses (ignoring caching, again) by 6-11 times (assuming the master record takes two disk accesses). That again seems a very high price to pay for theoretical elegance. Alec -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
in order to retrieve all the information about them. Replacing a theoretically ugly null flag with a 10 way join strikes me, as an engineer rather than a theoretician, the wrong side of the elegance/practicality trade-off. Using NULLs as well as de-normalization brings the risk of integrity problems to your storage, storing what is right is only a good thing. And when it comes to having to writing JOINs for all your queries, lo and behold, I bring you the wonder of the VIEW. ;-) The VIEW eases the syntax, but does it do anything for performance? Surely it must be much slower to read 11 different tables (Master record containing all NOT NULL fields, and 10 slave records which may or may not contain relevant fields)? Ignoring caching, you are going to have at least one disk access for every NULL field (index lookup which fails) and two for every non NULL field (index lookup, data lookup) for every null field. This means that you have multiplied your number of disk accesses (ignoring caching, again) by 6-11 times (assuming the master record takes two disk accesses). That again seems a very high price to pay for theoretical elegance. Keeping your design clean is hardly sacrificing anything. The most important feature of your database and the database design is the integrity of your data. Period. De-normalization, NULLs et all make this task much harder, as well as understanding your design for the guy that takes over your job ... Performance comes next, first thing to do there is getting your indices straight. Next thing is optimizing the slow stuff - if you can go around by keeping, for example, (these much smaller slave) tables in memory, and this works out fine, then do so. If all your queries are optimized, the next thing to do is to throw hardware at it. This is very cheap comparing to your hourly rate. ;-) Obviously, if your queries and design are valid and the database system is still slow, you can also complain with the creators of the database system - performance, after all, is a physical attribute, not a database design issue perse. During the years, many database systems have seen many improvements to bring better performance to its users. Only recently, for example, the Firebird DBMS implemented a new on-disk structure for it's indices greatly enhancing performance for indices with a large number of duplicates. In the past, people said: you have to modify your design such and so, with all risks included. Now, you can keep your design as clear as possible and enjoy greater performance, just because someone thought of something clever. If no-one ever complained, this particular piece of code wouldn't have changed. Martijn Tonies Database Workbench - tool for InterBase, Firebird, MySQL, Oracle MS SQL Server Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
Also, surely if you left join to a table then if there's no matching row you get a null value in your result set. So I'm not sure what the difference is between getting that null because you store it in a column in the primary table, and deriving it by left joining to a related table with no matching rows. James Harvard At 11:27 am + 16/3/06, [EMAIL PROTECTED] wrote: As someone totally unread in the theory of databases, that seems unduly puritanical. I assume that what Date would propose is that you have another table (related by master key) in which, if you do not know something, you do not enter it. But this means that if you have 10 different pieces of potentially but not necessarily available information about a single master record (e.g. a person), you have to do a 10-way join in order to retrieve all the information about them. Replacing a theoretically ugly null flag with a 10 way join strikes me, as an engineer rather than a theoretician, the wrong side of the elegance/practicality trade-off. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
James, Also, surely if you left join to a table then if there's no matching row you get a null value in your result set. So I'm not sure what the difference is between getting that null because you store it in a column in the primary table, and deriving it by left joining to a related table with no matching rows. Handling nulls in your resultsets makes a heck of a lot more sense than also storing these NULLs :-) Martijn Tonies Upscene Productions http://www.upscene.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
True Propositions (was Re: Accountability with MySQL)
Read the literature on how to design databases. What you do is storing true propositions. That is, each attribute defines a certain true proposition. For example: Employee Martijn has Employee# 14. You should be able to derive these sentences from every row. Putting a NULL or N/A in there fails to meet this requirement. Perhaps my theory is a bit rusty but I have never heard of this true propositions business before. Do you have a citation where I can read about this? I would certainly recommend this one: http://www.amazon.com/exec/obidos/ASIN/0321197844/databasede095-20/103-0880452-0854264?creative=327641camp=14573adid=0VQ1KDVJ56MV7389RPEQlink_code=as1 Hmm, I had hoped for a citation to a free online document that I could read. I don't fancy the thought of spending $100+ US on a book just to argue the fine points of database design Then again, I didn't ask for a link to a free source of this information so I can't fault you for your link, it (presumably) answers the question. You will find the true-thingy also in here: http://www.dbdebunk.com/page/page/2928212.htm Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: Accountability with MySQL
And now we are down to reality. This is a MySQL list. Views are a wonderful thing for creating an isolation layer between the application and the database. However, MySQL's current implementation makes it extremely difficult in many cases to avoid full table scans when you define the logical view. {Not poking at the development team I think 5.x is a huge step forward}. So you can't hide the underlying structure currently in MySQL. Like all other things we have to commingle best practices with pragmatism. In other words do what makes sense while at the same time acquiring a good knowledge of fundamentals. {Not just slap name address phone1 phone n in some table with an auto-increment ID field and say your done with the design.} By the way, Date and Codd {both way above me in math and theory} had this discussion ~20 years ago over a 6 month period in pages and pages of Database magazine. In the end I believe they agreed to disagree. -Original Message- From: Martijn Tonies [mailto:[EMAIL PROTECTED] Sent: Thursday, March 16, 2006 5:33 AM To: mysql@lists.mysql.com Subject: Re: Accountability with MySQL From: [EMAIL PROTECTED] Martijn Tonies [EMAIL PROTECTED] wrote on 16/03/2006 11:02:32: Well, the question still is if you should store unknown at all ;) Not according to Date: you should store what is known. See the remarks about the true propositions, from which relational databases are derived (but you probably know that). As someone totally unread in the theory of databases, that seems unduly puritanical. I assume that what Date would propose is that you have another table (related by master key) in which, if you do not know something, you do not enter it. But this means that if you have 10 different pieces of potentially but not necessarily available information about a single master record (e.g. a person), you have to do a 10-way join in order to retrieve all the information about them. Replacing a theoretically ugly null flag with a 10 way join strikes me, as an engineer rather than a theoretician, the wrong side of the elegance/practicality trade-off. Using NULLs as well as de-normalization brings the risk of integrity problems to your storage, storing what is right is only a good thing. And when it comes to having to writing JOINs for all your queries, lo and behold, I bring you the wonder of the VIEW. ;-) Martijn Tonies Upscene Productions http://www.upscene.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
On MySQL Views (Re: Accountability with MySQL)
Hello Gordon, And now we are down to reality. This is a MySQL list. Views are a wonderful thing for creating an isolation layer between the application and the database. However, MySQL's current implementation makes it extremely difficult in many cases to avoid full table scans when you define the logical view. {Not poking at the development team I think 5.x is a huge step forward}. Can you elaborate on that? Any real world experience with MySQL views? The thing that annoys the blo*** cr** out of me with the current implementation is the maintainability of views - as the complete query structure is lost once you've created a view, so you need to run/modify it from SQL statement all the time :-( Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
- Original Message - From: Martijn Tonies [EMAIL PROTECTED] To: mysql@lists.mysql.com Sent: Wednesday, March 15, 2006 2:53 AM Subject: Re: Accountability with MySQL Basics of database design: store what you know. Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. Nonsense!! That's a bold statement ... That's simply wrong. A null means unknown or not applicable and is a Ah, so now things become clear, a NULL actually can mean two things? So much for clarity then... Come on; that's not fair. Unknown and not applicable are more like different senses of the same thing, not two opposite things. Read the literature on how to design databases. What you do is storing true propositions. That is, each attribute defines a certain true proposition. For example: Employee Martijn has Employee# 14. You should be able to derive these sentences from every row. Putting a NULL or N/A in there fails to meet this requirement. Perhaps my theory is a bit rusty but I have never heard of this true propositions business before. Do you have a citation where I can read about this? Frankly, I'm skeptical about your interpretation of this idea. While you clearly don't want false information in a database, it isn't false to say that student such-and-such's exam mark or such-and-such an employee's termination date is unknown or not applicable. It _would_ be false to store a grade of 0 for a student who didn't take the exam because it implies that he got every question wrong when in fact he didn't write the test at all; that would be an example of 0 having two contradicatory meanings. Given a phobia about nulls, it is a reasonable design to put students who don't take the test in a separate table but it's not the ONLY way to handle the situation: a null to indicate a student who didn't take the test is perfectly reasonable too. perfectly valid value to use in many, many situations. Two classic examples: Let me first state that there's a difference between storing NULLs and handling NULLs in your result. Yes, of course there is: so what? You said NULLs should never be _stored_; that's what I'm responding to. Using NULLs is not very _difficult_ anyway, just a bit tedious because it is another case to handle. There's no point in storing what you don't know. But you _do_ know something: that the student didn't take the exam. That is a piece of true knowledge all by itself. So, let's debunk these two classic examples... Please note that I did not say that you MUST use NULLs, just that they were perfectly valid to use in a design. Depends on who you're asking ... A few years ago, I was working at a company that had developed an application with an Oracle database with about 400 tables, nothing too large. Plenty of tables had NULLs, not because of missing business data, but rather used internally in the system. I'm not sure what you mean by that phrase used internally in the system. The number of problems we had with them, I cannot count on my two hands, nor can I count them on the hands of all other developers (5) for that project. How about enumerating a few of them here? I've dealt with relational databases in several capacities for over 20 years and I've never found nulls to be a big problem. Just last week, I spoke to a guy who used to be my collegue there and we discussed the design a bit (he still does database design and development for a large company, on Sybase with over 4000 deployed systems) and we both agreed that using NULLs in most cases as part of the initial design was causing us more problems then helping us. Again, how about listing some of these problems? 1. You have a table containing test scores. Some students get every question on a given test wrong. Some students don't write the test at all, perhaps because they were sick. In this scenario, I think the correct thing to store is 0 for the students who got every question wrong and NULL for the students who didn't write the test. IMO, the better thing to do in this particular case is to NOT store a test result for the students that did not make the test. Okay, that might be acceptable, if it doesn't cause you to lose track of the student altogether. But if this table was the only one that even recorded the _existence_ of the student, you'd have a problem; Indeed, then I would have a problem, cause you cannot derive from the test results table that a student exists. This is a rather silly statement. If the test results table should ALSO store the existence of a student, you're design is wrong :-) if someone tried to verify that the student had attended this school, you wouldn't know that they had. In your own words: nonsense. Each table should store what is was designed to store. If I would have to know if a student was enlisted in a certain course or would be attending this school, I would not be using
Re: Accountability with MySQL
Hello, (please read all before replying) Basics of database design: store what you know. Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. Nonsense!! That's a bold statement ... That's simply wrong. A null means unknown or not applicable and is a Ah, so now things become clear, a NULL actually can mean two things? So much for clarity then... Come on; that's not fair. Unknown and not applicable are more like different senses of the same thing, not two opposite things. Read the literature on how to design databases. What you do is storing true propositions. That is, each attribute defines a certain true proposition. For example: Employee Martijn has Employee# 14. You should be able to derive these sentences from every row. Putting a NULL or N/A in there fails to meet this requirement. Perhaps my theory is a bit rusty but I have never heard of this true propositions business before. Do you have a citation where I can read about this? I would certainly recommend this one: http://www.amazon.com/exec/obidos/ASIN/0321197844/databasede095-20/103-0880452-0854264?creative=327641camp=14573adid=0VQ1KDVJ56MV7389RPEQlink_code=as1 Frankly, I'm skeptical about your interpretation of this idea. While you clearly don't want false information in a database, it isn't false to say that student such-and-such's exam mark or such-and-such an employee's termination date is unknown or not applicable. It _would_ be false to store a grade of 0 for a student who didn't take the exam because it implies that he got every question wrong when in fact he didn't write the test at all; Agreed. that would be an example of 0 having two contradicatory meanings. Given a phobia about nulls, it is a reasonable design to put students who don't take the test in a separate table but it's not the ONLY way to handle the situation: a null to indicate a student who didn't take the test is perfectly reasonable too. Disagreed. Let me explain further: In this case, a null means (according to you and in your design) that the student did not take the test. It could also mean that we lots it's grade for now, and that we want to fill it in later, as in: unknown, which is what NULL also means (your own words). In another case, null would be the employee isn't terminated yet, but could be in the future. That's the problems with nulls - there's no actual defined meaning. You can never say: This table holds test results, a row will mean: Student with StudentID 123 made test #12 and scored a 4 Cause you can have rows that can mean this: Student with StudentID 123 made test #12 and scored a yet unknown score or Student with StudentID 123 made test #12 and did not make the test or Student with StudentID 123 and test #12 doesn't apply to him In short: NULL is ambigious. perfectly valid value to use in many, many situations. Two classic examples: Let me first state that there's a difference between storing NULLs and handling NULLs in your result. Yes, of course there is: so what? You said NULLs should never be _stored_; that's what I'm responding to. Using NULLs is not very _difficult_ anyway, just a bit tedious because it is another case to handle. There's no point in storing what you don't know. But you _do_ know something: that the student didn't take the exam. That is a piece of true knowledge all by itself. But that's not what the rows in the table would mean - see above. In the relational model, the idea is to store true propositions - that is, each row means the same thing, but with different attributes. See above. So, let's debunk these two classic examples... Please note that I did not say that you MUST use NULLs, just that they were perfectly valid to use in a design. Depends on who you're asking ... A few years ago, I was working at a company that had developed an application with an Oracle database with about 400 tables, nothing too large. Plenty of tables had NULLs, not because of missing business data, but rather used internally in the system. I'm not sure what you mean by that phrase used internally in the system. Missing business data, eg: no middle name. Some people would put in a NULL in there as well, while NULL can also mean: I do not KNOW this persons middle name. Ambiguity, see above. The number of problems we had with them, I cannot count on my two hands, nor can I count them on the hands of all other developers (5) for that project. How about enumerating a few of them here? I've dealt with relational databases in several capacities for over 20 years and I've never found nulls to be a big problem. Not big, but a problem... :-) Do you often go around looking for (little) problems while you can easily avoid them? Just last week, I spoke to a guy who used to be my collegue there and we discussed the design
Re: Accountability with MySQL
- Original Message - From: Martijn Tonies [EMAIL PROTECTED] To: mysql@lists.mysql.com Sent: Wednesday, March 15, 2006 10:47 AM Subject: Re: Accountability with MySQL Hello, (please read all before replying) Basics of database design: store what you know. Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. Nonsense!! That's a bold statement ... That's simply wrong. A null means unknown or not applicable and is a Ah, so now things become clear, a NULL actually can mean two things? So much for clarity then... Come on; that's not fair. Unknown and not applicable are more like different senses of the same thing, not two opposite things. Read the literature on how to design databases. What you do is storing true propositions. That is, each attribute defines a certain true proposition. For example: Employee Martijn has Employee# 14. You should be able to derive these sentences from every row. Putting a NULL or N/A in there fails to meet this requirement. Perhaps my theory is a bit rusty but I have never heard of this true propositions business before. Do you have a citation where I can read about this? I would certainly recommend this one: http://www.amazon.com/exec/obidos/ASIN/0321197844/databasede095-20/103-0880452-0854264?creative=327641camp=14573adid=0VQ1KDVJ56MV7389RPEQlink_code=as1 Hmm, I had hoped for a citation to a free online document that I could read. I don't fancy the thought of spending $100+ US on a book just to argue the fine points of database design Then again, I didn't ask for a link to a free source of this information so I can't fault you for your link, it (presumably) answers the question. Frankly, I'm skeptical about your interpretation of this idea. While you clearly don't want false information in a database, it isn't false to say that student such-and-such's exam mark or such-and-such an employee's termination date is unknown or not applicable. It _would_ be false to store a grade of 0 for a student who didn't take the exam because it implies that he got every question wrong when in fact he didn't write the test at all; Agreed. that would be an example of 0 having two contradicatory meanings. Given a phobia about nulls, it is a reasonable design to put students who don't take the test in a separate table but it's not the ONLY way to handle the situation: a null to indicate a student who didn't take the test is perfectly reasonable too. Disagreed. Let me explain further: In this case, a null means (according to you and in your design) that the student did not take the test. Yes. It could also mean that we lots it's grade for now, and that we want to fill it in later, as in: unknown, which is what NULL also means (your own words). In another case, null would be the employee isn't terminated yet, but could be in the future. That's the problems with nulls - there's no actual defined meaning. I agree that the meaning is not a single, very narrow thing like 'zero' or 'unknown' in every single case. You have to put a null into some kind of context. But that context is usually fairly obvious, at least in my opinion. You can never say: This table holds test results, a row will mean: Student with StudentID 123 made test #12 and scored a 4 Cause you can have rows that can mean this: Student with StudentID 123 made test #12 and scored a yet unknown score or Student with StudentID 123 made test #12 and did not make the test or Student with StudentID 123 and test #12 doesn't apply to him If the exact reason that a null is in a particular row is particularly important, you can always define an extra column that describes the reason for the null. A could mean the score hadn't been entered yet because the exam hadn't taken place yet, B could mean the test has already taken place but the student didn't write the exam, C could mean that student has no score because he was excused from the test for some reason. I would agree that this is cumbersome but I think you'd have a very similar problem with your design: you would have put some people in a separate table if they didn't write the test but you'd have to identify the reason that the test wasn't written too, if you cared about that, or have separate tables for students who failed to attend the test and students who were excused from the test. So I don't see that you're a lot farther ahead with your design. In fact, your design could be _more_ complicated. In your design a program couldn't simply look up the student by his ID in a single table and then react to a null; the program would have to look in the main Grades table first, then, if it didn't find the student there, would have to look in the table that records people who didn't write the test because they were sick. If the row wasn't found there, it would have to look in the table that records people who were
Re: Accountability with MySQL
A commonly used example (at least at my last 3 jobs) would be a table of demographics for people (whether they be employees, clients, whatever). You can have one table and allow NULLs for some of the fields (id, LastName, FirstName, MiddleInitial, Title, NameSuffix, AddressLine1, AddressLine2, City, State, Zip, Phone1, Phone2, Fax, Pager), or you can have a separate table for each of the nullable fields (I count 8 in addition to the base demographics table). I am not a master DBA but it seems kind of brain damaged to insist on using 9 tables left joined together every time I need a line of demographics. Martijn Tonies wrote: Hello, (please read all before replying) Basics of database design: store what you know. Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. Nonsense!! That's a bold statement ... That's simply wrong. A null means unknown or not applicable and is a Ah, so now things become clear, a NULL actually can mean two things? So much for clarity then... Come on; that's not fair. Unknown and not applicable are more like different senses of the same thing, not two opposite things. Read the literature on how to design databases. What you do is storing true propositions. That is, each attribute defines a certain true proposition. For example: Employee Martijn has Employee# 14. You should be able to derive these sentences from every row. Putting a NULL or N/A in there fails to meet this requirement. Perhaps my theory is a bit rusty but I have never heard of this true propositions business before. Do you have a citation where I can read about this? I would certainly recommend this one: http://www.amazon.com/exec/obidos/ASIN/0321197844/databasede095-20/103-0880452-0854264?creative=327641camp=14573adid=0VQ1KDVJ56MV7389RPEQlink_code=as1 Frankly, I'm skeptical about your interpretation of this idea. While you clearly don't want false information in a database, it isn't false to say that student such-and-such's exam mark or such-and-such an employee's termination date is unknown or not applicable. It _would_ be false to store a grade of 0 for a student who didn't take the exam because it implies that he got every question wrong when in fact he didn't write the test at all; Agreed. that would be an example of 0 having two contradicatory meanings. Given a phobia about nulls, it is a reasonable design to put students who don't take the test in a separate table but it's not the ONLY way to handle the situation: a null to indicate a student who didn't take the test is perfectly reasonable too. Disagreed. Let me explain further: In this case, a null means (according to you and in your design) that the student did not take the test. It could also mean that we lots it's grade for now, and that we want to fill it in later, as in: unknown, which is what NULL also means (your own words). In another case, null would be the employee isn't terminated yet, but could be in the future. That's the problems with nulls - there's no actual defined meaning. You can never say: This table holds test results, a row will mean: Student with StudentID 123 made test #12 and scored a 4 Cause you can have rows that can mean this: Student with StudentID 123 made test #12 and scored a yet unknown score or Student with StudentID 123 made test #12 and did not make the test or Student with StudentID 123 and test #12 doesn't apply to him In short: NULL is ambigious. perfectly valid value to use in many, many situations. Two classic examples: Let me first state that there's a difference between storing NULLs and handling NULLs in your result. Yes, of course there is: so what? You said NULLs should never be _stored_; that's what I'm responding to. Using NULLs is not very _difficult_ anyway, just a bit tedious because it is another case to handle. There's no point in storing what you don't know. But you _do_ know something: that the student didn't take the exam. That is a piece of true knowledge all by itself. But that's not what the rows in the table would mean - see above. In the relational model, the idea is to store true propositions - that is, each row means the same thing, but with different attributes. See above. So, let's debunk these two classic examples... Please note that I did not say that you MUST use NULLs, just that they were perfectly valid to use in a design. Depends on who you're asking ... A few years ago, I was working at a company that had developed an application with an Oracle database with about 400 tables, nothing too large. Plenty of tables had NULLs, not because of missing business data, but rather used internally in the system. I'm not sure what you mean by that phrase used internally in the system. Missing business data, eg: no middle name. Some people would put in a NULL in there as well, while NULL can also mean: I do not KNOW this
Re: Accountability with MySQL
At 7:28 pm +0100 14/3/06, Martijn Tonies wrote: Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. At 1:49 pm -0500 14/3/06, Rhino wrote: A null means unknown or not applicable and is a perfectly valid value to use in many, many situations. It's not often you can say that two people are quite literally arguing about nothing! James Harvard (... being flippant because the actual arguments started going over my head about half-a-dozen posts ago...) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Accountability with MySQL
I am designing a simple accountability system so all the partners can have direct access to it by intranet. I was designing the data model, and came up with this: CREATE TABLE `moviments` ( `moviment_id` int(20) NOT NULL auto_increment, `moviment_date` date NOT NULL default '-00-00', `moviment_description` char(200) NOT NULL default '', `moviment_assignor` char(80) NOT NULL default '', `moviment_drawee` char(80) NOT NULL default '', `moviment_amount` int(20) NOT NULL default '0', PRIMARY KEY (`moviment_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 But I am a little bit stuck with this: Should I use a column to mark if the moviment has been executed? About income and outcome, should I use a column called moviment_type or just put a negative value when is an outcome for exampe? Does anyone ever made something like that, any other idea that could improve my little system? Best regards, Bruno B B Magalhães -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
Hello Bruno, I am designing a simple accountability system so all the partners can have direct access to it by intranet. I was designing the data model, and came up with this: CREATE TABLE `moviments` ( `moviment_id` int(20) NOT NULL auto_increment, `moviment_date` date NOT NULL default '-00-00', What a weird default date - it's rather invalid, isn't it? `moviment_description` char(200) NOT NULL default '', `moviment_assignor` char(80) NOT NULL default '', `moviment_drawee` char(80) NOT NULL default '', `moviment_amount` int(20) NOT NULL default '0', PRIMARY KEY (`moviment_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 btw, given all the default clauses, did you know you can actually put a complete empty moviment into this table? Doesn't say a thing. You might want to change that a bit :-) But I am a little bit stuck with this: Should I use a column to mark if the moviment has been executed? About income and outcome, should I use a column called moviment_type or just put a negative value when is an outcome for exampe? Calculating summaries etc is, I think, much easier if you just do a minus. Does anyone ever made something like that, any other idea that could improve my little system? Well, it depends on what you're trying to do and how you want to query data. You could also decide to create two tables - one for positives and one for negatives. Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
Hi Martijn, well about the date default value being invalid, well it´s working here in my system (MySQL 4.1.16, Mac OS X 10.4.5), and all my systens work with this... About the business logic I was thinking about showing the current month by default, and if the user wants he can select diferent days, months and year, or beetween dates.. And everything will be shown in a simple list, with some graphics like bills to pay in the next week, income to recieve in next week, and simple stats of that kind.. To split in two tables (income and outcome) seams a good idea... Best Regards, Bruno B B Magalhães Hello Bruno, I am designing a simple accountability system so all the partners can have direct access to it by intranet. I was designing the data model, and came up with this: CREATE TABLE `moviments` ( `moviment_id` int(20) NOT NULL auto_increment, `moviment_date` date NOT NULL default '-00-00', What a weird default date - it's rather invalid, isn't it? `moviment_description` char(200) NOT NULL default '', `moviment_assignor` char(80) NOT NULL default '', `moviment_drawee` char(80) NOT NULL default '', `moviment_amount` int(20) NOT NULL default '0', PRIMARY KEY (`moviment_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 btw, given all the default clauses, did you know you can actually put a complete empty moviment into this table? Doesn't say a thing. You might want to change that a bit :-) But I am a little bit stuck with this: Should I use a column to mark if the moviment has been executed? About income and outcome, should I use a column called moviment_type or just put a negative value when is an outcome for exampe? Calculating summaries etc is, I think, much easier if you just do a minus. Does anyone ever made something like that, any other idea that could improve my little system? Well, it depends on what you're trying to do and how you want to query data. You could also decide to create two tables - one for positives and one for negatives. Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
Bruno B B Magalháes [EMAIL PROTECTED] wrote on 03/14/2006 12:41:35 PM: I am designing a simple accountability system so all the partners can have direct access to it by intranet. I was designing the data model, and came up with this: CREATE TABLE `moviments` ( `moviment_id` int(20) NOT NULL auto_increment, `moviment_date` date NOT NULL default '-00-00', `moviment_description` char(200) NOT NULL default '', `moviment_assignor` char(80) NOT NULL default '', `moviment_drawee` char(80) NOT NULL default '', `moviment_amount` int(20) NOT NULL default '0', PRIMARY KEY (`moviment_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 But I am a little bit stuck with this: Should I use a column to mark if the moviment has been executed? About income and outcome, should I use a column called moviment_type or just put a negative value when is an outcome for exampe? Does anyone ever made something like that, any other idea that could improve my little system? Best regards, Bruno B B Magalhães It all depends on what accounting style you want to use whether you want single-entry or double-entry bookkeeping. There are pros and cons to both styles. The best answer comes from you: If you were an accountant, how would you write this out on paper? Once you figure out what you paper books should look like you should be able to manage your electronic data almost identically to how you manage your paper data. If you are used to seeing a list of debits and credits where each line represents one account (end) within a single transaction (these lines always appear in pairs) then you want to use one type of table. If you want to see each transaction on a single line that also lists both ends of the transaction, you use a table similar to the one you made. If you have a flag for transaction type (debit,credit, etc) then your amounts should almost always be listed as positive values. It's the position of the account (assignor or drawee) and the transaction type that determines the sign (+/-) to apply to the value as you apply the value to an account. Should you have a flag for the status movement complete ? I would say yes but instead of a simple checkbox, you could store a date value. That gives you two pieces of information a) if the date is null then the movement is not complete. b) if the date is NOT null then the movement is complete and you know when it finished. As I said before, how you set up your accounting tables really depends on the style of accounting you want to do. Check with a bookkeeper or an accountant for more details. Shawn Green Database Administrator Unimin Corporation - Spruce Pine
Re: Accountability with MySQL
Hello Bruno, well about the date default value being invalid, well it´s working here in my system (MySQL 4.1.16, Mac OS X 10.4.5), and all my systens work with this... Yes, it's a valid value in MySQL, but it's an invalid date, that's what I'm trying to say. Why have an invalid date as the default? Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
Martijn Tonies [EMAIL PROTECTED] wrote on 03/14/2006 01:16:11 PM: Hello Bruno, well about the date default value being invalid, well it´s working here in my system (MySQL 4.1.16, Mac OS X 10.4.5), and all my systens work with this... Yes, it's a valid value in MySQL, but it's an invalid date, that's what I'm trying to say. Why have an invalid date as the default? Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com Martin, If you used MySQL just a bit more often for day-to-day operations (yes, we all know that you build useful tools that interoperate with several RDBM systems), you would know that you cannot assign anything but constant values as default values for any type of column in MySQL (timestamp being the singular, partial exception). That means you cannot design a column to have more meaningful default date by capturing the current time because functions are not permitted as default values (yet). With that in mind, a design default of -00-00 is as good as any other single, randomly chosen default date, don't you think? That's also the default default date if you define a non-null date column and do not specify your own default in the definition. Shawn Green Database Administrator Unimin Corporation - Spruce Pine
Re: Accountability with MySQL
Should you have a flag for the status movement complete ? I would say yes but instead of a simple checkbox, you could store a date value. That gives you two pieces of information a) if the date is null then the movement is not complete. b) if the date is NOT null then the movement is complete and you know when it finished. There, Shawn and I disagree :-) In case of a very simple database, one could use NULL as a flag or signal. But in general, I would avoid storing NULLs. Basics of database design: store what you know. Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
- Original Message - From: Bruno B B Magalháes [EMAIL PROTECTED] To: mysql@lists.mysql.com Sent: Tuesday, March 14, 2006 12:41 PM Subject: Accountability with MySQL I am designing a simple accountability system so all the partners can have direct access to it by intranet. I was designing the data model, and came up with this: CREATE TABLE `moviments` ( `moviment_id` int(20) NOT NULL auto_increment, `moviment_date` date NOT NULL default '-00-00', `moviment_description` char(200) NOT NULL default '', `moviment_assignor` char(80) NOT NULL default '', `moviment_drawee` char(80) NOT NULL default '', `moviment_amount` int(20) NOT NULL default '0', PRIMARY KEY (`moviment_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 One small observation: if your column names are meant to be in English - as suggested by your use of amount, drawee, etc. I think you should also use an English word where you are currently using moviment. Moviment is not an English word. There _is_ an English word movement that might be what you want but I suspect that transaction would be the term most English-speakers would use in the table you describe. Your email address indicates that you are from Brazil so I'll assume you are a Portuguese-speaker, not a native English-speaker. This suggestion is only meant to be a friendly suggestion, not an insult to your English which is very very good. :-) But I am a little bit stuck with this: Should I use a column to mark if the moviment has been executed? That depends. You could decide to only add the row to the table if it has been executed; then, you would know implicitly that every entry has been executed. On the other hand, if you want to show transactions that have not yet been executed, a column that indicates whether the transaction has been executed would probably be a good idea. It might be even better to display details about the transaction that refer to its successful execution rather than just showing a yes/no flag. Perhaps you could store the timestamp that shows when the transaction was completed and maybe the identity of the person or program that completed the transaction or even a copy of the document generated by the transaction (or a link to this document). For instance, if a receipt was issued for this transaction, you might want to store an image of the receipt (or a link to the image) in the table. About income and outcome, should I use a column called moviment_type or just put a negative value when is an outcome for exampe? Either approach should work. It would probably be easier to simply store the sign of the transaction with the amount. Does anyone ever made something like that, any other idea that could improve my little system? I think the question you need to ask yourself is What do I want to be able to find out about a transaction after it has taken place?. If you decide you want to know what currency was involved in the transaction, you need to record that in your table. If you decide you want to know the serial numbers of the banknotes used in the transaction, you need to record that. And so on. It might be wise to talk to someone in the bank and ask them what things they need to find out while a transaction is taking place and afterwards. You may want to try asking your users (or their managers) what sorts of things people ask for but can't get from the old system; these are prime candidates for things that you could add to your new system. I'm a little surprised that your amount column is an integer. Does the Brazilian currency not have a fractional component? Here in Canada, monetary amounts are always decimal numbers, like $123.45, meaning one hundred and twenty three dollars and forty five cents. Or is your currency like Japanese yen, which have no fractional part? -- Rhino -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.385 / Virus Database: 268.2.2/280 - Release Date: 13/03/2006 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
Hello Shawn, well about the date default value being invalid, well it´s working here in my system (MySQL 4.1.16, Mac OS X 10.4.5), and all my systens work with this... Yes, it's a valid value in MySQL, but it's an invalid date, that's what I'm trying to say. Why have an invalid date as the default? If you used MySQL just a bit more often for day-to-day operations (yes, we all know that you build useful tools that interoperate with several RDBM systems), you would know that you cannot assign anything but constant values as default values for any type of column in MySQL (timestamp being the singular, partial exception). I know that. That means you cannot design a column to have more meaningful default date by capturing the current time because functions are not permitted as default values (yet). Same here. With that in mind, a design default of -00-00 is as good as any other single, randomly chosen default date, don't you think? That's also the default default date if you define a non-null date column and do not specify your own default in the definition. I'd rather say that having no default (and a NOT NULL) would be the better thing to do. What's the use of not null and a default? Isn't that like saying: well, you have to fill in each column, but if you don't I'll assign something empty-ish. Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
- Original Message - From: Martijn Tonies [EMAIL PROTECTED] To: mysql@lists.mysql.com Sent: Tuesday, March 14, 2006 1:16 PM Subject: Re: Accountability with MySQL Hello Bruno, well about the date default value being invalid, well it´s working here in my system (MySQL 4.1.16, Mac OS X 10.4.5), and all my systens work with this... Yes, it's a valid value in MySQL, but it's an invalid date, that's what I'm trying to say. Why have an invalid date as the default? I think there may be a little bit of language confusion here. Bruno, I think Martijn is suggesting that you use a date that can actually exist as the default transaction date; '000-00-00' is accepted by MySQL as a date that can be stored in a table but it is not a date that ever really existed. There was never a Year 0 in the history of the world. I think Martijn is right; it is probably a better choice to use a real date in your table. The obvious choice would be to store the current date in that column. For example, if you add a row to the table, the new row should normally contain the date that the transaction was executed; that might be yesterday or a week ago or maybe even longer. But if no transaction date is known, it would probably be reasonable to store today's date. But sometimes that _wouldn't_ be reasonable. You should probably talk to the people in the bank about this and ask if it is ever possible that the transaction date would be unknown; if the transaction data _can_ be unknown, you should ask what date they usually store in their existing system in that case. Then your new table should probably do the exact same thing as the old system does, assuming everyone agrees that this is the right thing to do. (The users may say that it is the wrong thing to do and suggest a better value to choose for the transaction date.) But if the transaction date can never be unknown, it doesn't make much difference what you choose as the default date since it will never be used. -- Rhino -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.385 / Virus Database: 268.2.2/280 - Release Date: 13/03/2006 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
- Original Message - From: Martijn Tonies [EMAIL PROTECTED] To: mysql@lists.mysql.com Sent: Tuesday, March 14, 2006 1:28 PM Subject: Re: Accountability with MySQL Should you have a flag for the status movement complete ? I would say yes but instead of a simple checkbox, you could store a date value. That gives you two pieces of information a) if the date is null then the movement is not complete. b) if the date is NOT null then the movement is complete and you know when it finished. There, Shawn and I disagree :-) In case of a very simple database, one could use NULL as a flag or signal. But in general, I would avoid storing NULLs. Basics of database design: store what you know. Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. Nonsense!! That's simply wrong. A null means unknown or not applicable and is a perfectly valid value to use in many, many situations. Two classic examples: 1. You have a table containing test scores. Some students get every question on a given test wrong. Some students don't write the test at all, perhaps because they were sick. In this scenario, I think the correct thing to store is 0 for the students who got every question wrong and NULL for the students who didn't write the test. 2. You have a table containing employee records. One of the columns is termination date. What value do you store for a new employee's termination date? Well, if they are a contractor on a fixed length contract, you could calculate the date the contract ends; fair enough. If they are a permanent employee and your area has mandatory retirement, you could calculate the date they turn 65 (or whatever) and use that. But what if they are a permanent employee and you don't have mandatory retirement? I would store a NULL to mean I don't know right now. Then, if and when they gave notice that they were leaving, I would change the termination date from NULL to their last day of work. -- Rhino -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.385 / Virus Database: 268.2.2/280 - Release Date: 13/03/2006 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
I am designing a simple accountability system so all the partners can have direct access to it by intranet. I was designing the data model, and came up with this: CREATE TABLE `moviments` ( `moviment_id` int(20) NOT NULL auto_increment, `moviment_date` date NOT NULL default '-00-00', `moviment_description` char(200) NOT NULL default '', `moviment_assignor` char(80) NOT NULL default '', `moviment_drawee` char(80) NOT NULL default '', `moviment_amount` int(20) NOT NULL default '0', PRIMARY KEY (`moviment_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 One small observation: if your column names are meant to be in English - as suggested by your use of amount, drawee, etc. I think you should also use an English word where you are currently using moviment. Moviment is not an English word. There _is_ an English word movement that might be what you want but I suspect that transaction would be the term most English-speakers would use in the table you describe. Your email address indicates that you are from Brazil so I'll assume you are a Portuguese-speaker, not a native English-speaker. This suggestion is only meant to be a friendly suggestion, not an insult to your English which is very very good. :-) Ups, that´s right, I misspelt movement, you know beeing a partner and also being the financial director, technology director and CEO some times can be exhaustive. Well, at least we are growing.. (the opposite of my weekends, that seams to be getting smaller and smaller). But thanks, I also agree transaction is more suitable. But I am a little bit stuck with this: Should I use a column to mark if the moviment has been executed? That depends. You could decide to only add the row to the table if it has been executed; then, you would know implicitly that every entry has been executed. On the other hand, if you want to show transactions that have not yet been executed, a column that indicates whether the transaction has been executed would probably be a good idea. It might be even better to display details about the transaction that refer to its successful execution rather than just showing a yes/no flag. Perhaps you could store the timestamp that shows when the transaction was completed and maybe the identity of the person or program that completed the transaction or even a copy of the document generated by the transaction (or a link to this document). For instance, if a receipt was issued for this transaction, you might want to store an image of the receipt (or a link to the image) in the table. It´s a good idea, but for now, overkill... I mean here is my new design: CREATE TABLE `earnings` ( `transaction_id` int(20) NOT NULL auto_increment, `transaction_planned_date` date NOT NULL default '-00-00', `transaction_executed_date` date NOT NULL default '-00-00', `transaction_description` char(200) NOT NULL default '', `transaction_assignor` char(80) NOT NULL default '', `transaction_amount` int(20) NOT NULL default '0', PRIMARY KEY (`transaction_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 CREATE TABLE `earnings` ( `transaction_id` int(20) NOT NULL auto_increment, `transaction_planned_date` date NOT NULL default '-00-00', `transaction_executed_date` date NOT NULL default '-00-00', `transaction_description` char(200) NOT NULL default '', `transaction_assignor` char(80) NOT NULL default '', `transaction_amount` int(20) NOT NULL default '0', PRIMARY KEY (`transaction_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 I think it is a good model for start... About income and outcome, should I use a column called moviment_type or just put a negative value when is an outcome for example? Either approach should work. It would probably be easier to simply store the sign of the transaction with the amount. Does anyone ever made something like that, any other idea that could improve my little system? I think the question you need to ask yourself is What do I want to be able to find out about a transaction after it has taken place?. If you decide you want to know what currency was involved in the transaction, you need to record that in your table. If you decide you want to know the serial numbers of the banknotes used in the transaction, you need to record that. And so on. It might be wise to talk to someone in the bank and ask them what things they need to find out while a transaction is taking place and afterwards. You may want to try asking your users (or their managers) what sorts of things people ask for but can't get from the old system; these are prime candidates for things that you could add to your new system. What I want? Well that my partners stop asking me to send the spreadsheet every week... :) They want to take a look the company´s profitability, if all payments have been made, if all clients have payed... As I am the one who make every transaction I simple would input it into DB (direct) and they would access a simple
Re: Accountability with MySQL
Should you have a flag for the status movement complete ? I would say yes but instead of a simple checkbox, you could store a date value. That gives you two pieces of information a) if the date is null then the movement is not complete. b) if the date is NOT null then the movement is complete and you know when it finished. There, Shawn and I disagree :-) In case of a very simple database, one could use NULL as a flag or signal. But in general, I would avoid storing NULLs. Basics of database design: store what you know. Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. Nonsense!! That's a bold statement ... That's simply wrong. A null means unknown or not applicable and is a Ah, so now things become clear, a NULL actually can mean two things? So much for clarity then... perfectly valid value to use in many, many situations. Two classic examples: Let me first state that there's a difference between storing NULLs and handling NULLs in your result. So, let's debunk these two classic examples... 1. You have a table containing test scores. Some students get every question on a given test wrong. Some students don't write the test at all, perhaps because they were sick. In this scenario, I think the correct thing to store is 0 for the students who got every question wrong and NULL for the students who didn't write the test. IMO, the better thing to do in this particular case is to NOT store a test result for the students that did not make the test. So, an example table with test scores: TEST_SCORES StudentID Int, TestID Int, Score TinyInt Unsigned Now, students who didn't make the test won't have a record in here. Perfectly valid design AND you avoid storing NULLs. 2. You have a table containing employee records. One of the columns is termination date. What value do you store for a new employee's termination date? Well, if they are a contractor on a fixed length contract, you could calculate the date the contract ends; fair enough. If they are a permanent employee and your area has mandatory retirement, you could calculate the date they turn 65 (or whatever) and use that. But what if they are a permanent employee and you don't have mandatory retirement? I would store a NULL to mean I don't know right now. Then, if and when they gave notice that they were leaving, I would change the termination date from NULL to their last day of work. Why store a date column if you don't know? Why not use: EMPLOYEES EmployeeID int, StartingDate Date, ... TERMINATED_EMPLOYEES (albeit a bit agressive ;) ) EmployeeID TerminationDate Once more: perfectly valid design. Martijn Tonies Database Workbench - development tool for MySQL, and more! Upscene Productions http://www.upscene.com My thoughts: http://blog.upscene.com/martijn/ Database development questions? Check the forum! http://www.databasedevelopmentforum.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
- Original Message - From: Bruno B B Magalháes [EMAIL PROTECTED] To: mysql@lists.mysql.com Sent: Tuesday, March 14, 2006 2:07 PM Subject: Re: Accountability with MySQL I am designing a simple accountability system so all the partners can have direct access to it by intranet. I was designing the data model, and came up with this: CREATE TABLE `moviments` ( `moviment_id` int(20) NOT NULL auto_increment, `moviment_date` date NOT NULL default '-00-00', `moviment_description` char(200) NOT NULL default '', `moviment_assignor` char(80) NOT NULL default '', `moviment_drawee` char(80) NOT NULL default '', `moviment_amount` int(20) NOT NULL default '0', PRIMARY KEY (`moviment_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 One small observation: if your column names are meant to be in English - as suggested by your use of amount, drawee, etc. I think you should also use an English word where you are currently using moviment. Moviment is not an English word. There _is_ an English word movement that might be what you want but I suspect that transaction would be the term most English-speakers would use in the table you describe. Your email address indicates that you are from Brazil so I'll assume you are a Portuguese-speaker, not a native English-speaker. This suggestion is only meant to be a friendly suggestion, not an insult to your English which is very very good. :-) Ups, that´s right, I misspelt movement, you know beeing a partner and also being the financial director, technology director and CEO some times can be exhaustive. Well, at least we are growing.. (the opposite of my weekends, that seams to be getting smaller and smaller). But thanks, I also agree transaction is more suitable. But I am a little bit stuck with this: Should I use a column to mark if the moviment has been executed? That depends. You could decide to only add the row to the table if it has been executed; then, you would know implicitly that every entry has been executed. On the other hand, if you want to show transactions that have not yet been executed, a column that indicates whether the transaction has been executed would probably be a good idea. It might be even better to display details about the transaction that refer to its successful execution rather than just showing a yes/no flag. Perhaps you could store the timestamp that shows when the transaction was completed and maybe the identity of the person or program that completed the transaction or even a copy of the document generated by the transaction (or a link to this document). For instance, if a receipt was issued for this transaction, you might want to store an image of the receipt (or a link to the image) in the table. It´s a good idea, but for now, overkill... I mean here is my new design: CREATE TABLE `earnings` ( `transaction_id` int(20) NOT NULL auto_increment, `transaction_planned_date` date NOT NULL default '-00-00', `transaction_executed_date` date NOT NULL default '-00-00', `transaction_description` char(200) NOT NULL default '', `transaction_assignor` char(80) NOT NULL default '', `transaction_amount` int(20) NOT NULL default '0', PRIMARY KEY (`transaction_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 CREATE TABLE `earnings` ( `transaction_id` int(20) NOT NULL auto_increment, `transaction_planned_date` date NOT NULL default '-00-00', `transaction_executed_date` date NOT NULL default '-00-00', `transaction_description` char(200) NOT NULL default '', `transaction_assignor` char(80) NOT NULL default '', `transaction_amount` int(20) NOT NULL default '0', PRIMARY KEY (`transaction_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 I think it is a good model for start... About income and outcome, should I use a column called moviment_type or just put a negative value when is an outcome for example? Either approach should work. It would probably be easier to simply store the sign of the transaction with the amount. Does anyone ever made something like that, any other idea that could improve my little system? I think the question you need to ask yourself is What do I want to be able to find out about a transaction after it has taken place?. If you decide you want to know what currency was involved in the transaction, you need to record that in your table. If you decide you want to know the serial numbers of the banknotes used in the transaction, you need to record that. And so on. It might be wise to talk to someone in the bank and ask them what things they need to find out while a transaction is taking place and afterwards. You may want to try asking your users (or their managers) what sorts of things people ask for but can't get from the old system; these are prime candidates for things that you could add to your new system. What I want? Well that my partners stop asking me to send the spreadsheet every week... :) They want to take a look the company´s profitability, if all
Re: Accountability with MySQL
- Original Message - From: Martijn Tonies [EMAIL PROTECTED] To: Rhino [EMAIL PROTECTED]; mysql@lists.mysql.com Sent: Tuesday, March 14, 2006 2:16 PM Subject: Re: Accountability with MySQL Should you have a flag for the status movement complete ? I would say yes but instead of a simple checkbox, you could store a date value. That gives you two pieces of information a) if the date is null then the movement is not complete. b) if the date is NOT null then the movement is complete and you know when it finished. There, Shawn and I disagree :-) In case of a very simple database, one could use NULL as a flag or signal. But in general, I would avoid storing NULLs. Basics of database design: store what you know. Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. Nonsense!! That's a bold statement ... That's simply wrong. A null means unknown or not applicable and is a Ah, so now things become clear, a NULL actually can mean two things? So much for clarity then... Come on; that's not fair. Unknown and not applicable are more like different senses of the same thing, not two opposite things. perfectly valid value to use in many, many situations. Two classic examples: Let me first state that there's a difference between storing NULLs and handling NULLs in your result. Yes, of course there is: so what? You said NULLs should never be _stored_; that's what I'm responding to. Using NULLs is not very _difficult_ anyway, just a bit tedious because it is another case to handle. So, let's debunk these two classic examples... Please note that I did not say that you MUST use NULLs, just that they were perfectly valid to use in a design. 1. You have a table containing test scores. Some students get every question on a given test wrong. Some students don't write the test at all, perhaps because they were sick. In this scenario, I think the correct thing to store is 0 for the students who got every question wrong and NULL for the students who didn't write the test. IMO, the better thing to do in this particular case is to NOT store a test result for the students that did not make the test. Okay, that might be acceptable, if it doesn't cause you to lose track of the student altogether. But if this table was the only one that even recorded the _existence_ of the student, you'd have a problem; if someone tried to verify that the student had attended this school, you wouldn't know that they had. Or in a more probable case, if that was the only test for that course and the student missed it and then had no row in the table, you might not have any way of knowing that they took the course! And if they later wanted to write the exam, having recovered from their illness, your query might have the effect of keeping them from taking the second exam: your query would report that they had never taken the first exam so an adminstrator might refuse to let him/her take the second exam because they (apparently) had never been scheduled to take the first one. Now, you could have a second table to record people who were scheduled to take tests but failed to take them to cover that situation but I think it would be easier to record all students in one table and then simply store a null for any test that they fail to take and a zero for every case where a student got every answer wrong. When you compute the class average, the avg() function would ensure that the students who got every question wrong would pull down the class average but that students who failed to write the test at all would NOT skew the average because the avg() function ignores nulls. So, an example table with test scores: TEST_SCORES StudentID Int, TestID Int, Score TinyInt Unsigned Now, students who didn't make the test won't have a record in here. Perfectly valid design AND you avoid storing NULLs. 2. You have a table containing employee records. One of the columns is termination date. What value do you store for a new employee's termination date? Well, if they are a contractor on a fixed length contract, you could calculate the date the contract ends; fair enough. If they are a permanent employee and your area has mandatory retirement, you could calculate the date they turn 65 (or whatever) and use that. But what if they are a permanent employee and you don't have mandatory retirement? I would store a NULL to mean I don't know right now. Then, if and when they gave notice that they were leaving, I would change the termination date from NULL to their last day of work. Why store a date column if you don't know? Why not use: EMPLOYEES EmployeeID int, StartingDate Date, ... TERMINATED_EMPLOYEES (albeit a bit agressive ;) ) EmployeeID TerminationDate Once more: perfectly valid design. Yes, that is also a valid design but it means you have to have yet another table that you could have avoided simply by permitting a null
Re: Accountability with MySQL
At 7:48 pm -0500 14/3/06, Rhino wrote: Ah, so now things become clear, a NULL actually can mean two things? So much for clarity then... Come on; that's not fair. Unknown and not applicable are more like different senses of the same thing, not two opposite things. IIRC (but please don't ask for a reference) it has been suggested that there should be two different types of NULL (value unknown not applicable) for just this reason. I suppose what it boils down to is that although the two main reasons one might use NULL are *not* the same thing, the outcome (the absence of a value) *is* the same. So you're both right. Sort of. :-) James Harvard -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
At 7:48 pm -0500 14/3/06, Rhino wrote: Ah, so now things become clear, a NULL actually can mean two things? So much for clarity then... Come on; that's not fair. Unknown and not applicable are more like different senses of the same thing, not two opposite things. IIRC (but please don't ask for a reference) it has been suggested that there should be two different types of NULL (value unknown not applicable) for just this reason. Yep. unknown and n/a are two different things and SQL happens to support them by NULLs. This is confusing at times and more difficult to handle at least... The problems with NULLs are numerous. Don't say they aren't, cause they are. In several SQL functions, NULLs make a difference and they will bite you every now and then. Especially for new people, which is a clear signal they are confusing. I suppose what it boils down to is that although the two main reasons one might use NULL are *not* the same thing, the outcome (the absence of a value) *is* the same. So you're both right. Sort of. :-) ;-) Martijn Tonies Upscene Productions http://www.upscene.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Accountability with MySQL
Basics of database design: store what you know. Given that NULLs basically means the absence of a value (eg: unknown), you shouldn't be storing NULLs. Nonsense!! That's a bold statement ... That's simply wrong. A null means unknown or not applicable and is a Ah, so now things become clear, a NULL actually can mean two things? So much for clarity then... Come on; that's not fair. Unknown and not applicable are more like different senses of the same thing, not two opposite things. Read the literature on how to design databases. What you do is storing true propositions. That is, each attribute defines a certain true proposition. For example: Employee Martijn has Employee# 14. You should be able to derive these sentences from every row. Putting a NULL or N/A in there fails to meet this requirement. perfectly valid value to use in many, many situations. Two classic examples: Let me first state that there's a difference between storing NULLs and handling NULLs in your result. Yes, of course there is: so what? You said NULLs should never be _stored_; that's what I'm responding to. Using NULLs is not very _difficult_ anyway, just a bit tedious because it is another case to handle. There's no point in storing what you don't know. So, let's debunk these two classic examples... Please note that I did not say that you MUST use NULLs, just that they were perfectly valid to use in a design. Depends on who you're asking ... A few years ago, I was working at a company that had developed an application with an Oracle database with about 400 tables, nothing too large. Plenty of tables had NULLs, not because of missing business data, but rather used internally in the system. The number of problems we had with them, I cannot count on my two hands, nor can I count them on the hands of all other developers (5) for that project. Just last week, I spoke to a guy who used to be my collegue there and we discussed the design a bit (he still does database design and development for a large company, on Sybase with over 4000 deployed systems) and we both agreed that using NULLs in most cases as part of the initial design was causing us more problems then helping us. 1. You have a table containing test scores. Some students get every question on a given test wrong. Some students don't write the test at all, perhaps because they were sick. In this scenario, I think the correct thing to store is 0 for the students who got every question wrong and NULL for the students who didn't write the test. IMO, the better thing to do in this particular case is to NOT store a test result for the students that did not make the test. Okay, that might be acceptable, if it doesn't cause you to lose track of the student altogether. But if this table was the only one that even recorded the _existence_ of the student, you'd have a problem; Indeed, then I would have a problem, cause you cannot derive from the test results table that a student exists. This is a rather silly statement. If the test results table should ALSO store the existence of a student, you're design is wrong :-) if someone tried to verify that the student had attended this school, you wouldn't know that they had. In your own words: nonsense. Each table should store what is was designed to store. If I would have to know if a student was enlisted in a certain course or would be attending this school, I would not be using the test results table. Or in a more probable case, if that was the only test for that course and the student missed it and then had no row in the table, you might not have any way of knowing that they took the course! See above. And if they later wanted to write the exam, having recovered from their illness, your query might have the effect of keeping them from taking the second exam: your query would report that they had never taken the first exam so an adminstrator might refuse to let him/her take the second exam because they (apparently) had never been scheduled to take the first one. Different problem, see above. Now, you could have a second table to record people who were scheduled to take tests but failed to take them to cover that situation but I think it Sounds like a decent design to me. would be easier to record all students in one table and then simply store a null for any test that they fail to take and a zero for every case where a student got every answer wrong. When you compute the class average, the avg() function would ensure that the students who got every question wrong would pull down the class average but that students who failed to write the test at all would NOT skew the average because the avg() function ignores nulls. Ah, an excellent example of why NULLs are tedious: they are ignored by some functions, but not by others. Without having the record there in the first place, there would have to be no rule of AVG ignoring