Another question on Cardinality??
Hello all, Another question on cardinality. Consider a table with 1000 rows and columnns. Details of the columns are as below: FLD_1 - int - cardinality 1000 - PRIMARY KEY FLD_2 - tinyint- cardinality 400 FLD_3 - varchar - cardinality 10 FLD_4 - varchar - cardinality 2 FLD_5 - varchar - cardinality 5 Assuming that cardinality exactly is the number of distinct values for that column, Which are the fields that is best for indexing for the table. Is it meaningful for me to index FLD_3, FLD_4 and FLD_5 knowing that their cardinality is always going to be the same? There are certain scenarios wherein I have queries on the tables as below: 1) Select * from table where FLD_4 = 1; 2) Select * from table where FLD_5 = 3; 3) Select * from table where FLD_3 1 AND FLD_5 6; considering all the above cases, what should I conclude? should I have indexes on these three fields? Looking for a specific answer than a depend on situation kind of an answer. Thanks Ratheesh Bhat K J
Re: Another question on Cardinality??
Have you considered reading up on basic database management? There are plenty of good material on the web for you to read where you can actually learn how to manage databases, so you don't have to ask others about every single detail. Ratheesh K J wrote: Hello all, Another question on cardinality. Consider a table with 1000 rows and columnns. Details of the columns are as below: FLD_1 - int - cardinality 1000 - PRIMARY KEY FLD_2 - tinyint- cardinality 400 FLD_3 - varchar - cardinality 10 FLD_4 - varchar - cardinality 2 FLD_5 - varchar - cardinality 5 Assuming that cardinality exactly is the number of distinct values for that column, Which are the fields that is best for indexing for the table. Is it meaningful for me to index FLD_3, FLD_4 and FLD_5 knowing that their cardinality is always going to be the same? There are certain scenarios wherein I have queries on the tables as below: 1) Select * from table where FLD_4 = 1; 2) Select * from table where FLD_5 = 3; 3) Select * from table where FLD_3 1 AND FLD_5 6; considering all the above cases, what should I conclude? should I have indexes on these three fields? Looking for a specific answer than a depend on situation kind of an answer. Thanks Ratheesh Bhat K J -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Another question on Cardinality??
It depends on the data, not the situation. How's that? FLD_4 is doubtful that you would want/need and index on it. This assumes an even distribution of both values (ie. male/female). Since you would be scanning half the table anyway, an index won't really help. Now if it's not an even distribution, like people under/over 70, and you will always be searching on people above 70, then an index may help since it will narrow the records down considerably. You really have to match cardinality with distribution of values. An index should allow you to quickly narrow the set of records that need to be analyzed. Cutting out half the records isn't going to help much, it's quicker just to read through the entire file sequentially than jump around to 50% of the records individually. - Original Message - From: Ratheesh K J [EMAIL PROTECTED] To: mysql@lists.mysql.com Sent: Friday, August 04, 2006 6:15 AM Subject: Another question on Cardinality?? Hello all, Another question on cardinality. Consider a table with 1000 rows and columnns. Details of the columns are as below: FLD_1 - int - cardinality 1000 - PRIMARY KEY FLD_2 - tinyint- cardinality 400 FLD_3 - varchar - cardinality 10 FLD_4 - varchar - cardinality 2 FLD_5 - varchar - cardinality 5 Assuming that cardinality exactly is the number of distinct values for that column, Which are the fields that is best for indexing for the table. Is it meaningful for me to index FLD_3, FLD_4 and FLD_5 knowing that their cardinality is always going to be the same? There are certain scenarios wherein I have queries on the tables as below: 1) Select * from table where FLD_4 = 1; 2) Select * from table where FLD_5 = 3; 3) Select * from table where FLD_3 1 AND FLD_5 6; considering all the above cases, what should I conclude? should I have indexes on these three fields? Looking for a specific answer than a depend on situation kind of an answer. Thanks Ratheesh Bhat K J -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Another question on Cardinality??
Brent, Given that... You really have to match cardinality with distribution of values. ...sounds like hard work (well you actually have to think about it) and... considering all the above cases, what should I conclude? should I have indexes on these three fields? Looking for a specific answer than a depend on situation kind of an answer. Frankly I'd ask Why shouldn't I be indexing these fields? not the other way around, you said to consider a table with only a thousand rows right? Unless I'm missing something an index would cost a trivial amount of disk space. If the real case is 10's of millions rows then compared to the data set it's still going to trivial anyway. Just index everything and throw some more hard drives at it. As someone else pointed out however there's plenty of comparative info all over the web let alone MySQL's own site, you could probably find some approximate numbers to play with out there. Regards, Phil -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Another question on Cardinality??
I didn't say consider a table with a thousand rows, must have been someone elses response. But anyway, it comes down to knowing your data. If you know your data, then you can create the best set of indexes. I would almost never create an index on a field with a cardinality of 2. However, I would most likely create a compound index that would include that field and others. Knowing which fields to includes under a single index requires knowing the data. Just index everything is probably and ok policy for many of the databases out there. But when you are adding/updating millions of records a day, performance is a concern. Indexing everything with have a noticeable slow down on inserts, updates and deletes because all the indexes also need to be updated. - Original Message - From: Philip Mather [EMAIL PROTECTED] To: Brent Baisley [EMAIL PROTECTED] Cc: Ratheesh K J [EMAIL PROTECTED]; mysql@lists.mysql.com Sent: Friday, August 04, 2006 1:31 PM Subject: Re: Another question on Cardinality?? Brent, Given that... You really have to match cardinality with distribution of values. ...sounds like hard work (well you actually have to think about it) and... considering all the above cases, what should I conclude? should I have indexes on these three fields? Looking for a specific answer than a depend on situation kind of an answer. Frankly I'd ask Why shouldn't I be indexing these fields? not the other way around, you said to consider a table with only a thousand rows right? Unless I'm missing something an index would cost a trivial amount of disk space. If the real case is 10's of millions rows then compared to the data set it's still going to trivial anyway. Just index everything and throw some more hard drives at it. As someone else pointed out however there's plenty of comparative info all over the web let alone MySQL's own site, you could probably find some approximate numbers to play with out there. Regards, Phil -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]