Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
2014/11/04 11:04 -0800, Jan Steinman I was not suggesting it for dates. The OP appeared to have a well-defined set of strings in a VARCHAR field THAT is what I suggested ENUMs for! What is the update frequency of those VARCHARs? If you're adding them often or if you need to occasionally change their value I'd use another table with a reference. If they're immutable and new ones are not added often, there's no design cost at all to using ENUMs. I'd argue there's a higher maintenance cost to NOT using them! Ugh--I missed the discussion shift from the DATEs to the VARCHAR labels ... and now I wholeheartedly agree with you. As for the DATEs, I yet suspect that for performance maybe TIMESTAMP is slightly better than DATE. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
2014-11-06 21:49 GMT+01:00, Roberta Jaskólski h...@tbbs.net: Ugh--I missed the discussion shift from the DATEs to the VARCHAR labels ... and now I wholeheartedly agree with you. As for the DATEs, I yet suspect that for performance maybe TIMESTAMP is slightly better than DATE. Well what I'm interested in - and I was asking my original question about - is SIGNIFICANT difference. If everything I can count for is just slight improvements (by which I understand difference that can be detected only by benchmarks, but not really during database operation), then actually it doesn't make much sense, it seems. -- Zbig -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
- Original Message - From: Zbigniew zbigniew2...@gmail.com Subject: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access? What about using ENUMs? They have nearly the performance of INTEGERs, but you don't have to maintain a string mapping in your programming logic. But are you able to estimate, what boost can i notice? 5% - or 50%, or maybe even 500%? I'll give you an interesting reason to switch to ENUM (or smallint, if so inclined): Your data fields will be smaller. That not only means more records in a page (might be negligable), but more importantly, it'll make the index on that field smaller, meaning a) more of it will remain in memory and b) lookups on it will be marginally faster, too. I have no hard data on how it'll impact index performance (your dataset is yours to benchmark), but on one million of records (and you were talking several), a each byte saved is a megabyte of memory that can be used for other purposes, like data cache, which will speed up other things, too. -- Unhappiness is discouraged and will be corrected with kitten pictures. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
From: (Halász Sándor) h...@tbbs.net 2014/10/29 20:56 +0100, Zbigniew if instead of textual labels I'll use SMALLINT as simple integer code for each different label (and in my program a translation table)? This, in effect, is ENUM... Well, not really! With INTEGERs, your referential integrity is external to MySQL, and has to be managed. ... and maybe is useful if the set of dates is well determined... I was not suggesting it for dates. The OP appeared to have a well-defined set of strings in a VARCHAR field — THAT is what I suggested ENUMs for! There is a design cost in using ENUM: If you find that your first set of dates is too small, later, with ALTER TABLE, you have to change the type. Again, the suggestion for ENUM was to replace a constrained set of VARCHARs, and yet, you raise a valid point. What is the update frequency of those VARCHARs? If you're adding them often — or if you need to occasionally change their value — I'd use another table with a reference. If they're immutable and new ones are not added often, there's no design cost at all to using ENUMs. I'd argue there's a higher maintenance cost to NOT using them! Jan Steinman, EcoReality Co-op -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
2014/11/02 13:19 +0100, Zbigniew So you guys (Jan and hsv) suggest, that switching from DATE to more numeric data type may not be necessary, but using ENUM instead of VARCHAR can be real performance gain, right? But are you able to estimate, what boost can i notice? 5% - or 50%, or maybe even 500%? _Maybe_! but I will not venture to measure it. I doubt that it would be big. I just looked ENUM up: it allows 65535 distinct values, which sounds like 16 bits, usually two bytes, and numeric operators would be used. 2014/11/02 11:19 -0800, Jan Steinman I would hope that the query optimizer converts '2014-11-02' to the three-bytes internal representation of DATE before doing the query, in which case, DATE should actually be a tiny bit faster than TIMESTAMP. That is doubtful. In the processors that I know, one built-in numeric operation is enough for either 2 bytes of ENUM or 4 bytes of TIMESTAMP, but three are needed for DATE. In any case, the C-overhead, since MySQL is implemented in C, overwhelms the cost of any single comparison. The equality comparison is at least simpler than an ordered comparison. 2014/10/29 20:56 +0100, Zbigniew if instead of textual labels I'll use SMALLINT as simple integer code for each different label (and in my program a translation table)? This, in effect, is ENUM, and maybe is useful if the set of dates is well determined. If you use ENUM at least the overhead of translation is built in into MySQL, and, one hopes, costs less than doing it for oneself. There is a design cost in using ENUM: If you find that your first set of dates is too small, later, with ALTER TABLE, you have to change the type. If you add the new string to the end, there is, maybe, no great cost to the adding, but if in the middle If every new date-string is added to the end, it will, maybe, be in random order. The same applys to your translation table. I suggested TIMESTAMP because I suspect that one built-in comparison (after the optimizer is done with it) is enough, and it allows the appearance of real dates. (If the processor fetches 32 bits at a time (nowadays 64 or more is likly) then a 32-bit type is fetched as swiftly as any other. Both shorter and longer types take longer.) The more I debate this, the better I like TIMESTAMP for your problem. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
2014-10-31 5:29 GMT+01:00, Jan Steinman j...@ecoreality.org: What about using ENUMs? They have nearly the performance of INTEGERs, but you don't have to maintain a string mapping in your programming logic. So you guys (Jan and hsv) suggest, that switching from DATE to more numeric data type may not be necessary, but using ENUM instead of VARCHAR can be real performance gain, right? But are you able to estimate, what boost can i notice? 5% - or 50%, or maybe even 500%? -- Zbig -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
From: Zbigniew zbigniew2...@gmail.com switching from DATE to more numeric data type may not be necessary... I would hope that the query optimizer converts '2014-11-02' to the three-bytes internal representation of DATE before doing the query, in which case, DATE should actually be a tiny bit faster than TIMESTAMP. using ENUM instead of VARCHAR can be real performance gain, right? Not just in performance, but it appears to simply be The Right Thing To Do(TM) in your case. (Codd Rule #10: referential integrity.) Consider an insert into a day-of-week column (for instance) that somehow got Sudnay in the VARCHAR field instead of Sunday. Using an ENUM eliminates the possibility of a typo at a more fundamental level than your programming logic. If you do a massive insert with Sudnay in the ENUM field, the entire transaction will fail, which is really what you want rather than having to track down bad data after the fact, no? If it REALLY is one value out of a known set, it SHOULD be either an ENUM, or a reference to another table. Use the latter technique if you need to add new values very often. But are you able to estimate, what boost can i notice? 5% - or 50%, or maybe even 500%? Very hard to say. That's like saying, If I eat well and get enough exercise, will I live 5% or 50% or 500% longer? Probably more like 5%, but it may FEEL like 500%! :-) If the value is constrained to a set, having it as an ENUM (or reference to another table) will save you grief in many other ways besides boosting performance. Private enterprise, indeed, became too private. It became privileged enterprise, not private enterprise. -- Franklin Delano Roosevelt Jan Steinman, EcoReality Co-op -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
2014/10/29 20:56 +0100, Zbigniew Now to the point: considering, that the second column shall contain about 100-200 different labels - so in average many of such labels can be repeated one million times (or even more) - will it speed-up the selection done with something like ...WHERE label='xyz' AND date='2013-02-25' (and maybe the insertion as well?), if instead of textual labels I'll use SMALLINT as simple integer code for each different label (and in my program a translation table)? If so - will I have further performance gain, if instead of human-readable DATE for the third column I'll use Unix time put into INTEGER type column, not DATE-type? I really cannot answer your real question, but say only that DATE and other time types are numeric, although mostly constructed. DATE takes three bytes with range '1000-01-01' to '-12-31'. TIMESTAMP (which has special features that one can suppress) is a four-byte integer that is a 31-bit Unix timestamp with range '1970-01-01 00:00:01.00' UTC to '2038-01-19 03:14:07.99' UTC. Maybe TIMESTAMP, which doubtless uses the underlying integer mechanism for comparison, is best for you. Consider also the functions UNIX_TIMESTAMP and FROM_UNIXTIME. The zone is not involved in DATE, but is involved in the rest aforesaid. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
From: Zbigniew zbigniew2...@gmail.com Now to the point: considering, that the second column shall contain about 100-200 different labels - so in average many of such labels can be repeated one million times (or even more) What about using ENUMs? They have essentially the performance of INTEGERs, but you don't have to maintain a string mapping in your programming logic. Yes'm, old friends is always best, 'less you can catch a new one that's fit to make an old one out of. -- Sarah Jewett Jan Steinman, EcoReality Co-op -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
From: Zbigniew zbigniew2...@gmail.com Now to the point: considering, that the second column shall contain about 100-200 different labels - so in average many of such labels can be repeated one million times (or even more) What about using ENUMs? They have nearly the performance of INTEGERs, but you don't have to maintain a string mapping in your programming logic. Yes'm, old friends is always best, 'less you can catch a new one that's fit to make an old one out of. -- Sarah Jewett Jan Steinman, EcoReality Co-op -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?
I'm going to establish a large database (ISAM) of simple structure, say the table with only 3 columns: - VARCHAR(80) - VARCHAR(40) - DATE The number of rows can be quite large, about hundred million or so. The first column, containing actual information, will contain unique values, unlike the two others - but the two others shall be used for data selection (and I'll index them). Now to the point: considering, that the second column shall contain about 100-200 different labels - so in average many of such labels can be repeated one million times (or even more) - will it speed-up the selection done with something like ...WHERE label='xyz' AND date='2013-02-25' (and maybe the insertion as well?), if instead of textual labels I'll use SMALLINT as simple integer code for each different label (and in my program a translation table)? If so - will I have further performance gain, if instead of human-readable DATE for the third column I'll use Unix time put into INTEGER type column, not DATE-type? And the final question: even, if so - is it worthy? I mean: will the supposed performance gain be significant (e.g. 2-3 times faster selection) - and not, say, just 5% faster (only possible to detect by using benchmarking tools)? Thanks in advance for your opinions. -- Zbig -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql