Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-11-06 Thread Roberta Jask�lski
 2014/11/04 11:04 -0800, Jan Steinman 
I was not suggesting it for dates. The OP appeared to have a well-defined set 
of strings in a VARCHAR field — THAT is what I suggested ENUMs for!



What is the update frequency of those VARCHARs? If you're adding them often — 
or if you need to occasionally change their value — I'd use another table with 
a reference.

If they're immutable and new ones are not added often, there's no design cost 
at all to using ENUMs. I'd argue there's a higher maintenance cost to NOT using 
them! 

Ugh--I missed the discussion shift from the DATEs to the VARCHAR labels ... and 
now I wholeheartedly agree with you.

As for the DATEs, I yet suspect that for performance maybe TIMESTAMP is 
slightly better than DATE.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-11-06 Thread Zbigniew
2014-11-06 21:49 GMT+01:00, Roberta Jaskólski h...@tbbs.net:

 Ugh--I missed the discussion shift from the DATEs to the VARCHAR labels ...
 and now I wholeheartedly agree with you.

 As for the DATEs, I yet suspect that for performance maybe TIMESTAMP is
 slightly better than DATE.

Well what I'm interested in - and I was asking my original question
about - is SIGNIFICANT difference. If everything I can count for is
just slight improvements (by which I understand difference that can
be detected only by benchmarks, but not really during database
operation), then actually it doesn't make much sense, it seems.
-- 
Zbig

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-11-04 Thread Johan De Meersman


- Original Message -
 From: Zbigniew zbigniew2...@gmail.com
 Subject: Using INTEGER instead of VARCHAR/DATE - is this a way to faster 
 access?

 What about using ENUMs? They have nearly the performance of INTEGERs, but
 you don't have to maintain a string mapping in your programming logic.
 
 But are you able to estimate, what boost can i notice? 5% - or 50%,
 or maybe even 500%?

I'll give you an interesting reason to switch to ENUM (or smallint, if so 
inclined): Your data fields will be smaller. That not only means more records 
in a page (might be negligable), but more importantly, it'll make the index on 
that field smaller, meaning a) more of it will remain in memory and b) lookups 
on it will be marginally faster, too.

I have no hard data on how it'll impact index performance (your dataset is 
yours to benchmark), but on one million of records (and you were talking 
several), a each byte saved is a megabyte of memory that can be used for other 
purposes, like data cache, which will speed up other things, too.


-- 
Unhappiness is discouraged and will be corrected with kitten pictures.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-11-04 Thread Jan Steinman
 From: (Halász Sándor) h...@tbbs.net
 
 2014/10/29 20:56 +0100, Zbigniew 
 if instead of textual labels I'll use SMALLINT as simple integer code for 
 each different label (and in my program a translation table)? 
 
 This, in effect, is ENUM...

Well, not really! With INTEGERs, your referential integrity is external to 
MySQL, and has to be managed.

 ... and maybe is useful if the set of dates is well determined...

I was not suggesting it for dates. The OP appeared to have a well-defined set 
of strings in a VARCHAR field — THAT is what I suggested ENUMs for!

 There is a design cost in using ENUM: If you find that your first set of 
 dates is too small, later, with ALTER TABLE, you have to change the type.

Again, the suggestion for ENUM was to replace a constrained set of VARCHARs, 
and yet, you raise a valid point.

What is the update frequency of those VARCHARs? If you're adding them often — 
or if you need to occasionally change their value — I'd use another table with 
a reference.

If they're immutable and new ones are not added often, there's no design cost 
at all to using ENUMs. I'd argue there's a higher maintenance cost to NOT using 
them!

 Jan Steinman, EcoReality Co-op 


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-11-03 Thread Hal�sz S�ndor
 2014/11/02 13:19 +0100, Zbigniew 
So you guys (Jan and hsv) suggest, that switching from DATE to more
numeric data type may not be necessary, but using ENUM instead of
VARCHAR can be real performance gain, right?

But are you able to estimate, what boost can i notice? 5% - or 50%,
or maybe even 500%? 

_Maybe_! but I will not venture to measure it. I doubt that it would be big. I 
just looked ENUM up: it allows 65535 distinct values, which sounds like 16 
bits, usually two bytes, and numeric operators would be used.

 2014/11/02 11:19 -0800, Jan Steinman 
I would hope that the query optimizer converts '2014-11-02' to the three-bytes 
internal representation of DATE before doing the query, in which case, DATE 
should actually be a tiny bit faster than TIMESTAMP. 

That is doubtful. In the processors that I know, one built-in numeric operation 
is enough for either 2 bytes of ENUM or 4 bytes of TIMESTAMP, but three are 
needed for DATE. In any case, the C-overhead, since MySQL is implemented in C, 
overwhelms the cost of any single comparison. The equality comparison is at 
least simpler than an ordered comparison.

 2014/10/29 20:56 +0100, Zbigniew 
if instead of textual labels I'll use SMALLINT as simple integer code for 
each different label (and in my program a translation table)? 

This, in effect, is ENUM, and maybe is useful if the set of dates is well 
determined. If you use ENUM at least the overhead of translation is built in 
into MySQL, and, one hopes, costs less than doing it for oneself.

There is a design cost in using ENUM: If you find that your first set of dates 
is too small, later, with ALTER TABLE, you have to change the type. If you add 
the new string to the end, there is, maybe, no great cost to the adding, but if 
in the middle If every new date-string is added to the end, it will, maybe, 
be in random order. The same applys to your translation table.

I suggested TIMESTAMP because I suspect that one built-in comparison (after the 
optimizer is done with it) is enough, and it allows the appearance of real 
dates. (If the processor fetches 32 bits at a time (nowadays 64 or more is 
likly) then a 32-bit type is fetched as swiftly as any other. Both shorter and 
longer types take longer.) The more I debate this, the better I like TIMESTAMP 
for your problem.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-11-02 Thread Zbigniew
2014-10-31 5:29 GMT+01:00, Jan Steinman j...@ecoreality.org:

 What about using ENUMs? They have nearly the performance of INTEGERs, but
 you don't have to maintain a string mapping in your programming logic.

So you guys (Jan and hsv) suggest, that switching from DATE to more
numeric data type may not be necessary, but using ENUM instead of
VARCHAR can be real performance gain, right?

But are you able to estimate, what boost can i notice? 5% - or 50%,
or maybe even 500%?
-- 
Zbig

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-11-02 Thread Jan Steinman
 From: Zbigniew zbigniew2...@gmail.com
 
 switching from DATE to more numeric data type may not be necessary...

I would hope that the query optimizer converts '2014-11-02' to the three-bytes 
internal representation of DATE before doing the query, in which case, DATE 
should actually be a tiny bit faster than TIMESTAMP.

 using ENUM instead of VARCHAR can be real performance gain, right?

Not just in performance, but it appears to simply be The Right Thing To Do(TM) 
in your case. (Codd Rule #10: referential integrity.)

Consider an insert into a day-of-week column (for instance) that somehow got 
Sudnay in the VARCHAR field instead of Sunday. Using an ENUM eliminates the 
possibility of a typo at a more fundamental level than your programming logic. 
If you do a massive insert with Sudnay in the ENUM field, the entire 
transaction will fail, which is really what you want rather than having to 
track down bad data after the fact, no?

If it REALLY is one value out of a known set, it SHOULD be either an ENUM, or a 
reference to another table. Use the latter technique if you need to add new 
values very often.

 But are you able to estimate, what boost can i notice? 5% - or 50%, or 
 maybe even 500%?

Very hard to say. That's like saying, If I eat well and get enough exercise, 
will I live 5% or 50% or 500% longer? Probably more like 5%, but it may FEEL 
like 500%! :-)

If the value is constrained to a set, having it as an ENUM (or reference to 
another table) will save you grief in many other ways besides boosting 
performance.

 Private enterprise, indeed, became too private. It became privileged 
enterprise, not private enterprise. -- Franklin Delano Roosevelt
 Jan Steinman, EcoReality Co-op 


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-10-30 Thread hsv
 2014/10/29 20:56 +0100, Zbigniew 
Now to the point: considering, that the second column shall contain
about 100-200 different labels - so in average many of such labels
can be repeated one million times (or even more) - will it speed-up
the selection done with something like ...WHERE label='xyz' AND
date='2013-02-25'  (and maybe the insertion as well?), if instead of
textual labels I'll use SMALLINT as simple integer code for each
different label (and in my program a translation table)?

If so - will I have further performance gain, if instead of
human-readable DATE for the third column I'll use Unix time put into
INTEGER type column, not DATE-type? 

I really cannot answer your real question, but say only that DATE and other 
time types are numeric, although mostly constructed.
DATE takes three bytes with range '1000-01-01' to '-12-31'.
TIMESTAMP (which has special features that one can suppress) is a four-byte 
integer that is a 31-bit Unix timestamp with range '1970-01-01 00:00:01.00' 
UTC to '2038-01-19 03:14:07.99' UTC.

Maybe TIMESTAMP, which doubtless uses the underlying integer mechanism for 
comparison, is best for you. Consider also the functions UNIX_TIMESTAMP and 
FROM_UNIXTIME.

The zone is not involved in DATE, but is involved in the rest aforesaid.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-10-30 Thread Jan Steinman
 From: Zbigniew zbigniew2...@gmail.com
 
 Now to the point: considering, that the second column shall contain
 about 100-200 different labels - so in average many of such labels
 can be repeated one million times (or even more) 

What about using ENUMs?

They have essentially the performance of INTEGERs, but you don't have to 
maintain a string mapping in your programming logic.

 Yes'm, old friends is always best, 'less you can catch a new one that's 
fit to make an old one out of. -- Sarah Jewett
 Jan Steinman, EcoReality Co-op 


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-10-30 Thread Jan Steinman
 From: Zbigniew zbigniew2...@gmail.com
 
 Now to the point: considering, that the second column shall contain
 about 100-200 different labels - so in average many of such labels
 can be repeated one million times (or even more) 

What about using ENUMs? They have nearly the performance of INTEGERs, but you 
don't have to maintain a string mapping in your programming logic.

 Yes'm, old friends is always best, 'less you can catch a new one that's 
fit to make an old one out of. -- Sarah Jewett
 Jan Steinman, EcoReality Co-op 


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Using INTEGER instead of VARCHAR/DATE - is this a way to faster access?

2014-10-29 Thread Zbigniew
I'm going to establish a large database (ISAM) of simple structure,
say the table with only 3 columns:

- VARCHAR(80)
- VARCHAR(40)
- DATE

The number of rows can be quite large, about hundred million or so.
The first column, containing actual information, will contain unique
values, unlike the two others - but the two others shall be used for
data selection (and I'll index them).

Now to the point: considering, that the second column shall contain
about 100-200 different labels - so in average many of such labels
can be repeated one million times (or even more) - will it speed-up
the selection done with something like ...WHERE label='xyz' AND
date='2013-02-25'  (and maybe the insertion as well?), if instead of
textual labels I'll use SMALLINT as simple integer code for each
different label (and in my program a translation table)?

If so - will I have further performance gain, if instead of
human-readable DATE for the third column I'll use Unix time put into
INTEGER type column, not DATE-type?

And the final question: even, if so - is it worthy? I mean: will the
supposed performance gain be significant (e.g. 2-3 times faster
selection) - and not, say, just 5% faster (only possible to detect by
using benchmarking tools)?

Thanks in advance for your opinions.
-- 
Zbig

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql