On Sat, Apr 23, 2016 at 12:50 PM, Keith Medcalf <kmedcalf at dessus.com> wrote:
> > When a file is a bag-o-bytes rather than a collection-of-records (fixed > length records at that), there is no longer any need to keep the records > the same length, hence no need to specify the "reserved length" of things > (contrary to common mis-belief, varchar(100) does not mean a "variable > length character field with a length up to 100 characters", it means a > character field with a length of 100 into which a smaller amount of data > may be stored that will be padded to 100 characters with special > discarded-upon-retrieval padding characters" -- that is, the difference > between CHAR(100) and VARCHAR(100) is that in the former case data less > than 100 characters is padded with spaces, and in the latter case it is > padded with nulls. The column is still, however, 100 characters wide. That's not necessarily true in the case of VARCHAR. It could be done that way, but it could just as easily and probably much more efficiently be done in a truly variable length fashion. Microsoft SQL Server (as far as I understand, being the only other SQL engine I've spent much time with) stores VARCHAR columns as the actual number of bytes in the field plus two bytes of overhead. So a 100 byte string stored in a VARCHAR(10000) column would only need 102 bytes of storage, not 10000 or 10002. My source (because I haven't used MSSQL for a few years) is https://msdn.microsoft.com/en-us/library/ms176089.aspx to refresh my memory. For any SQL datastore, the way the data is stored is completely an implementation detail. The SQL engine would be free to serialize all values to text and store them in 4K pages if it wanted to, then deserialize them on the way back. I certainly don't know of any that do that, but the impetus for the creation of VARCHAR fields (I imagine) was specifically to avoid storing padding for data that did not require it. -- Scott Robison