thanks both. String has a max length of 2GB so in a MapReduce with a 128MB block size we are talking about 16 blocks. With VARCHAR(30) we are talking about 1 block. I have not really experimented with this, however, I assume a table of 100k rows with VARCHAR columns will have a smaller footprint in HDFS compared to STRING columns?
Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 16 January 2017 at 15:49, sreebalineni . <sreebalin...@gmail.com> wrote: > How is that efficient storage wise because as far as I see it is in hdfs > and storage is based on your block size. > > Am i missing something here? > > On Jan 16, 2017 9:07 PM, "Mich Talebzadeh" <mich.talebza...@gmail.com> > wrote: > > > Coming from DBMS background I tend to treat the columns in Hive similar to > an RDBMS table. For example if a table created in Hive as Parquet I will > use VARCHAR(30) for column that has been defined as VARCHAR(30) as source. > If a column is defined as TEXT in RDBMS table I use STRING in Hive with a > max size of 2GB I believe. > > My view is that it is more efficient storage wise to have Hive table > created as VARCHA as opposed to STRING. > > I have not really seen any performance difference if one uses VARCHAR or > STRING. However, I believe there is a reason why one has VARCH in Hive as > opposed to STRRING. > > What is the thread view on this? > > Thanks > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > >