Re: Make hive support various charsets
Hi I have created an issue HIVE-2917 and submitted patch through Phabricator. Is there anyone who would like to review it? Thanks, Kai Zhang 2012/3/29 Namit Jain > Kai, > > That would be great. > > Please file a jura, and submit a patch. > We would definitely like to get it for the whole community > > > Thanks, > -namit > > > On 3/28/12 8:46 PM, "Zhang Kai" wrote: > > >Hi all > > > >I've been working with hive for some time. > > > >In my company, we use hive for querying on large datasets and found it's > >very easy to use. > > > >However we also found hive is lack of various charsets support so that we > >have to manually transform data files to utf-8 encoding before loading > >them > >into hive. > > > >So I have made a patch to make hive supports setting charset when creating > >a table. > >And the charset property will be used by SerDe when it serialize or > >deserialize data. > > > >The modified hql is like: > > > >CREATE TABLE tbl1 (col1 STRING) ROW FORMAT CHARET "GBK" DELIMITED FIELDS > >TERMINATED BY '\t'; > > > >I'm very happy to contribute this to the community and looking forward to > >your feedbacks. > > > >Thanks, > >Kai Zhang > >
Re: Make hive support various charsets
Kai, That would be great. Please file a jura, and submit a patch. We would definitely like to get it for the whole community Thanks, -namit On 3/28/12 8:46 PM, "Zhang Kai" wrote: >Hi all > >I've been working with hive for some time. > >In my company, we use hive for querying on large datasets and found it's >very easy to use. > >However we also found hive is lack of various charsets support so that we >have to manually transform data files to utf-8 encoding before loading >them >into hive. > >So I have made a patch to make hive supports setting charset when creating >a table. >And the charset property will be used by SerDe when it serialize or >deserialize data. > >The modified hql is like: > >CREATE TABLE tbl1 (col1 STRING) ROW FORMAT CHARET "GBK" DELIMITED FIELDS >TERMINATED BY '\t'; > >I'm very happy to contribute this to the community and looking forward to >your feedbacks. > >Thanks, >Kai Zhang
Make hive support various charsets
Hi all I've been working with hive for some time. In my company, we use hive for querying on large datasets and found it's very easy to use. However we also found hive is lack of various charsets support so that we have to manually transform data files to utf-8 encoding before loading them into hive. So I have made a patch to make hive supports setting charset when creating a table. And the charset property will be used by SerDe when it serialize or deserialize data. The modified hql is like: CREATE TABLE tbl1 (col1 STRING) ROW FORMAT CHARET "GBK" DELIMITED FIELDS TERMINATED BY '\t'; I'm very happy to contribute this to the community and looking forward to your feedbacks. Thanks, Kai Zhang