Re: Make hive support various charsets

2012-04-08 Thread Zhang Kai
Hi

I have created an issue HIVE-2917 and submitted patch through Phabricator.

Is there anyone who would like to review it?

Thanks,
Kai Zhang

2012/3/29 Namit Jain 

> Kai,
>
> That would be great.
>
> Please file a jura, and submit a patch.
> We would definitely like to get it for the whole community
>
>
> Thanks,
> -namit
>
>
> On 3/28/12 8:46 PM, "Zhang Kai"  wrote:
>
> >Hi all
> >
> >I've been working with hive for some time.
> >
> >In my company, we use hive for querying on large datasets and found it's
> >very easy to use.
> >
> >However we also found hive is lack of various charsets support so that we
> >have to manually transform data files to utf-8 encoding before loading
> >them
> >into hive.
> >
> >So I have made a patch to make hive supports setting charset when creating
> >a table.
> >And the charset property will be used by SerDe when it serialize or
> >deserialize data.
> >
> >The modified hql is like:
> >
> >CREATE TABLE tbl1 (col1 STRING) ROW FORMAT CHARET "GBK" DELIMITED FIELDS
> >TERMINATED BY '\t';
> >
> >I'm very happy to contribute this to the community and looking forward to
> >your feedbacks.
> >
> >Thanks,
> >Kai Zhang
>
>


Re: Make hive support various charsets

2012-03-28 Thread Namit Jain
Kai,

That would be great.

Please file a jura, and submit a patch.
We would definitely like to get it for the whole community


Thanks,
-namit


On 3/28/12 8:46 PM, "Zhang Kai"  wrote:

>Hi all
>
>I've been working with hive for some time.
>
>In my company, we use hive for querying on large datasets and found it's
>very easy to use.
>
>However we also found hive is lack of various charsets support so that we
>have to manually transform data files to utf-8 encoding before loading
>them
>into hive.
>
>So I have made a patch to make hive supports setting charset when creating
>a table.
>And the charset property will be used by SerDe when it serialize or
>deserialize data.
>
>The modified hql is like:
>
>CREATE TABLE tbl1 (col1 STRING) ROW FORMAT CHARET "GBK" DELIMITED FIELDS
>TERMINATED BY '\t';
>
>I'm very happy to contribute this to the community and looking forward to
>your feedbacks.
>
>Thanks,
>Kai Zhang



Make hive support various charsets

2012-03-28 Thread Zhang Kai
Hi all

I've been working with hive for some time.

In my company, we use hive for querying on large datasets and found it's
very easy to use.

However we also found hive is lack of various charsets support so that we
have to manually transform data files to utf-8 encoding before loading them
into hive.

So I have made a patch to make hive supports setting charset when creating
a table.
And the charset property will be used by SerDe when it serialize or
deserialize data.

The modified hql is like:

CREATE TABLE tbl1 (col1 STRING) ROW FORMAT CHARET "GBK" DELIMITED FIELDS
TERMINATED BY '\t';

I'm very happy to contribute this to the community and looking forward to
your feedbacks.

Thanks,
Kai Zhang