Re: Re: create a hive table: always a tab space before each line

2013-01-14 Thread Dean Wampler
Hadoop supports Sequence Files natively. Hadoop the Definitive Guide
discusses the details.

dean

On Mon, Jan 14, 2013 at 8:58 AM, Richard  wrote:

> thanks.
> it seems that as long as I use sequencefile as the storage format, there
> will be \t before the first column. If this output is continously used by
> hive, it is fine. The problem is that I may use a self-define map-reduce
> job to read these files.  Does that mean I have to take care of
> this \t by myself?
>
> is there any option that I can disable this \t in hive?
>
>
>
> At 2013-01-09 22:38:11,"Dean Wampler" 
> wrote:
>
> To add to what Nitin said, there is no key output by Hive in front of the
> tab.
>
> On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar wrote:
>
>> you may want to look at the sequencefile format
>>
>> http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432
>>
>> that tab is to separate key from values in the record (I may be wrong but
>> this is how I interpreted it)
>>
>>
>> On Wed, Jan 9, 2013 at 12:49 AM, Richard  wrote:
>>
>>> more information:
>>>
>>> if I set the format as textfile, there is no tab space.
>>> if I set the format as sequencefile and view the content via hadoop fs
>>> -text, I saw a tab space in the head of each line.
>>>
>>>
>>> At 2013-01-09 15:44:00,Richard  wrote:
>>>
>>> hi there
>>>
>>>
>>> I have a problem with creating a hive table.
>>>
>>> no matter what field delimiter I used, I always got a tab space in the head 
>>> of each line (a line is a record).
>>>
>>> something like this:
>>>
>>> \t f1 \001 f2 \001 f3 ...
>>>
>>> where f1 , f2 , f3 denotes the field value and \001 is the field separator.
>>>
>>>
>>> **
>>>
>>> here is the clause I used
>>>
>>> 35 create external table if not exists ${HIVETBL_my_table}
>>>  36 (
>>>  37 nid string,
>>>  38 userid string,
>>>  39 spv bigint,
>>>  40 sipv bigint,
>>>  41 pay bigint,
>>>  42 spay bigint,
>>>  43 ipv bigint,
>>>  44 sellerid string,
>>>  45 cate string
>>>  46 )
>>>  47 partitioned by(ds string)
>>>  48 row format delimited fields terminated by '\001' lines terminated by 
>>> '\n'
>>>  49 stored as sequencefile
>>>  50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}';
>>>
>>>
>>> thanks for help.
>>>
>>>
>>> Richard
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>
>
> --
> *Dean Wampler, Ph.D.*
> thinkbiganalytics.com
> +1-312-339-1330
>
>
>
>


-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330


Re:Re: create a hive table: always a tab space before each line

2013-01-14 Thread Richard
thanks.
it seems that as long as I use sequencefile as the storage format, there
will be \t before the first column. If this output is continously used by
hive, it is fine. The problem is that I may use a self-define map-reduce
job to read these files.  Does that mean I have to take care of 
this \t by myself?


is there any option that I can disable this \t in hive?




At 2013-01-09 22:38:11,"Dean Wampler"  
wrote:
To add to what Nitin said, there is no key output by Hive in front of the tab.


On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar  wrote:

you may want to look at the sequencefile format 
http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432



that tab is to separate key from values in the record (I may be wrong but this 
is how I interpreted it) 



On Wed, Jan 9, 2013 at 12:49 AM, Richard  wrote:

more information:


if I set the format as textfile, there is no tab space. 
if I set the format as sequencefile and view the content via hadoop fs -text, I 
saw a tab space in the head of each line.



At 2013-01-09 15:44:00,Richard  wrote:

hi there


I have a problem with creating a hive table.
no matter what field delimiter I used, I always got a tab space in the head of 
each line (a line is a record).
something like this:
\t f1 \001 f2 \001 f3 ...
where f1 , f2 , f3 denotes the field value and \001 is the field separator.


here is the clause I used 
35 create external table if not exists ${HIVETBL_my_table}
 36 (
 37 nid string, 
 38 userid string, 
 39 spv bigint, 
 40 sipv bigint, 
 41 pay bigint, 
 42 spay bigint, 
 43 ipv bigint, 
 44 sellerid string, 
 45 cate string
 46 )
 47 partitioned by(ds string)
 48 row format delimited fields terminated by '\001' lines terminated by '\n'
 49 stored as sequencefile
 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}';


thanks for help.


Richard











--
Nitin Pawar






--
Dean Wampler, Ph.D.
thinkbiganalytics.com
+1-312-339-1330



Re: create a hive table: always a tab space before each line

2013-01-09 Thread Dean Wampler
To add to what Nitin said, there is no key output by Hive in front of the
tab.

On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar  wrote:

> you may want to look at the sequencefile format
>
> http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432
>
> that tab is to separate key from values in the record (I may be wrong but
> this is how I interpreted it)
>
>
> On Wed, Jan 9, 2013 at 12:49 AM, Richard  wrote:
>
>> more information:
>>
>> if I set the format as textfile, there is no tab space.
>> if I set the format as sequencefile and view the content via hadoop fs
>> -text, I saw a tab space in the head of each line.
>>
>>
>> At 2013-01-09 15:44:00,Richard  wrote:
>>
>> hi there
>>
>>
>> I have a problem with creating a hive table.
>>
>> no matter what field delimiter I used, I always got a tab space in the head 
>> of each line (a line is a record).
>>
>> something like this:
>>
>> \t f1 \001 f2 \001 f3 ...
>>
>> where f1 , f2 , f3 denotes the field value and \001 is the field separator.
>>
>>
>> **
>>
>> here is the clause I used
>>
>> 35 create external table if not exists ${HIVETBL_my_table}
>>  36 (
>>  37 nid string,
>>  38 userid string,
>>  39 spv bigint,
>>  40 sipv bigint,
>>  41 pay bigint,
>>  42 spay bigint,
>>  43 ipv bigint,
>>  44 sellerid string,
>>  45 cate string
>>  46 )
>>  47 partitioned by(ds string)
>>  48 row format delimited fields terminated by '\001' lines terminated by '\n'
>>  49 stored as sequencefile
>>  50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}';
>>
>>
>> thanks for help.
>>
>>
>> Richard
>>
>>
>>
>>
>>
>>
>
>
> --
> Nitin Pawar
>



-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330


Re:Re: create a hive table: always a tab space before each line

2013-01-09 Thread Richard
I am trying to create a table and insert overwrite it, so the data is supposed 
to be generated.






At 2013-01-09 17:17:06,"Anurag Tangri"  wrote:

Hi Richard,
You should set the format in create external table command based on the format 
of your data on HDFS.


Is your data text file or seq file on HDFS ?


Thanks,
Anurag Tangri

Sent from my iPhone

On Jan 9, 2013, at 12:49 AM, Richard   wrote:


more information:


if I set the format as textfile, there is no tab space. 
if I set the format as sequencefile and view the content via hadoop fs -text, I 
saw a tab space in the head of each line.


At 2013-01-09 15:44:00,Richard  wrote:

hi there


I have a problem with creating a hive table.
no matter what field delimiter I used, I always got a tab space in the head of 
each line (a line is a record).
something like this:
\t f1 \001 f2 \001 f3 ...
where f1 , f2 , f3 denotes the field value and \001 is the field separator.


here is the clause I used 
35 create external table if not exists ${HIVETBL_my_table}
 36 (
 37 nid string, 
 38 userid string, 
 39 spv bigint, 
 40 sipv bigint, 
 41 pay bigint, 
 42 spay bigint, 
 43 ipv bigint, 
 44 sellerid string, 
 45 cate string
 46 )
 47 partitioned by(ds string)
 48 row format delimited fields terminated by '\001' lines terminated by '\n'
 49 stored as sequencefile
 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}';


thanks for help.


Richard







Re: create a hive table: always a tab space before each line

2013-01-09 Thread Anurag Tangri
Hi Richard,
You should set the format in create external table command based on the format 
of your data on HDFS.

Is your data text file or seq file on HDFS ?

Thanks,
Anurag Tangri

Sent from my iPhone

On Jan 9, 2013, at 12:49 AM, Richard   wrote:

> more information:
> 
> if I set the format as textfile, there is no tab space. 
> if I set the format as sequencefile and view the content via hadoop fs -text, 
> I saw a tab space in the head of each line.
> 
> At 2013-01-09 15:44:00,Richard  wrote:
> hi there
> 
> I have a problem with creating a hive table.
> no matter what field delimiter I used, I always got a tab space in the head 
> of each line (a line is a record).
> something like this:
> \t f1 \001 f2 \001 f3 ...
> where f1 , f2 , f3 denotes the field value and \001 is the field separator.
> 
> here is the clause I used 
> 35 create external table if not exists ${HIVETBL_my_table}
>  36 (
>  37 nid string, 
>  38 userid string, 
>  39 spv bigint, 
>  40 sipv bigint, 
>  41 pay bigint, 
>  42 spay bigint, 
>  43 ipv bigint, 
>  44 sellerid string, 
>  45 cate string
>  46 )
>  47 partitioned by(ds string)
>  48 row format delimited fields terminated by '\001' lines terminated by '\n'
>  49 stored as sequencefile
>  50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}';
> 
> thanks for help.
> 
> Richard
> 
> 
> 
> 


Re: create a hive table: always a tab space before each line

2013-01-09 Thread Nitin Pawar
you may want to look at the sequencefile format
http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432

that tab is to separate key from values in the record (I may be wrong but
this is how I interpreted it)


On Wed, Jan 9, 2013 at 12:49 AM, Richard  wrote:

> more information:
>
> if I set the format as textfile, there is no tab space.
> if I set the format as sequencefile and view the content via hadoop fs
> -text, I saw a tab space in the head of each line.
>
>
> At 2013-01-09 15:44:00,Richard  wrote:
>
> hi there
>
>
> I have a problem with creating a hive table.
>
> no matter what field delimiter I used, I always got a tab space in the head 
> of each line (a line is a record).
>
> something like this:
>
> \t f1 \001 f2 \001 f3 ...
>
> where f1 , f2 , f3 denotes the field value and \001 is the field separator.
>
>
> **
>
> here is the clause I used
>
> 35 create external table if not exists ${HIVETBL_my_table}
>  36 (
>  37 nid string,
>  38 userid string,
>  39 spv bigint,
>  40 sipv bigint,
>  41 pay bigint,
>  42 spay bigint,
>  43 ipv bigint,
>  44 sellerid string,
>  45 cate string
>  46 )
>  47 partitioned by(ds string)
>  48 row format delimited fields terminated by '\001' lines terminated by '\n'
>  49 stored as sequencefile
>  50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}';
>
>
> thanks for help.
>
>
> Richard
>
>
>
>
>
>


-- 
Nitin Pawar


create a hive table: always a tab space before each line

2013-01-08 Thread Richard
hi there

I have a problem with creating a hive table.
no matter what field delimiter I used, I always got a tab space in the head of 
each line (a line is a record).
something like this:
\t f1 \001 f2 \001 f3 ...
where f1 , f2 , f3 denotes the field value and \001 is the field separator.


here is the clause I used 
35 create external table if not exists ${HIVETBL_my_table}
 36 (
 37 nid string, 
 38 userid string, 
 39 spv bigint, 
 40 sipv bigint, 
 41 pay bigint, 
 42 spay bigint, 
 43 ipv bigint, 
 44 sellerid string, 
 45 cate string
 46 )
 47 partitioned by(ds string)
 48 row format delimited fields terminated by '\001' lines terminated by '\n'
 49 stored as sequencefile
 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}';


thanks for help.


Richard