Re: Re: create a hive table: always a tab space before each line
Hadoop supports Sequence Files natively. Hadoop the Definitive Guide discusses the details. dean On Mon, Jan 14, 2013 at 8:58 AM, Richard wrote: > thanks. > it seems that as long as I use sequencefile as the storage format, there > will be \t before the first column. If this output is continously used by > hive, it is fine. The problem is that I may use a self-define map-reduce > job to read these files. Does that mean I have to take care of > this \t by myself? > > is there any option that I can disable this \t in hive? > > > > At 2013-01-09 22:38:11,"Dean Wampler" > wrote: > > To add to what Nitin said, there is no key output by Hive in front of the > tab. > > On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar wrote: > >> you may want to look at the sequencefile format >> >> http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432 >> >> that tab is to separate key from values in the record (I may be wrong but >> this is how I interpreted it) >> >> >> On Wed, Jan 9, 2013 at 12:49 AM, Richard wrote: >> >>> more information: >>> >>> if I set the format as textfile, there is no tab space. >>> if I set the format as sequencefile and view the content via hadoop fs >>> -text, I saw a tab space in the head of each line. >>> >>> >>> At 2013-01-09 15:44:00,Richard wrote: >>> >>> hi there >>> >>> >>> I have a problem with creating a hive table. >>> >>> no matter what field delimiter I used, I always got a tab space in the head >>> of each line (a line is a record). >>> >>> something like this: >>> >>> \t f1 \001 f2 \001 f3 ... >>> >>> where f1 , f2 , f3 denotes the field value and \001 is the field separator. >>> >>> >>> ** >>> >>> here is the clause I used >>> >>> 35 create external table if not exists ${HIVETBL_my_table} >>> 36 ( >>> 37 nid string, >>> 38 userid string, >>> 39 spv bigint, >>> 40 sipv bigint, >>> 41 pay bigint, >>> 42 spay bigint, >>> 43 ipv bigint, >>> 44 sellerid string, >>> 45 cate string >>> 46 ) >>> 47 partitioned by(ds string) >>> 48 row format delimited fields terminated by '\001' lines terminated by >>> '\n' >>> 49 stored as sequencefile >>> 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; >>> >>> >>> thanks for help. >>> >>> >>> Richard >>> >>> >>> >>> >>> >>> >> >> >> -- >> Nitin Pawar >> > > > > -- > *Dean Wampler, Ph.D.* > thinkbiganalytics.com > +1-312-339-1330 > > > > -- *Dean Wampler, Ph.D.* thinkbiganalytics.com +1-312-339-1330
Re:Re: create a hive table: always a tab space before each line
thanks. it seems that as long as I use sequencefile as the storage format, there will be \t before the first column. If this output is continously used by hive, it is fine. The problem is that I may use a self-define map-reduce job to read these files. Does that mean I have to take care of this \t by myself? is there any option that I can disable this \t in hive? At 2013-01-09 22:38:11,"Dean Wampler" wrote: To add to what Nitin said, there is no key output by Hive in front of the tab. On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar wrote: you may want to look at the sequencefile format http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432 that tab is to separate key from values in the record (I may be wrong but this is how I interpreted it) On Wed, Jan 9, 2013 at 12:49 AM, Richard wrote: more information: if I set the format as textfile, there is no tab space. if I set the format as sequencefile and view the content via hadoop fs -text, I saw a tab space in the head of each line. At 2013-01-09 15:44:00,Richard wrote: hi there I have a problem with creating a hive table. no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record). something like this: \t f1 \001 f2 \001 f3 ... where f1 , f2 , f3 denotes the field value and \001 is the field separator. here is the clause I used 35 create external table if not exists ${HIVETBL_my_table} 36 ( 37 nid string, 38 userid string, 39 spv bigint, 40 sipv bigint, 41 pay bigint, 42 spay bigint, 43 ipv bigint, 44 sellerid string, 45 cate string 46 ) 47 partitioned by(ds string) 48 row format delimited fields terminated by '\001' lines terminated by '\n' 49 stored as sequencefile 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; thanks for help. Richard -- Nitin Pawar -- Dean Wampler, Ph.D. thinkbiganalytics.com +1-312-339-1330
Re: create a hive table: always a tab space before each line
To add to what Nitin said, there is no key output by Hive in front of the tab. On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar wrote: > you may want to look at the sequencefile format > > http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432 > > that tab is to separate key from values in the record (I may be wrong but > this is how I interpreted it) > > > On Wed, Jan 9, 2013 at 12:49 AM, Richard wrote: > >> more information: >> >> if I set the format as textfile, there is no tab space. >> if I set the format as sequencefile and view the content via hadoop fs >> -text, I saw a tab space in the head of each line. >> >> >> At 2013-01-09 15:44:00,Richard wrote: >> >> hi there >> >> >> I have a problem with creating a hive table. >> >> no matter what field delimiter I used, I always got a tab space in the head >> of each line (a line is a record). >> >> something like this: >> >> \t f1 \001 f2 \001 f3 ... >> >> where f1 , f2 , f3 denotes the field value and \001 is the field separator. >> >> >> ** >> >> here is the clause I used >> >> 35 create external table if not exists ${HIVETBL_my_table} >> 36 ( >> 37 nid string, >> 38 userid string, >> 39 spv bigint, >> 40 sipv bigint, >> 41 pay bigint, >> 42 spay bigint, >> 43 ipv bigint, >> 44 sellerid string, >> 45 cate string >> 46 ) >> 47 partitioned by(ds string) >> 48 row format delimited fields terminated by '\001' lines terminated by '\n' >> 49 stored as sequencefile >> 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; >> >> >> thanks for help. >> >> >> Richard >> >> >> >> >> >> > > > -- > Nitin Pawar > -- *Dean Wampler, Ph.D.* thinkbiganalytics.com +1-312-339-1330
Re:Re: create a hive table: always a tab space before each line
I am trying to create a table and insert overwrite it, so the data is supposed to be generated. At 2013-01-09 17:17:06,"Anurag Tangri" wrote: Hi Richard, You should set the format in create external table command based on the format of your data on HDFS. Is your data text file or seq file on HDFS ? Thanks, Anurag Tangri Sent from my iPhone On Jan 9, 2013, at 12:49 AM, Richard wrote: more information: if I set the format as textfile, there is no tab space. if I set the format as sequencefile and view the content via hadoop fs -text, I saw a tab space in the head of each line. At 2013-01-09 15:44:00,Richard wrote: hi there I have a problem with creating a hive table. no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record). something like this: \t f1 \001 f2 \001 f3 ... where f1 , f2 , f3 denotes the field value and \001 is the field separator. here is the clause I used 35 create external table if not exists ${HIVETBL_my_table} 36 ( 37 nid string, 38 userid string, 39 spv bigint, 40 sipv bigint, 41 pay bigint, 42 spay bigint, 43 ipv bigint, 44 sellerid string, 45 cate string 46 ) 47 partitioned by(ds string) 48 row format delimited fields terminated by '\001' lines terminated by '\n' 49 stored as sequencefile 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; thanks for help. Richard
Re: create a hive table: always a tab space before each line
Hi Richard, You should set the format in create external table command based on the format of your data on HDFS. Is your data text file or seq file on HDFS ? Thanks, Anurag Tangri Sent from my iPhone On Jan 9, 2013, at 12:49 AM, Richard wrote: > more information: > > if I set the format as textfile, there is no tab space. > if I set the format as sequencefile and view the content via hadoop fs -text, > I saw a tab space in the head of each line. > > At 2013-01-09 15:44:00,Richard wrote: > hi there > > I have a problem with creating a hive table. > no matter what field delimiter I used, I always got a tab space in the head > of each line (a line is a record). > something like this: > \t f1 \001 f2 \001 f3 ... > where f1 , f2 , f3 denotes the field value and \001 is the field separator. > > here is the clause I used > 35 create external table if not exists ${HIVETBL_my_table} > 36 ( > 37 nid string, > 38 userid string, > 39 spv bigint, > 40 sipv bigint, > 41 pay bigint, > 42 spay bigint, > 43 ipv bigint, > 44 sellerid string, > 45 cate string > 46 ) > 47 partitioned by(ds string) > 48 row format delimited fields terminated by '\001' lines terminated by '\n' > 49 stored as sequencefile > 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; > > thanks for help. > > Richard > > > >
Re: create a hive table: always a tab space before each line
you may want to look at the sequencefile format http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432 that tab is to separate key from values in the record (I may be wrong but this is how I interpreted it) On Wed, Jan 9, 2013 at 12:49 AM, Richard wrote: > more information: > > if I set the format as textfile, there is no tab space. > if I set the format as sequencefile and view the content via hadoop fs > -text, I saw a tab space in the head of each line. > > > At 2013-01-09 15:44:00,Richard wrote: > > hi there > > > I have a problem with creating a hive table. > > no matter what field delimiter I used, I always got a tab space in the head > of each line (a line is a record). > > something like this: > > \t f1 \001 f2 \001 f3 ... > > where f1 , f2 , f3 denotes the field value and \001 is the field separator. > > > ** > > here is the clause I used > > 35 create external table if not exists ${HIVETBL_my_table} > 36 ( > 37 nid string, > 38 userid string, > 39 spv bigint, > 40 sipv bigint, > 41 pay bigint, > 42 spay bigint, > 43 ipv bigint, > 44 sellerid string, > 45 cate string > 46 ) > 47 partitioned by(ds string) > 48 row format delimited fields terminated by '\001' lines terminated by '\n' > 49 stored as sequencefile > 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; > > > thanks for help. > > > Richard > > > > > > -- Nitin Pawar
create a hive table: always a tab space before each line
hi there I have a problem with creating a hive table. no matter what field delimiter I used, I always got a tab space in the head of each line (a line is a record). something like this: \t f1 \001 f2 \001 f3 ... where f1 , f2 , f3 denotes the field value and \001 is the field separator. here is the clause I used 35 create external table if not exists ${HIVETBL_my_table} 36 ( 37 nid string, 38 userid string, 39 spv bigint, 40 sipv bigint, 41 pay bigint, 42 spay bigint, 43 ipv bigint, 44 sellerid string, 45 cate string 46 ) 47 partitioned by(ds string) 48 row format delimited fields terminated by '\001' lines terminated by '\n' 49 stored as sequencefile 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; thanks for help. Richard