Hadoop supports Sequence Files natively. Hadoop the Definitive Guide discusses the details.
dean On Mon, Jan 14, 2013 at 8:58 AM, Richard <codemon...@163.com> wrote: > thanks. > it seems that as long as I use sequencefile as the storage format, there > will be \t before the first column. If this output is continously used by > hive, it is fine. The problem is that I may use a self-define map-reduce > job to read these files. Does that mean I have to take care of > this \t by myself? > > is there any option that I can disable this \t in hive? > > > > At 2013-01-09 22:38:11,"Dean Wampler" <dean.wamp...@thinkbiganalytics.com> > wrote: > > To add to what Nitin said, there is no key output by Hive in front of the > tab. > > On Wed, Jan 9, 2013 at 3:07 AM, Nitin Pawar <nitinpawar...@gmail.com>wrote: > >> you may want to look at the sequencefile format >> >> http://my.safaribooksonline.com/book/databases/hadoop/9780596521974/file-based-data-structures/id3555432 >> >> that tab is to separate key from values in the record (I may be wrong but >> this is how I interpreted it) >> >> >> On Wed, Jan 9, 2013 at 12:49 AM, Richard <codemon...@163.com> wrote: >> >>> more information: >>> >>> if I set the format as textfile, there is no tab space. >>> if I set the format as sequencefile and view the content via hadoop fs >>> -text, I saw a tab space in the head of each line. >>> >>> >>> At 2013-01-09 15:44:00,Richard <codemon...@163.com> wrote: >>> >>> hi there >>> >>> >>> I have a problem with creating a hive table. >>> >>> no matter what field delimiter I used, I always got a tab space in the head >>> of each line (a line is a record). >>> >>> something like this: >>> >>> \t f1 \001 f2 \001 f3 ... >>> >>> where f1 , f2 , f3 denotes the field value and \001 is the field separator. >>> >>> >>> ** >>> >>> here is the clause I used >>> >>> 35 create external table if not exists ${HIVETBL_my_table} >>> 36 ( >>> 37 nid string, >>> 38 userid string, >>> 39 spv bigint, >>> 40 sipv bigint, >>> 41 pay bigint, >>> 42 spay bigint, >>> 43 ipv bigint, >>> 44 sellerid string, >>> 45 cate string >>> 46 ) >>> 47 partitioned by(ds string) >>> 48 row format delimited fields terminated by '\001' lines terminated by >>> '\n' >>> 49 stored as sequencefile >>> 50 location '${HADOOP_PATH_4_MY_HIVE}/${HIVETBL_my_table}'; >>> >>> >>> thanks for help. >>> >>> >>> Richard >>> >>> >>> >>> >>> >>> >> >> >> -- >> Nitin Pawar >> > > > > -- > *Dean Wampler, Ph.D.* > thinkbiganalytics.com > +1-312-339-1330 > > > > -- *Dean Wampler, Ph.D.* thinkbiganalytics.com +1-312-339-1330