Thank you Greg, you are absoutely right, text and orc are also the available options.
However, If sqoop generates seq file format, it should indicate <K, V> is needed, otherwise it is contradict to the definition of seq file and its format. If one field is acceptable (apparently that's what sqoop has been generating), then the definition of seq ( https://wiki.apache.org/ hadoop/SequenceFile ) should be revised to avoid misleading. I've seen too many people stuck with the same question as me. Don't you think so? *------------------------------------------------* *Sincerely yours,* *Raymond* On Thu, Mar 22, 2018 at 10:35 AM, Greg Lindholm <[email protected]> wrote: > Why are you using Sequence files? > Sequence files are binary key/value stores, I haven't used them but it > sounds from the docs that each 'value' is a record, so one type sounds > correct. > You might consider trying Textfile or ORC? You might get better results. > > /Greg > > On Thu, Mar 22, 2018 at 10:06 AM, Raymond Xie <[email protected]> > wrote: > >> I have a sequence file generated using sqoop, why only one type is seen >> in the file? >> >> sqoop import -m 1 \ >> --connect=jdbc:mysql://ms.itversity.com/retail_db \ >> --username=retail_user \ >> --password=itversity \ >> --table=orders \ >> --as-sequencefile \ >> --target-dir=order20180320_seq >> >> The head part of the sequence is as below: >> >> [paslechoix@gw03 ~]$ hdfs dfs -cat order20180320_seq/part-m-00000 |head >> SEQ!org.apache.hadoop.io.LongWritableorders7▒▒P▒ >> U3▒3▒$@▒▒-OCLOSED@▒▒PENDING_PAYMENT@▒▒/COMPLETE@▒▒"{CLOSED@▒ >> ▒,COMPLETE@▒COMPLETE@▒▒COMPLET@▒▒ >> >> As you can see, there is only one type in the sequence file's head: >> LongWritable. >> >> According to this Hadoop WiKi about sequence file format: >> https://wiki.apache.org/hadoop/SequenceFile >> >> sequence file's header should contain: >> >> version - A byte array: 3 bytes of magic header 'SEQ', followed by 1 byte of >> actual version no. (e.g. SEQ4 or SEQ6) >> keyClassName - String >> valueClassName - String >> >> However, all the sequence files I generated with sqoop >> (1.4.6.2.5.0.0-1245 ) contains only one Class. >> >> Is there anything missing in the sqoop command? how can generate a >> sequence with the right info in its head? >> >> Thank you very much. >> >> >> *------------------------------------------------* >> *Sincerely yours,* >> >> >> *Raymond* >> > >
