I have a sequence file generated using sqoop, why only one type is seen in
the file?
sqoop import -m 1 \
--connect=jdbc:mysql://ms.itversity.com/retail_db \
--username=retail_user \
--password=itversity \
--table=orders \
--as-sequencefile \
--target-dir=order20180320_seq
The head part of the sequence is as below:
[paslechoix@gw03 ~]$ hdfs dfs -cat order20180320_seq/part-m-00000 |head
SEQ!org.apache.hadoop.io.LongWritableorders7▒▒P▒
U3▒3▒$@▒▒-OCLOSED@▒▒PENDING_PAYMENT@▒▒/COMPLETE@▒▒"{CLOSED@▒▒
,COMPLETE@▒COMPLETE@▒▒COMPLET@▒▒
As you can see, there is only one type in the sequence file's head:
LongWritable.
According to this Hadoop WiKi about sequence file format:
https://wiki.apache.org/hadoop/SequenceFile
sequence file's header should contain:
version - A byte array: 3 bytes of magic header 'SEQ', followed by 1
byte of actual version no. (e.g. SEQ4 or SEQ6)
keyClassName - String
valueClassName - String
However, all the sequence files I generated with sqoop (1.4.6.2.5.0.0-1245
) contains only one Class.
Is there anything missing in the sqoop command? how can generate a sequence
with the right info in its head?
Thank you very much.
*------------------------------------------------*
*Sincerely yours,*
*Raymond*