Hi Amir, Sqoop will generate special class when importing table (even with only one column) and will use this class as a key for the SequenceFile. I'm not familiar with mahout, so I'm not sure if this format can be consumed by it.
Jarcec On Mon, Dec 16, 2013 at 01:42:55PM +0000, Amir Mohammad Saied wrote: > Hi, > > I'm using Sqoop to import (only one column of) a table from MySQL to HDFS. > I'd like records to be stored as SequenceFiles so I can run Mahout's > "seq2sparse" to generate Vectors from them later. > > I've two questions regarding the import process: > > 1) Dumping SequenceFiles generated by sqoop-import, I realized the row > "Key" is automatically generated by Sqoop, and is not the "id" column of > the MySQL table row. Can I ask sqoop-import to use the row's "id" field as > Key? > > 2) If its possible to set row "Key" (above question), can I cast it to a > specific class using sqoop-import? > > Thanks, > > amir
signature.asc
Description: Digital signature
