Hi,

I did the same search a few weeks back and found that there is nothing in
the current API to do that from command line.

However I did write a java program that transforms a csv into a
SequenceFile which can be used to train a naive bayes (amongst other
things).

Here are the sources :
https://gist.github.com/kmoulart/9616125

You'll find all you need to make a jar with dependecies running and with a
proper command line (using JCommander).
Both the sequential version and the MapReduce one are in the given files.

If you're lazy, I'll put the whole maven project on my github later today.

Hope it helps you

Kévin Moulart


2014-03-18 9:41 GMT+01:00 Margusja <mar...@roo.ee>:

> Hi
>
> I am looking a simple way in a command line how to convert vector to
> sequence file.
> in example I have data.txt file contains vectors.
> 1,1
> 2,1
> 1,2
> 2,2
> 3,3
> 8,8
> 8,9
> 9,8
> 9,9
>
> So is there command line possibility to convert that into sequence file?
>
> I tried mahout seqdirectory but after it  hdfs dfs -text
> output2/part-m-00000 gives me something like:
> /data.txt    1,1
> 2,1
> 1,2
> 2,2
> 3,3
> 8,8
> 8,9
> 9,8
> 9,9
>
> and that is not sequence file format as I understand.
>
> I know there are java API but I am looking command line.
>
>
> --
> Best regards, Margus (Margusja) Roo
> +372 51 48 780
> http://margus.roo.ee
> http://ee.linkedin.com/in/margusroo
> skype: margusja
> ldapsearch -x -h ldap.sk.ee -b c=EE "(serialNumber=37303140314)"
> -----BEGIN PUBLIC KEY-----
> MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
> 5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
> RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
> BjM8j36yJvoBVsfOHQIDAQAB
> -----END PUBLIC KEY-----
>
>

Reply via email to