eg. kmeans input:
1,2,3,4  //text file
kmeans output:
point1, point2,point3(text file of center points)


I just thought of one reason. The input data should be storaged in
vector(dense or sparse) format ,so a conversion step
needs to be doned before algorithms deal with data. Is that right?

2014-11-04 23:56 GMT+08:00 Ted Dunning <ted.dunn...@gmail.com>:

> What should the input be?
>
>
>
> On Tue, Nov 4, 2014 at 12:28 AM, Lee S <sle...@gmail.com> wrote:
>
> > Hi all:
> >   I'm wondering why the input and output of most algorithm like
> > kmeans,naivebayes are all sequencefiles. One more step of conversion need
> > to be done if we want the algorithm works.And
> > I think the step is time consuming. Because it's also a mapreduce job.
> >   For the reason to deal with small files and compress to save disk
> space?
> >
>

Reply via email to