Yes. "where the indices are one-based and **in ascending order**". -Xiangrui
On Tue, Oct 21, 2014 at 1:10 PM, Sameer Tilak <ssti...@live.com> wrote: > Hi All, > > I have a question regarding the ordering of indices. The document says that > the indices indices are one-based and in ascending order. However, do the > indices within a row need to be sorted in ascending order? > > > > > Sparse data > > It is very common in practice to have sparse training data. MLlib supports > reading training examples stored in LIBSVM format, which is the default > format used by LIBSVM and LIBLINEAR. It is a text format in which each line > represents a labeled sparse feature vector using the following format: > > label index1:value1 index2:value2 ... > > where the indices are one-based and in ascending order. After loading, the > feature indices are converted to zero-based. > > > > For example, I have have indices ranging rom 1 to 1000 is this as a libsvm > data file OK? > > > 1 110:1.0 80:0.5 310:0.0 > > 0 890:0.5 20:0.0 200:0.5 400:1.0 82:0.0 > > and so on: > > > OR do I need to sort them as: > > > 1 80:0.5 110:1.0 310:0.0 > > 0 20:0.0 82:0.0 200:0.5 400:1.0 890:0.5 --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org