Re: mllib vector templates

2014-05-12 Thread Debasish Das
Hi, I see ALS is still using Array[Int] but for other mllib algorithm we moved to Vector[Double] so that it can support either dense and sparse formats... I know ALS can stay in Array[Int] due to the Netflix format for input datasets which is well defined but it helps if we move ALS to

Re: mllib vector templates

2014-05-11 Thread Debasish Das
Hi, I see ALS is still using Array[Int] but for other mllib algorithm we moved to Vector[Double] so that it can support either dense and sparse formats... ALS can stay in Array[Int] due to the Netflix format for input datasets which is well defined but it helps if we move ALS to Vector[Double]

mllib vector templates

2014-05-05 Thread Debasish Das
Hi, Why mllib vector is using double as default ? /** * Represents a numeric vector, whose index type is Int and value type is Double. */ trait Vector extends Serializable { /** * Size of the vector. */ def size: Int /** * Converts the instance to a double array.

Re: mllib vector templates

2014-05-05 Thread DB Tsai
+1 Would be nice that we can use different type in Vector. Sincerely, DB Tsai --- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Mon, May 5, 2014 at 2:41 PM, Debasish Das debasish.da...@gmail.comwrote: Hi,

Re: mllib vector templates

2014-05-05 Thread Debasish Das
Is this a breeze issue or breeze can take templates on float / double ? If breeze can take templates then it is a minor fix for Vectors.scala right ? Thanks. Deb On Mon, May 5, 2014 at 2:45 PM, DB Tsai dbt...@stanford.edu wrote: +1 Would be nice that we can use different type in Vector.

Re: mllib vector templates

2014-05-05 Thread DB Tsai
Breeze could take any type (Int, Long, Double, and Float) in the matrix template. Sincerely, DB Tsai --- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Mon, May 5, 2014 at 2:56 PM, Debasish Das

Re: mllib vector templates

2014-05-05 Thread Debasish Das
Is any one facing issues due to this ? If not then I guess doubles are fine... For me it's not a big deal as there is enough memory available... On Mon, May 5, 2014 at 3:06 PM, David Hall d...@cs.berkeley.edu wrote: Lbfgs and other optimizers would not work immediately, as they require

Re: mllib vector templates

2014-05-05 Thread Xiangrui Meng
I fixed index type and value type to make things simple, especially when we need to provide Java and Python APIs. For raw features and feature transmations, we should allow generic types. -Xiangrui On Mon, May 5, 2014 at 3:40 PM, DB Tsai dbt...@stanford.edu wrote: David, Could we use Int,

Re: mllib vector templates

2014-05-05 Thread David Hall
On Mon, May 5, 2014 at 3:40 PM, DB Tsai dbt...@stanford.edu wrote: David, Could we use Int, Long, Float as the data feature spaces, and Double for optimizer? Yes. Breeze doesn't allow operations on mixed types, so you'd need to convert the double vectors to Floats if you wanted, e.g. dot