Re: how to construct parameter for model.transform() from datafile

2017-03-14 Thread Yuhao Yang
Hi Jinhong, Based on the error message, your second collection of vectors has a dimension of 804202, while the dimension of your training vectors was 144109. So please make sure your test dataset are of the same dimension as the training data. >From the test dataset you posted, the vector

Re: Implementation of RNN/LSTM in Spark

2017-02-27 Thread Yuhao Yang
Welcome to try and contribute to our BigDL: https://github.com/intel-analytics/BigDL It's native on Spark and fast by leveraging Intel MKL. 2017-02-23 4:51 GMT-08:00 Joeri Hermans : > Hi Nikita, > > We are actively working on this:

Re: Question about using collaborative filtering in MLlib

2016-11-02 Thread Yuhao Yang
Hi Zak, Indeed the function is missing in DataFrame-based API. I can probably provide some quick prototype to see if it we can merge the function into next release. I would send update here and feel free to give it a try. Regards, Yuhao 2016-11-01 10:00 GMT-07:00 Zak H

Re: MinMaxScaler With features include category variables

2016-07-06 Thread Yuhao Yang
You may also find VectorSlicer and SQLTransformer useful in your case. Just out of curiosity, how would you typically handles categorical features, except for OneHotEncoder. Regards, Yuhao 2016-07-01 4:00 GMT-07:00 Yanbo Liang : > You can combine the columns which are need

Re: Welcoming Yanbo Liang as a committer

2016-06-05 Thread Yuhao Yang
Congratulations Yanbo 2016-06-04 23:43 GMT-07:00 Hyukjin Kwon : > Congratulations! > > 2016-06-04 11:48 GMT+09:00 Matei Zaharia : > >> Hi all, >> >> The PMC recently voted to add Yanbo Liang as a committer. Yanbo has been >> a super active

Need suggestions on monitor Spark progress

2015-11-29 Thread Yuhao Yang
Hi all, I got a simple processing job for 2 accounts on 8 partitions. It's roughly 2500 accounts on each partition. Each account will take about 1s to complete the computation. That means each partition will take about 2500 seconds to finish the batch. My question is how can I get the