, Jiusheng Chen
chenjiush...@gmail.com
wrote:
How about increase HDFS file extent size? like current value is
128M,
we
make it 512M or bigger.
On Tue, Aug 12, 2014 at 11:46 AM, ZHENG, Xu-dong dong...@gmail.com
wrote:
Hi all,
We are trying to use Spark MLlib
PM, Jiusheng Chen chenjiush...@gmail.com wrote:
Hi Xiangrui,
A side-by question about MLLib.
It looks current LBFGS in MLLib (version 1.0.2 and even v1.1) only
support L2 regurization, the doc explains it: The L1 regularization by
using L1Updater
http://spark.apache.org/docs/latest/api
How about increase HDFS file extent size? like current value is 128M, we
make it 512M or bigger.
On Tue, Aug 12, 2014 at 11:46 AM, ZHENG, Xu-dong dong...@gmail.com wrote:
Hi all,
We are trying to use Spark MLlib to train super large data (100M features
and 5B rows). The input data in HDFS
It seems MLlib right now doesn't support weighted training, training samples
have equal importance. Weighted training can be very useful to reduce data
size and speed up training.
Do you have plan to support it in future? The data format will be something
like:
label:*weight * index1:value1