It is possible that the answer (the final solution vector x) given by two
different algorithms (such as the one in mllib and in R) are different, as
the problem may not be strictly convex and multiple global optimum may
exist. However, these answers should admit the same objective values. Can
you
Hi,
Are you suggesting that taking simple vector dot products or sigmoid
function on 10K * 1M data takes 5hrs?
On Thu, Jul 17, 2014 at 3:59 PM, m3.sharma sharm...@umn.edu wrote:
We are using RegressionModels that comes with *mllib* package in SPARK.
--
View this message in context:
Hi Koert,
Just curious did you find any information like CANNOT FIND ADDRESS
after clicking into some stage? I've seen similar problems due to lost of
executors.
Best,
On Fri, Jul 11, 2014 at 4:42 PM, Koert Kuipers ko...@tresata.com wrote:
I just tested a long lived application (that we
Hi, just wondering anybody knows how to set up the number of workers (and
the amount of memory) in mesos, while lauching spark-shell? I was trying to
edit conf/spark-env.sh and it looks like that the environment variables are
for YARN of standalone. Thanks!
If I'm understanding correctly, you want to use MLlib for offline training
and then deploy the learned model to Storm? In this case I don't think
there is any problem. However if you are looking for online model
update/training, this can be complicated and I guess quite a few algorithms
in mllib
commonly, the result of the stage may be used in a later
calculation, and has to be recalculated. This happens if some of the
results were evicted from cache.
On Wed, Jun 11, 2014 at 2:23 AM, Shuo Xiang shuoxiang...@gmail.com
wrote:
Hi,
Came up with some confusion regarding
replication but still seeing this.
On Wednesday, June 11, 2014, Shuo Xiang shuoxiang...@gmail.com wrote:
Daniel,
Thanks for the explanation.
On Wed, Jun 11, 2014 at 8:57 AM, Daniel Darabos
daniel.dara...@lynxanalytics.com wrote:
About more succeeded tasks than total tasks
Xiangrui, clicking into the RDD link, it gives the same message, say only
96 of 100 partitions are cached. The disk/memory usage are the same, which
is far below the limit.
Is this what you want to check or other issue?
On Wed, Jun 11, 2014 at 4:38 PM, Xiangrui Meng men...@gmail.com wrote: