Re: Problem with using Spark ML

2015-04-23 Thread Staffan
So I got the tip of trying to reduce step-size and that finally gave some more decent results, had hoped for the default params to give at least OK results and thought that the problem must be somewhere else in the code. Problem solved! -- View this message in context:

Problem with using Spark ML

2015-04-21 Thread Staffan
Hi, I've written an application that performs some machine learning on some data. I've validated that the data _should_ give a good output with a decent RMSE by using Lib-SVM: Mean squared error = 0.00922063 (regression) Squared correlation coefficient = 0.9987 (regression) When I try to use

Pipelines for controlling workflow

2015-04-07 Thread Staffan
on type-safety, but I'm confused about how to create a branching pipeline using only type-declarations. Thanks, Staffan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Pipelines-for-controlling-workflow-tp22403.html Sent from the Apache Spark User List

How to efficiently control concurrent Spark jobs

2015-02-25 Thread Staffan
this in a better way or perhaps if there a higher level workflow tool that I can use on top of Spark? (The cool solution would have been to use nestled RDDs and just map over them in a high level way, but as this is not supported afaik). Thanks! Staffan -- View this message in context: http

Re: Issues when combining Spark and a third party java library

2015-01-27 Thread Staffan
To clarify: I'm currently working on this locally, running on a laptop and I do not use Spark-submit (using Eclipse to run my applications currently). I've tried running both on Mac OS X and in a VM running Ubuntu. Furthermore, I've got the VM from a fellow worker which has no issues running his

Re: Issues when combining Spark and a third party java library

2015-01-27 Thread Staffan
Okay, I finally tried to change the Hadoop-client version from 2.4.0 to 2.5.2 and that mysteriously fixed everything.. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Issues-when-combining-Spark-and-a-third-party-java-library-tp21367p21387.html Sent from

Issues when combining Spark and a third party java library

2015-01-26 Thread Staffan
I'm using Maven and Eclipse to build my project. I'm letting Maven download all the things I need for running everything, which has worked fine up until now. I need to use the CDK library (https://github.com/egonw/cdk, http://sourceforge.net/projects/cdk/) and once I add the dependencies to my