Re: RDD Partitions not distributed evenly to executors

2016-11-21 Thread Thunder Stumpges
Has anyone figured this out yet!? I have gone looking for this exact problem (spark 1.6.1) and I cannot get my partitions to be distributed evenly across executors no matter what I've tried. it has been mentioned several other times in the user group as well as the dev group (as mentioned by Mike

Re: Spark transformations

2016-09-12 Thread Thunder Stumpges
n Mon, Sep 12, 2016 at 9:43 AM, Thunder Stumpges < > thunder.stump...@gmail.com> wrote: > >> Hi Janardhan, >> >> I have run into similar issues and asked similar questions. I also ran >> into many problems with private code when trying to write my own >

Re: Spark transformations

2016-09-12 Thread Thunder Stumpges
Hi Janardhan, I have run into similar issues and asked similar questions. I also ran into many problems with private code when trying to write my own Model/Transformer/Estimator. (you might be able to find my question to the group regarding this, I can't really tell if my emails are getting

Re: Getting figures from spark streaming

2016-09-12 Thread Thunder Stumpges
Just a guess, but doesn't the `.apply(0)' at the end of each of your print statements take just the first one of the returned list? On Wed, Sep 7, 2016 at 12:36 AM Ashok Kumar wrote: > Any help on this warmly appreciated. > > > On Tuesday, 6 September 2016, 21:31,

Re: Complex RDD operation as DataFrame UDF ?

2016-09-09 Thread Thunder Stumpges
Bump, check if this is actually going to the group? I can't see my recent posts on the archives: http://apache-spark-user-list.1001560.n3.nabble.com/ Is there a reason it would not show up here? Thanks! On Tue, Sep 6, 2016 at 11:28 AM Thunder Stumpges <thunder.stump...@gmail.com> wrote:

Complex RDD operation as DataFrame UDF ?

2016-09-06 Thread Thunder Stumpges
Hi guys, Spark 1.6.1 here. I am trying to "DataFrame-ize" a complex function I have that currently operates on a DataSet, and returns another DataSet with a new "column" added to it. I'm trying to fit this into the new ML "Model" format where I can receive a DataFrame, ensure the input column

Re: Coding in the Spark ml "ecosystem" why is everything private?!

2016-08-29 Thread Thunder Stumpges
from a library. > > If there's a clear opportunity to expose something cleanly you can > bring it up for discussion. But it's never just a matter of making > something public. Making it public means committing others' time to > supporting it as-is for years. It would have to be worth

Coding in the Spark ml "ecosystem" why is everything private?!

2016-08-29 Thread Thunder Stumpges
Hi all, I'm not sure if this belongs here in users or over in dev as I guess it's somewhere in between. We have been starting to implement some machine learning pipelines, and it seemed from the documentation that Spark had a fairly well thought-out platform (see:

MLLib : Math on Vector and Matrix

2014-07-02 Thread Thunder Stumpges
I am upgrading from Spark 0.9.0 to 1.0 and I had a pretty good amount of code working with internals of MLLib. One of the big changes was the move from the old jblas.Matrix to the Vector/Matrix classes included in MLLib. However I don't see how we're supposed to use them for ANYTHING other than a

Re: MLLib : Math on Vector and Matrix

2014-07-02 Thread Thunder Stumpges
. Thunder On Wed, Jul 2, 2014 at 2:05 PM, Koert Kuipers ko...@tresata.com wrote: i did the second option: re-implemented .toBreeze as .breeze using pimp classes On Wed, Jul 2, 2014 at 5:00 PM, Thunder Stumpges thunder.stump...@gmail.com wrote: I am upgrading from Spark 0.9.0 to 1.0