Fwd: Writing streaming data to cassandra creates duplicates

2015-08-04 Thread Priya Ch
Yes...union would be one solution. I am not doing any aggregation hence reduceByKey would not be useful. If I use groupByKey, messages with same key would be obtained in a partition. But groupByKey is very expensive operation as it involves shuffle operation. My ultimate goal is to write the

Re: How to help for 1.5 release?

2015-08-04 Thread Akhil Das
I think you can start from here https://issues.apache.org/jira/browse/SPARK/fixforversion/12332078/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-summary-panel Thanks Best Regards On Tue, Aug 4, 2015 at 12:02 PM, Meihua Wu rotationsymmetr...@gmail.com wrote: I think the team is

How to help for 1.5 release?

2015-08-04 Thread Meihua Wu
I think the team is preparing for the 1.5 release. Anything to help with the QA, testing etc? Thanks, MW

Re: Have Friedman's glmnet algo running in Spark

2015-08-04 Thread Patrick
I have a follow up on this: I see on JIRA that the idea of having a GLMNET implementation was more or less abandoned, since a OWLQN implementation was chosen to construct a model using L1/L2 regularization. However, GLMNET has the property of returning a multitide of models (corresponding to

Re: Have Friedman's glmnet algo running in Spark

2015-08-04 Thread mike
My friends and I are continuing work on the algorithm. You are right that there are two elements to Friedman's glmnet algorithm. One is the use of coordinate descent for minimizing penalized regression with an absolute value penalty and the other is managing the regularization parameters.

Re: How to help for 1.5 release?

2015-08-04 Thread Patrick Wendell
Hey Meihua, If you are a user of Spark, one thing that is really helpful is to run Spark 1.5 on your workload and report any issues, performance regressions, etc. - Patrick On Mon, Aug 3, 2015 at 11:49 PM, Akhil Das ak...@sigmoidanalytics.com wrote: I think you can start from here

shane will be OOO 8-5-15 through 8-18-15

2015-08-04 Thread shane knapp
so i done gone and got myself hitched, and will be disappearing in to the rainy island of kol chang in thailand for the next ~2 weeks. :) this means i will be completely out of contact, and have to leave jenkins in the gentle hands of jon kuroda (a sysadmin here at the lab) and matt massie (my