MLPC model can not be saved

2016-03-20 Thread HanPan
Hi Guys, I built a ML pipeline that includes multilayer perceptron classifier, I got the following error message when I tried to save the pipeline model. It seems like MLPC model can not be saved which means I have no ways to save the trained model. Is there any way to save the

Re: graceful shutdown in external data sources

2016-03-20 Thread Hamel Kothari
Dan, You could probably just register a JVM shutdown hook yourself: https://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread ) This at least would let you close the connections when the application as a whole has completed (in standalone) or when your

Re: Spark 1.6.1 Hadoop 2.6 package on S3 corrupt?

2016-03-20 Thread Nicholas Chammas
I'm seeing the same. :( On Fri, Mar 18, 2016 at 10:57 AM Ted Yu wrote: > I tried again this morning : > > $ wget > https://s3.amazonaws.com/spark-related-packages/spark-1.6.1-bin-hadoop2.6.tgz > --2016-03-18 07:55:30-- >

[POWERED BY] Please add our organization

2016-03-20 Thread Craig Lukasik
Name: ​ ​ Zaloni's Bedrock & Mica URL: http://www.zaloni.com/products/ Description: ​ ​ Zaloni's data ​lake ​ management platform (Bedrock) and self-service data preparation solution ​ ​ (Mica) leverage Spark for ​fast execution of transformations and data exploration.

Re: Spark build with scala-2.10 fails ?

2016-03-20 Thread Josh Rosen
It looks like the Scala 2.10 Jenkins build is working: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-compile-sbt-scala-2.10/ Can you share more details about how you're compiling with 2.10 (e.g. which commands you ran, git SHA, etc)? On Wed, Mar 16, 2016 at

Re: SPARK-13843 and future of streaming backends

2016-03-20 Thread Marcelo Vanzin
Hi Reynold, thanks for the info. On Thu, Mar 17, 2016 at 2:18 PM, Reynold Xin wrote: > If one really feels strongly that we should go through all the overhead to > setup an ASF subproject for these modules that won't work with the new > structured streaming, and want to

[discuss] making SparkEnv private in Spark 2.0

2016-03-20 Thread Reynold Xin
Any objections? Please articulate your use case. SparkEnv is a weird one because it was documented as "private" but not marked as so in class visibility. * NOTE: This is not intended for external use. This is exposed for Shark and may be made private * in a future release. I do see Hive

Re: graceful shutdown in external data sources

2016-03-20 Thread Dan Burkert
After further thought, I think following both of your suggestions- adding a shutdown hook and making the threads non-daemon- may have the result I'm looking for. I'll check and see if there are other reasons not to use daemon threads in our networking internals. More generally though, what do

Re: [discuss] making SparkEnv private in Spark 2.0

2016-03-20 Thread Reynold Xin
On Wed, Mar 16, 2016 at 3:29 PM, Mridul Muralidharan wrote: > b) Shuffle manager (to get shuffle reader) > What's the use case for shuffle manager/reader? This seems like using super internal APIs in applications.