Re: Google Summer of Code 2017 is coming

2017-02-05 Thread Nick Pentreath
I think Sean raises valid points - that the result is highly dependent on the particular student, project and mentor involved, and that the actual required time investment is very significant. Having said that, it's not all bad certainly. Scikit-learn started as a GSoC project 10 years ago!

subscribe

2017-02-05 Thread 李昀樵

FileNotFoundException, while file is actually available

2017-02-05 Thread Evgenii Morozov
Hi, I see a lot of exceptions like the following during our machine learning pipeline calculation. Spark version 2.0.2. Sometimes it’s just few executors that fails with this message, but the job is successful. I’d appreciate any hint you might have. Thank you. 2017-02-05 07:56:47.022

Re: ml word2vec finSynonyms return type

2017-02-05 Thread Asher Krim
It took me a while, but I finally got around this: https://github.com/apache/spark/pull/16811/files On Fri, Jan 6, 2017 at 4:03 AM, Asher Krim wrote: > Felix - I'm not sure I understand your example about pipeline models, > could you elaborate? I'm talking about the

Re: specifing schema on dataframe

2017-02-05 Thread Michael Armbrust
-dev You can use withColumn to change the type after the data has been loaded . On Sat, Feb 4, 2017 at 6:22 AM, Sam Elamin

Is there any plan to have a predict method for single instance on PipelineModel?

2017-02-05 Thread Aseem Bansal
Hi I looked up in the JIRA but could not find any JIRA to support predict method for single instance on PipelineModel. Is there anything that I may have missed?

Re: Is there any plan to have a predict method for single instance on PipelineModel?

2017-02-05 Thread Holden Karau
I'm in mobile right now but there is a JIRA to add it to the models first and on that JIRA people are discussing single element transform as a possibility - https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-10413 There might be others as well that just aren't as fresh in my

Re: How to checkpoint and RDD after a stage and before reaching an action?

2017-02-05 Thread Liang-Chi Hsieh
Hi Leo, The checkpointing of a RDD will be performed after a job using this RDD has completed. Since you have only one job, rdd1 will only be checkpointed after it is finished. To checkpoint rdd1, you can simply materialize (and maybe cache it to avoid recomputation) rdd1 (e.g., rdd1.count)