Re: The driver hangs at DataFrame.rdd in Spark 2.1.0

2017-02-23 Thread StanZhai
Thanks for Cheng's help. It must be something wrong with InferFiltersFromConstraints, I just removed InferFiltersFromConstraints from org/apache/spark/sql/catalyst/optimizer/Optimizer.scala to avoid this issue. I will analysis this issue with the method your provided. --

Re: [Spark Namespace]: Expanding Spark ML under Different Namespace?

2017-02-23 Thread Joseph Bradley
+1 for Nick's comment about discussing APIs which need to be made public in https://issues.apache.org/jira/browse/SPARK-19498 ! On Thu, Feb 23, 2017 at 2:36 AM, Steve Loughran wrote: > > On 22 Feb 2017, at 20:51, Shouheng Yi > wrote: > > Hi

unsubscribe

2017-02-23 Thread Donam Kim
unsubscribe

Re: Feedback on MLlib roadmap process proposal

2017-02-23 Thread Tim Hunter
As Sean wrote very nicely above, the changes made to Spark are decided in an organic fashion based on the interests and motivations of the committers and contributors. The case of deep learning is a good example. There is a lot of interest, and the core algorithms could be implemented without too

Re: The driver hangs at DataFrame.rdd in Spark 2.1.0

2017-02-23 Thread Cheng Lian
This one seems to be relevant, but it's already fixed in 2.1.0. One way to debug is to turn on trace log and check how the analyzer/optimizer behaves. On 2/22/17 11:11 PM, StanZhai wrote: Could this be related to https://issues.apache.org/jira/browse/SPARK-17733 ? --

Re: Feedback on MLlib roadmap process proposal

2017-02-23 Thread Nick Pentreath
Sorry for being late to the discussion. I think Joseph, Sean and others have covered the issues well. Overall I like the proposed cleaned up roadmap & process (thanks Joseph!). As for the actual critical roadmap items mentioned on SPARK-18813, I think it makes sense and will comment a bit further

Re: Support for decimal separator (comma or period) in spark 2.1

2017-02-23 Thread Hyukjin Kwon
Please take a look at https://issues.apache.org/jira/browse/SPARK-18359. 2017-02-23 21:53 GMT+09:00 Arkadiusz Bicz : > Thank you Sam for answer, I have solved problem by loading all decimals > columns as string and replacing all commas with dots but this solution is >

unsubscribe

2017-02-23 Thread नितेश
unsubscribe

Re: Support for decimal separator (comma or period) in spark 2.1

2017-02-23 Thread Arkadiusz Bicz
Thank you Sam for answer, I have solved problem by loading all decimals columns as string and replacing all commas with dots but this solution is lacking of automatic infer schema which is quite nice functionality. I can work on adding new option to DataFrameReader for localization like:

RE: Implementation of RNN/LSTM in Spark

2017-02-23 Thread Joeri Hermans
Hi Nikita, We are actively working on this: https://github.com/cerndb/dist-keras This will allow you to run Keras on Spark (with distributed optimization algorithms) through pyspark. I recommend you to check the examples https://github.com/cerndb/dist-keras/tree/master/examples. However, you

Re: Implementation of RNN/LSTM in Spark

2017-02-23 Thread Nick Pentreath
The short answer is there is none and highly unlikely to be inside of Spark MLlib any time in the near future. The best bets are to look at other DL libraries - for JVM there is Deeplearning4J and BigDL (there are others but these seem to be the most comprehensive I have come across) - that run

Re: Support for decimal separator (comma or period) in spark 2.1

2017-02-23 Thread Sam Elamin
Hi Arkadiuz Not sure if there is a localisation ability but I'm sure other will correct me if I'm wrong What you could do is write a udf function that replaces the commas with a . Assuming you know the column in question Regards Sam On Thu, 23 Feb 2017 at 12:31, Arkadiusz Bicz

Re: Support for decimal separator (comma or period) in spark 2.1

2017-02-23 Thread Arkadiusz Bicz
> > Hi Team, > > I would like to know if it is possible to specify decimal localization for > DataFrameReader for csv? > > I have cvs files from localization where decimal separator is comma like > 0,32 instead of US way like 0.32 > > Is it a way to specify in current version of spark to provide

Re: Implementation of RNN/LSTM in Spark

2017-02-23 Thread n1kt0
Hi, can anyone tell me what the current status about RNNs in Spark is? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Implementation-of-RNN-LSTM-in-Spark-tp14866p21060.html Sent from the Apache Spark Developers List mailing list archive at

Re: Filestream can not recognize the copied file

2017-02-23 Thread ??????????
hi all, I checked the code just now. Spark scan the dir and filter files by timestamp of file. I copied files to the dir but not changed the timestamp. I know the reason now. thanks ---Original--- From: "??"<1427357...@qq.com> Date: 2017/2/23 17:07:33 To: "dev";

Re: [Spark Namespace]: Expanding Spark ML under Different Namespace?

2017-02-23 Thread Steve Loughran
On 22 Feb 2017, at 20:51, Shouheng Yi > wrote: Hi Spark developers, Currently my team at Microsoft is extending Spark’s machine learning functionalities to include new learners and transformers. We would like users to use

unsubscribe

2017-02-23 Thread Donam Kim
unsubscribe

Filestream can not recognize the copied file

2017-02-23 Thread ??????????
Hi all, I tested filestream today, my code looks like: val fs = ssc.textFileStream(*) erroelines = fs.filter( _.contains("erroe")) erroelines.print ssc.start() when I edit a file and save it to the dir, it works well. If i copy a file to the dir, it does work. my issues are: 1, is it OK