Re: How to scale livy servers?

2017-12-18 Thread amarouni
On 18/12/2017 04:03, Meisam Fathi wrote: >   > > 1) I have a couple of livy servers that are submitting jobs and > say one of them crashes the session id's again start from 0 which > can coincide with the non-faulty running livy servers. I think it > would be nice to have session

[GitHub] bahir pull request #42: [BAHIR-116] Add spark streaming connector to Google ...

2017-05-03 Thread amarouni
Github user amarouni commented on a diff in the pull request: https://github.com/apache/bahir/pull/42#discussion_r114502066 --- Diff: streaming-pubsub/examples/src/main/scala/org.apache.spark.examples.streaming.pubsub/PubsubWordCount.scala --- @@ -0,0 +1,150

Re: Beam spark 2.x runner status

2017-03-16 Thread amarouni
ired. >>> I'd be happy to do that, or guide anyone who wants to (I did most of it on >>> my branch for Spark 2 anyway) but since it's a branch and not on master (I >>> don't believe it "deserves" a place on master), it would always be a bit >>> behi

Re: Beam spark 2.x runner status

2017-03-15 Thread amarouni
+1 for Spark runners based on different APIs RDD/Dataset and keeping the Spark versions as a deployment dependency. The RDD API is stable & mature enough so it makes sense to have it on master, the Dataset API still have some work to do and from our own experience it just reached a comparable RDD

Re: Graduation!

2017-01-11 Thread amarouni
Congratulations to everyone on this important milestone. On 11/01/2017 11:52, Neelesh Salian wrote: > Congratulations to the community. :) > > On Jan 11, 2017 3:37 PM, "Stephan Ewen" wrote: > >> Very nice :-) >> >> Good to see this happening! >> >> On Tue, Jan 10, 2017 at

Re: DataFrame Sort gives Cannot allocate a page with more than 17179869176 bytes

2016-10-06 Thread amarouni
You can get some more insights by using the Spark history server (http://spark.apache.org/docs/latest/monitoring.html), it can show you which task is failing and some other information that might help you debugging the issue. On 05/10/2016 19:00, Babak Alipour wrote: > The issue seems to lie in

[GitHub] incubator-beam pull request: [BEAM-313] Enable the use of an existing spark ...

2016-05-31 Thread amarouni
GitHub user amarouni opened a pull request: https://github.com/apache/incubator-beam/pull/401 [BEAM-313] Enable the use of an existing spark context with the SparkPipelineRunner The general use case is that the SparkPipelineRunner creates its own Spark context and uses

Spark ML Interaction

2016-03-08 Thread amarouni
Hi, Did anyone here manage to write an example of the following ML feature transformer http://spark.apache.org/docs/latest/api/java/org/apache/spark/ml/feature/Interaction.html ? It's not documented on the official Spark ML features pages but it can be found in the package API javadocs. Thanks,

Dynamic jar loading

2015-12-17 Thread amarouni
Hello guys, Do you know if the method SparkContext.addJar("file:///...") can be used on a running context (an already started spark-shell) ? And if so, does it add the jar to the class-path of the Spark workers (Yarn containers in case of yarn-client) ? Thanks,

Re: Database does not exist: (Spark-SQL ===> Hive)

2015-12-15 Thread amarouni
Can you test with latest version of spark ? I had the same issue with 1.3 and it was resolved 1.5. On 15/12/2015 04:31, Jeff Zhang wrote: > Do you put hive-site.xml on the classpath ? > > On Tue, Dec 15, 2015 at 11:14 AM, Gokula Krishnan D > >

Re: Save RandomForest Model from ML package

2015-10-23 Thread amarouni
It's an open issue : https://issues.apache.org/jira/browse/SPARK-4587 That's being said, you can workaround the issue by serializing the Model (simple java serialization) and then restoring it before calling the predicition job. Best Regards, On 22/10/2015 14:33, Sebastian Kuepers wrote: >