Re: submitting spark job with kerberized Hadoop issue

2016-08-06 Thread Wojciech Pituła
What I can say, is that we successfully use spark on yarn with kerberized cluster. One of my coworkers also tried using it in the same way as you are(spark standalone with kerberized cluster) but as far as I remember, he didn't succeed. I may be wrong, because I was not personally involved in this

Dynamic (de)allocation with Spark Streaming

2015-11-04 Thread Wojciech Pituła
Hi, I have some doubts about dynamic resource allocation with spark streaming. If spark had allocated 5 executors for me, then he would dispatch every batch tasks on all of them equally. So if batchSize < spark.dynamicAllocation.executorIdleTimeout then spark will never free any executor.

Re: Java 8 vs Scala

2015-07-16 Thread Wojciech Pituła
IMHO only Scala is an option. Once you're familiar with it you just cant even look at java code. czw., 16.07.2015 o 07:20 użytkownik spark user spark_u...@yahoo.com.invalid napisał: I struggle lots in Scala , almost 10 days n0 improvement , but when i switch to Java 8 , things are so smooth ,

Numer of runJob at SparkPlan.scala:122 in Spark SQL

2015-07-09 Thread Wojciech Pituła
Hey, I was wondering if it is possible to tune number of jobs generated by spark sql? Currently my query generates over 80 runJob at SparkPlan.scala:122 jobs, every one of them gets executed in ~4 sec and contains only 5 tasks. As a result of this, most of my cores do nothing.

Re: Spark streaming on standalone cluster

2015-07-01 Thread Wojciech Pituła
Hi, https://spark.apache.org/docs/latest/streaming-programming-guide.html Points to remember - When running a Spark Streaming program locally, do not use “local” or “local[1]” as the master URL. Either of these means that only one thread will be used for running tasks locally. If

Re: Spark-Submit / Spark-Shell Error Standalone cluster

2015-06-28 Thread Wojciech Pituła
I assume that /usr/bin/load-spark-env.sh exists. Have you got the rights to execute it? niedz., 28.06.2015 o 04:53 użytkownik Ashish Soni asoni.le...@gmail.com napisał: Not sure what is the issue but when i run the spark-submit or spark-shell i am getting below error /usr/bin/spark-class:

Re: Spark Streaming: limit number of nodes

2015-06-24 Thread Wojciech Pituła
on each node *From:* Wojciech Pituła [mailto:w.pit...@gmail.com] *Sent:* Tuesday, June 23, 2015 12:38 PM *To:* user@spark.apache.org *Subject:* Spark Streaming: limit number of nodes I have set up small standalone cluster: 5 nodes, every node has 5GB of memory an 8 cores. As you can see

Re: Spark Streaming: limit number of nodes

2015-06-23 Thread Wojciech Pituła
Best Regards On Tue, Jun 23, 2015 at 5:07 PM, Wojciech Pituła w.pit...@gmail.com wrote: I have set up small standalone cluster: 5 nodes, every node has 5GB of memory an 8 cores. As you can see, node doesn't have much RAM. I have 2 streaming apps, first one is configured to use 3GB of memory

Spark Streaming: limit number of nodes

2015-06-23 Thread Wojciech Pituła
I have set up small standalone cluster: 5 nodes, every node has 5GB of memory an 8 cores. As you can see, node doesn't have much RAM. I have 2 streaming apps, first one is configured to use 3GB of memory per node and second one uses 2GB per node. My problem is, that smaller app could easily run