Re: Environment tab meaning

2016-06-07 Thread satish saley
Thank you Jacek.
In case of YARN, I see that hadoop jars are present in system classpath for
Driver. Will it be the same for all executors?

On Tue, Jun 7, 2016 at 11:22 AM, Jacek Laskowski <ja...@japila.pl> wrote:

> Ouch, I made a mistake :( Sorry.
>
> You've asked about spark **history** server. It's pretty much the same.
>
> HistoryServer is a web interface for completed and running (aka
> incomplete) Spark applications. It uses EventLoggingListener to
> collect events as JSON using org.apache.spark.util.JsonProtocol
> object.
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Tue, Jun 7, 2016 at 8:18 PM, Jacek Laskowski <ja...@japila.pl> wrote:
> > Hi,
> >
> > It is the driver - see the port. Is this 4040 or similar? It's started
> > when SparkContext starts and is controlled by spark.ui.enabled.
> >
> > spark.ui.enabled (default: true) = controls whether the web UI is
> > started or not.
> >
> > It's through JobProgressListener which is the SparkListener for web UI
> > that the console knows what happens under the covers (and can
> > calculate the stats).
> >
> > BTW, spark.ui.port (default: 4040) controls the port Web UI binds to.
> >
> > Pozdrawiam,
> > Jacek Laskowski
> > 
> > https://medium.com/@jaceklaskowski/
> > Mastering Apache Spark http://bit.ly/mastering-apache-spark
> > Follow me at https://twitter.com/jaceklaskowski
> >
> >
> > On Tue, Jun 7, 2016 at 8:11 PM, satish saley <satishsale...@gmail.com>
> wrote:
> >> Hi,
> >> In spark history server, we see environment tab. Is it show environment
> of
> >> Driver or Executor or both?
> >>
> >> Jobs
> >> Stages
> >> Storage
> >> Environment
> >> Executors
> >>
>


Environment tab meaning

2016-06-07 Thread satish saley
Hi,
In spark history server, we see environment tab. Is it show environment of
Driver or Executor or both?


   - Jobs
   
   - Stages
   
   - Storage
   
   - Environment
   
   - Executors
   
   -


duplicate jar problem in yarn-cluster mode

2016-05-17 Thread satish saley
Hello,
I am executing a simple code with yarn-cluster

--master
yarn-cluster
--name
Spark-FileCopy
--class
my.example.SparkFileCopy
--properties-file
spark-defaults.conf
--queue
saleyq
--executor-memory
1G
--driver-memory
1G
--conf
spark.john.snow.is.back=true
--jars
hdfs://myclusternn.com:8020/tmp/saley/examples/examples-new.jar
--conf
spark.executor.extraClassPath=examples-new.jar
--conf
spark.driver.extraClassPath=examples-new.jar
--verbose
examples-new.jar
hdfs://myclusternn.com:8020/tmp/saley/examples/input-data/text/data.txt
hdfs://myclusternn.com:8020/tmp/saley/examples/output-data/spark


I am facing

Resource hdfs://
myclusternn.com/user/saley/.sparkStaging/application_5181/examples-new.jar
changed on src filesystem (expected 1463440119942, was 1463440119989
java.io.IOException: Resource hdfs://
myclusternn.com/user/saley/.sparkStaging/application_5181/examples-new.jar
changed on src filesystem (expected 1463440119942, was 1463440119989

I see a jira for this
https://issues.apache.org/jira/browse/SPARK-1921

Is this yet to be fixed or fixed as part of another jira and need some
additional config?


Re: System memory 186646528 must be at least 4.718592E8.

2016-05-13 Thread satish saley
Thank you . Looking at the source code helped :)

I set spark.testing.memory to 512 MB and it worked :)

private def getMaxMemory(conf: SparkConf): Long = {
  val systemMemory = conf.getLong("spark.testing.memory",
Runtime.getRuntime.maxMemory)
  val reservedMemory = conf.getLong("spark.testing.reservedMemory",
if (conf.contains("spark.testing")) 0 else RESERVED_SYSTEM_MEMORY_BYTES)
  val minSystemMemory = reservedMemory * 1.5
  if (systemMemory < minSystemMemory) {
throw new IllegalArgumentException(s"System memory $systemMemory must " +


On Fri, May 13, 2016 at 12:51 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Here is related code:
>
>   val executorMemory = conf.*getSizeAsBytes*("spark.executor.memory")
>   if (executorMemory < minSystemMemory) {
> throw new IllegalArgumentException(s"Executor memory
> $executorMemory must be at least " +
>
> On Fri, May 13, 2016 at 12:47 PM, satish saley <satishsale...@gmail.com>
> wrote:
>
>> Hello,
>> I am running
>> https://github.com/apache/spark/blob/branch-1.6/examples/src/main/python/pi.py
>>  example,
>> but facing following exception
>>
>> What is the unit of memory pointed out in the error?
>>
>> Following are configs
>>
>> --master
>>
>> local[*]
>>
>> --deploy-mode
>>
>> client
>>
>> --name
>>
>> PysparkExample
>>
>> --py-files
>>
>> py4j-0.9-src.zip,pyspark.zip,
>>
>> --verbose
>>
>>
>> pi.py/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
>>
>> py4j.protocol.Py4JJavaError: An error occurred while calling
>> None.org.apache.spark.api.java.JavaSparkContext.
>>
>> : java.lang.IllegalArgumentException: System memory 186646528 must be at
>> least 4.718592E8. Please use a larger heap size.
>>
>> at
>> org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:193)
>>
>> at
>> org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:175)
>>
>> at org.apache.spark.SparkEnv$.create(SparkEnv.scala:354)
>>
>> at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
>>
>> at
>> org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)
>>
>> at org.apache.spark.SparkContext.(SparkContext.scala:457)
>>
>> at
>> org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:59)
>>
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>
>> at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>>
>> at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>>
>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
>>
>> at
>> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
>>
>> at py4j.Gateway.invoke(Gateway.java:214)
>>
>> at
>> py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
>>
>> at
>> py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
>>
>> at py4j.GatewayConnection.run(GatewayConnection.java:209)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>
>


System memory 186646528 must be at least 4.718592E8.

2016-05-13 Thread satish saley
Hello,
I am running
https://github.com/apache/spark/blob/branch-1.6/examples/src/main/python/pi.py
example,
but facing following exception

What is the unit of memory pointed out in the error?

Following are configs

--master

local[*]

--deploy-mode

client

--name

PysparkExample

--py-files

py4j-0.9-src.zip,pyspark.zip,

--verbose


pi.py/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value

py4j.protocol.Py4JJavaError: An error occurred while calling
None.org.apache.spark.api.java.JavaSparkContext.

: java.lang.IllegalArgumentException: System memory 186646528 must be at
least 4.718592E8. Please use a larger heap size.

at
org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:193)

at
org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:175)

at org.apache.spark.SparkEnv$.create(SparkEnv.scala:354)

at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)

at
org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)

at org.apache.spark.SparkContext.(SparkContext.scala:457)

at
org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:59)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)

at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)

at py4j.Gateway.invoke(Gateway.java:214)

at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)

at
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)

at py4j.GatewayConnection.run(GatewayConnection.java:209)

at java.lang.Thread.run(Thread.java:745)


pyspark.zip and py4j-0.9-src.zip

2016-05-15 Thread satish saley
Hi,
Is there any way to pull in pyspark.zip and py4j-0.9-src.zip in maven
project?


mesos cluster mode

2016-05-05 Thread satish saley
Hi,
Spark documentation says that "cluster mode is currently not supported for
Mesos clusters."But below we can see mesos example with cluster mode. I
don't have mesos cluster to try it out. Which one is true? Shall I
interpret it as "cluster mode is currently not supported for Mesos clusters*
for Python applications*" ?

"Alternatively, if your application is submitted from a machine far from
the worker machines (e.g. locally on your laptop), it is common to use
cluster mode to minimize network latency between the drivers and the
executors. Note that cluster mode is currently not supported for Mesos
clusters. Currently only YARN supports cluster mode for Python
applications."



# Run on a Mesos cluster in cluster deploy mode with supervise
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master mesos://207.184.161.138:7077 \
  --deploy-mode cluster
  --supervise
  --executor-memory 20G \
  --total-executor-cores 100 \
  http://path/to/examples.jar \
  1000


Re: killing spark job which is submitted using SparkSubmit

2016-05-06 Thread satish saley
Hi Anthony,

I am passing

--master
yarn-cluster
--name
pysparkexample
--executor-memory
1G
--driver-memory
1G
--conf
spark.yarn.historyServer.address=http://localhost:18080
--conf
spark.eventLog.enabled=true

--verbose

pi.py


I am able to run the job successfully. I just want to get it killed
automatically whenever I kill my application.


On Fri, May 6, 2016 at 11:58 AM, Anthony May <anthony...@gmail.com> wrote:

> Greetings Satish,
>
> What are the arguments you're passing in?
>
> On Fri, 6 May 2016 at 12:50 satish saley <satishsale...@gmail.com> wrote:
>
>> Hello,
>>
>> I am submitting a spark job using SparkSubmit. When I kill my
>> application, it does not kill the corresponding spark job. How would I kill
>> the corresponding spark job? I know, one way is to use SparkSubmit again
>> with appropriate options. Is there any way though which I can tell
>> SparkSubmit at the time of job submission itself. Here is my code:
>>
>>
>>-
>>import org.apache.spark.deploy.SparkSubmit;
>>- class MyClass{
>>-
>>- public static void main(String args[]){
>>- //preparing args
>>- SparkSubmit.main(args);
>>- }
>>-
>>- }
>>
>>


Re: killing spark job which is submitted using SparkSubmit

2016-05-06 Thread satish saley
Thank you Anthony. I am clearer on yarn-cluster and yarn-client now.

On Fri, May 6, 2016 at 1:05 PM, Anthony May <anthony...@gmail.com> wrote:

> Making the master yarn-cluster means that the driver is then running on
> YARN not just the executor nodes. It's then independent of your application
> and can only be killed via YARN commands, or if it's batch and completes.
> The simplest way to tie the driver to your app is to pass in yarn-client as
> master instead.
>
> On Fri, May 6, 2016 at 2:00 PM satish saley <satishsale...@gmail.com>
> wrote:
>
>> Hi Anthony,
>>
>> I am passing
>>
>> --master
>> yarn-cluster
>> --name
>> pysparkexample
>> --executor-memory
>> 1G
>> --driver-memory
>> 1G
>> --conf
>> spark.yarn.historyServer.address=http://localhost:18080
>> --conf
>> spark.eventLog.enabled=true
>>
>> --verbose
>>
>> pi.py
>>
>>
>> I am able to run the job successfully. I just want to get it killed 
>> automatically whenever I kill my application.
>>
>>
>> On Fri, May 6, 2016 at 11:58 AM, Anthony May <anthony...@gmail.com>
>> wrote:
>>
>>> Greetings Satish,
>>>
>>> What are the arguments you're passing in?
>>>
>>> On Fri, 6 May 2016 at 12:50 satish saley <satishsale...@gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I am submitting a spark job using SparkSubmit. When I kill my
>>>> application, it does not kill the corresponding spark job. How would I kill
>>>> the corresponding spark job? I know, one way is to use SparkSubmit again
>>>> with appropriate options. Is there any way though which I can tell
>>>> SparkSubmit at the time of job submission itself. Here is my code:
>>>>
>>>>
>>>>-
>>>>import org.apache.spark.deploy.SparkSubmit;
>>>>- class MyClass{
>>>>-
>>>>- public static void main(String args[]){
>>>>- //preparing args
>>>>- SparkSubmit.main(args);
>>>>- }
>>>>-
>>>>- }
>>>>
>>>>
>>


killing spark job which is submitted using SparkSubmit

2016-05-06 Thread satish saley
Hello,

I am submitting a spark job using SparkSubmit. When I kill my application,
it does not kill the corresponding spark job. How would I kill the
corresponding spark job? I know, one way is to use SparkSubmit again with
appropriate options. Is there any way though which I can tell SparkSubmit
at the time of job submission itself. Here is my code:

-
import org.apache.spark.deploy.SparkSubmit;
- class MyClass{
-
- public static void main(String args[]){
- //preparing args
- SparkSubmit.main(args);
- }
-
- }


Redirect from yarn to spark history server

2016-05-02 Thread satish saley
Hello,

I am running pyspark job using yarn-cluster mode. I can see spark job in
yarn but I am able to go from any "log history" link from yarn to spark
history server. How would I keep track of yarn log and its corresponding
log in spark history server? Is there any setting in yarn/spark that let me
redirect to spark history server from yarn?

Best,
Satish


Unsubscribe

2017-02-05 Thread satish saley
 blockquote, div.yahoo_quoted { margin-left: 0 !important; border-left:1px 
#715FFA solid !important; padding-left:1ex !important; background-color:white 
!important; }  Unsubscribe


Sent from Yahoo Mail for iPhone