Re: spark executor memory, jvm config

2017-03-08 Thread TheGeorge1918 .
OK, I found the problem. There is a typo in my configuration. As a result, the executor dynamic allocation is not disabled. So, the executors get killed and requested from time to time. All good now. On Wed, Mar 8, 2017 at 2:45 PM, TheGeorge1918 . wrote: > Hello all, > > I was run

spark executor memory, jvm config

2017-03-08 Thread TheGeorge1918 .
Hello all, I was running some spark job and some executors failed without error info. The executors were dead and new executors were requested but on the spark web UI, no failure found. Normally, if it's memory issue, I could find OOM ther, but not this time. Configuration: 1. each executor has

Re: How to run a spark on Pycharm

2017-03-03 Thread TheGeorge1918 .
Hey, Depends on your configuration. I configure my dockerfile with spark2.0 installed and in pycharm, properly configure the interpreter using docker and add following env in your script configuration. You can check the dockerfile here: https://github.com/zhangxuan1918/spark2.0 PYSPARK_PYTHON /u

configure spark with openblas, thanks

2016-09-29 Thread TheGeorge1918 .
Hi all, I’m trying to properly configure OpenBlas in spark ml. I use centos7, hadoop2.7.2, spark2.0 and python2.7. (I use pyspark to build ml pipeline) At first I have following warnings *WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS* *WARN BLAS: Fa

spark lda runs out of disk space

2016-02-29 Thread TheGeorge1918 .
Hi guys I was running lda with 2000 topics on 6G compressed data, roughly 1.2 million docs. I used aws 3 r3.8xlarge machines as core nodes. It turned out spark applications crushed after 3 or 4 iterations. From ganglia, it indicated the disk space was all consumed. I believe it’s the shuffle data

spark 1.5.0 mllib lda eats up all the disk space

2015-11-06 Thread TheGeorge1918 .
Hi all, *PROBLEM:* I'm using spark 1.5.0 distributedLDA to do topic modelling. It looks like after 20 iterations, the whole disk space is exhausted and the application broke down. *DETAILS:* I'm using 4 m3.2xlarge (each has 30G memory and 2x80G disk space) machines as data nodes. I monitored th