Spark Streaming Shuffle to Disk

2015-12-04 Thread spearson23
I'm running a Spark Streaming job on 1.3.1 which contains an updateStateByKey. The job works perfectly fine, but at some point (after a few runs), it starts shuffling to disk no matter how much memory I give the executors. I have tried changing --executor-memory on spark-submit,

Re: JMXSink for YARN deployment

2015-12-04 Thread spearson23
We use a metrics.property file on YARN by submitting applications like this: spark-submit --conf spark.metrics.conf=metrics.properties --class CLASS_NAME --master yarn-cluster --files /PATH/TO/metrics.properties /PATH/TO/CODE.JAR /PATH/TO/CONFIG.FILE APP_NAME -- View this message in context:

Re: JMXSink for YARN deployment

2015-12-04 Thread spearson23
Run "spark-submit --help" to see all available options. To get JMX to work you need to: spark-submit --driver-java-options "-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=JMX_PORT"