Re: Spark UI consuming lots of memory

2015-10-15 Thread Nicholas Pritchard
there may be a bug that leaks memory in SQLListener. >> >> Best Regards, >> Shixiong Zhu >> >> 2015-10-13 11:44 GMT+08:00 Nicholas Pritchard < >> nicholas.pritch...@falkonry.com>: >> >>> As an update, I did try disabling the ui with "spark.u

Re: Spark UI consuming lots of memory

2015-10-12 Thread Nicholas Pritchard
2015 at 8:42 PM, Nicholas Pritchard < nicholas.pritch...@falkonry.com> wrote: > I set those configurations by passing to spark-submit script: > "bin/spark-submit --conf spark.ui.retainedJobs=20 ...". I have verified > that these configurations are being passed

Re: Spark UI consuming lots of memory

2015-10-12 Thread Nicholas Pritchard
I set those configurations by passing to spark-submit script: "bin/spark-submit --conf spark.ui.retainedJobs=20 ...". I have verified that these configurations are being passed correctly because they are listed in the environments tab and also by counting the number of job/stages that are listed.

Re: Force RDD evaluation

2015-02-23 Thread Nicholas Pritchard
Thanks, Sean! Yes, I agree that this logging would still have some cost and so would not be used in production. On Sat, Feb 21, 2015 at 1:37 AM, Sean Owen so...@cloudera.com wrote: I think the cheapest possible way to force materialization is something like rdd.foreachPartition(i = None) I

Creating time-sequential pairs

2014-05-10 Thread Nicholas Pritchard
Hi Spark community, I have a design/algorithm question that I assume is common enough for someone else to have tackled before. I have an RDD of time-series data formatted as time-value tuples, RDD[(Double, Double)], and am trying to extract threshold crossings. In order to do so, I first want to