Python memory included YARN-monitored memory?

2016-05-27 Thread Mike Sukmanowsky
Hi everyone, More of a YARN/OS question than a Spark one, but would be good to clarify this on the docs somewhere once I get an answer. We use PySpark for all our Spark applications running on EMR. Like many users, we're accustomed to seeing the occasional ExecutorLostFailure after YARN kills a

Re: Spark Metrics Framework?

2016-04-01 Thread Mike Sukmanowsky
equest if there isn’t > one already. > > Thanks, > Silvio > > From: Mike Sukmanowsky <mike.sukmanow...@gmail.com> > Date: Friday, March 25, 2016 at 10:48 AM > > To: Silvio Fiorito <silvio.fior...@granturing.com>, "user@spark.apache.org" > <user@spa

Re: Spark Metrics Framework?

2016-03-25 Thread Mike Sukmanowsky
Pinging again - any thoughts? On Wed, 23 Mar 2016 at 09:17 Mike Sukmanowsky <mike.sukmanow...@gmail.com> wrote: > Thanks Ted and Silvio. I think I'll need a bit more hand holding here, > sorry. The way we use ES Hadoop is in pyspark via > org.elasticsearch.hadoop.mr

Re: Spark Metrics Framework?

2016-03-23 Thread Mike Sukmanowsky
o.fior...@granturing.com> wrote: > Hi Mike, > > It’s been a while since I worked on a custom Source but I think all you > need to do is make your Source in the org.apache.spark package. > > Thanks, > Silvio > > From: Mike Sukmanowsky <mike.sukmanow...@gmail.com&

Re: Spark Metrics Framework?

2016-03-22 Thread Mike Sukmanowsky
se the metric sources and sinks described here: > http://spark.apache.org/docs/latest/monitoring.html#metrics > > If you want to push the metrics to another system you can define a custom > sink. You can also extend the metrics by defining a custom source. > > From: Mike Sukmanow

Spark Metrics Framework?

2016-03-21 Thread Mike Sukmanowsky
We make extensive use of the elasticsearch-hadoop library for Hadoop/Spark. In trying to troubleshoot our Spark applications, it'd be very handy to have access to some of the many metrics that the library makes available

PySpark concurrent jobs using single SparkContext

2015-08-20 Thread Mike Sukmanowsky
a single PySpark context? -- Mike Sukmanowsky Aspiring Digital Carpenter *e*: mike.sukmanow...@gmail.com LinkedIn http://www.linkedin.com/profile/view?id=10897143 | github https://github.com/msukmanowsky

PySpark and Cassandra 2.1 Examples

2014-10-29 Thread Mike Sukmanowsky
) structs to Cassandra. Comments or questions are welcome. Will update the group again when we have support for the DataStax connector. -- Mike Sukmanowsky Aspiring Digital Carpenter *p*: +1 (416) 953-4248 *e*: mike.sukmanow...@gmail.com facebook http://facebook.com/mike.sukmanowsky | twitter http

Using the DataStax Cassandra Connector from PySpark

2014-10-21 Thread Mike Sukmanowsky
. The correct response from the GatewayServer should be: In [22]: gateway.jvm.CassandraRow() Out[22]: JavaObject id=o0 Also tried using --jars option instead and that doesn't seem to work either. Is there something I'm missing as to why the classes aren't available? -- Mike Sukmanowsky