Re: Spark Profiler

2019-03-26 Thread manish ranjan
I have found ganglia very helpful in understanding network I/o , CPU and memory usage for a given spark cluster. I have not used , but have heard good things about Dr Elephant ( which I think was contributed by LinkedIn but not 100%sure). On Tue, Mar 26, 2019, 5:59 AM Jack Kolokasis wrote: >

Re: Spark Tuning Tool

2018-01-23 Thread manish ranjan
This is awesome work Rohit. Not only as a user, but I will be also super interested in contributing to solving this pain point of my daily work. Manish ~Manish On Mon, Jan 22, 2018 at 9:21 PM, lucas.g...@gmail.com wrote: > I'd be very interested in anything I can send

Re: Monitoring the User Metrics for a long running Spark Job

2016-12-05 Thread manish ranjan
http://spark.apache.org/docs/latest/monitoring.html You can even install tools like dstat , iostat , and iotop , *collectd* can provide fine-grained profiling on individual nodes. If

Re: Spark Website

2016-07-13 Thread manish ranjan
working for me. What do you mean 'as supposed to'? ~Manish On Wed, Jul 13, 2016 at 11:45 AM, Benjamin Kim wrote: > Has anyone noticed that the spark.apache.org is not working as supposed > to? > > > - >

queup jobs in spark cluster

2015-09-26 Thread manish ranjan
Dear All, I have a small spark cluster for academia purpose and would like it to be open to accept jobs for set of friends where all of us can submit and queue up jobs. How is that possible ? What is solution of this problem ? Any blog/sw/ link will be very helpful. Thanks ~Manish

Need clarification on spark on cluster set up instruction

2015-06-29 Thread manish ranjan
Hi All here goes my first question : Here is my use case I have 1TB data I want to process on ec2 using spark I have uploaded the data on ebs volume The instruction on amazon ec2 set up explains *If your application needs to access large datasets, the fastest way to do that is to load them from