subject:"balancing RDDs"

Re: balancing RDDs

2014-06-25 Thread Sean McNamara

Yep exactly! I’m not sure how complicated it would be to pull off. If someone wouldn’t mind helping to get me pointed in the right direction I would be happy to look into and contribute this functionality. I imagine this would be implemented in the scheduler codebase and there would be some s

Re: balancing RDDs

2014-06-24 Thread Mayur Rustagi

This would be really useful. Especially for Shark where shift of partitioning effects all subsequent queries unless task scheduling time beats spark.locality.wait. Can cause overall low performance for all subsequent tasks. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur

balancing RDDs

2014-06-23 Thread Sean McNamara

We have a use case where we’d like something to execute once on each node and I thought it would be good to ask here. Currently we achieve this by setting the parallelism to the number of nodes and use a mod partitioner: val balancedRdd = sc.parallelize( (0 until Settings.parallelism)

Re: balancing RDDs

Re: balancing RDDs

balancing RDDs

3 matches

Site Navigation

Mail list logo

Footer information