What kind of Query are you performing? You should set something like 2 partition per core that would be 400 Mb per partition. As you have a lot of ram I suggest to cache the whole table, performance will increase a lot.
Paolo Inviata dal mio Windows Phone ________________________________ Da: vdiwakar.malladi<mailto:vdiwakar.mall...@gmail.com> Inviato: 05/12/2014 18:52 A: u...@spark.incubator.apache.org<mailto:u...@spark.incubator.apache.org> Oggetto: Optimized spark configuration Hi Could any one help what would be better / optimized configuration for driver memory, worker memory, number of parallelisms etc., parameters to be configured when we are running 1 master node (it itself acting as slave node also) and 1 slave node. Both are of 32 GB RAM with 4 cores. On this, I loaded approx. 17M rows of data (3.2 GB) to hive store and when I try to execute a query on this from jdbc thrift server, it is taking about 10-12 sec to retrieve the data which I think is too much. Or guide please guide me any tutorial which will explain about these optimize configurations. Thanks in advance. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Optimized-spark-configuration-tp20495.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org