Hi all,

I have the following spark configuration

spark.app.name=Test
spark.cassandra.connection.host=127.0.0.1
spark.cassandra.connection.keep_alive_ms=5000
spark.cassandra.connection.port=10000
spark.cassandra.connection.timeout_ms=30000
spark.cleaner.ttl=3600
spark.default.parallelism=4
spark.master=local[2]
spark.ui.enabled=false
spark.ui.showConsoleProgress=false

Because I am setting spark.default.parallelism to 4, I was expecting
only 4 spark partitions. But it looks like it is not the case

When I do the following

    df.foreachPartition { partition =>
      val groupedPartition = partition.toList.grouped(3).toList
      println("Grouped partition " + groupedPartition)
    }

There are too many print statements with empty list at the top. Only
the relevant partitions are at the bottom. Is there a way to control
number of partitions?

Regards,
Noorul

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to