Thanks, Sean, I'll try to explain, what I'm trying to do. The native component, that I'm talking about is the native code, that I call using JNI. I've wrote small test
Here, I traverse through the collection to call the native component N (1000) times. Then I have a result it means, that I'm able to get 10 req/sec by calling native component. And I would like to achieve the same result (not less) on a single node using spark. Then I've started 1 node cluster and runned next code on it: Here I've provided partitions = 1000, but the response time was not the same, but a lot more worse: Operation filtered.top(10)(Ordering.Double) is blocking, as I understand, at this time closure inside the map transformation starts to execute, calling native component is blocking there. If I could make it non-blocking, I would expect increase in performance. What do you think? How would you improve code? Or what spark configurations to look for? (Sorry, I'm quite new to Spark) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-do-you-perform-blocking-IO-in-apache-spark-job-tp13704p13713.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org