Re: How do you perform blocking IO in apache spark job?

DrKhu Mon, 08 Sep 2014 09:50:41 -0700

 Thanks, Sean, I'll try to explain, what I'm trying to do.

The native component, that I'm talking about is the native code, that I call
using JNI.
I've wrote small test




Here, I traverse through the collection to call the native component N
(1000) times.
Then I have a result 

it means, that I'm able to get 10 req/sec by calling native component.

And I would like to achieve the same result (not less) on a single node
using spark.
Then I've started 1 node cluster and runned next code on it:



Here I've provided partitions = 1000, but the response time was not the
same, but a lot more worse:



Operation filtered.top(10)(Ordering.Double) is blocking, as I understand, at
this time closure inside the map transformation starts to execute, calling
native component is blocking there. If I could make it non-blocking, I would
expect increase in performance.

What do you think?
How would you improve code? Or what spark configurations to look for?
(Sorry, I'm quite new to Spark)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-do-you-perform-blocking-IO-in-apache-spark-job-tp13704p13713.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: How do you perform blocking IO in apache spark job?

Reply via email to