OK, I did some uber-basic testing of the Python ALS example and the Scala ALS
example (I wouldn't call this real benchmarking because of the configuration
and casual nature of the test).

CPU:i5-2500K
Memory allotted to an example with -Djava.executor.memory=2g
I've got one master and one slave running.

I'm listing the results in <API Language> <List of params: movies users
features iterations slices> : <Time taken> format.

<Scala> 500 2000 100 5 2 1m21s
<Scala> 500 2000 100 5 4 0m50s
<Scala> 700 2000 100 5 2 1m41s
<Scala> 700 2000 100 5 4 1m14s
<Python> 500 2000 100 5 4 8m18s
(Sorry, no more for Python, I'm pressed for time at the moment.)

I noticed that average CPU utilization on the quad-core was always 99%+
(except for the drops to ~90% between iterations) during the Scala runs.
During Python however, it was around 55-67%, and the rest was WAIT.
Evidently a huge time was being wasted (on I/O? slow loops?).

And a funnier thing was, the RMSE over the 5 iterations for Scala began with
0.82 and ended with 0.73, while for the Python version, it started with an
RMSE of 1294.1236 and ended with 210.2984. That's a pretty huge gap. Can
someone very all this at least on a single node?

I haven't even modified any code, so Scala's using the usual Colt.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Python-API-Performance-tp1048p1109.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to