Hi,

I observed some weird performance issue using Spark in combination with
Theano, and I have no real explanation for that. To exemplify the issue I am
using the pi.py example of spark that computes pi:

When I modify the function from the example: 

#unmodified code
def f(_):
        x = random() * 2 - 1
        y = random() * 2 - 1
        
        return 1 if x ** 2 + y ** 2 < 1 else 0

count = sc.parallelize(xrange(1, n + 1), partitions).map(f).reduce(add)
#

by adding a very simple dummy function that just computes the product of two
floats, the execution slows down massively (about 100x slower). 

Here is the slow code:

# define simple function in theano that computes the product
 x = T.dscalar()
 y = T.dscalar()
 dummyFun = theano.function([x,y],y * x)
 broadcast_dummyFun = sc.broadcast(dummyFun)

def f(_):
        x = random() * 2 - 1
        y = random() * 2 - 1
        
        # compute product
        tmp = broadcast_dummyFun.value(x,y)

        return 1 if x ** 2 + y ** 2 < 1 else 0


Any idea why it slows down so much? Using a python function that computes
the product (or lambda function) again gives full-speed.

I would appreciate some help on that.

-Tassilo



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Performance-issue-tp21194.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to