RE: Spark SQL udf(ScalaUdf) is very slow

2015-03-23 Thread Cheng, Hao
on the improvement soon. From: zzcclp [mailto:441586...@qq.com] Sent: Monday, March 23, 2015 5:10 PM To: user@spark.apache.org Subject: Spark SQL udf(ScalaUdf) is very slow My test env: 1. Spark version is 1.3.0 2. 3 node per 80G/20C 3. read 250G parquet files from hdfs Test case: 1. register floor

Spark SQL udf(ScalaUdf) is very slow

2015-03-23 Thread zzcclp
-list.1001560.n3.nabble.com/Spark-SQL-udf-ScalaUdf-is-very-slow-tp22185.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: RE: Spark SQL udf(ScalaUdf) is very slow

2015-03-23 Thread ??o0/ka????
@spark.apache.orguser@spark.apache.org; Subject: RE: Spark SQL udf(ScalaUdf) is very slow This is a very interesting issue, the root reason for the lower performance probably is, in Scala UDF, Spark SQL converts the data type from internal representation to Scala representation via Scala reflection