Fwd: Code generation for GPU

kiran lonikar Thu, 03 Sep 2015 04:59:17 -0700

Hi,

I am speaking in Spark Europe summit on exploiting GPUs for columnar
DataFrame operations
<https://spark-summit.org/eu-2015/events/exploiting-gpus-for-columnar-dataframe-operations/>.
I was going through various blogs, talks and JIRAs given by all the key
spark folks and trying to figure out where to make changes for this
proposal.

First of all, I must thank the recent progress in project tungsten that has
made my job easier. The changes for code generation
<https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala>
make it possible to allow me to generate OpenCL code for expressions
instead of existing java/scala code and run the OpenCL code on GPUs through
a Java library JavaCL.

However, before starting the work, I have a few questions/doubts as below:

1. I found where the code generation

<https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala>
happens
in spark code from the blogs

https://databricks.com/blog/2014/06/02/exciting-performance-improvements-on-the-horizon-for-spark-sql.html,

https://databricks.com/blog/2015/04/13/deep-dive-into-spark-sqls-catalyst-optimizer.html
and

https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html.
However, I could not find where is the generated code executed? A major
part of my changes will be there since this executor will now have to send
vectors of columns to GPU RAM, invoke execution, and get the results back
to CPU RAM. Thus, the existing executor will significantly change.
2. On the project tungsten blog

<https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html>,
in the third Code Generation section, it is mentioned that you plan
to increase the level of code generation from record-at-a-time expression
evaluation to vectorized expression evaluation. Has this been implemented?
If not, how do I implement this? I will need access to columnar ByteBuffer
objects in DataFrame to do this. Having row by row access to data will
defeat this exercise. In particular, I need access to

https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala
in the executor of the generated code.
3. One thing that confuses me is the changes from 1.4 to 1.5 possibly
due to JIRA https://issues.apache.org/jira/browse/SPARK-7956 and pull
request https://github.com/apache/spark/pull/6479/files*. *This changed
the code generation from quasiquotes (q) to string s operator. This makes
it simpler for me to generate OpenCL code which is string based. The
question, is this branch stable now? Should I make my changes on spark 1.4
or spark 1.5 or master branch?
4. How do I tune the batch size (number of rows in the ByteBuffer)? Is
it through the property spark.sql.inMemoryColumnarStorage.batchSize?

Thanks in anticipation,

Kiran
PS:

Other things I found useful were:

*Spark DataFrames*: https://www.brighttalk.com/webcast/12891/166495
*Apache Spark 1.5*: https://www.brighttalk.com/webcast/12891/168177

The links to JavaCL/ScalaCL:

*Library to execute OpenCL code through Java*:
https://github.com/nativelibs4java/ScalaCL
*Library to convert Scala code to OpenCL and execute on GPUs*:
https://github.com/nativelibs4java/JavaCL

Fwd: Code generation for GPU

Reply via email to