Hi Li, When using the DataFrame integration, it supports arbitrary SELECT statements. Column pruning and predicate filtering is pushed down to Phoenix, and aggregate functions are executed within Spark.
When using RDDs directly, you can specify a table name, columns and an optional WHERE predicate for basic filtering. Aggregate functions however are not supported. The integration tests have a reasonably thorough set of examples on both DataFrames and RDDs with Phoenix. [1] Good luck, Josh [1] https://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala On Tue, Dec 15, 2015 at 5:57 PM, Li Gao <g...@marinsoftware.com> wrote: > Hi community, > > Does Phoenix Spark support arbitrary SELECT statements for generating DF > or RDD? > > From this reading: https://phoenix.apache.org/phoenix_spark.html I am not > sure how to do that. > > Thanks, > Li > >