Hi,

I am trying to rewrite my program to use dataFrames, and I see that I can
perform a mapPartitions and a foreachPartition, but can I perform a
partitionBy/set a partitioner? Or is there some other way to make my data
land in the right partition for *Partition to use? (I see that PartitionBy
is only available on pairRDD's, this might have something to with it..)

I am using the spark master branch. The error:
[error]
/home/th/spark-1.5.0/spark/IBM_ARL_teraSort_v4-01/src/main/scala/IBM_ARL_teraSort.scala:107:
value partitionBy is not a member of org.apache.spark.sql.DataFrame

Thanks,

Tom Hubregtsen





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/PartitionBy-Partitioner-for-dataFrames-tp23420.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to