Re: RDD[internalRow] -> DataSet

2017-12-12 Thread Vadim Semenov
not possible, but you can add your own object in your project to the
spark's package that would give you access to private methods

package org.apache.spark.sql

import org.apache.spark.rdd.RDD
import org.apache.spark.sql.catalyst.InternalRow
import org.apache.spark.sql.execution.LogicalRDD
import org.apache.spark.sql.types.StructType

object DataFrameUtil {
  /**
* Creates a DataFrame out of RDD[InternalRow] that you can get
using `df.queryExection.toRdd`
*/
  def createFromInternalRows(sparkSession: SparkSession, schema:
StructType, rdd: RDD[InternalRow]): DataFrame = {
val logicalPlan = LogicalRDD(schema.toAttributes, rdd)(sparkSession)
Dataset.ofRows(sparkSession, logicalPlan)
  }
}


Re: RDD[internalRow] -> DataSet

2017-12-09 Thread Jacek Laskowski
Hi Satyajit,

That's exactly what Dataset.rdd does -->
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala?utf8=%E2%9C%93#L2916-L2921

Pozdrawiam,
Jacek Laskowski

https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

On Fri, Dec 8, 2017 at 5:25 AM, satyajit vegesna  wrote:

> Hi All,
>
> Is there a way to convert RDD[internalRow] to Dataset , from outside spark
> sql package.
>
> Regards,
> Satyajit.
>


RDD[internalRow] -> DataSet

2017-12-07 Thread satyajit vegesna
Hi All,

Is there a way to convert RDD[internalRow] to Dataset , from outside spark
sql package.

Regards,
Satyajit.