You have to register a Cassandra table in spark as dataframes
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/14_data_frames.md Thanks Sathish On Mon, Jan 22, 2018 at 7:43 AM Conconscious <conconsci...@gmail.com> wrote: > Hi list, > > I have a Cassandra table with two fields; id bigint, kafka text > > My goal is to read only the kafka field (that is a JSON) and infer the > schema > > Hi have this skeleton code (not working): > > sc.stop > import org.apache.spark._ > import com.datastax.spark._ > import org.apache.spark.sql.functions.get_json_object > > import org.apache.spark.sql.functions.to_json > import org.apache.spark.sql.functions.from_json > import org.apache.spark.sql.types._ > > val conf = new SparkConf(true) > .set("spark.cassandra.connection.host", "127.0.0.1") > .set("spark.cassandra.auth.username", "cassandra") > .set("spark.cassandra.auth.password", "cassandra") > val sc = new SparkContext(conf) > > val sqlContext = new org.apache.spark.sql.SQLContext(sc) > val df = sqlContext.sql("SELECT kafka from table1") > df.printSchema() > > I think at least I have two problems; is missing the keyspace, is not > recognizing the table and for sure is not going to infer the schema from > the text field. > > I have a working solution for json files, but I can't "translate" this > to Cassandra: > > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val spark = SparkSession.builder().appName("Spark SQL basic > example").getOrCreate() > val redf = spark.read.json("/usr/local/spark/examples/cqlsh_r.json") > redf.printSchema > redf.count > redf.show > redf.createOrReplaceTempView("clicks") > val clicksDF = spark.sql("SELECT * FROM clicks") > clicksDF.show() > > My Spark version is 2.2.1 and Cassandra version is 3.11.1 > > Thanks in advance > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >