Spark SQL Json Parse
Hi, I have a problem with Json Parser. I am using spark streaming with hiveContext for keeping json format tweets. The flume collects tweets and sink to hdfs path. My spark streaming job checks the hdfs path and convert coming json tweets and insert them to hive table. My problem is that ; Some of tweets have location name information or in reply to information etc. but some of tweets have not any information for that columns. In Spark streaming job, I try to make generic json handler for example if the tweet has no location information spark can write null value to hive table for location column. Is there any way to make this? I am using Spark 1.3.0 version Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Json-Parse-tp26391.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark Avarage
Hi I have a class in above desc. case class weatherCond(dayOfdate: String, minDeg: Int, maxDeg: Int, meanDeg: Int) I am reading the data from csv file and I put this data into weatherCond class with this code val weathersRDD = sc.textFile(weather.csv).map { line = val Array(dayOfdate, minDeg, maxDeg, meanDeg) = line.replaceAll(\,).trim.split(,) weatherCond(dayOfdate, minDeg.toInt, maxDeg.toInt, meanDeg.toInt) } the question is ; how can I average the minDeg, maxDeg and meanDeg values for each month ; The data set example day, min, max , mean 2014-03-17,-3,5,5 2014-03-18,6,7,7 2014-03-19,6,14,10 result has to be (2014-03, 3, 8.6 ,7.3) -- (Average for 2014 - 03 ) Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Avarage-tp22391.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Streaming Linear Regression
Hi I tried to run Streaming Linear Regression in my local. val trainingData = ssc.textFileStream(/home/barisakgu/Desktop/Spark/train).map(LabeledPoint.parse) textFileStream is not seeing the new files. I search on the Internet, and I saw that somebody has same issue but no solution is found for that. Is there any opinion for this ? Is there any body who can achieve the running streaming linear regression ? Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-Linear-Regression-tp21726.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org