Hi All,
I am trying to read a table of a relational database using spark 2.x.
I am using code like the following:
sparkContext.read().jdbc(url, table ,
connectionProperties).select('SELECT_COLUMN').where(whereClause);
Now, What's happening is spark is actually the SQL query which spark is
Dear All,
I would like to know how, in spark 2.0, can I split a dataframe into two
dataframes when I know the exact counts the two dataframes should have. I
tried using limit but got quite weird results. Also, I am looking for exact
counts in child dfs, not the approximate % based split.
Hi Everyone,
I am getting the following error while running a spark streaming example on
my local machine, the being ingested is only 506kb.
*16/11/23 03:05:54 INFO MappedDStream: Slicing from 1479850537180 ms to
1479850537235 ms (aligned to 1479850537180 ms and 1479850537235 ms)*
*Exception
Hi Raghav,
Please refer to the following code:
SparkConf sparkConf = new
SparkConf().setMaster("local[2]").setAppName("PersonApp");
//creating java spark context
JavaSparkContext sc = new JavaSparkContext(sparkConf);
//reading file from hfs into spark rdd , the name node is localhost
JavaRDD
Hi All,
After creating a direct stream like below:
val events = KafkaUtils.createDirectStream[String, String,
StringDecoder, StringDecoder](
ssc, kafkaParams, topicsSet)
I would like to convert the above stream into data frames, so that I could
run hive queries over it. Could anyone
Any idea anyone?
On Fri, Aug 14, 2015 at 10:11 AM, Mohit Durgapal durgapalmo...@gmail.com
wrote:
Hi All,
After creating a direct stream like below:
val events = KafkaUtils.createDirectStream[String, String,
StringDecoder, StringDecoder](
ssc, kafkaParams, topicsSet)
I would
Hi All,
I just wanted to know how does directAPI for spark streaming compare with
earlier receivers based API. Has anyone used directAPI based approach on
production or is it still being used for pocs?
Also, since I'm new to spark, could anyone share a starting point from
where I could find a
I want to write a spark streaming consumer for kafka in java. I want to
process the data in real-time as well as store the data in hdfs in
year/month/day/hour/ format. I am not sure how to achieve this. Should I
write separate kafka consumers, one for writing data to HDFS and one for
spark
I want to write a spark streaming consumer for kafka in java. I want to
process the data in real-time as well as store the data in hdfs in
year/month/day/hour/ format. I am not sure how to achieve this. Should I
write separate kafka consumers, one for writing data to HDFS and one for
spark
Hi All,
I have a requirement where I need to consume messages from ActiveMQ and do
live stream processing as well as batch processing using Spark. Is there a
spark-plugin or library that can enable this? If not, then do you know any
other way this could be done?
Regards
Mohit
10 matches
Mail list logo