Structured Streaming to Kafka Topic

2019-03-06 Thread Pankaj Wahane
Hi, I am using structured streaming for ETL. val data_stream = spark .readStream // constantly expanding dataframe .format("kafka") .option("kafka.bootstrap.servers", "localhost:9092") .option("subscribe", "sms_history") .option("startingOffsets", "earliest") // begin from start of

Re: how to add colum to dataframe

2016-12-06 Thread Pankaj Wahane
You may want to try using df2.na.fill(…) From: lk_spark Date: Tuesday, 6 December 2016 at 3:05 PM To: "user.spark" Subject: how to add colum to dataframe hi,all: my spark version is 2.0 I have a parquet file with one colum name url type is

Re: Spark Streaming: java.lang.NoClassDefFoundError: org/apache/kafka/common/message/KafkaLZ4BlockOutputStream

2016-03-11 Thread Pankaj Wahane
Next thing you may want to check is if the jar has been provided to all the executors in your cluster. Most of the class not found errors got resolved for me after making required jars available in the SparkContext. Thanks. From: Ted Yu > Date:

Re: Question on take function - Spark Java API

2015-08-26 Thread Pankaj Wahane
Technologies http://www.nubetech.co/ Check out Reifier at Spark Summit 2015 https://spark-summit.org/2015/events/real-time-fuzzy-matching-with-spark-and-elastic-search/ http://in.linkedin.com/in/sonalgoyal On Wed, Aug 26, 2015 at 8:25 AM, Pankaj Wahane pankaj.wah...@qiotec.com

Question on take function - Spark Java API

2015-08-25 Thread Pankaj Wahane
Hi community members, Apache Spark is Fantastic and very easy to learn.. Awesome work!!! Question: I have multiple files in a folder and and the first line in each file is name of the asset that the file belongs to. Second line is csv header row and data starts from third row.. Ex: