date:20200809

Re: Spark streaming receivers

2020-08-09 Thread Dark Crusader

Hi Russell, This is super helpful. Thank you so much. Can you elaborate on the differences between structured streaming vs dstreams? How would the number of receivers required etc change? On Sat, 8 Aug, 2020, 10:28 pm Russell Spitzer, wrote: > Note, none of this applies to Direct streaming

回复：[Spark-Kafka-Streaming] Verifying the approach for multiple queries

2020-08-09 Thread tianlangstudio

Hello, Sir! What about process and group the data first then write grouped data to Kafka topics A and B. Then read topic A or B from another Spark Application and process it more. Like the term ETL's mean. TianlangStudio Some of the biggest lies: I will start tomorrow/Others are better

[Spark-Kafka-Streaming] Verifying the approach for multiple queries

2020-08-09 Thread Amit Joshi

Hi, I have a scenario where a kafka topic is being written with different types of json records. I have to regroup the records based on the type and then fetch the schema and parse and write as parquet. I have tried structured programming. But dynamic schema is a constraint. So I have used

regexp_extract regex for extracting the columns from string

2020-08-09 Thread anbutech

Hi All, I have a following info.in the data column. <1000> date=2020-08-01 time=20:50:04 name=processing id=123 session=new packt=20 orgin=null address=null dest=fgjglgl here I want to create a separate column for the above key value pairs after the integer <1000> separated by spaces. Is there

Re: Spark batch job chaining

2020-08-09 Thread Jun Zhu

Hi I am using Airflow in such scenario

Re: Spark streaming receivers

回复：[Spark-Kafka-Streaming] Verifying the approach for multiple queries

[Spark-Kafka-Streaming] Verifying the approach for multiple queries

regexp_extract regex for extracting the columns from string

Re: Spark batch job chaining

5 matches

Site Navigation

Mail list logo

Footer information