Data ingestion into elastic failing using pyspark

2024-03-11 Thread Karthick Nk
Hi @all, I am using pyspark program to write the data into elastic index by using upsert operation (sample code snippet below). def writeDataToES(final_df): write_options = { "es.nodes": elastic_host, "es.net.ssl": "false", "es.nodes.wan.only": "true",

Re: Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.

2024-01-09 Thread Mich Talebzadeh
024 at 19:11, ashok34...@yahoo.com wrote: > Hey Mich, > > Thanks for this introduction on your forthcoming proposal "Spark > Structured Streaming and Flask REST API for Real-Time Data Ingestion and > Analytics". I recently came across an article by Databricks with title &g

Re: Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.

2024-01-09 Thread ashok34...@yahoo.com.INVALID
Hey Mich, Thanks for this introduction on your forthcoming proposal "Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics". I recently came across an article by Databricks with title Scalable Spark Structured Streaming for REST API Destinations.

Re: Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.

2024-01-08 Thread Mich Talebzadeh
uction. On Mon, 8 Jan 2024 at 19:30, Mich Talebzadeh wrote: > Thought it might be useful to share my idea with fellow forum members. During > the breaks, I worked on the *seamless integration of Spark Structured > Streaming with Flask REST API for real-time data ingestion and analytics*.

Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.

2024-01-08 Thread Mich Talebzadeh
Thought it might be useful to share my idea with fellow forum members. During the breaks, I worked on the *seamless integration of Spark Structured Streaming with Flask REST API for real-time data ingestion and analytics*. The use case revolves around a scenario where data is generated through

Re: Data ingestion

2022-08-18 Thread Pasha Finkelshtein
to both mysql and hive > fluently. > > regards. > > > Akash Vellukai wrote: > > How we could do data ingestion from MySQL to Hive with the help of Spark > > streaming and not with Kafka > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >

Re: Data ingestion

2022-08-17 Thread pengyh
from my experience, spark can read/write from/to both mysql and hive fluently. regards. Akash Vellukai wrote: How we could do data ingestion from MySQL to Hive with the help of Spark streaming and not with Kafka

Re: Data ingestion

2022-08-17 Thread Yuri Oleynikov (‫יורי אולייניקוב‬‎)
If you are on aws, you can use RDS + AWS DMS to save data to s3 and then read streaming data with spark structured streaming from s3 into hive Best regards > On 17 Aug 2022, at 20:51, Akash Vellukai wrote: > >  > Dear Sir, > > > How we could do data ingesti

Re: Data ingestion

2022-08-17 Thread Pasha Finkelshtein
asm0dey> [image: instagram] <https://instagram.com/asm0dey> Pasha Finkelshteyn Developer Advocate for Data Engineering JetBrains asm0...@jetbrains.com https://linktr.ee/asm0dey Find out more <https://jetbrains.com> ср, 17 авг. 2022 г. в 19:51, Akash Vellukai : > Dea

Data ingestion

2022-08-17 Thread Akash Vellukai
Dear Sir, How we could do data ingestion from MySQL to Hive with the help of Spark streaming and not with Kafka Thanks and regards Akash

Re: [EXTERNAL] Re: Spark streaming - Data Ingestion

2022-08-17 Thread Akash Vellukai
; *Sent:* 17 August 2022 16:53 > *To:* Akash Vellukai > *Cc:* user@spark.apache.org > *Subject:* [EXTERNAL] Re: Spark streaming - Data Ingestion > > *Caution! This email originated outside of FedEx. Please do not open > attachments or click links from an unknown or suspicious origi

Re: [EXTERNAL] Re: Spark streaming - Data Ingestion

2022-08-17 Thread Gibson
t;> Regards >> -- >> *From:* Gibson >> *Sent:* 17 August 2022 16:53 >> *To:* Akash Vellukai >> *Cc:* user@spark.apache.org >> *Subject:* [EXTERNAL] Re: Spark streaming - Data Ingestion >> >> *Caution! This email originat

Re: [EXTERNAL] Re: Spark streaming - Data Ingestion

2022-08-17 Thread Saurabh Gulati
; Hive Regards From: Gibson Sent: 17 August 2022 16:53 To: Akash Vellukai Cc: user@spark.apache.org Subject: [EXTERNAL] Re: Spark streaming - Data Ingestion Caution! This email originated outside of FedEx. Please do not open attachments or click links from an unknown or su

Re: Spark streaming - Data Ingestion

2022-08-17 Thread Gibson
If you have space for a message log like, then you should try: MySQL -> Kafka (via CDC) -> Spark (Structured Streaming) -> HDFS/S3/ADLS -> Hive On Wed, Aug 17, 2022 at 5:40 PM Akash Vellukai wrote: > Dear sir > > I have tried a lot on this could you help me with this? &g

Spark streaming - Data Ingestion

2022-08-17 Thread Akash Vellukai
Dear sir I have tried a lot on this could you help me with this? Data ingestion from MySql to Hive with spark- streaming? Could you give me an overview. Thanks and regards Akash P

Re: Dynamic data ingestion into SparkSQL - Interesting question

2017-11-21 Thread Aakash Basu
Yes, I did the same. It's working. Thanks! On 21-Nov-2017 4:04 PM, "Fernando Pereira" wrote: > Did you consider do string processing to build the SQL expression which > you can execute with spark.sql(...)? > Some examples: https://spark.apache.org/docs/latest/sql- >

Re: Dynamic data ingestion into SparkSQL - Interesting question

2017-11-21 Thread Fernando Pereira
Did you consider do string processing to build the SQL expression which you can execute with spark.sql(...)? Some examples: https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables Cheers On 21 November 2017 at 03:27, Aakash Basu wrote: > Hi all,

Re: Dynamic data ingestion into SparkSQL - Interesting question

2017-11-20 Thread Aakash Basu
Hi all, Any help? PFB. Thanks, Aakash. On 20-Nov-2017 6:58 PM, "Aakash Basu" wrote: > Hi all, > > I have a table which will have 4 columns - > > | Expression|filter_condition| from_clause| > group_by_columns| > > > This file may have variable

Dynamic data ingestion into SparkSQL - Interesting question

2017-11-20 Thread Aakash Basu
Hi all, I have a table which will have 4 columns - | Expression|filter_condition| from_clause| group_by_columns| This file may have variable number of rows depending on the no. of KPIs I need to calculate. I need to write a SparkSQL program which will have to read this

Re: jdbcRDD for data ingestion from RDBMS

2016-10-18 Thread Mich Talebzadeh
Hi, If we are talking about billions of records and depending on your network and RDBMs with parallel connections, from my experience it works OK for Dimension tables of moderate size, in that you can have parallel connections to RDBMS (assuming the RDBMS has a primary key/unique column) to

Re: jdbcRDD for data ingestion from RDBMS

2016-10-18 Thread Teng Qiu
Hi Ninad, i believe the purpose of jdbcRDD is to use RDBMS as an addtional data source during the data processing, main goal of spark is still analyzing data from HDFS-like file system. to use spark as a data integration tool to transfer billions of records from RDBMS to HDFS etc. could work, but

Fwd: jdbcRDD for data ingestion from RDBMS

2016-10-17 Thread Ninad Shringarpure
Hi Team, One of my client teams is trying to see if they can use Spark to source data from RDBMS instead of Sqoop. Data would be substantially large in the order of billions of records. I am not sure reading the documentations whether jdbcRDD by design is going to be able to scale well for this

Re: Schedule lunchtime today for a free webinar IoT data ingestion in Spark Streaming using Kaa 11 a.m. PDT (2 p.m. EDT)

2015-08-04 Thread orozvadovskyy
Hi there! If you missed our webinar on IoT data ingestion in Spark with KaaIoT, see the video and slides here: http://goo.gl/VMyQ1M We recorded our webinar on “IoT data ingestion in Spark Streaming using Kaa” for those who couldn’t see it live or who would like to refresh what they have

Schedule lunchtime today for a free webinar IoT data ingestion in Spark Streaming using Kaa 11 a.m. PDT (2 p.m. EDT)

2015-07-23 Thread Oleh Rozvadovskyy
Hi there! Only couple of hours left to our first webinar on* IoT data ingestion in Spark Streaming using Kaa*. During the webinar we will build a solution that ingests real-time data from Intel Edison into Apache Spark for stream processing. This solution includes a client, middleware

Re: How to speed up data ingestion with Spark

2015-05-12 Thread Akhil Das
and partitions effects the throughput. And this article https://www.sigmoid.com/creating-sigview-a-real-time-analytics-dashboard/ has a use-case. Thanks Best Regards On Tue, May 12, 2015 at 8:25 PM, dgoldenberg dgoldenberg...@gmail.com wrote: Hi, I'm looking at a data ingestion implementation

How to speed up data ingestion with Spark

2015-05-12 Thread dgoldenberg
Hi, I'm looking at a data ingestion implementation which streams data out of Kafka with Spark Streaming, then uses a multi-threaded pipeline engine to process the data in each partition. Have folks looked at ways of speeding up this type of ingestion? Let's say the main part of the ingest