Re: How to programmatically pause and resume Spark/Kafka structured streaming?

2019-08-06 Thread Gourav Sengupta
Hi There is a method to iterate only once in Spark. I use it for reading files using streaming. May be you can try that. Regards, Gourav On Tue, 6 Aug 2019, 21:50 kant kodali, wrote: > If I stop and start while processing the batch what will happen? will that > batch gets canceled and gets

unsubscribe

2019-08-06 Thread Information Technologies
unsubscribe -- ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. They may not be disseminated or distributed to

Re: Announcing Delta Lake 0.3.0

2019-08-06 Thread Nicolas Paris
> • Scala/Java APIs for DML commands - You can now modify data in Delta Lake > tables using programmatic APIs for Delete, Update and Merge. These APIs > mirror the syntax and semantics of their corresponding SQL commands and > are > great for many workloads, e.g., Slowly Changing

How to read configuration file parameters in Spark without mapping each parameter

2019-08-06 Thread Mich Talebzadeh
Hi, Assume that I have a configuration file as below with static parameters some Strings, Integer and Double: md_AerospikeAerospike { dbHost = "rhes75" dbPort = "3000" dbConnection = "trading_user_RW" namespace = "trading" dbSetRead = "MARKETDATAAEROSPIKEBATCH" dbSetWrite =

unsubscribe

2019-08-06 Thread Information Technologies
unsubscribe -- ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. They may not be disseminated or distributed to

unsubscribe

2019-08-06 Thread Peter Willis
unsubscribe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: How to programmatically pause and resume Spark/Kafka structured streaming?

2019-08-06 Thread kant kodali
If I stop and start while processing the batch what will happen? will that batch gets canceled and gets reprocessed again when I click start? Does that mean I need to worry about duplicates in the downstream? Kafka consumers have a pause and resume and they work just fine so I am not sure why

CVE-2019-10099: Apache Spark unencrypted data on local disk

2019-08-06 Thread Imran Rashid
Severity: Important Vendor: The Apache Software Foundation Versions affected: All Spark 1.x, Spark 2.0.x, Spark 2.1.x, and 2.2.x versions Spark 2.3.0 to 2.3.2 Description: Prior to Spark 2.3.3, in certain situations Spark would write user data to local disk unencrypted, even if

Re: Hive external table not working in sparkSQL when subdirectories are present

2019-08-06 Thread Mich Talebzadeh
which versions of Spark and Hive are you using. what will happen if you use parquet tables instead? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Hive external table not working in sparkSQL when subdirectories are present

2019-08-06 Thread Rishikesh Gawade
Hi. I have built a Hive external table on top of a directory 'A' which has data stored in ORC format. This directory has several subdirectories inside it, each of which contains the actual ORC files. These subdirectories are actually created by spark jobs which ingest data from other sources and