Hi
There is a method to iterate only once in Spark. I use it for reading files
using streaming. May be you can try that.
Regards,
Gourav
On Tue, 6 Aug 2019, 21:50 kant kodali, wrote:
> If I stop and start while processing the batch what will happen? will that
> batch gets canceled and gets
unsubscribe
--
**
This
email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are
addressed. They may not be disseminated or distributed to
> • Scala/Java APIs for DML commands - You can now modify data in Delta Lake
> tables using programmatic APIs for Delete, Update and Merge. These APIs
> mirror the syntax and semantics of their corresponding SQL commands and
> are
> great for many workloads, e.g., Slowly Changing
Hi,
Assume that I have a configuration file as below with static parameters
some Strings, Integer and Double:
md_AerospikeAerospike {
dbHost = "rhes75"
dbPort = "3000"
dbConnection = "trading_user_RW"
namespace = "trading"
dbSetRead = "MARKETDATAAEROSPIKEBATCH"
dbSetWrite =
unsubscribe
--
**
This
email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are
addressed. They may not be disseminated or distributed to
unsubscribe
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
If I stop and start while processing the batch what will happen? will that
batch gets canceled and gets reprocessed again when I click start? Does
that mean I need to worry about duplicates in the downstream? Kafka
consumers have a pause and resume and they work just fine so I am not sure
why
Severity: Important
Vendor: The Apache Software Foundation
Versions affected:
All Spark 1.x, Spark 2.0.x, Spark 2.1.x, and 2.2.x versions
Spark 2.3.0 to 2.3.2
Description:
Prior to Spark 2.3.3, in certain situations Spark would write user data to
local disk unencrypted, even if
which versions of Spark and Hive are you using.
what will happen if you use parquet tables instead?
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
Hi.
I have built a Hive external table on top of a directory 'A' which has data
stored in ORC format. This directory has several subdirectories inside it,
each of which contains the actual ORC files.
These subdirectories are actually created by spark jobs which ingest data
from other sources and
10 matches
Mail list logo