date:20200114

Re: Structured Streaming - HDFS State Store Performance Issues

2020-01-14 Thread Gourav Sengupta

Hi Will, have you tried using S3 as state store with the option in EMR enabled for faster file sync, also there is an option now of using FSx Lustre. Thanks and Regards, Gourav Sengupta On Wed, Jan 15, 2020 at 5:17 AM William Briggs wrote: > Hi all, I've got a problem that really has me stumpe

Structured Streaming - HDFS State Store Performance Issues

2020-01-14 Thread William Briggs

Hi all, I've got a problem that really has me stumped. I'm running a Structured Streaming query that reads from Kafka, performs some transformations and stateful aggregations (using flatMapGroupsWithState), and outputs any updated aggregates to another Kafka topic. I'm running this job using Spark

Unsubscribe

2020-01-14 Thread Dylan Hogg

Unsubscribe

Re: Reading 7z file in spark

2020-01-14 Thread Andrew Melo

It only makes sense if the underlying file is also splittable, and even then, it doesn't really do anything for you if you don't explicitly tell spark about the split boundaries On Tue, Jan 14, 2020 at 7:36 PM Someshwar Kale wrote: > I would suggest to use other compression technique which is sp

Re: Reading 7z file in spark

2020-01-14 Thread Someshwar Kale

I would suggest to use other compression technique which is splittable for eg. Bzip2, lzo, lz4. On Wed, Jan 15, 2020, 1:32 AM Enrico Minack wrote: > Hi, > > Spark does not support 7z natively, but you can read any file in Spark: > > def read(stream: PortableDataStream): Iterator[String] = { > S

Re: Reading 7z file in spark

2020-01-14 Thread Enrico Minack

Hi, Spark does not support 7z natively, but you can read any file in Spark: def read(stream: PortableDataStream):Iterator[String] = {Seq(stream.getPath()).iterator } spark.sparkContext .binaryFiles("*.7z") .flatMap(file => read(file._2)) .toDF("path") .show(false) This scales with the

[no subject]

2020-01-14 Thread @Sanjiv Singh

Regards Sanjiv Singh Mob : +1 571-599-5236

Reading Dataset from DB2 over JDBC

2020-01-14 Thread Andrew A

Hello everyone! I try to get data from DB2 table which columns have names with non-ascii (cyrillic) symbols, but I get from JDBC-driver error with "SQLCODE=-206" (object-name IS NOT VALID IN THE CONTEXT WHERE IT IS USED) and SQLERRMC consists of name of this column and the added parts ";N*.N*" lik

[no subject]

2020-01-14 Thread @Sanjiv Singh

Regards Sanjiv Singh Mob : +1 571-599-5236

Re: Structured Streaming - HDFS State Store Performance Issues

Structured Streaming - HDFS State Store Performance Issues

Unsubscribe

Re: Reading 7z file in spark

Re: Reading 7z file in spark

Re: Reading 7z file in spark

[no subject]

Reading Dataset from DB2 over JDBC

[no subject]

9 matches

Site Navigation

Mail list logo

Footer information