date:20180207

Are there any alternatives to Hive "stored by" clause as Spark 2.0 does not support it

2018-02-07 Thread Pralabh Kumar

Hi Spark 2.0 doesn't support stored by . Is there any alternative to achieve the same.

[CFP] DataWorks Summit, San Jose, 2018

2018-02-07 Thread Yanbo Liang

Hi All, DataWorks Summit, San Jose, 2018 is a good place to share your experience of advanced analytics, data science, machine learning and deep learning. We have Artificial Intelligence and Data Science session, to cover technologies such as: Apache Spark, Sciki-learn, TensorFlow, Keras,

unsubscribe

2018-02-07 Thread dmp

unsubscribe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Issue with EFS checkpoint

2018-02-07 Thread Khan, Obaidur Rehman

Hello, We have a Spark cluster with 3 worker nodes available as EC2 on AWS. Spark application is running in cluster mode and the checkpoints are stored in EFS. Spark version used is 2.2.0. We noticed the below error coming up – our understanding was that this intermittent checkpoint issue

Re: Sharing spark executor pool across multiple long running spark applications

2018-02-07 Thread Vadim Semenov

The other way might be to launch a single SparkContext and then run jobs inside of it. You can take a look at these projects: - https://github.com/spark-jobserver/spark-jobserver#persistent-context-mode---faster--required-for-related-jobs - http://livy.incubator.apache.org Problems with this

How to preserve the order of parquet files?

2018-02-07 Thread Kevin Jung

Hi all, In spark 2.2.1, when I load parquet files, it shows differently ordered result of original dataset. It seems like FileSourceScanExec.createNonBucketedReadRDD method sorts parquet file splits by their own lengths. - val splitFiles = selectedPartitions.flatMap { partition =>

Spark CEP with files and no streams ?

2018-02-07 Thread Esa Heikkinen

Hello I am trying to use CEP of Spark for log files (as batch job), but not for streams (as realtime). Is that possible ? If yes, do you know examples Scala codes about that ? Or should I convert the log files (with time stamps) into streams ? But how to handle time stamps in Spark ? If I can

Are there any alternatives to Hive "stored by" clause as Spark 2.0 does not support it

[CFP] DataWorks Summit, San Jose, 2018

unsubscribe

Issue with EFS checkpoint

Re: Sharing spark executor pool across multiple long running spark applications

How to preserve the order of parquet files?

Spark CEP with files and no streams ?

7 matches

Site Navigation

Mail list logo

Footer information