Re: [UPDATE] Apache Spark 3.1.0 Release Window

2020-10-12 Thread Xiao Li
Thank you, Dongjoon Xiao On Mon, Oct 12, 2020 at 4:19 PM Dongjoon Hyun wrote: > Hi, All. > > Apache Spark 3.1.0 Release Window is adjusted like the following today. > Please check the latest information on the official website. > > - >

[UPDATE] Apache Spark 3.1.0 Release Window

2020-10-12 Thread Dongjoon Hyun
Hi, All. Apache Spark 3.1.0 Release Window is adjusted like the following today. Please check the latest information on the official website. - https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca - https://spark.apache.org/versioning-policy.html

The simplest Syntax for saprk/Scala collect.foreach(println) in Pyspark

2020-10-12 Thread Mich Talebzadeh
Hi In Spark/Scala one can do scala> println ("\nStarted at"); spark.sql("SELECT FROM_unixtime(unix_timestamp(), 'dd/MM/ HH:mm:ss.ss') ").collect.foreach(println) Started at [12/10/2020 22:29:19.19] I believe foreach(println) is a special syntax in this case. I can also do a verbose one

Re: Spark as computing engine vs spark cluster

2020-10-12 Thread Kushagra Deep
Hi Santosh, Spark is a distributed computation engine . You may ask why distributed ? The answer is when things are distributed, memory and cores can be increased to process parallely on scale . Since it is difficult to scale things vertically we scale horizontally. Thanks And Regards

Re: Spark as computing engine vs spark cluster

2020-10-12 Thread Mich Talebzadeh
Hi Santosh, Generally speaking, there are two ways of making a process faster: 1. Do more intelligent work by creating indexes, cubes etc thus reducing the processing time 2. Throw hardware and memory at it using something like Spark multi-cluster with fully managed cloud service

Re: Spark as computing engine vs spark cluster

2020-10-12 Thread Jeff Evans
Spark is a computation engine that runs on a set of distributed nodes. You must "bring your own" hardware, although of course there are hosted solutions available. On Sat, Oct 10, 2020 at 9:24 AM Santosh74 wrote: > Is spark compute engine only or it's also cluster which comes with set of >