Thank you, Dongjoon
Xiao
On Mon, Oct 12, 2020 at 4:19 PM Dongjoon Hyun
wrote:
> Hi, All.
>
> Apache Spark 3.1.0 Release Window is adjusted like the following today.
> Please check the latest information on the official website.
>
> -
>
Hi, All.
Apache Spark 3.1.0 Release Window is adjusted like the following today.
Please check the latest information on the official website.
-
https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca
- https://spark.apache.org/versioning-policy.html
Hi
In Spark/Scala one can do
scala> println ("\nStarted at"); spark.sql("SELECT
FROM_unixtime(unix_timestamp(), 'dd/MM/ HH:mm:ss.ss')
").collect.foreach(println)
Started at
[12/10/2020 22:29:19.19]
I believe foreach(println) is a special syntax in this case.
I can also do a verbose one
Hi Santosh,
Spark is a distributed computation engine . You may ask why distributed ? The
answer is when things are distributed, memory and cores can be increased to
process parallely on scale . Since it is difficult to scale things vertically
we scale horizontally.
Thanks And Regards
Hi Santosh,
Generally speaking, there are two ways of making a process faster:
1. Do more intelligent work by creating indexes, cubes etc thus reducing
the processing time
2. Throw hardware and memory at it using something like Spark
multi-cluster with fully managed cloud service
Spark is a computation engine that runs on a set of distributed nodes. You
must "bring your own" hardware, although of course there are hosted
solutions available.
On Sat, Oct 10, 2020 at 9:24 AM Santosh74 wrote:
> Is spark compute engine only or it's also cluster which comes with set of
>