Hi Santosh,

Generally speaking, there are two ways of making a process faster:


   1. Do more intelligent work by creating indexes, cubes etc thus reducing
   the processing time
   2. Throw hardware and memory at it using something like Spark
   multi-cluster with fully managed cloud service like Google Dataproc

So the framework is a computational engine (Spark) and the physical
realisation is achieved by creating a Spark cluster (multi nodes/VM hosts)
that work in tandem and provide parallel processing. I suggest that you
look at Spark docs   <https://spark.apache.org/>

HTH



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*





*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Sat, 10 Oct 2020 at 15:24, Santosh74 <sardesaisant...@gmail.com> wrote:

> Is spark compute engine only or it's also cluster which comes with set of
> hardware /nodes  ? What exactly is spark clusterr?
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to