Re: [ On the use of Spark as 'storage system']

Sean Owen Wed, 21 Dec 2016 07:35:04 -0800

Spark isn't a storage system -- it's a batch processing system at heart. To
"serve" something means to run a distributed computation scanning
partitions for an element and collect it to a driver and return it.
Although that could be fast-enough for some definition of fast, it's going
to be orders of magnitude slower than using a technology for point lookups
of data -- NoSQL stores. Consider that if you want to sustain, say,
1000qps, that's 1000 Spark jobs launching N tasks per second. It isn't
designed for that.


On Wed, Dec 21, 2016 at 3:29 PM Enrico DUrso <enrico.du...@everis.com>
wrote:

> Hello,
>
> I had a discussion today with a colleague who was saying the following:
> “We can use Spark as fast serving layer in our architecture, that is we
> can compute an RDD or even a dataset using Spark SQL,
> then we can cache it  and offering to the front end layer an access to our
> application in order to show them the content of the RDD/Content.”
>
> This way of using Spark is for me something new, has anyone of you
> experience in this use case?
>
> Cheers,
>
> Enrico
>
> ------------------------------
>
> CONFIDENTIALITY WARNING.
> This message and the information contained in or attached to it are
> private and confidential and intended exclusively for the addressee. everis
> informs to whom it may receive it in error that it contains privileged
> information and its use, copy, reproduction or distribution is prohibited.
> If you are not an intended recipient of this E-mail, please notify the
> sender, delete it and do not read, act upon, print, disclose, copy, retain
> or redistribute any portion of this E-mail.
>

Re: [ On the use of Spark as 'storage system']

Reply via email to