Spark isn't a storage system -- it's a batch processing system at heart. To "serve" something means to run a distributed computation scanning partitions for an element and collect it to a driver and return it. Although that could be fast-enough for some definition of fast, it's going to be orders of magnitude slower than using a technology for point lookups of data -- NoSQL stores. Consider that if you want to sustain, say, 1000qps, that's 1000 Spark jobs launching N tasks per second. It isn't designed for that.
On Wed, Dec 21, 2016 at 3:29 PM Enrico DUrso <enrico.du...@everis.com> wrote: > Hello, > > I had a discussion today with a colleague who was saying the following: > “We can use Spark as fast serving layer in our architecture, that is we > can compute an RDD or even a dataset using Spark SQL, > then we can cache it and offering to the front end layer an access to our > application in order to show them the content of the RDD/Content.” > > This way of using Spark is for me something new, has anyone of you > experience in this use case? > > Cheers, > > Enrico > > ------------------------------ > > CONFIDENTIALITY WARNING. > This message and the information contained in or attached to it are > private and confidential and intended exclusively for the addressee. everis > informs to whom it may receive it in error that it contains privileged > information and its use, copy, reproduction or distribution is prohibited. > If you are not an intended recipient of this E-mail, please notify the > sender, delete it and do not read, act upon, print, disclose, copy, retain > or redistribute any portion of this E-mail. >