Yes and no. The idea of n-tier architecture is about 20 years older than Spark and doesn't really apply to Spark as n-tier was original conceived. If the n-tier model helps you make sense of some things related to Spark, then use it; but don't get hung up on trying to force a Spark architecture into an outdated model.
On Tue, Mar 29, 2016 at 5:02 PM, Ashok Kumar <ashok34...@yahoo.com.invalid> wrote: > Thank you both. > > So am I correct that Spark fits in within the application tier in N-tier > architecture? > > > On Tuesday, 29 March 2016, 23:50, Alexander Pivovarov < > apivova...@gmail.com> wrote: > > > Spark is a distributed data processing engine plus distributed in-memory / > disk data cache > > spark-jobserver provides REST API to your spark applications. It allows > you to submit jobs to spark and get results in sync or async mode > > It also can create long running Spark context to cache RDDs in memory with > some name (namedRDD) and then use it to serve requests from multiple users. > Because RDD is in memory response should be super fast (seconds) > > https://github.com/spark-jobserver/spark-jobserver > > > On Tue, Mar 29, 2016 at 2:50 PM, Mich Talebzadeh < > mich.talebza...@gmail.com> wrote: > > Interesting question. > > The most widely used application of N-tier is the traditional three-tier > architecture that has been the backbone of Client-server architecture by > having presentation layer, application layer and data layer. This is > primarily for performance, scalability and maintenance. The most profound > changes that Big data space has introduced to N-tier architecture is the > concept of horizontal scaling as opposed to the previous tiers that relied > on vertical scaling. HDFS is an example of horizontal scaling at the data > tier by adding more JBODS to storage. Similarly adding more nodes to Spark > cluster should result in better performance. > > Bear in mind that these tiers are at Logical levels which means that there > or may not be so many so many physical layers. For example multiple virtual > servers can be hosted on the same physical server. > > With regard to Spark, it is effectively a powerful query tools that sits > in between the presentation layer (say Tableau) and the HDFS or Hive as you > alluded. In that sense you can think of Spark as part of the application > layer that communicates with the backend via a number of protocols > including the standard JDBC. There is rather a blurred vision here whether > Spark is a database or query tool. IMO it is a query tool in a sense that > Spark by itself does not have its own storage concept or metastore. Thus it > relies on others to provide that service. > > HTH > > > > Dr Mich Talebzadeh > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > http://talebzadehmich.wordpress.com > > > On 29 March 2016 at 22:07, Ashok Kumar <ashok34...@yahoo.com.invalid> > wrote: > > Experts, > > One of terms used and I hear is N-tier architecture within Big Data used > for availability, performance etc. I also hear that Spark by means of its > query engine and in-memory caching fits into middle tier (application > layer) with HDFS and Hive may be providing the data tier. Can someone > elaborate the role of Spark here. For example A Scala program that we write > uses JDBC to talk to databases so in that sense is Spark a middle tier > application? > > I hope that someone can clarify this and if so what would the best > practice in using Spark as middle tier and within Big data. > > Thanks > > > > > >