Our difference is mostly over whether n-tier means what it meant long ago, or whether it is a malleable concept that can be stretched without breaking to cover newer architectures. As I said before, if n-tier helps you think about Spark, then use it; if it doesn't, don't force it.
On Tue, Mar 29, 2016 at 5:44 PM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi Mark, > > I beg I agree to differ on the interpretation of N-tier architecture. > Agreed that 3-tier and by extrapolation N-tier have been around since days > of client-server architecture. However, they are as valid today as 20 years > ago. I believe the main recent expansion of n-tier has been on horizontal > scaling and Spark by means of its clustering capability contributes to this > model. > > Cheers > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 30 March 2016 at 00:22, Mark Hamstra <m...@clearstorydata.com> wrote: > >> Yes and no. The idea of n-tier architecture is about 20 years older than >> Spark and doesn't really apply to Spark as n-tier was original conceived. >> If the n-tier model helps you make sense of some things related to Spark, >> then use it; but don't get hung up on trying to force a Spark architecture >> into an outdated model. >> >> On Tue, Mar 29, 2016 at 5:02 PM, Ashok Kumar < >> ashok34...@yahoo.com.invalid> wrote: >> >>> Thank you both. >>> >>> So am I correct that Spark fits in within the application tier in N-tier >>> architecture? >>> >>> >>> On Tuesday, 29 March 2016, 23:50, Alexander Pivovarov < >>> apivova...@gmail.com> wrote: >>> >>> >>> Spark is a distributed data processing engine plus distributed in-memory >>> / disk data cache >>> >>> spark-jobserver provides REST API to your spark applications. It allows >>> you to submit jobs to spark and get results in sync or async mode >>> >>> It also can create long running Spark context to cache RDDs in memory >>> with some name (namedRDD) and then use it to serve requests from multiple >>> users. Because RDD is in memory response should be super fast (seconds) >>> >>> https://github.com/spark-jobserver/spark-jobserver >>> >>> >>> On Tue, Mar 29, 2016 at 2:50 PM, Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>> Interesting question. >>> >>> The most widely used application of N-tier is the traditional three-tier >>> architecture that has been the backbone of Client-server architecture by >>> having presentation layer, application layer and data layer. This is >>> primarily for performance, scalability and maintenance. The most profound >>> changes that Big data space has introduced to N-tier architecture is the >>> concept of horizontal scaling as opposed to the previous tiers that relied >>> on vertical scaling. HDFS is an example of horizontal scaling at the data >>> tier by adding more JBODS to storage. Similarly adding more nodes to Spark >>> cluster should result in better performance. >>> >>> Bear in mind that these tiers are at Logical levels which means that >>> there or may not be so many so many physical layers. For example multiple >>> virtual servers can be hosted on the same physical server. >>> >>> With regard to Spark, it is effectively a powerful query tools that sits >>> in between the presentation layer (say Tableau) and the HDFS or Hive as you >>> alluded. In that sense you can think of Spark as part of the application >>> layer that communicates with the backend via a number of protocols >>> including the standard JDBC. There is rather a blurred vision here whether >>> Spark is a database or query tool. IMO it is a query tool in a sense that >>> Spark by itself does not have its own storage concept or metastore. Thus it >>> relies on others to provide that service. >>> >>> HTH >>> >>> >>> >>> Dr Mich Talebzadeh >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> On 29 March 2016 at 22:07, Ashok Kumar <ashok34...@yahoo.com.invalid> >>> wrote: >>> >>> Experts, >>> >>> One of terms used and I hear is N-tier architecture within Big Data used >>> for availability, performance etc. I also hear that Spark by means of its >>> query engine and in-memory caching fits into middle tier (application >>> layer) with HDFS and Hive may be providing the data tier. Can someone >>> elaborate the role of Spark here. For example A Scala program that we write >>> uses JDBC to talk to databases so in that sense is Spark a middle tier >>> application? >>> >>> I hope that someone can clarify this and if so what would the best >>> practice in using Spark as middle tier and within Big data. >>> >>> Thanks >>> >>> >>> >>> >>> >>> >> >