Hi David, Could you point me to an example where SparkSQL is used in Stratio Deep?
Rgds On Mon, Dec 15, 2014 at 2:20 PM, David Morales <dmora...@stratio.com> wrote: > > Hi there, > > For sure, the new release does support SparkSQL, so you can use sparkSQL > and Stratio Deep together jusy out of the box. > > About cross-data, it' not itself related to Spark but can use Spark-Deep. > It's an interactive SQL like Hive, for example. > > > Regards. > > 2014-12-12 21:29 GMT+01:00 Niranda Perera <nira...@wso2.com>: >> >> Hi David, >> >> I have been going through the Deep-Spark examples. It looks very >> promising. >> >> On a follow up query, does Deep-spark/ deep-cassandra support SQL like >> operations on RDDs (like SparkSQL)? >> >> Example (from Datastax Cassandra connector demos): >> >> object SQLDemo extends DemoApp { >> >> val cc = new CassandraSQLContext(sc) >> >> CassandraConnector(conf).withSessionDo { session => >> session.execute("CREATE KEYSPACE IF NOT EXISTS test WITH REPLICATION >> = {'class': 'SimpleStrategy', 'replication_factor': 1 }") >> session.execute("DROP TABLE IF EXISTS test.sql_demo") >> session.execute("CREATE TABLE test.sql_demo (key INT PRIMARY KEY, grp >> INT, value DOUBLE)") >> session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES >> (1, 1, 1.0)") >> session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES >> (2, 1, 2.5)") >> session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES >> (3, 1, 10.0)") >> session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES >> (4, 2, 4.0)") >> session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES >> (5, 2, 2.2)") >> session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES >> (6, 2, 2.8)") >> } >> >> val rdd = cc.cassandraSql("SELECT grp, max(value) AS mv FROM >> test.sql_demo GROUP BY grp ORDER BY mv") >> rdd.collect().foreach(println) // [2, 4.0] [1, 10.0] >> >> sc.stop() >> } >> >> I also read about Stratio Crossdata. Does Crossdata serve this purpose? >> >> Rgds >> >> On Tue, Dec 2, 2014 at 11:14 PM, David Morales <dmora...@stratio.com> >> wrote: >>> >>> Hi¡ >>> >>> Please, check the develop branch if you want to see a more realistic >>> view of our development path. Last commit was about two hours ago :) >>> >>> Stratio Deep is one of our core modules so there is a core team in >>> Stratio fully devoted to spark + noSQL integration. In these last months, >>> for example, we have added mongoDB, ElasticSearch and Aerospike to Stratio >>> Deep, so you can talk to these databases from Spark just like you do with >>> HDFS. >>> >>> Furthermore, we are working on more backends, such as neo4j or >>> couchBase, for example. >>> >>> >>> About our benchmarks, you can check out some results in this link: >>> http://www.stratio.com/deep-vs-datastax/ >>> >>> Please, keep in mind that spark integration with a datastore could be >>> done in two ways: HCI or native. We are now working on improving native >>> integration because it's quite more performant. In this way, we are just >>> working on some other tests with even more impressive results. >>> >>> >>> Here you can find a technical overview of all our platform. >>> >>> >>> http://www.slideshare.net/Stratio/stratio-platform-overview-v41 >>> >>> Regards >>> >>> 2014-12-02 11:14 GMT+01:00 Niranda Perera <nira...@wso2.com>: >>> >>>> Hi David, >>>> >>>> Sorry to re-initiate this thread. But may I know if you have done any >>>> benchmarking on Datastax Spark cassandra connector and Stratio Deep-spark >>>> cassandra integration? Would love to take a look at it. >>>> >>>> I recently checked deep-spark github repo and noticed that there is no >>>> activity since Oct 29th. May I know what your future plans on this >>>> particular project? >>>> >>>> Cheers >>>> >>>> On Tue, Aug 26, 2014 at 9:12 PM, David Morales <dmora...@stratio.com> >>>> wrote: >>>> >>>>> Yes, it is already included in our benchmarks. >>>>> >>>>> It could be a nice idea to share our findings, let me talk about it >>>>> here. Meanwhile, you can ask us any question by using my mail or this >>>>> thread, we are glad to help you. >>>>> >>>>> >>>>> Best regards. >>>>> >>>>> >>>>> 2014-08-24 15:49 GMT+02:00 Niranda Perera <nira...@wso2.com>: >>>>> >>>>>> Hi David, >>>>>> >>>>>> Thank you for your detailed reply. >>>>>> >>>>>> It was great to hear about Stratio-Deep and I must say, it looks very >>>>>> interesting. Storage handlers for databases such Cassandra, MongoDB etc >>>>>> would be very helpful. We will definitely look up on Stratio-Deep. >>>>>> >>>>>> I came across with the Datastax Spark-Cassandra connector ( >>>>>> https://github.com/datastax/spark-cassandra-connector ). Have you >>>>>> done any comparison with your implementation and Datastax's connector? >>>>>> >>>>>> And, yes, please do share the performance results with us once it's >>>>>> ready. >>>>>> >>>>>> On a different note, is there any way for us to interact with Stratio >>>>>> dev community, in the form of dev mail lists etc, so that we could >>>>>> mutually >>>>>> share our findings? >>>>>> >>>>>> Best regards >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Aug 22, 2014 at 2:07 PM, David Morales <dmora...@stratio.com> >>>>>> wrote: >>>>>> >>>>>>> Hi there, >>>>>>> >>>>>>> *1. About the size of deployments.* >>>>>>> >>>>>>> It depends on your use case... specially when you combine spark with >>>>>>> a datastore. We use to deploy spark with cassandra or mongodb, instead >>>>>>> of >>>>>>> using HDFS for example. >>>>>>> >>>>>>> Spark will be faster if you put the data in memory, so if you need a >>>>>>> lot of speed (interactive queries, for example), you should have enough >>>>>>> memory. >>>>>>> >>>>>>> >>>>>>> *2. About storage handlers.* >>>>>>> >>>>>>> We have developed the first tight integration between Cassandra and >>>>>>> Spark, called Stratio Deep, announced in the first spark summit. You can >>>>>>> check Stratio Deep out here: https://github.com/Stratio/stratio-deep >>>>>>> (open, >>>>>>> apache2 license). >>>>>>> >>>>>>> *Deep is a thin integration layer between Apache Spark and several >>>>>>> NoSQL datastores. We actually support Apache Cassandra and MongoDB, but >>>>>>> in >>>>>>> the near future we will add support for sever other datastores.* >>>>>>> >>>>>>> Datastax have announce its own driver for spark in the last spark >>>>>>> summit, but we have been working in our solution for almost a year. >>>>>>> >>>>>>> Furthermore, we are working to extend this solution in order to >>>>>>> work also with other databases... MongoDB integration is completed right >>>>>>> now and ElasticSearch will be ready in a few weeks. >>>>>>> >>>>>>> And that is not all, we have also developed an integration with >>>>>>> Cassandra and Lucene for indexing data (open source, apache2). >>>>>>> >>>>>>> *Stratio Cassandra is a fork of Apache Cassandra >>>>>>> <http://cassandra.apache.org/> where index functionality has been >>>>>>> extended >>>>>>> to provide near real time search such as ElasticSearch or Solr, >>>>>>> including full text search >>>>>>> <http://en.wikipedia.org/wiki/Full_text_search> capabilities and free >>>>>>> multivariable search. It is achieved through an Apache Lucene >>>>>>> <http://lucene.apache.org/> based implementation of Cassandra secondary >>>>>>> indexes, where each node of the cluster indexes its own data.* >>>>>>> >>>>>>> >>>>>>> We will publish some benchmarks in two weeks, so i will share our >>>>>>> results here if you are interested. >>>>>>> >>>>>>> >>>>>>> If you are more interested in distributed file systems, you should >>>>>>> take a look on Tachyon: http://tachyon-project.org/index.html >>>>>>> >>>>>>> >>>>>>> *3. Spark - Hive compatibility* >>>>>>> >>>>>>> Spark will support anything with the Hadoop InputFormat interface. >>>>>>> >>>>>>> >>>>>>> *4. Performance* >>>>>>> >>>>>>> We are working a lot with Cassandra and mongoDB and the performance >>>>>>> is quite nice. We are finishing right now some benchmarks comparing >>>>>>> Hadoop >>>>>>> + HDFS vs Spark + HDFS vs Spark + Cassandra (using stratio deep and even >>>>>>> our fork of Cassandra). >>>>>>> >>>>>>> Let me please share this results with you when they were ready, ok? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Regards. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2014-08-22 7:53 GMT+02:00 Niranda Perera <nira...@wso2.com>: >>>>>>> >>>>>>> Hi Srinath, >>>>>>>> Yes, I am working on deploying it on a multi-node cluster with the >>>>>>>> debs dataset. I will keep architecture@ posted on the progress. >>>>>>>> >>>>>>>> >>>>>>>> Hi David, >>>>>>>> Thank you very much for the detailed insight you've provided. >>>>>>>> Few quick questions, >>>>>>>> 1. Do you have experiences in using storage handlers in Spark? >>>>>>>> 2. Would a storage handler used in Hive, be directly compatible >>>>>>>> with Spark? >>>>>>>> 3. How do you grade the performance of Spark with other databases >>>>>>>> such as Cassandra, HBase, H2, etc? >>>>>>>> >>>>>>>> Thank you very much again for your interest. Look forward to >>>>>>>> hearing from you. >>>>>>>> >>>>>>>> Regards >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 21, 2014 at 7:02 PM, Srinath Perera <srin...@wso2.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Niranda, we need test Spark in multi-node mode before making a >>>>>>>>> decision. Spark is very fast, I think there is no doubt about that. >>>>>>>>> We need >>>>>>>>> to make sure it stable. >>>>>>>>> >>>>>>>>> David, thanks for a detailed email! How big (nodes) is the Spark >>>>>>>>> setup you guys are running? >>>>>>>>> >>>>>>>>> --Srinath >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Aug 21, 2014 at 1:34 PM, David Morales < >>>>>>>>> dmora...@stratio.com> wrote: >>>>>>>>> >>>>>>>>>> Sorry for disturbing this thread, but i think that i can help >>>>>>>>>> clarifying a few things (we were attending the last Spark Summit, we >>>>>>>>>> were >>>>>>>>>> also speakers there and we are working very close to spark) >>>>>>>>>> >>>>>>>>>> *> Hive/Shark and others benchmark* >>>>>>>>>> >>>>>>>>>> You can find a nice comparison and benchmark in this web: >>>>>>>>>> https://amplab.cs.berkeley.edu/benchmark/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *> Shark and SparkSQL* >>>>>>>>>> >>>>>>>>>> SparkSQL is the natural replacement for Shark, but SparkSQL is >>>>>>>>>> still young at this moment. If you are looking for Hive >>>>>>>>>> compatibility, you >>>>>>>>>> have to execute SparkSQL with an specific context. >>>>>>>>>> >>>>>>>>>> Quoted from spark website: >>>>>>>>>> >>>>>>>>>> *> Note that Spark SQL currently uses a very basic SQL >>>>>>>>>> parser. Users that want a more complete dialect of SQL should look >>>>>>>>>> at the >>>>>>>>>> HiveSQL support provided by HiveContext.* >>>>>>>>>> >>>>>>>>>> So, only note that SparkSQL is a work in progress. If you want >>>>>>>>>> SparkSQL you have to run a SparkSQLContext, if you want Hive, you >>>>>>>>>> will have >>>>>>>>>> a different context... >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *> Spark - Hadoop: the future* >>>>>>>>>> >>>>>>>>>> Most Hadoop distributions are including Spark: cloudera, >>>>>>>>>> hortonworks, mapR... and contributing to migrate all the Hadoop >>>>>>>>>> ecosystem >>>>>>>>>> to Spark. >>>>>>>>>> >>>>>>>>>> Spark is a bit more than Map/Reduce... as you can read here: >>>>>>>>>> http://gigaom.com/2014/06/28/4-reasons-why-spark-could-jolt-hadoop-into-hyperdrive/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *> Spark Streaming / Spark SQL* >>>>>>>>>> >>>>>>>>>> Spark Streaming is built on Spark and it provides streaming >>>>>>>>>> processing through an information abstraction called DStreams (a >>>>>>>>>> collection >>>>>>>>>> of RDDs in a window of time). >>>>>>>>>> >>>>>>>>>> There is some efforts in order to make SparkSQL compatible with >>>>>>>>>> Spark Streaming (something similar to trident for storm), as you can >>>>>>>>>> see >>>>>>>>>> here: >>>>>>>>>> >>>>>>>>>> *StreamSQL (https://github.com/thunderain-project/StreamSQL >>>>>>>>>> <https://github.com/thunderain-project/StreamSQL>) is a POC project >>>>>>>>>> based >>>>>>>>>> on Spark to combine the power of Catalyst and Spark Streaming, to >>>>>>>>>> offer >>>>>>>>>> people the ability to manipulate SQL on top of DStream as you >>>>>>>>>> wanted, this >>>>>>>>>> keep the same semantics with SparkSQL as offer a SchemaDStream on >>>>>>>>>> top of >>>>>>>>>> DStream. You don't need to do tricky thing like extracting rdd to >>>>>>>>>> register >>>>>>>>>> as a table. Besides other parts are the same as Spark.* >>>>>>>>>> >>>>>>>>>> So, you can apply a SQL in a data stream, but it is very simple >>>>>>>>>> at the moment... you can expect a bunch of improvements in this >>>>>>>>>> matter in >>>>>>>>>> the next months (i guess that sparkSQL will work on Spark streaming >>>>>>>>>> streams >>>>>>>>>> before the end of this year). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *> Spark Streaming / Spark SQL and CEP* >>>>>>>>>> >>>>>>>>>> There is no relationship at this moment between (your absolutely >>>>>>>>>> amazing) Siddhi CEP and Spark. As fas as i know, you are working in >>>>>>>>>> doing >>>>>>>>>> distributed CEP with Storm and Siddhi. >>>>>>>>>> >>>>>>>>>> We are currently working on doing an interactive cep built with >>>>>>>>>> kafka + spark streaming + siddhi, with some features such as an API, >>>>>>>>>> an >>>>>>>>>> interactive shell, built-in statistics and auditing, built-in >>>>>>>>>> functions >>>>>>>>>> (save2cassandra, save2mongo, save2elasticsearch...). >>>>>>>>>> >>>>>>>>>> If you are interested we can talk about this project, i think >>>>>>>>>> that it would be a nice idea¡ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Anyway, i don't think that SparkSQL will evolve in something like >>>>>>>>>> a CEP. Patterns, sequences, for example would be very complex to do >>>>>>>>>> with >>>>>>>>>> spark streaming (at least now). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2014-08-21 6:18 GMT+02:00 Sriskandarajah Suhothayan < >>>>>>>>>> s...@wso2.com>: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Aug 20, 2014 at 1:36 PM, Niranda Perera < >>>>>>>>>>> nira...@wso2.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> @Maninda, >>>>>>>>>>>> >>>>>>>>>>>> +1 for suggesting Spark SQL. >>>>>>>>>>>> >>>>>>>>>>>> Quote Databricks, >>>>>>>>>>>> "Spark SQL provides state-of-the-art SQL performance and >>>>>>>>>>>> maintains compatibility with Shark/Hive. In particular, like >>>>>>>>>>>> Shark, Spark >>>>>>>>>>>> SQL supports all existing Hive data formats, user-defined >>>>>>>>>>>> functions (UDF), >>>>>>>>>>>> and the Hive metastore." [1] >>>>>>>>>>>> >>>>>>>>>>>> But I am not entirely sure if Spark SQL and Siddhi is >>>>>>>>>>>> comparable, because SparkSQL (like Hive) is designed for batch >>>>>>>>>>>> processing, >>>>>>>>>>>> where as Siddhi is real-time processing. But if there are >>>>>>>>>>>> implementations >>>>>>>>>>>> where Siddhi is run on top of Spark, it would be very interesting. >>>>>>>>>>>> >>>>>>>>>>> Yes Siddhi's current way of operation does not support this. But >>>>>>>>>>> with partitions and we can achieve this to some extent. >>>>>>>>>>> >>>>>>>>>>> Suho >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Spark supports either Hadoop1 or 2. But I think we should see, >>>>>>>>>>>> what is best, MR1 or YARN+MR2 >>>>>>>>>>>> >>>>>>>>>>>> [image: Hadoop Architecture] >>>>>>>>>>>> [2] >>>>>>>>>>>> >>>>>>>>>>>> [1] >>>>>>>>>>>> http://databricks.com/blog/2014/07/01/shark-spark-sql-hive-on-spark-and-the-future-of-sql-on-spark.html >>>>>>>>>>>> [2] http://www.tomsitpro.com/articles/hadoop-2-vs-1,2-718.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Aug 20, 2014 at 1:13 PM, Lasantha Fernando < >>>>>>>>>>>> lasan...@wso2.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Maninda, >>>>>>>>>>>>> >>>>>>>>>>>>> On 20 August 2014 12:02, Maninda Edirisooriya < >>>>>>>>>>>>> mani...@wso2.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> In the case of discontinuity of Shark project, IMO we should >>>>>>>>>>>>>> not move to Shark at all. >>>>>>>>>>>>>> And it seems better to go with Spark SQL as we are already >>>>>>>>>>>>>> using Spark for CEP. But I am not sure the difference between >>>>>>>>>>>>>> Spark SQL and >>>>>>>>>>>>>> the Siddhi queries on the Spark engine. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Currently, we are doing integration with CEP using Apache >>>>>>>>>>>>> Storm, not Spark... :-). Spark Streaming is a possible candidate >>>>>>>>>>>>> for >>>>>>>>>>>>> integrating with CEP, but we have opted with Storm. I think there >>>>>>>>>>>>> has been >>>>>>>>>>>>> some independent work on integrating Kafka + Spark Streaming + >>>>>>>>>>>>> Siddhi. >>>>>>>>>>>>> Please refer to thread on arch@ "[Architecture] A few >>>>>>>>>>>>> questions about WSO2 CEP/Siddhi" >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> And we have to figure out how Spark SQL is used for historical >>>>>>>>>>>>>> data, whether it can execute incremental processing by default >>>>>>>>>>>>>> which will >>>>>>>>>>>>>> implement all out existing BAM use cases. >>>>>>>>>>>>>> On the other hand in Hadoop 2 [1] they are using a completely >>>>>>>>>>>>>> different platform for resource allocation known as Yarn. >>>>>>>>>>>>>> Sometimes this >>>>>>>>>>>>>> may be more suitable for batch jobs. >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] https://www.youtube.com/watch?v=RncoVN0l6dc >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Lasantha >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> *Maninda Edirisooriya* >>>>>>>>>>>>>> Senior Software Engineer >>>>>>>>>>>>>> >>>>>>>>>>>>>> *WSO2, Inc. *lean.enterprise.middleware. >>>>>>>>>>>>>> >>>>>>>>>>>>>> *Blog* : http://maninda.blogspot.com/ >>>>>>>>>>>>>> *E-mail* : mani...@wso2.com >>>>>>>>>>>>>> *Skype* : @manindae >>>>>>>>>>>>>> *Twitter* : @maninda >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Aug 20, 2014 at 11:33 AM, Niranda Perera < >>>>>>>>>>>>>> nira...@wso2.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Anjana and Srinath, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> After the discussion I had with Anjana, I researched more on >>>>>>>>>>>>>>> the continuation of Shark project by Databricks. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Here's what I found out, >>>>>>>>>>>>>>> - Shark was built on the Hive codebase and achieved >>>>>>>>>>>>>>> performance improvements by swapping out the physical execution >>>>>>>>>>>>>>> engine part >>>>>>>>>>>>>>> of Hive. While this approach enabled Shark users to speed up >>>>>>>>>>>>>>> their Hive >>>>>>>>>>>>>>> queries, Shark inherited a large, complicated code base from >>>>>>>>>>>>>>> Hive that made >>>>>>>>>>>>>>> it hard to optimize and maintain. >>>>>>>>>>>>>>> Hence, Databricks has announced that they are halting the >>>>>>>>>>>>>>> development of Shark from July, 2014. (Shark 0.9 would be the >>>>>>>>>>>>>>> last release) >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> - Shark will be replaced by Spark SQL. It beats Shark in TPC-DS >>>>>>>>>>>>>>> performance >>>>>>>>>>>>>>> <http://databricks.com/blog/2014/06/02/exciting-performance-improvements-on-the-horizon-for-spark-sql.html> >>>>>>>>>>>>>>> by almost an order of magnitude. It also supports all existing >>>>>>>>>>>>>>> Hive data >>>>>>>>>>>>>>> formats, user-defined functions (UDF), and the Hive metastore. >>>>>>>>>>>>>>> [2] >>>>>>>>>>>>>>> - Following is the Shark, Spark SQL migration plan >>>>>>>>>>>>>>> http://spark-summit.org/wp-content/uploads/2014/07/Future-of-Spark-Patrick-Wendell.pdf >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - For the legacy Hive and MapReduce users, they have >>>>>>>>>>>>>>> proposed a new 'Hive on Spark Project' [3], [4] >>>>>>>>>>>>>>> But, given the performance enhancement, it is quite certain >>>>>>>>>>>>>>> that Hive and MR would be replaced by engines build on top of >>>>>>>>>>>>>>> Spark (ex: >>>>>>>>>>>>>>> Spark SQL) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In my opinion there are a few matters to figure out if we >>>>>>>>>>>>>>> are migrating from Hive, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. whether we are changing the query engine only? (Then, we >>>>>>>>>>>>>>> can replace Hive by Shark) >>>>>>>>>>>>>>> 2. whether we are changing the existing Hadoop/ MapReduce >>>>>>>>>>>>>>> framework to Spark? (Then we can replace Hive and Hadoop with >>>>>>>>>>>>>>> Spark and >>>>>>>>>>>>>>> Spark SQL) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In my opinion, considering the longterm impact and the >>>>>>>>>>>>>>> availability of support, it is best to migrate the Hive/Hadoop >>>>>>>>>>>>>>> to Spark. >>>>>>>>>>>>>>> It is open for discussion! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In the mean time, I've already tried Spark SQL, and >>>>>>>>>>>>>>> Databricks claims on improved performance seems to be true. I >>>>>>>>>>>>>>> will work >>>>>>>>>>>>>>> more on this. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> http://databricks.com/blog/2014/07/01/shark-spark-sql-hive-on-spark-and-the-future-of-sql-on-spark.html >>>>>>>>>>>>>>> [2] >>>>>>>>>>>>>>> http://databricks.com/blog/2014/06/02/exciting-performance-improvements-on-the-horizon-for-spark-sql.html >>>>>>>>>>>>>>> [3] https://issues.apache.org/jira/browse/HIVE-7292 >>>>>>>>>>>>>>> [4] >>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 12:16 PM, Anjana Fernando < >>>>>>>>>>>>>>> anj...@wso2.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Srinath, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> No, this has not been tested in multiple nodes. I told >>>>>>>>>>>>>>>> Niranda here in my last mail, to test a cluster with the same >>>>>>>>>>>>>>>> set of >>>>>>>>>>>>>>>> hardware we have, that we are using to test our large data set >>>>>>>>>>>>>>>> with Hive. >>>>>>>>>>>>>>>> As for the effort to make the change, we still have to figure >>>>>>>>>>>>>>>> out the MT >>>>>>>>>>>>>>>> aspects of Shark here. Sinthuja was working on making the >>>>>>>>>>>>>>>> latest Hive >>>>>>>>>>>>>>>> version MT ready, and most probably, we can do the same >>>>>>>>>>>>>>>> changes to the Hive >>>>>>>>>>>>>>>> version Shark is using. So after we do that, the integration >>>>>>>>>>>>>>>> should be >>>>>>>>>>>>>>>> seamless. And also, as I mentioned earlier here, we are also >>>>>>>>>>>>>>>> going to test >>>>>>>>>>>>>>>> this with the APIM Hive script, to check if there are any >>>>>>>>>>>>>>>> unforeseen >>>>>>>>>>>>>>>> incompatibilities. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>> Anjana. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 11:53 AM, Srinath Perera < >>>>>>>>>>>>>>>> srin...@wso2.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This look great. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We need to test Spark with multiple nodes? Did we do that. >>>>>>>>>>>>>>>>> Please create few VMs in performance could (talk to Lakmal) >>>>>>>>>>>>>>>>> and test with >>>>>>>>>>>>>>>>> at least 5 nodes. We need to make sure it works OK with >>>>>>>>>>>>>>>>> distributed setup >>>>>>>>>>>>>>>>> as well. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> What does it take to change to spark? Anjana .. how much >>>>>>>>>>>>>>>>> work is it? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> --Srinath >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, Aug 13, 2014 at 7:06 PM, Niranda Perera < >>>>>>>>>>>>>>>>> nira...@wso2.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thank you Anjana. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yes, I am working on it. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> In the mean time, I found this in Hive documentation [1]. >>>>>>>>>>>>>>>>>> It talks about Hive on Spark, and compares Hive, Shark and >>>>>>>>>>>>>>>>>> Spark SQL at an >>>>>>>>>>>>>>>>>> higher architectural level. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Additionally, it is said that the in-memory performance >>>>>>>>>>>>>>>>>> of Shark can be improved by introducing Tachyon [2]. I guess >>>>>>>>>>>>>>>>>> we can >>>>>>>>>>>>>>>>>> consider this later on. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Cheers. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark#HiveonSpark-1.3ComparisonwithSharkandSparkSQL >>>>>>>>>>>>>>>>>> [2] >>>>>>>>>>>>>>>>>> http://tachyon-project.org/Running-Tachyon-Locally.html >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Wed, Aug 13, 2014 at 3:17 PM, Anjana Fernando < >>>>>>>>>>>>>>>>>> anj...@wso2.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi Niranda, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Excellent analysis of Hive vs Shark! .. This gives a lot >>>>>>>>>>>>>>>>>>> of insight into how both operates in different scenarios. >>>>>>>>>>>>>>>>>>> As the next step, >>>>>>>>>>>>>>>>>>> we will need to run this in an actual cluster of computers. >>>>>>>>>>>>>>>>>>> Since you've >>>>>>>>>>>>>>>>>>> used a subset of the dataset of 2014 DEBS challenge, we >>>>>>>>>>>>>>>>>>> should use the full >>>>>>>>>>>>>>>>>>> data set in a clustered environment and check this. Gokul >>>>>>>>>>>>>>>>>>> is already >>>>>>>>>>>>>>>>>>> working on the Hive based setup for this, after that is >>>>>>>>>>>>>>>>>>> done, you can >>>>>>>>>>>>>>>>>>> create a Shark cluster in the same hardware and run the >>>>>>>>>>>>>>>>>>> tests there, to get >>>>>>>>>>>>>>>>>>> a clear comparison on how these two match up in a cluster. >>>>>>>>>>>>>>>>>>> Until the setup >>>>>>>>>>>>>>>>>>> is ready, do continue with your next steps on checking the >>>>>>>>>>>>>>>>>>> RDD support and >>>>>>>>>>>>>>>>>>> Spark SQL use. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> After these are done, we should also do a trial run of >>>>>>>>>>>>>>>>>>> our own APIM Hive scripts, migrated to Shark. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>> Anjana. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Mon, Aug 11, 2014 at 12:21 PM, Niranda Perera < >>>>>>>>>>>>>>>>>>> nira...@wso2.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I have been evaluating the performance of >>>>>>>>>>>>>>>>>>>> Shark (distributed SQL query engine for Hadoop) against >>>>>>>>>>>>>>>>>>>> Hive. This is with >>>>>>>>>>>>>>>>>>>> the objective of seeing the possibility to move the WSO2 >>>>>>>>>>>>>>>>>>>> BAM data >>>>>>>>>>>>>>>>>>>> processing (which currently uses Hive) to Shark (and >>>>>>>>>>>>>>>>>>>> Apache Spark) for >>>>>>>>>>>>>>>>>>>> improved performance. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I am sharing my findings herewith. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> *AMP Lab Shark* >>>>>>>>>>>>>>>>>>>> Shark can execute Hive QL queries up to 100 times >>>>>>>>>>>>>>>>>>>> faster than Hive without any modification to the existing >>>>>>>>>>>>>>>>>>>> data or queries. >>>>>>>>>>>>>>>>>>>> It supports Hive's QL, metastore, serialization formats, >>>>>>>>>>>>>>>>>>>> and user-defined >>>>>>>>>>>>>>>>>>>> functions, providing seamless integration with existing >>>>>>>>>>>>>>>>>>>> Hive deployments >>>>>>>>>>>>>>>>>>>> and a familiar, more powerful option for new ones. [1] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> *Apache Spark*Apache Spark is an open-source data >>>>>>>>>>>>>>>>>>>> analytics cluster computing framework. It fits into the >>>>>>>>>>>>>>>>>>>> Hadoop open-source >>>>>>>>>>>>>>>>>>>> community, building on top of the HDFS and promises >>>>>>>>>>>>>>>>>>>> performance up to 100 >>>>>>>>>>>>>>>>>>>> times faster than Hadoop MapReduce for certain >>>>>>>>>>>>>>>>>>>> applications. [2] >>>>>>>>>>>>>>>>>>>> Official documentation: [3] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I carried out the comparison between the following Hive >>>>>>>>>>>>>>>>>>>> and Shark releases with input files ranging from 100 to 1 >>>>>>>>>>>>>>>>>>>> billion entries. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> QL Engine >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Apache Hive 0.11 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Shark Shark 0.9.1 (Latest release) which uses, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Scala 2.10.3 >>>>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Spark 0.9.1 >>>>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> AMPLab’s Hive 0.9.0 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Framework >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hadoop 1.0.4 >>>>>>>>>>>>>>>>>>>> Spark 0.9.1 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> File system >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> HDFS >>>>>>>>>>>>>>>>>>>> HDFS >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Attached herewith is a report which describes in detail >>>>>>>>>>>>>>>>>>>> about the performance comparison between Shark and Hive. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> hive_vs_shark >>>>>>>>>>>>>>>>>>>> <https://docs.google.com/a/wso2.com/folderview?id=0B1GsnfycTl32QTZqUktKck1Ucjg&usp=drive_web> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> hive_vs_shark_report.odt >>>>>>>>>>>>>>>>>>>> <https://docs.google.com/a/wso2.com/file/d/0B1GsnfycTl32X3J5dTh6Slloa0E/edit?usp=drive_web> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In summary, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> From the evaluation, following conclusions can be >>>>>>>>>>>>>>>>>>>> derived. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> - Shark is indifferent to Hive in DDL operations >>>>>>>>>>>>>>>>>>>> (CREATE, DROP .. TABLE, DATABASE). Both engines show a >>>>>>>>>>>>>>>>>>>> fairly constant >>>>>>>>>>>>>>>>>>>> performance as the input size increases. >>>>>>>>>>>>>>>>>>>> - Shark is indifferent to Hive in DML operations >>>>>>>>>>>>>>>>>>>> (LOAD, INSERT) but when a DML operation is called in >>>>>>>>>>>>>>>>>>>> conjuncture of a data >>>>>>>>>>>>>>>>>>>> retrieval operation (ex. INSERT <TBL> SELECT <PROP> >>>>>>>>>>>>>>>>>>>> FROM <TBL>), Shark >>>>>>>>>>>>>>>>>>>> significantly over-performs Hive with a performance >>>>>>>>>>>>>>>>>>>> factor of 10x+ (Ranging >>>>>>>>>>>>>>>>>>>> from 10x to 80x in some instances). Shark performance >>>>>>>>>>>>>>>>>>>> factor reduces with >>>>>>>>>>>>>>>>>>>> the input size increases, while HIVE performance is >>>>>>>>>>>>>>>>>>>> fairly indifferent. >>>>>>>>>>>>>>>>>>>> - Shark clearly over-performs Hive in Data >>>>>>>>>>>>>>>>>>>> Retrieval operations (FILTER, ORDER BY, JOIN). Hive >>>>>>>>>>>>>>>>>>>> performance is fairly >>>>>>>>>>>>>>>>>>>> indifferent in the data retrieval operations while >>>>>>>>>>>>>>>>>>>> Shark performance >>>>>>>>>>>>>>>>>>>> reduces as the input size increases. But at every >>>>>>>>>>>>>>>>>>>> instance Shark >>>>>>>>>>>>>>>>>>>> over-performed Hive with a minimum performance factor >>>>>>>>>>>>>>>>>>>> of 5x+ (Ranging from >>>>>>>>>>>>>>>>>>>> 5x to 80x in some instances). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Please refer the 'hive_vs_shark_report', it has all the >>>>>>>>>>>>>>>>>>>> information about the queries and timings pictographically. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The code repository can also be found in >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> https://github.com/nirandaperera/hiveToShark/tree/master/hiveVsShark >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Moving forward, I am currently working on the >>>>>>>>>>>>>>>>>>>> following. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> - Apache Spark's resilient distributed dataset >>>>>>>>>>>>>>>>>>>> (RDD) abstraction (which is a collection of elements >>>>>>>>>>>>>>>>>>>> partitioned across the >>>>>>>>>>>>>>>>>>>> nodes of the cluster that can be operated on in >>>>>>>>>>>>>>>>>>>> parallel). The use of RDDs >>>>>>>>>>>>>>>>>>>> and its impact to the performance. >>>>>>>>>>>>>>>>>>>> - Spark SQL - Use of this Spark SQL over Shark on >>>>>>>>>>>>>>>>>>>> Spark framework >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> [1] https://github.com/amplab/shark/wiki >>>>>>>>>>>>>>>>>>>> [2] http://en.wikipedia.org/wiki/Apache_Spark >>>>>>>>>>>>>>>>>>>> [3] http://spark.apache.org/docs/latest/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Would love to have your feedback on this. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Best regards >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> *Niranda Perera* >>>>>>>>>>>>>>>>>>>> Software Engineer, WSO2 Inc. >>>>>>>>>>>>>>>>>>>> Mobile: +94-71-554-8430 >>>>>>>>>>>>>>>>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> *Anjana Fernando* >>>>>>>>>>>>>>>>>>> Senior Technical Lead >>>>>>>>>>>>>>>>>>> WSO2 Inc. | http://wso2.com >>>>>>>>>>>>>>>>>>> lean . enterprise . middleware >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> *Niranda Perera* >>>>>>>>>>>>>>>>>> Software Engineer, WSO2 Inc. >>>>>>>>>>>>>>>>>> Mobile: +94-71-554-8430 >>>>>>>>>>>>>>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> ============================ >>>>>>>>>>>>>>>>> Srinath Perera, Ph.D. >>>>>>>>>>>>>>>>> http://people.apache.org/~hemapani/ >>>>>>>>>>>>>>>>> http://srinathsview.blogspot.com/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> *Anjana Fernando* >>>>>>>>>>>>>>>> Senior Technical Lead >>>>>>>>>>>>>>>> WSO2 Inc. | http://wso2.com >>>>>>>>>>>>>>>> lean . enterprise . middleware >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> *Niranda Perera* >>>>>>>>>>>>>>> Software Engineer, WSO2 Inc. >>>>>>>>>>>>>>> Mobile: +94-71-554-8430 >>>>>>>>>>>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> Architecture mailing list >>>>>>>>>>>>>> Architecture@wso2.org >>>>>>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> *Lasantha Fernando* >>>>>>>>>>>>> Software Engineer - Data Technologies Team >>>>>>>>>>>>> WSO2 Inc. http://wso2.com >>>>>>>>>>>>> >>>>>>>>>>>>> email: lasan...@wso2.com >>>>>>>>>>>>> mobile: (+94) 71 5247551 >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Niranda Perera* >>>>>>>>>>>> Software Engineer, WSO2 Inc. >>>>>>>>>>>> Mobile: +94-71-554-8430 >>>>>>>>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Architecture mailing list >>>>>>>>>>>> Architecture@wso2.org >>>>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> *S. Suhothayan* >>>>>>>>>>> Technical Lead & Team Lead of WSO2 Complex Event Processor >>>>>>>>>>> *WSO2 Inc. *http://wso2.com >>>>>>>>>>> * <http://wso2.com/>* >>>>>>>>>>> lean . enterprise . middleware >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *cell: (+94) 779 756 757 <%28%2B94%29%20779%20756%20757> | blog: >>>>>>>>>>> http://suhothayan.blogspot.com/ <http://suhothayan.blogspot.com/> >>>>>>>>>>> twitter: >>>>>>>>>>> http://twitter.com/suhothayan <http://twitter.com/suhothayan> | >>>>>>>>>>> linked-in: >>>>>>>>>>> http://lk.linkedin.com/in/suhothayan >>>>>>>>>>> <http://lk.linkedin.com/in/suhothayan>* >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Architecture mailing list >>>>>>>>>>> Architecture@wso2.org >>>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Architecture mailing list >>>>>>>>>> Architecture@wso2.org >>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> ============================ >>>>>>>>> Srinath Perera, Ph.D. >>>>>>>>> http://people.apache.org/~hemapani/ >>>>>>>>> http://srinathsview.blogspot.com/ >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Architecture mailing list >>>>>>>>> Architecture@wso2.org >>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Niranda Perera* >>>>>>>> Software Engineer, WSO2 Inc. >>>>>>>> Mobile: +94-71-554-8430 >>>>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *Niranda Perera* >>>>>> Software Engineer, WSO2 Inc. >>>>>> Mobile: +94-71-554-8430 >>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Niranda Perera* >>>> Software Engineer, WSO2 Inc. >>>> Mobile: +94-71-554-8430 >>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>> >>> >>> >>> >>> -- >>> >>> David Morales de Frías :: +34 607 010 411 :: @dmoralesdf >>> <https://twitter.com/dmoralesdf> >>> >>> >>> <http://www.stratio.com/> >>> Avenida de Europa, 26. Ática 5. 2ª Planta >>> 28224 Pozuelo de Alarcón, Madrid >>> Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>* >>> >> >> >> -- >> *Niranda Perera* >> Software Engineer, WSO2 Inc. >> Mobile: +94-71-554-8430 >> Twitter: @n1r44 <https://twitter.com/N1R44> >> > > > -- > > David Morales de Frías :: +34 607 010 411 :: @dmoralesdf > <https://twitter.com/dmoralesdf> > > > <http://www.stratio.com/> > Avenida de Europa, 26. Ática 5. 2ª Planta > 28224 Pozuelo de Alarcón, Madrid > Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>* > -- *Niranda Perera* Software Engineer, WSO2 Inc. Mobile: +94-71-554-8430 Twitter: @n1r44 <https://twitter.com/N1R44>
_______________________________________________ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture