Hi there, For sure, the new release does support SparkSQL, so you can use sparkSQL and Stratio Deep together jusy out of the box.
About cross-data, it' not itself related to Spark but can use Spark-Deep. It's an interactive SQL like Hive, for example. Regards. 2014-12-12 21:29 GMT+01:00 Niranda Perera <nira...@wso2.com>: > > Hi David, > > I have been going through the Deep-Spark examples. It looks very > promising. > > On a follow up query, does Deep-spark/ deep-cassandra support SQL like > operations on RDDs (like SparkSQL)? > > Example (from Datastax Cassandra connector demos): > > object SQLDemo extends DemoApp { > > val cc = new CassandraSQLContext(sc) > > CassandraConnector(conf).withSessionDo { session => > session.execute("CREATE KEYSPACE IF NOT EXISTS test WITH REPLICATION = > {'class': 'SimpleStrategy', 'replication_factor': 1 }") > session.execute("DROP TABLE IF EXISTS test.sql_demo") > session.execute("CREATE TABLE test.sql_demo (key INT PRIMARY KEY, grp > INT, value DOUBLE)") > session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES (1, > 1, 1.0)") > session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES (2, > 1, 2.5)") > session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES (3, > 1, 10.0)") > session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES (4, > 2, 4.0)") > session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES (5, > 2, 2.2)") > session.execute("INSERT INTO test.sql_demo(key, grp, value) VALUES (6, > 2, 2.8)") > } > > val rdd = cc.cassandraSql("SELECT grp, max(value) AS mv FROM > test.sql_demo GROUP BY grp ORDER BY mv") > rdd.collect().foreach(println) // [2, 4.0] [1, 10.0] > > sc.stop() > } > > I also read about Stratio Crossdata. Does Crossdata serve this purpose? > > Rgds > > On Tue, Dec 2, 2014 at 11:14 PM, David Morales <dmora...@stratio.com> > wrote: >> >> Hi¡ >> >> Please, check the develop branch if you want to see a more realistic view >> of our development path. Last commit was about two hours ago :) >> >> Stratio Deep is one of our core modules so there is a core team in >> Stratio fully devoted to spark + noSQL integration. In these last months, >> for example, we have added mongoDB, ElasticSearch and Aerospike to Stratio >> Deep, so you can talk to these databases from Spark just like you do with >> HDFS. >> >> Furthermore, we are working on more backends, such as neo4j or couchBase, >> for example. >> >> >> About our benchmarks, you can check out some results in this link: >> http://www.stratio.com/deep-vs-datastax/ >> >> Please, keep in mind that spark integration with a datastore could be >> done in two ways: HCI or native. We are now working on improving native >> integration because it's quite more performant. In this way, we are just >> working on some other tests with even more impressive results. >> >> >> Here you can find a technical overview of all our platform. >> >> >> http://www.slideshare.net/Stratio/stratio-platform-overview-v41 >> >> Regards >> >> 2014-12-02 11:14 GMT+01:00 Niranda Perera <nira...@wso2.com>: >> >>> Hi David, >>> >>> Sorry to re-initiate this thread. But may I know if you have done any >>> benchmarking on Datastax Spark cassandra connector and Stratio Deep-spark >>> cassandra integration? Would love to take a look at it. >>> >>> I recently checked deep-spark github repo and noticed that there is no >>> activity since Oct 29th. May I know what your future plans on this >>> particular project? >>> >>> Cheers >>> >>> On Tue, Aug 26, 2014 at 9:12 PM, David Morales <dmora...@stratio.com> >>> wrote: >>> >>>> Yes, it is already included in our benchmarks. >>>> >>>> It could be a nice idea to share our findings, let me talk about it >>>> here. Meanwhile, you can ask us any question by using my mail or this >>>> thread, we are glad to help you. >>>> >>>> >>>> Best regards. >>>> >>>> >>>> 2014-08-24 15:49 GMT+02:00 Niranda Perera <nira...@wso2.com>: >>>> >>>>> Hi David, >>>>> >>>>> Thank you for your detailed reply. >>>>> >>>>> It was great to hear about Stratio-Deep and I must say, it looks very >>>>> interesting. Storage handlers for databases such Cassandra, MongoDB etc >>>>> would be very helpful. We will definitely look up on Stratio-Deep. >>>>> >>>>> I came across with the Datastax Spark-Cassandra connector ( >>>>> https://github.com/datastax/spark-cassandra-connector ). Have you >>>>> done any comparison with your implementation and Datastax's connector? >>>>> >>>>> And, yes, please do share the performance results with us once it's >>>>> ready. >>>>> >>>>> On a different note, is there any way for us to interact with Stratio >>>>> dev community, in the form of dev mail lists etc, so that we could >>>>> mutually >>>>> share our findings? >>>>> >>>>> Best regards >>>>> >>>>> >>>>> >>>>> On Fri, Aug 22, 2014 at 2:07 PM, David Morales <dmora...@stratio.com> >>>>> wrote: >>>>> >>>>>> Hi there, >>>>>> >>>>>> *1. About the size of deployments.* >>>>>> >>>>>> It depends on your use case... specially when you combine spark with >>>>>> a datastore. We use to deploy spark with cassandra or mongodb, instead of >>>>>> using HDFS for example. >>>>>> >>>>>> Spark will be faster if you put the data in memory, so if you need a >>>>>> lot of speed (interactive queries, for example), you should have enough >>>>>> memory. >>>>>> >>>>>> >>>>>> *2. About storage handlers.* >>>>>> >>>>>> We have developed the first tight integration between Cassandra and >>>>>> Spark, called Stratio Deep, announced in the first spark summit. You can >>>>>> check Stratio Deep out here: https://github.com/Stratio/stratio-deep >>>>>> (open, >>>>>> apache2 license). >>>>>> >>>>>> *Deep is a thin integration layer between Apache Spark and several >>>>>> NoSQL datastores. We actually support Apache Cassandra and MongoDB, but >>>>>> in >>>>>> the near future we will add support for sever other datastores.* >>>>>> >>>>>> Datastax have announce its own driver for spark in the last spark >>>>>> summit, but we have been working in our solution for almost a year. >>>>>> >>>>>> Furthermore, we are working to extend this solution in order to >>>>>> work also with other databases... MongoDB integration is completed right >>>>>> now and ElasticSearch will be ready in a few weeks. >>>>>> >>>>>> And that is not all, we have also developed an integration with >>>>>> Cassandra and Lucene for indexing data (open source, apache2). >>>>>> >>>>>> *Stratio Cassandra is a fork of Apache Cassandra >>>>>> <http://cassandra.apache.org/> where index functionality has been >>>>>> extended >>>>>> to provide near real time search such as ElasticSearch or Solr, >>>>>> including full text search >>>>>> <http://en.wikipedia.org/wiki/Full_text_search> capabilities and free >>>>>> multivariable search. It is achieved through an Apache Lucene >>>>>> <http://lucene.apache.org/> based implementation of Cassandra secondary >>>>>> indexes, where each node of the cluster indexes its own data.* >>>>>> >>>>>> >>>>>> We will publish some benchmarks in two weeks, so i will share our >>>>>> results here if you are interested. >>>>>> >>>>>> >>>>>> If you are more interested in distributed file systems, you should >>>>>> take a look on Tachyon: http://tachyon-project.org/index.html >>>>>> >>>>>> >>>>>> *3. Spark - Hive compatibility* >>>>>> >>>>>> Spark will support anything with the Hadoop InputFormat interface. >>>>>> >>>>>> >>>>>> *4. Performance* >>>>>> >>>>>> We are working a lot with Cassandra and mongoDB and the performance >>>>>> is quite nice. We are finishing right now some benchmarks comparing >>>>>> Hadoop >>>>>> + HDFS vs Spark + HDFS vs Spark + Cassandra (using stratio deep and even >>>>>> our fork of Cassandra). >>>>>> >>>>>> Let me please share this results with you when they were ready, ok? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Regards. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> 2014-08-22 7:53 GMT+02:00 Niranda Perera <nira...@wso2.com>: >>>>>> >>>>>> Hi Srinath, >>>>>>> Yes, I am working on deploying it on a multi-node cluster with the >>>>>>> debs dataset. I will keep architecture@ posted on the progress. >>>>>>> >>>>>>> >>>>>>> Hi David, >>>>>>> Thank you very much for the detailed insight you've provided. >>>>>>> Few quick questions, >>>>>>> 1. Do you have experiences in using storage handlers in Spark? >>>>>>> 2. Would a storage handler used in Hive, be directly compatible with >>>>>>> Spark? >>>>>>> 3. How do you grade the performance of Spark with other databases >>>>>>> such as Cassandra, HBase, H2, etc? >>>>>>> >>>>>>> Thank you very much again for your interest. Look forward to hearing >>>>>>> from you. >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> >>>>>>> On Thu, Aug 21, 2014 at 7:02 PM, Srinath Perera <srin...@wso2.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Niranda, we need test Spark in multi-node mode before making a >>>>>>>> decision. Spark is very fast, I think there is no doubt about that. We >>>>>>>> need >>>>>>>> to make sure it stable. >>>>>>>> >>>>>>>> David, thanks for a detailed email! How big (nodes) is the Spark >>>>>>>> setup you guys are running? >>>>>>>> >>>>>>>> --Srinath >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 21, 2014 at 1:34 PM, David Morales < >>>>>>>> dmora...@stratio.com> wrote: >>>>>>>> >>>>>>>>> Sorry for disturbing this thread, but i think that i can help >>>>>>>>> clarifying a few things (we were attending the last Spark Summit, we >>>>>>>>> were >>>>>>>>> also speakers there and we are working very close to spark) >>>>>>>>> >>>>>>>>> *> Hive/Shark and others benchmark* >>>>>>>>> >>>>>>>>> You can find a nice comparison and benchmark in this web: >>>>>>>>> https://amplab.cs.berkeley.edu/benchmark/ >>>>>>>>> >>>>>>>>> >>>>>>>>> *> Shark and SparkSQL* >>>>>>>>> >>>>>>>>> SparkSQL is the natural replacement for Shark, but SparkSQL is >>>>>>>>> still young at this moment. If you are looking for Hive >>>>>>>>> compatibility, you >>>>>>>>> have to execute SparkSQL with an specific context. >>>>>>>>> >>>>>>>>> Quoted from spark website: >>>>>>>>> >>>>>>>>> *> Note that Spark SQL currently uses a very basic SQL >>>>>>>>> parser. Users that want a more complete dialect of SQL should look at >>>>>>>>> the >>>>>>>>> HiveSQL support provided by HiveContext.* >>>>>>>>> >>>>>>>>> So, only note that SparkSQL is a work in progress. If you want >>>>>>>>> SparkSQL you have to run a SparkSQLContext, if you want Hive, you >>>>>>>>> will have >>>>>>>>> a different context... >>>>>>>>> >>>>>>>>> >>>>>>>>> *> Spark - Hadoop: the future* >>>>>>>>> >>>>>>>>> Most Hadoop distributions are including Spark: cloudera, >>>>>>>>> hortonworks, mapR... and contributing to migrate all the Hadoop >>>>>>>>> ecosystem >>>>>>>>> to Spark. >>>>>>>>> >>>>>>>>> Spark is a bit more than Map/Reduce... as you can read here: >>>>>>>>> http://gigaom.com/2014/06/28/4-reasons-why-spark-could-jolt-hadoop-into-hyperdrive/ >>>>>>>>> >>>>>>>>> >>>>>>>>> *> Spark Streaming / Spark SQL* >>>>>>>>> >>>>>>>>> Spark Streaming is built on Spark and it provides streaming >>>>>>>>> processing through an information abstraction called DStreams (a >>>>>>>>> collection >>>>>>>>> of RDDs in a window of time). >>>>>>>>> >>>>>>>>> There is some efforts in order to make SparkSQL compatible with >>>>>>>>> Spark Streaming (something similar to trident for storm), as you can >>>>>>>>> see >>>>>>>>> here: >>>>>>>>> >>>>>>>>> *StreamSQL (https://github.com/thunderain-project/StreamSQL >>>>>>>>> <https://github.com/thunderain-project/StreamSQL>) is a POC project >>>>>>>>> based >>>>>>>>> on Spark to combine the power of Catalyst and Spark Streaming, to >>>>>>>>> offer >>>>>>>>> people the ability to manipulate SQL on top of DStream as you wanted, >>>>>>>>> this >>>>>>>>> keep the same semantics with SparkSQL as offer a SchemaDStream on top >>>>>>>>> of >>>>>>>>> DStream. You don't need to do tricky thing like extracting rdd to >>>>>>>>> register >>>>>>>>> as a table. Besides other parts are the same as Spark.* >>>>>>>>> >>>>>>>>> So, you can apply a SQL in a data stream, but it is very simple at >>>>>>>>> the moment... you can expect a bunch of improvements in this matter >>>>>>>>> in the >>>>>>>>> next months (i guess that sparkSQL will work on Spark streaming >>>>>>>>> streams >>>>>>>>> before the end of this year). >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> *> Spark Streaming / Spark SQL and CEP* >>>>>>>>> >>>>>>>>> There is no relationship at this moment between (your absolutely >>>>>>>>> amazing) Siddhi CEP and Spark. As fas as i know, you are working in >>>>>>>>> doing >>>>>>>>> distributed CEP with Storm and Siddhi. >>>>>>>>> >>>>>>>>> We are currently working on doing an interactive cep built with >>>>>>>>> kafka + spark streaming + siddhi, with some features such as an API, >>>>>>>>> an >>>>>>>>> interactive shell, built-in statistics and auditing, built-in >>>>>>>>> functions >>>>>>>>> (save2cassandra, save2mongo, save2elasticsearch...). >>>>>>>>> >>>>>>>>> If you are interested we can talk about this project, i think that >>>>>>>>> it would be a nice idea¡ >>>>>>>>> >>>>>>>>> >>>>>>>>> Anyway, i don't think that SparkSQL will evolve in something like >>>>>>>>> a CEP. Patterns, sequences, for example would be very complex to do >>>>>>>>> with >>>>>>>>> spark streaming (at least now). >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 2014-08-21 6:18 GMT+02:00 Sriskandarajah Suhothayan <s...@wso2.com >>>>>>>>> >: >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Aug 20, 2014 at 1:36 PM, Niranda Perera <nira...@wso2.com >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> @Maninda, >>>>>>>>>>> >>>>>>>>>>> +1 for suggesting Spark SQL. >>>>>>>>>>> >>>>>>>>>>> Quote Databricks, >>>>>>>>>>> "Spark SQL provides state-of-the-art SQL performance and >>>>>>>>>>> maintains compatibility with Shark/Hive. In particular, like Shark, >>>>>>>>>>> Spark >>>>>>>>>>> SQL supports all existing Hive data formats, user-defined functions >>>>>>>>>>> (UDF), >>>>>>>>>>> and the Hive metastore." [1] >>>>>>>>>>> >>>>>>>>>>> But I am not entirely sure if Spark SQL and Siddhi is >>>>>>>>>>> comparable, because SparkSQL (like Hive) is designed for batch >>>>>>>>>>> processing, >>>>>>>>>>> where as Siddhi is real-time processing. But if there are >>>>>>>>>>> implementations >>>>>>>>>>> where Siddhi is run on top of Spark, it would be very interesting. >>>>>>>>>>> >>>>>>>>>> Yes Siddhi's current way of operation does not support this. But >>>>>>>>>> with partitions and we can achieve this to some extent. >>>>>>>>>> >>>>>>>>>> Suho >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Spark supports either Hadoop1 or 2. But I think we should see, >>>>>>>>>>> what is best, MR1 or YARN+MR2 >>>>>>>>>>> >>>>>>>>>>> [image: Hadoop Architecture] >>>>>>>>>>> [2] >>>>>>>>>>> >>>>>>>>>>> [1] >>>>>>>>>>> http://databricks.com/blog/2014/07/01/shark-spark-sql-hive-on-spark-and-the-future-of-sql-on-spark.html >>>>>>>>>>> [2] http://www.tomsitpro.com/articles/hadoop-2-vs-1,2-718.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Aug 20, 2014 at 1:13 PM, Lasantha Fernando < >>>>>>>>>>> lasan...@wso2.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Maninda, >>>>>>>>>>>> >>>>>>>>>>>> On 20 August 2014 12:02, Maninda Edirisooriya <mani...@wso2.com >>>>>>>>>>>> > wrote: >>>>>>>>>>>> >>>>>>>>>>>>> In the case of discontinuity of Shark project, IMO we should >>>>>>>>>>>>> not move to Shark at all. >>>>>>>>>>>>> And it seems better to go with Spark SQL as we are already >>>>>>>>>>>>> using Spark for CEP. But I am not sure the difference between >>>>>>>>>>>>> Spark SQL and >>>>>>>>>>>>> the Siddhi queries on the Spark engine. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Currently, we are doing integration with CEP using Apache >>>>>>>>>>>> Storm, not Spark... :-). Spark Streaming is a possible candidate >>>>>>>>>>>> for >>>>>>>>>>>> integrating with CEP, but we have opted with Storm. I think there >>>>>>>>>>>> has been >>>>>>>>>>>> some independent work on integrating Kafka + Spark Streaming + >>>>>>>>>>>> Siddhi. >>>>>>>>>>>> Please refer to thread on arch@ "[Architecture] A few >>>>>>>>>>>> questions about WSO2 CEP/Siddhi" >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> And we have to figure out how Spark SQL is used for historical >>>>>>>>>>>>> data, whether it can execute incremental processing by default >>>>>>>>>>>>> which will >>>>>>>>>>>>> implement all out existing BAM use cases. >>>>>>>>>>>>> On the other hand in Hadoop 2 [1] they are using a completely >>>>>>>>>>>>> different platform for resource allocation known as Yarn. >>>>>>>>>>>>> Sometimes this >>>>>>>>>>>>> may be more suitable for batch jobs. >>>>>>>>>>>>> >>>>>>>>>>>>> [1] https://www.youtube.com/watch?v=RncoVN0l6dc >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Lasantha >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> *Maninda Edirisooriya* >>>>>>>>>>>>> Senior Software Engineer >>>>>>>>>>>>> >>>>>>>>>>>>> *WSO2, Inc. *lean.enterprise.middleware. >>>>>>>>>>>>> >>>>>>>>>>>>> *Blog* : http://maninda.blogspot.com/ >>>>>>>>>>>>> *E-mail* : mani...@wso2.com >>>>>>>>>>>>> *Skype* : @manindae >>>>>>>>>>>>> *Twitter* : @maninda >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Aug 20, 2014 at 11:33 AM, Niranda Perera < >>>>>>>>>>>>> nira...@wso2.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Anjana and Srinath, >>>>>>>>>>>>>> >>>>>>>>>>>>>> After the discussion I had with Anjana, I researched more on >>>>>>>>>>>>>> the continuation of Shark project by Databricks. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Here's what I found out, >>>>>>>>>>>>>> - Shark was built on the Hive codebase and achieved >>>>>>>>>>>>>> performance improvements by swapping out the physical execution >>>>>>>>>>>>>> engine part >>>>>>>>>>>>>> of Hive. While this approach enabled Shark users to speed up >>>>>>>>>>>>>> their Hive >>>>>>>>>>>>>> queries, Shark inherited a large, complicated code base from >>>>>>>>>>>>>> Hive that made >>>>>>>>>>>>>> it hard to optimize and maintain. >>>>>>>>>>>>>> Hence, Databricks has announced that they are halting the >>>>>>>>>>>>>> development of Shark from July, 2014. (Shark 0.9 would be the >>>>>>>>>>>>>> last release) >>>>>>>>>>>>>> [1] >>>>>>>>>>>>>> - Shark will be replaced by Spark SQL. It beats Shark in TPC-DS >>>>>>>>>>>>>> performance >>>>>>>>>>>>>> <http://databricks.com/blog/2014/06/02/exciting-performance-improvements-on-the-horizon-for-spark-sql.html> >>>>>>>>>>>>>> by almost an order of magnitude. It also supports all existing >>>>>>>>>>>>>> Hive data >>>>>>>>>>>>>> formats, user-defined functions (UDF), and the Hive metastore. >>>>>>>>>>>>>> [2] >>>>>>>>>>>>>> - Following is the Shark, Spark SQL migration plan >>>>>>>>>>>>>> http://spark-summit.org/wp-content/uploads/2014/07/Future-of-Spark-Patrick-Wendell.pdf >>>>>>>>>>>>>> >>>>>>>>>>>>>> - For the legacy Hive and MapReduce users, they have proposed >>>>>>>>>>>>>> a new 'Hive on Spark Project' [3], [4] >>>>>>>>>>>>>> But, given the performance enhancement, it is quite certain >>>>>>>>>>>>>> that Hive and MR would be replaced by engines build on top of >>>>>>>>>>>>>> Spark (ex: >>>>>>>>>>>>>> Spark SQL) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> In my opinion there are a few matters to figure out if we are >>>>>>>>>>>>>> migrating from Hive, >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. whether we are changing the query engine only? (Then, we >>>>>>>>>>>>>> can replace Hive by Shark) >>>>>>>>>>>>>> 2. whether we are changing the existing Hadoop/ MapReduce >>>>>>>>>>>>>> framework to Spark? (Then we can replace Hive and Hadoop with >>>>>>>>>>>>>> Spark and >>>>>>>>>>>>>> Spark SQL) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> In my opinion, considering the longterm impact and the >>>>>>>>>>>>>> availability of support, it is best to migrate the Hive/Hadoop >>>>>>>>>>>>>> to Spark. >>>>>>>>>>>>>> It is open for discussion! >>>>>>>>>>>>>> >>>>>>>>>>>>>> In the mean time, I've already tried Spark SQL, and >>>>>>>>>>>>>> Databricks claims on improved performance seems to be true. I >>>>>>>>>>>>>> will work >>>>>>>>>>>>>> more on this. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] >>>>>>>>>>>>>> http://databricks.com/blog/2014/07/01/shark-spark-sql-hive-on-spark-and-the-future-of-sql-on-spark.html >>>>>>>>>>>>>> [2] >>>>>>>>>>>>>> http://databricks.com/blog/2014/06/02/exciting-performance-improvements-on-the-horizon-for-spark-sql.html >>>>>>>>>>>>>> [3] https://issues.apache.org/jira/browse/HIVE-7292 >>>>>>>>>>>>>> [4] >>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 12:16 PM, Anjana Fernando < >>>>>>>>>>>>>> anj...@wso2.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Srinath, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> No, this has not been tested in multiple nodes. I told >>>>>>>>>>>>>>> Niranda here in my last mail, to test a cluster with the same >>>>>>>>>>>>>>> set of >>>>>>>>>>>>>>> hardware we have, that we are using to test our large data set >>>>>>>>>>>>>>> with Hive. >>>>>>>>>>>>>>> As for the effort to make the change, we still have to figure >>>>>>>>>>>>>>> out the MT >>>>>>>>>>>>>>> aspects of Shark here. Sinthuja was working on making the >>>>>>>>>>>>>>> latest Hive >>>>>>>>>>>>>>> version MT ready, and most probably, we can do the same changes >>>>>>>>>>>>>>> to the Hive >>>>>>>>>>>>>>> version Shark is using. So after we do that, the integration >>>>>>>>>>>>>>> should be >>>>>>>>>>>>>>> seamless. And also, as I mentioned earlier here, we are also >>>>>>>>>>>>>>> going to test >>>>>>>>>>>>>>> this with the APIM Hive script, to check if there are any >>>>>>>>>>>>>>> unforeseen >>>>>>>>>>>>>>> incompatibilities. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>> Anjana. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Aug 14, 2014 at 11:53 AM, Srinath Perera < >>>>>>>>>>>>>>> srin...@wso2.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This look great. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We need to test Spark with multiple nodes? Did we do that. >>>>>>>>>>>>>>>> Please create few VMs in performance could (talk to Lakmal) >>>>>>>>>>>>>>>> and test with >>>>>>>>>>>>>>>> at least 5 nodes. We need to make sure it works OK with >>>>>>>>>>>>>>>> distributed setup >>>>>>>>>>>>>>>> as well. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> What does it take to change to spark? Anjana .. how much >>>>>>>>>>>>>>>> work is it? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> --Srinath >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, Aug 13, 2014 at 7:06 PM, Niranda Perera < >>>>>>>>>>>>>>>> nira...@wso2.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank you Anjana. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yes, I am working on it. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In the mean time, I found this in Hive documentation [1]. >>>>>>>>>>>>>>>>> It talks about Hive on Spark, and compares Hive, Shark and >>>>>>>>>>>>>>>>> Spark SQL at an >>>>>>>>>>>>>>>>> higher architectural level. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Additionally, it is said that the in-memory performance of >>>>>>>>>>>>>>>>> Shark can be improved by introducing Tachyon [2]. I guess we >>>>>>>>>>>>>>>>> can consider >>>>>>>>>>>>>>>>> this later on. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cheers. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark#HiveonSpark-1.3ComparisonwithSharkandSparkSQL >>>>>>>>>>>>>>>>> [2] >>>>>>>>>>>>>>>>> http://tachyon-project.org/Running-Tachyon-Locally.html >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, Aug 13, 2014 at 3:17 PM, Anjana Fernando < >>>>>>>>>>>>>>>>> anj...@wso2.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Niranda, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Excellent analysis of Hive vs Shark! .. This gives a lot >>>>>>>>>>>>>>>>>> of insight into how both operates in different scenarios. As >>>>>>>>>>>>>>>>>> the next step, >>>>>>>>>>>>>>>>>> we will need to run this in an actual cluster of computers. >>>>>>>>>>>>>>>>>> Since you've >>>>>>>>>>>>>>>>>> used a subset of the dataset of 2014 DEBS challenge, we >>>>>>>>>>>>>>>>>> should use the full >>>>>>>>>>>>>>>>>> data set in a clustered environment and check this. Gokul is >>>>>>>>>>>>>>>>>> already >>>>>>>>>>>>>>>>>> working on the Hive based setup for this, after that is >>>>>>>>>>>>>>>>>> done, you can >>>>>>>>>>>>>>>>>> create a Shark cluster in the same hardware and run the >>>>>>>>>>>>>>>>>> tests there, to get >>>>>>>>>>>>>>>>>> a clear comparison on how these two match up in a cluster. >>>>>>>>>>>>>>>>>> Until the setup >>>>>>>>>>>>>>>>>> is ready, do continue with your next steps on checking the >>>>>>>>>>>>>>>>>> RDD support and >>>>>>>>>>>>>>>>>> Spark SQL use. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> After these are done, we should also do a trial run of >>>>>>>>>>>>>>>>>> our own APIM Hive scripts, migrated to Shark. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>> Anjana. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, Aug 11, 2014 at 12:21 PM, Niranda Perera < >>>>>>>>>>>>>>>>>> nira...@wso2.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have been evaluating the performance of >>>>>>>>>>>>>>>>>>> Shark (distributed SQL query engine for Hadoop) against >>>>>>>>>>>>>>>>>>> Hive. This is with >>>>>>>>>>>>>>>>>>> the objective of seeing the possibility to move the WSO2 >>>>>>>>>>>>>>>>>>> BAM data >>>>>>>>>>>>>>>>>>> processing (which currently uses Hive) to Shark (and Apache >>>>>>>>>>>>>>>>>>> Spark) for >>>>>>>>>>>>>>>>>>> improved performance. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I am sharing my findings herewith. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> *AMP Lab Shark* >>>>>>>>>>>>>>>>>>> Shark can execute Hive QL queries up to 100 times faster >>>>>>>>>>>>>>>>>>> than Hive without any modification to the existing data or >>>>>>>>>>>>>>>>>>> queries. It >>>>>>>>>>>>>>>>>>> supports Hive's QL, metastore, serialization formats, and >>>>>>>>>>>>>>>>>>> user-defined >>>>>>>>>>>>>>>>>>> functions, providing seamless integration with existing >>>>>>>>>>>>>>>>>>> Hive deployments >>>>>>>>>>>>>>>>>>> and a familiar, more powerful option for new ones. [1] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> *Apache Spark*Apache Spark is an open-source data >>>>>>>>>>>>>>>>>>> analytics cluster computing framework. It fits into the >>>>>>>>>>>>>>>>>>> Hadoop open-source >>>>>>>>>>>>>>>>>>> community, building on top of the HDFS and promises >>>>>>>>>>>>>>>>>>> performance up to 100 >>>>>>>>>>>>>>>>>>> times faster than Hadoop MapReduce for certain >>>>>>>>>>>>>>>>>>> applications. [2] >>>>>>>>>>>>>>>>>>> Official documentation: [3] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I carried out the comparison between the following Hive >>>>>>>>>>>>>>>>>>> and Shark releases with input files ranging from 100 to 1 >>>>>>>>>>>>>>>>>>> billion entries. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> QL Engine >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Apache Hive 0.11 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Shark Shark 0.9.1 (Latest release) which uses, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Scala 2.10.3 >>>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Spark 0.9.1 >>>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> AMPLab’s Hive 0.9.0 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Framework >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hadoop 1.0.4 >>>>>>>>>>>>>>>>>>> Spark 0.9.1 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> File system >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> HDFS >>>>>>>>>>>>>>>>>>> HDFS >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Attached herewith is a report which describes in detail >>>>>>>>>>>>>>>>>>> about the performance comparison between Shark and Hive. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> hive_vs_shark >>>>>>>>>>>>>>>>>>> <https://docs.google.com/a/wso2.com/folderview?id=0B1GsnfycTl32QTZqUktKck1Ucjg&usp=drive_web> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> hive_vs_shark_report.odt >>>>>>>>>>>>>>>>>>> <https://docs.google.com/a/wso2.com/file/d/0B1GsnfycTl32X3J5dTh6Slloa0E/edit?usp=drive_web> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In summary, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> From the evaluation, following conclusions can be >>>>>>>>>>>>>>>>>>> derived. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - Shark is indifferent to Hive in DDL operations >>>>>>>>>>>>>>>>>>> (CREATE, DROP .. TABLE, DATABASE). Both engines show a >>>>>>>>>>>>>>>>>>> fairly constant >>>>>>>>>>>>>>>>>>> performance as the input size increases. >>>>>>>>>>>>>>>>>>> - Shark is indifferent to Hive in DML operations >>>>>>>>>>>>>>>>>>> (LOAD, INSERT) but when a DML operation is called in >>>>>>>>>>>>>>>>>>> conjuncture of a data >>>>>>>>>>>>>>>>>>> retrieval operation (ex. INSERT <TBL> SELECT <PROP> FROM >>>>>>>>>>>>>>>>>>> <TBL>), Shark >>>>>>>>>>>>>>>>>>> significantly over-performs Hive with a performance >>>>>>>>>>>>>>>>>>> factor of 10x+ (Ranging >>>>>>>>>>>>>>>>>>> from 10x to 80x in some instances). Shark performance >>>>>>>>>>>>>>>>>>> factor reduces with >>>>>>>>>>>>>>>>>>> the input size increases, while HIVE performance is >>>>>>>>>>>>>>>>>>> fairly indifferent. >>>>>>>>>>>>>>>>>>> - Shark clearly over-performs Hive in Data Retrieval >>>>>>>>>>>>>>>>>>> operations (FILTER, ORDER BY, JOIN). Hive performance is >>>>>>>>>>>>>>>>>>> fairly indifferent >>>>>>>>>>>>>>>>>>> in the data retrieval operations while Shark performance >>>>>>>>>>>>>>>>>>> reduces as the >>>>>>>>>>>>>>>>>>> input size increases. But at every instance Shark >>>>>>>>>>>>>>>>>>> over-performed Hive with >>>>>>>>>>>>>>>>>>> a minimum performance factor of 5x+ (Ranging from 5x to >>>>>>>>>>>>>>>>>>> 80x in some >>>>>>>>>>>>>>>>>>> instances). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please refer the 'hive_vs_shark_report', it has all the >>>>>>>>>>>>>>>>>>> information about the queries and timings pictographically. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The code repository can also be found in >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> https://github.com/nirandaperera/hiveToShark/tree/master/hiveVsShark >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Moving forward, I am currently working on the following. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - Apache Spark's resilient distributed dataset (RDD) >>>>>>>>>>>>>>>>>>> abstraction (which is a collection of elements >>>>>>>>>>>>>>>>>>> partitioned across the nodes >>>>>>>>>>>>>>>>>>> of the cluster that can be operated on in parallel). The >>>>>>>>>>>>>>>>>>> use of RDDs and >>>>>>>>>>>>>>>>>>> its impact to the performance. >>>>>>>>>>>>>>>>>>> - Spark SQL - Use of this Spark SQL over Shark on >>>>>>>>>>>>>>>>>>> Spark framework >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> [1] https://github.com/amplab/shark/wiki >>>>>>>>>>>>>>>>>>> [2] http://en.wikipedia.org/wiki/Apache_Spark >>>>>>>>>>>>>>>>>>> [3] http://spark.apache.org/docs/latest/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Would love to have your feedback on this. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Best regards >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> *Niranda Perera* >>>>>>>>>>>>>>>>>>> Software Engineer, WSO2 Inc. >>>>>>>>>>>>>>>>>>> Mobile: +94-71-554-8430 >>>>>>>>>>>>>>>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> *Anjana Fernando* >>>>>>>>>>>>>>>>>> Senior Technical Lead >>>>>>>>>>>>>>>>>> WSO2 Inc. | http://wso2.com >>>>>>>>>>>>>>>>>> lean . enterprise . middleware >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> *Niranda Perera* >>>>>>>>>>>>>>>>> Software Engineer, WSO2 Inc. >>>>>>>>>>>>>>>>> Mobile: +94-71-554-8430 >>>>>>>>>>>>>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> ============================ >>>>>>>>>>>>>>>> Srinath Perera, Ph.D. >>>>>>>>>>>>>>>> http://people.apache.org/~hemapani/ >>>>>>>>>>>>>>>> http://srinathsview.blogspot.com/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> *Anjana Fernando* >>>>>>>>>>>>>>> Senior Technical Lead >>>>>>>>>>>>>>> WSO2 Inc. | http://wso2.com >>>>>>>>>>>>>>> lean . enterprise . middleware >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Niranda Perera* >>>>>>>>>>>>>> Software Engineer, WSO2 Inc. >>>>>>>>>>>>>> Mobile: +94-71-554-8430 >>>>>>>>>>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> Architecture mailing list >>>>>>>>>>>>> Architecture@wso2.org >>>>>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Lasantha Fernando* >>>>>>>>>>>> Software Engineer - Data Technologies Team >>>>>>>>>>>> WSO2 Inc. http://wso2.com >>>>>>>>>>>> >>>>>>>>>>>> email: lasan...@wso2.com >>>>>>>>>>>> mobile: (+94) 71 5247551 >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> *Niranda Perera* >>>>>>>>>>> Software Engineer, WSO2 Inc. >>>>>>>>>>> Mobile: +94-71-554-8430 >>>>>>>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Architecture mailing list >>>>>>>>>>> Architecture@wso2.org >>>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> *S. Suhothayan* >>>>>>>>>> Technical Lead & Team Lead of WSO2 Complex Event Processor >>>>>>>>>> *WSO2 Inc. *http://wso2.com >>>>>>>>>> * <http://wso2.com/>* >>>>>>>>>> lean . enterprise . middleware >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *cell: (+94) 779 756 757 <%28%2B94%29%20779%20756%20757> | blog: >>>>>>>>>> http://suhothayan.blogspot.com/ <http://suhothayan.blogspot.com/> >>>>>>>>>> twitter: >>>>>>>>>> http://twitter.com/suhothayan <http://twitter.com/suhothayan> | >>>>>>>>>> linked-in: >>>>>>>>>> http://lk.linkedin.com/in/suhothayan >>>>>>>>>> <http://lk.linkedin.com/in/suhothayan>* >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Architecture mailing list >>>>>>>>>> Architecture@wso2.org >>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Architecture mailing list >>>>>>>>> Architecture@wso2.org >>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> ============================ >>>>>>>> Srinath Perera, Ph.D. >>>>>>>> http://people.apache.org/~hemapani/ >>>>>>>> http://srinathsview.blogspot.com/ >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Architecture mailing list >>>>>>>> Architecture@wso2.org >>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Niranda Perera* >>>>>>> Software Engineer, WSO2 Inc. >>>>>>> Mobile: +94-71-554-8430 >>>>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> *Niranda Perera* >>>>> Software Engineer, WSO2 Inc. >>>>> Mobile: +94-71-554-8430 >>>>> Twitter: @n1r44 <https://twitter.com/N1R44> >>>>> >>>> >>>> >>> >>> >>> -- >>> *Niranda Perera* >>> Software Engineer, WSO2 Inc. >>> Mobile: +94-71-554-8430 >>> Twitter: @n1r44 <https://twitter.com/N1R44> >>> >> >> >> >> -- >> >> David Morales de Frías :: +34 607 010 411 :: @dmoralesdf >> <https://twitter.com/dmoralesdf> >> >> >> <http://www.stratio.com/> >> Avenida de Europa, 26. Ática 5. 2ª Planta >> 28224 Pozuelo de Alarcón, Madrid >> Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>* >> > > > -- > *Niranda Perera* > Software Engineer, WSO2 Inc. > Mobile: +94-71-554-8430 > Twitter: @n1r44 <https://twitter.com/N1R44> > -- David Morales de Frías :: +34 607 010 411 :: @dmoralesdf <https://twitter.com/dmoralesdf> <http://www.stratio.com/> Avenida de Europa, 26. Ática 5. 2ª Planta 28224 Pozuelo de Alarcón, Madrid Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>*
_______________________________________________ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture