Hi Iker and welcome! It's nice to have more ppl being involved into the project and bringing in new ideas, feedback and code!
I'd like to touch on a couple of differences between Ignite and Spark, but I am sure other ppl will add their views as well. - The main different is, of course, that Ignite is in-memory computing system, e.g. the one that treats RAM as primary storage facility. Where's others - Spark included - only use RAM for precessing. - Ignite's mapreduce is fully compatibly with Hadoop MR APIs which let everyone to simply reuse existing legacy MR code yet run it with >30x performance improvement. - Also, unlike Spark's the streaming in Ignite isn't quantified by the size of RDD. In other words, you don't need to form an RDD first before processing it; you can actually do the real streaming. - Unlike Spark Ignite doesn't have the issue with data spil-overs to the disk (which was attempted to be addressed with Tachyon) - as one of the components, Ignite provides the first-class citizen file-system caching layer. Note, there's a Tachyon project and I have already addressed the differences between that and Ignite in [1], but looks like my post got deleted for some reason. I wonder why? ;) [2] - Ignite's uses off-heap memory to avoid GC pauses, etc. and does it highly efficiently. - Ignite guarantees strong consistency - Ignite supports full SQL99 as one of the ways to process the data w/ full support for ACID transactions (as you have pointed out) - with Ignite a Java programmer shouldn't learn new ropes of Scala. And I will withhold my my professional opinion about the latter in order to keep this threat polite and concise ;) I can keep on rumbling for a long time, but you might consider reading [3] and [4], where Nikita Ivanov - one of the founders of this project - has a good reflection on key differences. [1] http://bit.ly/1JvTAB6 [2] https://twitter.com/c0sin/status/592825217606688768 [3] http://www.infoq.com/articles/gridgain-apache-ignite [4] http://www.odbms.org/blog/2015/02/interview-nikita-ivanov/ Hope it helps to clarify the differences a bit. Cos On Mon, Apr 27, 2015 at 05:20PM, Iker Huerga wrote: > Hi Ignite team, > > My name is Iker Huerga, I'm a Software Engineer, Data Scientist and > entrepreneur with more than 8 years of experience in Java, I was a > Lucene/Solr contributor in the past, and have been using Hadoop in > production for more than 3 years now. > > After being contacted by one the members of this community I got intriged > by the project you guys are working on. I took a look at the code and > documentation, and would like to say 'kudos' to all of you. It's clear that > there is a huge amount of work behind Ignite. > > I would like to see whether I can be a contributor to Ignite, but there's > been a question in the back of my mind since I started reading about > Ignite, what is the main difference with Apache Spark? > > Please note that I've already read the proposal [1], and I get the point > that Ignite is a more general in-memory engine. But Spark also provide > streaming processing, mapreduce computations, etc. Would you say the main > difference is ACID trx in memory? > > Also, what is the route map for Ignite? Is it production ready? > > Sorry for so many questions..... in exchange of an answer I can take care > of https://issues.apache.org/jira/browse/IGNITE-640 if you guys want to > assign it to me > > Thanks in advance! > Iker > > > [1] https://wiki.apache.org/incubator/IgniteProposal > > -- > Iker Huerga > http://www.ikerhuerga.com/ > ᐧ
