Hi Shahani, We didn't try this feature using Graph databases yet. But Hope they will give better performance over Cassandra, because they are naturally designed to answer these kind of problems. If we are going to support this kind of relation search using Cassandra we have to do some calculations and indexing for each data-products when retrieving and storing informations. But this would be a good research and we will do some tests comparing Cassandra and a graph database.
We did some research on several graph databases. But selecting a graph database may be a problem, when considering the license issues and community support. We can't use Neo4J because it licensed under GPLv3 and conflict with apache license [0] and found some problem with some others databases as well. Orientdb [1][2] may be a good option, because It is licensed under Apache license v2.0 and seems to have some good community support as well. If we are going to use a graph database with MetCat it will become a dependency of the MetCat. Also using two database (i.e Cassandra to store metadata and graph database to store relations) may be a problem. So We need your suggestions on this issue. Thanks, Hasitha [0] - http://www.apache.org/licenses/GPL-compatibility.html [1] - http://www.orientdb.org/orient-db.htm [2] - http://code.google.com/p/orient/ On Sat, Aug 18, 2012 at 7:01 AM, Shahani Markus Weerawarana < [email protected]> wrote: > Hi Hasitha, > > This would be an interesting exploration. > Have you tried out your idea with something like Neo4J? Have you come > across any performance comparison articles/papers of Graph DBs such as > Neo4J, FlockDB, InfiniteGraph with Column DBs such as Cassandra? > > Shahani > > On Fri, Aug 17, 2012 at 8:24 PM, Hasitha Aravinda > <[email protected]>wrote: > > > Hi Devs, > > > > In MetCat, it is a requirement to store relationships between data > entities > > and query about them. It might not be possible to achieve optimum > > efficiency for query results since we are using non-relational database, > > i.e. Cassandra. > > > > As an example take following entities A,B,C,D,E... etc. * If > > A->C->B->D->E and B->F->G->K* > > We might want to know whether there exists a relationship between A and F > > (That is path between A and F) > > > > In real world there may be thousands of entities and relationships which > > may degrade the query efficiency well if we used conventional databases > to > > address such requirements. These kind of requirements are hardly filled > by > > Cassandra but Graph databases. > > > > So for the above requirement, we might need to use graph database too. > We'd > > like to know your opinions on this. > > > > Thanks, > > Hasitha. > > > > > > -- > *Shahani Markus Weerawarana, Ph.D.* > *Computer Scientist* > Visiting Lecturer, University of Moratuwa, Sri Lanka. > Visiting Scientist, Indiana University, USA. >
