HI Michael Thanks for this, I tried breaking everything down into a limit of 1,000 still takes forever to run. Do you know of anyway to create a repeating script or loop that can parse a csv file with each permutation per line, each answer is a different column, concatenate these numbers into various queries and repeat till the csv file has finished? I have mapped out all the potentials combinations for 8 parameters and creating individual relationships based on a receiving line(array) from the csv file for example
quA, quB, quC, quD, quE, quF, quG, quH, 1, 1 quA, quB, quC, quD, quE, quF, quG, 1, quI, 1 quA, quB, quC, quD, quE, quF, 1, quH, quI, 1 though to.... 5, 5, quC, quD, quE, quF, quG, quH, quI, quJ *(where each qu* represents a column in the csv file)* *and the merge query would look ike* MATCH (a1:Profile) MATCH (b1:Profile) WHERE a1.profileID = 1111111111 AND b1.profileId = 1111111122 MERGE (a1)-[rel:SIMILAR]-(b1) ON CREATE SET rel.strength = 8 There are 720 of these in total and if I could parse in each of the 562500 into this as a batch, it would probably work and not cause me a bunch of headaches, so I can then get on with testing the ideas behind the application. Being on a self-educating path is really showing it's limitations now. Dave On Thursday, 23 March 2017 21:30:47 UTC, Michael Hunger wrote: > > Hi Dave, > > would be good to look at a sample first of all: > > you should create about 10k-100k relationships per transaction. > > For "joining" nodes which is not an optimized graph operation, you should > have at least the very selective properties to be indexed. > > Before running the queries I suggest to use EXPLAIN / PROFILE > > e.g. > > MATCH (a1:Profile), (b1:Profile) > WHERE a1.profileID < b1.profileId AND a1.quA = b1.quA AND a1.quB = b1.quB > AND a1.quC = b1.quC AND a1.quD = b1.quD AND a1.quE = b1.quE AND a1.quF = > b1.quF AND a1.quG = b1.quG > CREATE UNIQUE (a1)-[:SIMILAR {strength: 7} ]->(b1) > > PROFILE / EXPLAIN > MATCH (a1:Profile) > WITH a1 LIMIT 1000 // sample > MATCH (b1:Profile) > WHERE a1.profileID < b1.profileId AND a1.quA = b1.quA AND a1.quB = b1.quB > AND a1.quC = b1.quC AND a1.quD = b1.quD AND a1.quE = b1.quE AND a1.quF = > b1.quF AND a1.quG = b1.quG > MERGE (a1)-[rel:SIMILAR]-(b1) ON CREATE SET rel.strength = 7 > > you should at least see one index lookup for b1 best if it was the most > selective property. > > Michael > > > On Thu, Mar 23, 2017 at 3:35 PM, Dave Clissold <dave.cli...@gmail.com > <javascript:>> wrote: > >> I am fairly new to programming and this is my first time using graph >> databases, Cypher and Neo4J, I am learning as I go, testing to see if each >> stage is a viable route to final development and trying to gain enough of a >> basic understanding of each element needed for the application, so I >> can hire and communicate with a full time team, as well as be able to do >> grunt work when needed, rather than be the entrepreneur who has no clue >> about what is happening and just expects things to happen. Any assistance >> would be greatly appreciated. >> >> I am trying to create a database which will allow users with similar >> profiles to match. They have answered questions and have been able to >> create the nodes that would represent each profile possibility by assigning >> a numerical value to each answer, so I have. >> >> :Profile >> quA: 1, quB: 1,quC: 1, quD: 1, quE: 1, quF: 1, quG: 1, quH: 1, quI: 1, >> quJ: 1 >> .... >> all the way to >> .... >> quA: 5, quB: 5,quC: 5, quD: 5, quE: 5, quF: 5, quG: 3, quH: 3, quI: 2, >> quJ: 2 >> >> where each numerical value is stored as an integer, this has resulted in >> 562500 nodes imported by CSV this created a 515Mb database. I have also >> concatenated the answers to create a unique ID for each node so that I can >> run the following query. >> >> MATCH (a1:Profile), (b1:Profile) >> WHERE a1.profileID < b1.profileId AND a1.quA = b1.quA AND a1.quB = b1.quB >> AND a1.quC = b1.quC AND a1.quD = b1.quD AND a1.quE = b1.quE AND a1.quF = >> b1.quF AND a1.quG = b1.quG >> CREATE UNIQUE (a1)-[:SIMILAR {strength: 7} ]->(b1) >> >> >> and so on so that I have every combination of 7 parameters matching up to >> 9 parameters matching. I know that will eventually create 175 relationships >> per node so a massive total of 98,437,500 relationships. >> >> >> Have set this up in a docker container on a google compute 8core 52Gb >> (the max on the free trial option), with a 65500MB heap size, (based on the >> calculator). >> >> I am trying to find out if there is a more efficient way to create these >> relationships, as on this setup, I have tried running the 1st query, >> above), it has currently taken over 5 hours and has not finished, . Can >> anyone suggest a better query or workflow to create such a large number of >> relationships? The last thing I want to do is try and create individual >> relationships and input them, unless someone can suggest a way of doing >> this via a script and to send the queries via json. >> >> Regards >> >> >> Dave >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to neo4j+un...@googlegroups.com <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.