The first query returns 999996 which is the number of rows in the file and the second one returns Neo.DatabaseError.Statement.ExecutionFailure probably because of the null values. But then I run the following command: LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c MATCH (city:City { Id: toInt(c.CityId)}) WHERE coalesce(c.CityId,"") <> "" RETURN count(*)
and I get 992980 marți, 17 iunie 2014, 17:55:56 UTC+3, Michael Hunger a scris: > No you can just filter out the lines with no cityid > > Did you run my suggested commands? > > LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" AS >>> c >>> MATCH (client: Client { Id: toInt(c.Id)}) >>> >>> RETURN count(*) >>> >>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" >>> AS c >>> MATCH (city: City { Id: toInt(c.CityId)}) >>> >>> RETURN count(*) >>> >> >>> >> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" >>> AS c >>> >>> return c > limit 10 > > >>> Am 17.06.2014 um 16:37 schrieb Paul Damian <paulda...@gmail.com > <javascript:>>: > > in the file I only have 2 columns, one for client id, which is always not > null and CityId, which may be sometimes null. Should I export the records > from SQL database leaving out the columns that contain null values? > > marți, 17 iunie 2014, 15:39:14 UTC+3, Michael Hunger a scris: >> >> if they don't have a value for city id, do they then have empty columns >> there still? like "user-id,, >> >> You probably want to filter these rows? >> >> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" >>> AS c >>> >>> WHERE coalesce(c.CitiId,"") <> "" >> ... >> >> Am 17.06.2014 um 11:23 schrieb Paul Damian <paulda...@gmail.com>: >> >> Well, the csv file contains some rows that do not have a value for >> CityId, and the rows are unique regarding the clientID. There are 11M >> clients living in 14K Cities. Is there a limit of links/node? >> Now I've created a piece of code that reads from file and creates each >> relationship, but, as you can imagine, it works really slow in this >> scenario. >> >> >>> did you create an index on :Client(Id) and :City(Id) >>> >>> what happens if you do: >>> >>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" >>> AS c >>> MATCH (client: Client { Id: toInt(c.Id)}) >>> >>> RETURN count(*) >>> >>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" >>> AS c >>> MATCH (city: City { Id: toInt(c.CityId)}) >>> >>> RETURN count(*) >>> >>> each count should be equivalent to the # of rows in the file. >>> >>> Michael >>> >>> Am 16.06.2014 um 17:47 schrieb Paul Damian <paulda...@gmail.com>: >>> >>> Somehow I've managed to load all the nodes and now I'm trying to load >>> the links as well. I read the nodes from csv file and create the relation >>> between them. I run the following command: >>> USING PERIODIC COMMIT 100 >>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" >>> AS c >>> MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: >>> toInt(c.CityId)}) >>> CREATE (client)-[r:LOCATED_IN]->(city) >>> >>> Running with a smaller commit size returns this error >>> Neo.DatabaseError.Statement.ExecutionFailure, while increasing the >>> commit size to 10000 throws Neo.DatabaseError.General.UnknownFailure. >>> Can you help me with this? >>> >>> >>> joi, 5 iunie 2014, 12:05:18 UTC+3, Michael Hunger a scris: >>>> >>>> Perhaps something with field or line terminators? >>>> >>>> I assume it blows up the field separation. >>>> >>>> Try to run: >>>> >>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Client.csv" AS >>>> c >>>> RETURN { Id: toInt(c.Id), FirstName: c.FirstName, LastName: c.Lastname, >>>> Address: c.Address, ZipCode: toInt(c.ZipCode), Email: c.Email, Phone: >>>> c.Phone, Fax: c.Fax, BusinessName: c.BusinessName, URL: c.URL, Latitude: >>>> toFloat(c.Latitude), Longitude: toFloat(c.Longitude), AgencyId: >>>> toInt(c.AgencyId), RowStatus: toInt(c.RowStatus)} as data, c as line >>>> LIMIT 3 >>>> >>>> >>>> >>>> On Thu, Jun 5, 2014 at 10:51 AM, Paul Damian <paulda...@gmail.com> >>>> wrote: >>>> >>>>> I've tried using the shell and I get the same results: nodes with no >>>>> properties. >>>>> I've created the csv file using MsSQL Server Export. Is it relevant? >>>>> >>>>> About you curiosity: I figured I would import first the nodes, then >>>>> the relationships from the connection tables. Am I doing it wrong? >>>>> >>>>> Thanks >>>>> >>>>> joi, 5 iunie 2014, 09:54:31 UTC+3, Michael Hunger a scris: >>>>>> >>>>>> I'd probably use a commit size in your case of 50k or 100k. >>>>>> >>>>>> Try to use the neo4j-shell and not the web-interface. >>>>>> >>>>>> Connect to neo4j using bin/neo4j-shell >>>>>> >>>>>> Then run your commands ending with a semicolon. >>>>>> >>>>>> Just curious: Your data is imported as one node per row? That's not >>>>>> really a graph structure. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jun 4, 2014 at 6:56 PM, Paul Damian <paulda...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi there, >>>>>>> >>>>>>> I'm experimenting with Neo4j while benchmarking a bunch of NoSQL >>>>>>> databases for my graduation paper. >>>>>>> I'm using the web interface to populate the database. I've been able >>>>>>> to load the smaller tables from my SQL database and LOAD CSV works fine. >>>>>>> By small, I mean a few columns (4-5) and some rows (1 million). >>>>>>> However, when I try to upload a larger table (15 columns, 12 million >>>>>>> rows), >>>>>>> it creates the nodes but it doesn't set any properties. >>>>>>> I've tried to reduce the number of records (to 100) and also the >>>>>>> number of columns( just the Id property ), but no luck so far. >>>>>>> >>>>>>> The cypher command used is this one >>>>>>> USING PERIODIC COMMIT 100 >>>>>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Client.csv" >>>>>>> AS c >>>>>>> CREATE (:Client { Id: toInt(c.Id), FirstName: c.FirstName, LastName: >>>>>>> c.Lastname, Address: c.Address, ZipCode: toInt(c.ZipCode), Email: >>>>>>> c.Email, >>>>>>> Phone: c.Phone, Fax: c.Fax, BusinessName: c.BusinessName, URL: c.URL, >>>>>>> Latitude: toFloat(c.Latitude), Longitude: toFloat(c.Longitude), >>>>>>> AgencyId: >>>>>>> toInt(c.AgencyId), RowStatus: toInt(c.RowStatus)}) >>>>>>> >>>>>>> Any help and indication is welcomed, >>>>>>> Paul >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Neo4j" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to neo4j+un...@googlegroups.com. >>>>>>> >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Neo4j" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to neo4j+un...@googlegroups.com. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to neo4j+un...@googlegroups.com. >>> For more options, visit https://groups.google.com/d/optout. >>> >>> >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to neo4j+un...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> >> >> > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to neo4j+un...@googlegroups.com <javascript:>. > For more options, visit https://groups.google.com/d/optout. > > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.