Re: [Neo4j] LOAD CSV creates nodes but does not set properties

Paul Damian Mon, 23 Jun 2014 05:20:05 -0700

I did. Do you have any new ideas on the current topic?

luni, 23 iunie 2014, 13:22:38 UTC+3, Michael Hunger a scris:
>
> Please start a new thread for this discussion.
>
> Am 23.06.2014 um 11:02 schrieb Paul Damian <paulda...@gmail.com 
> <javascript:>>:
>
> Hey, 
> I'm trying to run a command to find out 10 clients and the companies they 
> work for. I've used a query like this:
> match (c: Client)-[WORKS_FOR]->(co: Company)  return c, co limit 10
> However, it keeps returning Java heap space error. Neo4j is installed on a 
> vm with windows server 2012R2 Intel Xeon @ 2.27 GHz and 8 GB of RAM. The 
> graph db has over 30 GB (which is also weird since the SQL database that 
> was used to populate the graph only has 13 GB). What can I do to improve 
> the query performance beside adding indexes?
>
>
>
> miercuri, 18 iunie 2014, 16:34:10 UTC+3, Michael Hunger a scris:
>>
>> For me it sounds as if there is a big cross product happening.
>>
>> I.e. many Verticals with the same Id
>>
>> What happens if you do:
>>
>> MATCH (v:Vertical)
>> RETURN v.Id, count(*) 
>>
>> Michael
>>
>> Am 18.06.2014 um 15:26 schrieb Paul Damian <paulda...@gmail.com>:
>>
>> Hi,
>>
>> I've tried with another file, which contains ClientdId and VerticalId. 
>> The thing is, there are only 7 verticals and 11M clients, so there is an 
>> obvious one-to-many relationship there.
>> When I run 
>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Vertical.csv" AS c
>> WITH c LIMIT 100
>> MATCH (cli: Client { Id: toInt(c.ClientId)}), (vert: Vertical { Id: 
>> toInt(c.VerticalId)})
>> Return count(*)
>> it return Neo.DatabaseError.Statement.ExecutionFailure 
>> I get the same result when I only match the verticals. 
>> However, if I run 
>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Vertical.csv" AS c
>> WITH c LIMIT 100
>> MATCH (cli: Client { Id: toInt(c.ClientId)})
>> Return count(*)
>>  it returns 100.
>> I think it has something to do with the fact that the first 100 verticals 
>> have the same Id
>>
>> miercuri, 18 iunie 2014, 14:20:57 UTC+3, Michael Hunger a scris:
>>>
>>> sorry
>>>
>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" 
>>> AS c
>>> WITH c
>>> LIMIT 100
>>> MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: 
>>> toInt(c.CityId)})
>>> Return count(*)
>>>
>>>
>>> Am 18.06.2014 um 11:44 schrieb Paul Damian <paulda...@gmail.com>:
>>>
>>> I cannot run this command. It returns invalid syntax.  Only way I could 
>>> run it was 
>>>
>>>  LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" 
>>> AS c
>>>  MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: 
>>> toInt(c.CityId)})
>>> Return count(*) Limit 100
>>>
>>> Also, I think a skype call would be great.
>>>
>>> marți, 17 iunie 2014, 21:36:05 UTC+3, Michael Hunger a scris:
>>>>
>>>> The something is really wrong.
>>>>
>>>> What happens if you do
>>>>
>>>>  
>>>>>>>>  LOAD CSV WITH HEADERS FROM 
>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>
>>>>>>>> Limit 100
>>>>
>>>>  MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: 
>>>>>>>> toInt(c.CityId)})
>>>>>>>>
>>>>>>>> Return count(*)
>>>>
>>>> I'm at a conference in Amsterdam this week
>>>> but perhaps we can do a skype call next week?
>>>>
>>>> Michael
>>>>
>>>>
>>>>
>>>> Sent from mobile device
>>>>
>>>> Am 17.06.2014 um 18:48 schrieb Paul Damian <paulda...@gmail.com>:
>>>>
>>>> Yes, I do. I keep getting Java heap space error now. I'm using 100 
>>>> commit size.
>>>>
>>>> marți, 17 iunie 2014, 19:28:05 UTC+3, Michael Hunger a scris:
>>>>>
>>>>> Ok, cool and you have the indexes for both :City(Id) and :Client(Id) ?
>>>>>
>>>>>
>>>>> Michael
>>>>>
>>>>> Am 17.06.2014 um 18:15 schrieb Paul Damian <paulda...@gmail.com>:
>>>>>
>>>>> The first query returns 999996 which is the number of rows in the file 
>>>>> and the second one returns 
>>>>> Neo.DatabaseError.Statement.ExecutionFailure
>>>>>  probably because of the null values. But then I run the following 
>>>>> command:
>>>>> LOAD CSV WITH HEADERS FROM 
>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>  MATCH (city:City { Id: toInt(c.CityId)})
>>>>> WHERE coalesce(c.CityId,"") <> ""
>>>>> RETURN count(*)
>>>>>
>>>>> and I get 992980
>>>>>
>>>>>
>>>>> marți, 17 iunie 2014, 17:55:56 UTC+3, Michael Hunger a scris:
>>>>>
>>>>>> No you can just filter out the lines with no cityid
>>>>>>
>>>>>> Did you run my suggested commands?
>>>>>>
>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>  MATCH (client: Client { Id: toInt(c.Id)})
>>>>>>>>
>>>>>>>> RETURN count(*)
>>>>>>>>
>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>  MATCH (city: City { Id: toInt(c.CityId)})
>>>>>>>>
>>>>>>>> RETURN count(*)
>>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>
>>>>>>>> return c
>>>>>> limit 10
>>>>>>
>>>>>>
>>>>>>>> Am 17.06.2014 um 16:37 schrieb Paul Damian <paulda...@gmail.com>:
>>>>>>
>>>>>> in the file I only have 2 columns, one for client id, which is always 
>>>>>> not null and CityId, which may be sometimes null. Should I export the 
>>>>>> records from SQL database leaving out the columns that contain null 
>>>>>> values?
>>>>>>
>>>>>> marți, 17 iunie 2014, 15:39:14 UTC+3, Michael Hunger a scris:
>>>>>>>
>>>>>>> if they don't have a value for city id, do they then have empty 
>>>>>>> columns there still? like "user-id,,
>>>>>>>
>>>>>>> You probably want to filter these rows?
>>>>>>>
>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>
>>>>>>>> WHERE coalesce(c.CitiId,"") <> ""
>>>>>>> ...
>>>>>>>
>>>>>>> Am 17.06.2014 um 11:23 schrieb Paul Damian <paulda...@gmail.com>:
>>>>>>>
>>>>>>> Well, the csv file contains some rows that do not have a value for 
>>>>>>> CityId, and the rows are unique regarding the clientID. There are 11M 
>>>>>>> clients living in 14K Cities. Is there a limit of links/node?
>>>>>>> Now I've created a piece of code that reads from file and creates 
>>>>>>> each relationship, but, as you can imagine, it works really slow in 
>>>>>>> this 
>>>>>>> scenario.
>>>>>>>  
>>>>>>>
>>>>>>>> did you create an index on :Client(Id) and :City(Id)
>>>>>>>>
>>>>>>>> what happens if you do:
>>>>>>>>
>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>  MATCH (client: Client { Id: toInt(c.Id)})
>>>>>>>>
>>>>>>>> RETURN count(*)
>>>>>>>>
>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>  MATCH (city: City { Id: toInt(c.CityId)})
>>>>>>>>
>>>>>>>> RETURN count(*)
>>>>>>>>
>>>>>>>> each count should be equivalent to the # of rows in the file.
>>>>>>>>
>>>>>>>> Michael
>>>>>>>>
>>>>>>>> Am 16.06.2014 um 17:47 schrieb Paul Damian <paulda...@gmail.com>:
>>>>>>>>
>>>>>>>> Somehow I've managed to load all the nodes and now I'm trying to 
>>>>>>>> load the links as well. I read the nodes from csv file and create the 
>>>>>>>> relation between them. I run the following command:
>>>>>>>> USING PERIODIC COMMIT 100 
>>>>>>>>  LOAD CSV WITH HEADERS FROM 
>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>  MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: 
>>>>>>>> toInt(c.CityId)})
>>>>>>>>  CREATE (client)-[r:LOCATED_IN]->(city)
>>>>>>>>
>>>>>>>> Running with a smaller commit size returns this error 
>>>>>>>> Neo.DatabaseError.Statement.ExecutionFailure, while increasing the 
>>>>>>>> commit size to 10000 throws 
>>>>>>>> Neo.DatabaseError.General.UnknownFailure. 
>>>>>>>> Can you help me with this?
>>>>>>>>
>>>>>>>>
>>>>>>>> joi, 5 iunie 2014, 12:05:18 UTC+3, Michael Hunger a scris:
>>>>>>>>>
>>>>>>>>> Perhaps something with field or line terminators?
>>>>>>>>>
>>>>>>>>> I assume it blows up the field separation.
>>>>>>>>>
>>>>>>>>> Try to run:
>>>>>>>>>
>>>>>>>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Client.csv" 
>>>>>>>>> AS c
>>>>>>>>> RETURN { Id: toInt(c.Id), FirstName: c.FirstName, LastName: 
>>>>>>>>> c.Lastname, Address: c.Address, ZipCode: toInt(c.ZipCode), Email: 
>>>>>>>>> c.Email, 
>>>>>>>>> Phone: c.Phone, Fax: c.Fax, BusinessName: c.BusinessName, URL: c.URL, 
>>>>>>>>> Latitude: toFloat(c.Latitude), Longitude: toFloat(c.Longitude), 
>>>>>>>>> AgencyId: 
>>>>>>>>> toInt(c.AgencyId), RowStatus: toInt(c.RowStatus)} as data, c as line
>>>>>>>>> LIMIT 3
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Jun 5, 2014 at 10:51 AM, Paul Damian <paulda...@gmail.com>
>>>>>>>>>  wrote:
>>>>>>>>>
>>>>>>>>>> I've tried using the shell and I get the same results: nodes with 
>>>>>>>>>> no properties.
>>>>>>>>>> I've created the csv file using MsSQL Server Export. Is it 
>>>>>>>>>> relevant?
>>>>>>>>>>
>>>>>>>>>> About you curiosity: I figured I would import first the nodes, 
>>>>>>>>>> then the relationships from the connection tables. Am I doing it 
>>>>>>>>>> wrong?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>> joi, 5 iunie 2014, 09:54:31 UTC+3, Michael Hunger a scris:
>>>>>>>>>>>
>>>>>>>>>>> I'd probably use a commit size in your case of 50k or 100k.
>>>>>>>>>>>
>>>>>>>>>>> Try to use the neo4j-shell and not the web-interface.
>>>>>>>>>>>
>>>>>>>>>>> Connect to neo4j using bin/neo4j-shell
>>>>>>>>>>>
>>>>>>>>>>> Then run your commands ending with a semicolon.
>>>>>>>>>>>
>>>>>>>>>>> Just curious: Your data is imported as one node per row? That's 
>>>>>>>>>>> not really a graph structure.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jun 4, 2014 at 6:56 PM, Paul Damian <paulda...@gmail.com
>>>>>>>>>>> > wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi there,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm experimenting with Neo4j while benchmarking a bunch of 
>>>>>>>>>>>> NoSQL databases for my graduation paper. 
>>>>>>>>>>>> I'm using the web interface to populate the database. I've been 
>>>>>>>>>>>> able to load the smaller tables from my SQL database and LOAD CSV 
>>>>>>>>>>>> works 
>>>>>>>>>>>> fine.
>>>>>>>>>>>> By small, I mean a few columns (4-5) and some rows (1 million). 
>>>>>>>>>>>> However, when I try to upload a larger table (15 columns, 12 
>>>>>>>>>>>> million rows), 
>>>>>>>>>>>> it creates the nodes but it doesn't set any properties.
>>>>>>>>>>>> I've tried to reduce the number of records (to 100) and also 
>>>>>>>>>>>> the number of columns( just the Id property ), but no luck so far.
>>>>>>>>>>>>
>>>>>>>>>>>> The cypher command used is this one
>>>>>>>>>>>> USING PERIODIC COMMIT 100
>>>>>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>>>>>> "file:/Users/pauld/Documents/Client.csv" 
>>>>>>>>>>>> AS c
>>>>>>>>>>>> CREATE (:Client { Id: toInt(c.Id), FirstName: c.FirstName, 
>>>>>>>>>>>> LastName: c.Lastname, Address: c.Address, ZipCode: 
>>>>>>>>>>>> toInt(c.ZipCode), Email: 
>>>>>>>>>>>> c.Email, Phone: c.Phone, Fax: c.Fax, BusinessName: c.BusinessName, 
>>>>>>>>>>>> URL: 
>>>>>>>>>>>> c.URL, Latitude: toFloat(c.Latitude), Longitude: 
>>>>>>>>>>>> toFloat(c.Longitude), 
>>>>>>>>>>>> AgencyId: toInt(c.AgencyId), RowStatus: toInt(c.RowStatus)})
>>>>>>>>>>>>
>>>>>>>>>>>> Any help and indication is welcomed,
>>>>>>>>>>>> Paul
>>>>>>>>>>>>
>>>>>>>>>>>> -- 
>>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>>> Google Groups "Neo4j" group.
>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>>> it, send an email to neo4j+un...@googlegroups.com.
>>>>>>>>>>>>
>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>> Google Groups "Neo4j" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>> send an email to neo4j+un...@googlegroups.com.
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "Neo4j" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to neo4j+un...@googlegroups.com.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "Neo4j" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to neo4j+un...@googlegroups.com.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "Neo4j" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to neo4j+un...@googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>>
>>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Neo4j" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to neo4j+un...@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>>
>>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Neo4j" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to neo4j+un...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to neo4j+un...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to neo4j+un...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to neo4j+un...@googlegroups.com <javascript:>.
> For more options, visit https://groups.google.com/d/optout.
>
>
>


-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] LOAD CSV creates nodes but does not set properties

Reply via email to