Re: [Neo4j] performance when deleting large numbers of nodes

2016-06-20 Thread 'Chris Vest' via Neo4j
If you perform deletes in parallel, it can be worth investing some time in 
making the code smart enough to choose disjoint data sets in transactions that 
run in parallel; e.g. no node should be start or end node in more than one 
parallel transaction at a time. That way they won’t contend on locks, or worse, 
run into deadlocks and roll back.

--
Chris Vest
System Engineer, Neo Technology
[ skype: mr.chrisvest, twitter: chvest ]


> On 19 Jun 2016, at 01:20, 'Michael Hunger' via Neo4j  
> wrote:
> 
> Shouldn't be slow. Faster disk. Concurrent batches would help. 
> 
> Von meinem iPhone gesendet
> 
> Am 18.06.2016 um 22:29 schrieb John Fry  >:
> 
>> 
>> Clark - this works. It is still slow. I guess multithreading may help 
>> some 
>> 
>> 
>> 
>> Transaction tx = db.beginTx();
>> 
>>  //try ( Transaction tx = db.beginTx() ) {
>> 
>>  for (int i=0; i> 
>>  Relationship rel = 
>> db.getRelationshipById(deletedLinks.get(i));
>> 
>>  rel.delete();
>> 
>>  txc++;
>> 
>>  if (txc>5) {
>> 
>>  txc=0;
>> 
>>  tx.success();
>> 
>>  tx.close();
>> 
>>  tx = db.beginTx();
>> 
>>  }
>> 
>>  }
>> 
>>  tx.success();
>> 
>>  tx.close();
>> 
>> //}
>> 
>> //catch (Exception e) {
>> 
>> //System.out.println("Exception link deletion: " + 
>> e.getMessage());
>> 
>> //}
>> 
>> 
>> 
>> 
>> 
>> 
>> On Saturday, June 18, 2016 at 2:03:33 PM UTC-7, John Fry wrote:
>> Thanks Clark - is there any good/recommended way to nest the commits?
>> 
>> Thx JF
>> 
>> On Saturday, June 18, 2016 at 1:43:19 PM UTC-7, Clark Richey wrote:
>> You need to periodically commit.  Holding that many transactions in memory 
>> isn't efficient.  
>> 
>> Sent from my iPhone
>> 
>> On Jun 18, 2016, at 16:41, John Fry > wrote:
>> 
>>> Hello All,
>>> 
>>> I have a graph of about 200M relationships and often I need to delete a 
>>> larges amount of them.
>>> For the proxy code below I am seeing huge memory usage and memory thrashing 
>>> when deleting about 15M relationships.
>>> 
>>> When it hits tx.close() I see all CPU cores start working at close to 100% 
>>> util and thrash for > 30mins.
>>> I need this to work in <5mins ideally.
>>> 
>>> (note when I execute large amounts of changes to properties or create large 
>>> amounts of new properties I don't have such issues)
>>> 
>>> Any advice? Why is this happening?
>>> 
>>> Regards, John.
>>> 
>>> 
>>> 
>>>  int txc = 0;
>>> 
>>>  // serially delete the links   
>>> 
>>>  try ( Transaction tx = db.beginTx() ) {
>>> 
>>> for (int i=0; i>> 
>>> Relationship rel = 
>>> db.getRelationshipById(deletedLinks.get(i));
>>> 
>>> rel.delete();
>>> 
>>> txc++;
>>> 
>>> if (txc>5) {
>>> 
>>> txc=0;
>>> 
>>> tx.success();
>>> 
>>> }
>>> 
>>> }
>>> 
>>> tx.success();
>>> 
>>> tx.close();
>>> 
>>> }
>>> 
>>> catch (Exception e) {
>>> 
>>> System.out.println("Exception link deletion: " + 
>>> e.getMessage());
>>> 
>>> }
>>> 
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to neo4j+un...@googlegroups.com <>.
>>> For more options, visit https://groups.google.com/d/optout 
>>> .
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to neo4j+unsubscr...@googlegroups.com 
>> .
>> For more options, visit https://groups.google.com/d/optout 
>> .
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to neo4j+unsubscr...@googlegroups.com 
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 

Re: [Neo4j] performance when deleting large numbers of nodes

2016-06-18 Thread 'Michael Hunger' via Neo4j
Shouldn't be slow. Faster disk. Concurrent batches would help. 

Von meinem iPhone gesendet

> Am 18.06.2016 um 22:29 schrieb John Fry :
> 
> 
> Clark - this works. It is still slow. I guess multithreading may help 
> some 
> 
> 
> 
> Transaction tx = db.beginTx();
> 
>   //try ( Transaction tx = db.beginTx() ) {
> 
>   for (int i=0; i 
>   Relationship rel = 
> db.getRelationshipById(deletedLinks.get(i));
> 
>   rel.delete();
> 
>   txc++;
> 
>   if (txc>5) {
> 
>   txc=0;
> 
>   tx.success();
> 
>   tx.close();
> 
>   tx = db.beginTx();
> 
>   }
> 
>   }
> 
>   tx.success();
> 
>   tx.close();
> 
> //}
> 
> //catch (Exception e) {
> 
> //System.out.println("Exception link deletion: " + 
> e.getMessage());
> 
> //}
> 
> 
> 
> 
> 
> 
>> On Saturday, June 18, 2016 at 2:03:33 PM UTC-7, John Fry wrote:
>> Thanks Clark - is there any good/recommended way to nest the commits?
>> 
>> Thx JF
>> 
>>> On Saturday, June 18, 2016 at 1:43:19 PM UTC-7, Clark Richey wrote:
>>> You need to periodically commit.  Holding that many transactions in memory 
>>> isn't efficient.  
>>> 
>>> Sent from my iPhone
>>> 
 On Jun 18, 2016, at 16:41, John Fry  wrote:
 
 Hello All,
 
 I have a graph of about 200M relationships and often I need to delete a 
 larges amount of them.
 For the proxy code below I am seeing huge memory usage and memory 
 thrashing when deleting about 15M relationships.
 
 When it hits tx.close() I see all CPU cores start working at close to 100% 
 util and thrash for > 30mins.
 I need this to work in <5mins ideally.
 
 (note when I execute large amounts of changes to properties or create 
 large amounts of new properties I don't have such issues)
 
 Any advice? Why is this happening?
 
 Regards, John.
 
 
 
  int txc = 0;
 
  // serially delete the links  
 
  try ( Transaction tx = db.beginTx() ) {
 
for (int i=0; i5) {
 
txc=0;
 
tx.success();
 
}
 
}
 
tx.success();
 
tx.close();
 
 }
 
 catch (Exception e) {
 
 System.out.println("Exception link deletion: " + 
 e.getMessage());
 
 }
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 "Neo4j" group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to neo4j+un...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to neo4j+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Neo4j] performance when deleting large numbers of nodes

2016-06-18 Thread Clark Richey
Yes. That's a lot to delete doing it in parallel will definitely help

Sent from my iPhone

> On Jun 18, 2016, at 17:29, John Fry  wrote:
> 
> 
> Clark - this works. It is still slow. I guess multithreading may help 
> some 
> 
> 
> 
> Transaction tx = db.beginTx();
> 
>   //try ( Transaction tx = db.beginTx() ) {
> 
>   for (int i=0; i 
>   Relationship rel = 
> db.getRelationshipById(deletedLinks.get(i));
> 
>   rel.delete();
> 
>   txc++;
> 
>   if (txc>5) {
> 
>   txc=0;
> 
>   tx.success();
> 
>   tx.close();
> 
>   tx = db.beginTx();
> 
>   }
> 
>   }
> 
>   tx.success();
> 
>   tx.close();
> 
> //}
> 
> //catch (Exception e) {
> 
> //System.out.println("Exception link deletion: " + 
> e.getMessage());
> 
> //}
> 
> 
> 
> 
> 
> 
>> On Saturday, June 18, 2016 at 2:03:33 PM UTC-7, John Fry wrote:
>> Thanks Clark - is there any good/recommended way to nest the commits?
>> 
>> Thx JF
>> 
>>> On Saturday, June 18, 2016 at 1:43:19 PM UTC-7, Clark Richey wrote:
>>> You need to periodically commit.  Holding that many transactions in memory 
>>> isn't efficient.  
>>> 
>>> Sent from my iPhone
>>> 
 On Jun 18, 2016, at 16:41, John Fry  wrote:
 
 Hello All,
 
 I have a graph of about 200M relationships and often I need to delete a 
 larges amount of them.
 For the proxy code below I am seeing huge memory usage and memory 
 thrashing when deleting about 15M relationships.
 
 When it hits tx.close() I see all CPU cores start working at close to 100% 
 util and thrash for > 30mins.
 I need this to work in <5mins ideally.
 
 (note when I execute large amounts of changes to properties or create 
 large amounts of new properties I don't have such issues)
 
 Any advice? Why is this happening?
 
 Regards, John.
 
 
 
  int txc = 0;
 
  // serially delete the links  
 
  try ( Transaction tx = db.beginTx() ) {
 
for (int i=0; i5) {
 
txc=0;
 
tx.success();
 
}
 
}
 
tx.success();
 
tx.close();
 
 }
 
 catch (Exception e) {
 
 System.out.println("Exception link deletion: " + 
 e.getMessage());
 
 }
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 "Neo4j" group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to neo4j+un...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to neo4j+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Neo4j] performance when deleting large numbers of nodes

2016-06-18 Thread John Fry

Clark - this works. It is still slow. I guess multithreading may help 
some 



Transaction tx = db.beginTx();

//try ( Transaction tx = db.beginTx() ) {

for (int i=0; i5) {

txc=0;

tx.success();

tx.close();

tx = db.beginTx();

}

}

tx.success();

tx.close();

//}

//catch (Exception e) {

//System.out.println("Exception link deletion: " + 
e.getMessage());

//}





On Saturday, June 18, 2016 at 2:03:33 PM UTC-7, John Fry wrote:
>
> Thanks Clark - is there any good/recommended way to nest the commits?
>
> Thx JF
>
> On Saturday, June 18, 2016 at 1:43:19 PM UTC-7, Clark Richey wrote:
>>
>> You need to periodically commit.  Holding that many transactions in 
>> memory isn't efficient.  
>>
>> Sent from my iPhone
>>
>> On Jun 18, 2016, at 16:41, John Fry  wrote:
>>
>> Hello All,
>>
>> I have a graph of about 200M relationships and often I need to delete a 
>> larges amount of them.
>> For the proxy code below I am seeing huge memory usage and memory 
>> thrashing when deleting about 15M relationships.
>>
>> When it hits tx.close() I see all CPU cores start working at close to 
>> 100% util and thrash for > 30mins.
>> I need this to work in <5mins ideally.
>>
>> (note when I execute large amounts of changes to properties or create 
>> large amounts of new properties I don't have such issues)
>>
>> Any advice? Why is this happening?
>>
>> Regards, John.
>>
>>
>>
>>  int txc = 0;
>>
>>  // serially delete the links 
>>
>>  try ( Transaction tx = db.beginTx() ) {
>>
>> for (int i=0; i>
>> Relationship rel = db.getRelationshipById(deletedLinks.get(i));
>>
>> rel.delete();
>>
>> txc++;
>>
>> if (txc>5) {
>>
>> txc=0;
>>
>> tx.success();
>>
>> }
>>
>> }
>>
>> tx.success();
>>
>> tx.close();
>>
>> }
>>
>> catch (Exception e) {
>>
>> System.out.println("Exception link deletion: " + e
>> .getMessage());
>>
>> }
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to neo4j+un...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Neo4j] performance when deleting large numbers of nodes

2016-06-18 Thread Clark Richey
Don't them. Just create a counter and every x deletes commit the transaction 
and open a new one. 

Sent from my iPhone

> On Jun 18, 2016, at 17:03, John Fry  wrote:
> 
> Thanks Clark - is there any good/recommended way to nest the commits?
> 
> Thx JF
> 
>> On Saturday, June 18, 2016 at 1:43:19 PM UTC-7, Clark Richey wrote:
>> You need to periodically commit.  Holding that many transactions in memory 
>> isn't efficient.  
>> 
>> Sent from my iPhone
>> 
>>> On Jun 18, 2016, at 16:41, John Fry  wrote:
>>> 
>>> Hello All,
>>> 
>>> I have a graph of about 200M relationships and often I need to delete a 
>>> larges amount of them.
>>> For the proxy code below I am seeing huge memory usage and memory thrashing 
>>> when deleting about 15M relationships.
>>> 
>>> When it hits tx.close() I see all CPU cores start working at close to 100% 
>>> util and thrash for > 30mins.
>>> I need this to work in <5mins ideally.
>>> 
>>> (note when I execute large amounts of changes to properties or create large 
>>> amounts of new properties I don't have such issues)
>>> 
>>> Any advice? Why is this happening?
>>> 
>>> Regards, John.
>>> 
>>> 
>>> 
>>>  int txc = 0;
>>> 
>>>  // serially delete the links   
>>> 
>>>  try ( Transaction tx = db.beginTx() ) {
>>> 
>>> for (int i=0; i>> 
>>> Relationship rel = 
>>> db.getRelationshipById(deletedLinks.get(i));
>>> 
>>> rel.delete();
>>> 
>>> txc++;
>>> 
>>> if (txc>5) {
>>> 
>>> txc=0;
>>> 
>>> tx.success();
>>> 
>>> }
>>> 
>>> }
>>> 
>>> tx.success();
>>> 
>>> tx.close();
>>> 
>>> }
>>> 
>>> catch (Exception e) {
>>> 
>>> System.out.println("Exception link deletion: " + 
>>> e.getMessage());
>>> 
>>> }
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to neo4j+un...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to neo4j+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Neo4j] performance when deleting large numbers of nodes

2016-06-18 Thread John Fry
Thanks Clark - is there any good/recommended way to nest the commits?

Thx JF

On Saturday, June 18, 2016 at 1:43:19 PM UTC-7, Clark Richey wrote:
>
> You need to periodically commit.  Holding that many transactions in memory 
> isn't efficient.  
>
> Sent from my iPhone
>
> On Jun 18, 2016, at 16:41, John Fry  
> wrote:
>
> Hello All,
>
> I have a graph of about 200M relationships and often I need to delete a 
> larges amount of them.
> For the proxy code below I am seeing huge memory usage and memory 
> thrashing when deleting about 15M relationships.
>
> When it hits tx.close() I see all CPU cores start working at close to 100% 
> util and thrash for > 30mins.
> I need this to work in <5mins ideally.
>
> (note when I execute large amounts of changes to properties or create 
> large amounts of new properties I don't have such issues)
>
> Any advice? Why is this happening?
>
> Regards, John.
>
>
>
>  int txc = 0;
>
>  // serially delete the links 
>
>  try ( Transaction tx = db.beginTx() ) {
>
> for (int i=0; i
> Relationship rel = db.getRelationshipById(deletedLinks.get(i));
>
> rel.delete();
>
> txc++;
>
> if (txc>5) {
>
> txc=0;
>
> tx.success();
>
> }
>
> }
>
> tx.success();
>
> tx.close();
>
> }
>
> catch (Exception e) {
>
> System.out.println("Exception link deletion: " + e
> .getMessage());
>
> }
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to neo4j+un...@googlegroups.com .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Neo4j] performance when deleting large numbers of nodes

2016-06-18 Thread Clark Richey
You need to periodically commit.  Holding that many transactions in memory 
isn't efficient.  

Sent from my iPhone

> On Jun 18, 2016, at 16:41, John Fry  wrote:
> 
> Hello All,
> 
> I have a graph of about 200M relationships and often I need to delete a 
> larges amount of them.
> For the proxy code below I am seeing huge memory usage and memory thrashing 
> when deleting about 15M relationships.
> 
> When it hits tx.close() I see all CPU cores start working at close to 100% 
> util and thrash for > 30mins.
> I need this to work in <5mins ideally.
> 
> (note when I execute large amounts of changes to properties or create large 
> amounts of new properties I don't have such issues)
> 
> Any advice? Why is this happening?
> 
> Regards, John.
> 
> 
> 
>  int txc = 0;
> 
>  // serially delete the links 
> 
>  try ( Transaction tx = db.beginTx() ) {
> 
>   for (int i=0; i 
>   Relationship rel = 
> db.getRelationshipById(deletedLinks.get(i));
> 
>   rel.delete();
> 
>   txc++;
> 
>   if (txc>5) {
> 
>   txc=0;
> 
>   tx.success();
> 
>   }
> 
>   }
> 
>   tx.success();
> 
>   tx.close();
> 
> }
> 
> catch (Exception e) {
> 
> System.out.println("Exception link deletion: " + e.getMessage());
> 
> }
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to neo4j+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.