Re: [Neo4j] Loading nodes matching an indexed UUID

'Michael Hunger' via Neo4j Tue, 30 Jan 2018 10:34:46 -0800

Hi Vincent,


On Tue, Jan 30, 2018 at 4:27 PM, Vincent Mooser <vincent.moo...@gmail.com>
wrote:

> Hi,
>
> How much memory does the machine have?
>
> The machine has 64g of memory, so I think I can increase my page cache.
> But I should have at least twice this memory to be able to load the whole
> graph in the page cache.
>

I would definitely increase the page-cache,

If it's only 100k nodes that you're  loading it should be fine.
The page-cache is emptied by utilization (LRU-K) so if those 100k nodes
keep getting used, their pages stay in.
Although if a lot of other data is loaded they might get unloaded.
There is no idle eviction.

For the node-properties there are separate pages.
>From your description it would be 2 or at most 3 property-records per node.

The disk is the biggest issue, if you can compensate with the larger
page-cache to avoid disks hits that will help (at least for reads).

Switch to 3.3.2
Use 12G heap
Use 48G page-cache

Then this should be better.
Also try my query suggestion.

Cheers, Michael


In my use case, as Solr only contains a subset of the FOLDER nodes (about
> 100000 nodes), I was thinking of executing a query that selects these
> 100000 nodes at start, for warming up the cache and to be sure that the
> page cache contains (at least) these nodes. Will they be evicted of the
> page cache after a certain amount of time ?
>
> Which properties of the nodes do you need to be returned? the full nodes?
>
> Yes, the full nodes have to be returned. They contain 1 oid (String), 1
> property 'name' (String), 4 boolean properties used as flags for business
> tasks and 2 long properties (creation and modification date)
>
> Thank you,
> Vincent
>
> On Tuesday, January 30, 2018 at 3:04:50 AM UTC+1, Michael Hunger wrote:
>>
>> Hi,
>> this query should be better:
>>
>> match(node : FOLDER) where node.oid IN {uuidList} return node
>>
>> You have definitely a really bad system for this graph size:
>> How much memory does the machine have?
>>
>> 0. Switch to Neo4j Enterprise 3.3.2 which is more memory efficient
>> 1. *use an SSD*
>> 2. use more memory
>> 3. use a constraint instead of an index
>>
>> Otherwise you are effectively measuring disk speed.
>>
>> The problem is that the nodes might be distributed across the disk and
>> then it might have to load up to 200 pages with the HDD having to seek to
>> each of the blocks.
>>
>> Which properties of the nodes do you need to be returned? the full nodes?
>>
>>
>> On Mon, Jan 29, 2018 at 5:11 PM, Vincent Mooser <vincent...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> I am currently facing some performance problems when loading nodes using
>>> an indexed UUID. My use case is the following:
>>>
>>> - I initiate a search query in Apache Solr which returns a list of 200
>>> UUID (max)
>>> - I load the 200 nodes corresponding to the uuid with the following
>>> cypher:
>>>
>>> unwind {uuidList} as uuid
>>> match(node : FOLDER { oid : uuid}) return node
>>>
>>> The uuidList is a query param containing the list of UUID (string)
>>>
>>> When the query has no page fault, it takes about 10-20ms to load the 200
>>> nodes. But when some page faults appears in the query log, the query time
>>> can take up to 4 seconds. I understand that some nodes have to be loaded
>>> directly from the disk, but for 200 nodes, it looks very slow to me.
>>>
>>> The FOLDER nodes are organized  like folders in a filesystem and are
>>> attached together with a 'PARENT' relationship. The only folder that does
>>> not have any parent is the root folder.
>>>
>>> Environment specs are:
>>> - 300M nodes
>>> - 600M relationships
>>> - 110M nodes with the label 'FOLDER'
>>> - all FOLDER nodes have a property 'oid' which index is online
>>> - the graph.db directory is about 125g (without transaction logs)
>>> - neo4j enterprise 3.2.6 and java driver 1.4.4
>>> - 8g of Heap
>>> - 32g of page cache
>>> - no SSD
>>>
>>> Any hints for improving performances ?
>>>
>>> Thank you
>>> Vincent
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to neo4j+un...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to neo4j+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Loading nodes matching an indexed UUID

Reply via email to