Re: off heap indexes - setSqlOnheapRowCacheSize - how does it improve efficiency ?

Alexei Scherbakov Thu, 26 May 2016 09:32:47 -0700

We are talking about count(*) query performance, right ?
WriteBehind is for writing to CacheStore in the async mode.


If yes, do the following:

1) Set OFFHEAP_TIERED mode and reduce max heap memory on example to 4Gb.
2) Update to Ignite 1.6
3) Measure query performance. Run the query several times and use average
value as the estimation.
4) If it's not as expected, show me GC logs.



2016-05-26 18:28 GMT+03:00 Tomek W <[email protected]>:

> No, I am using ON_HEAP_TIERED.
>
> Maybe WriteBehind should be turned on ?
> My App do exactly one thing:  initialize hot loading.
>
> When it comes to JDBC client, I did show fragment of code in previous post.
>
> 2016-05-26 16:15 GMT+02:00 Alexei Scherbakov <[email protected]
> >:
>
>> I see long pauses in your GC log (> 3 seconds)
>> This means your app have high pressure on the heap.
>> It's hard to tell why without knowing what your app is doing.
>>
>> Are you using OFFHEAP_TIERED?
>> If yes, try to reduce sqlOnheapRowCacheSize value.
>>
>>
>>
>>
>> 2016-05-26 14:57 GMT+03:00 Tomek W <[email protected]>:
>>
>>> Ok,
>>> i am going to add new machines to ignite cluster. Firstly, please look
>>> at my gc file log - previous message.
>>>
>>> 2016-05-26 13:39 GMT+02:00 Alexei Scherbakov <
>>> [email protected]>:
>>>
>>>> Hi,
>>>>
>>>> The initial question was about setSqlOnheapRowCacheSize and I think
>>>> now it is clear how to improve SQL performance using with parameter.
>>>>
>>>> If you dissatisfied with the Ignite performance, I suggest you to start
>>>> a new thread on this,
>>>> providing detailed info about your performance test like
>>>> cluster configuration, server GC settings, and test sources.
>>>>
>>>> As already mentioned, Ignite SQL engine(H2) has the same(or slightly)
>>>> less performance when Postresql.
>>>> Ignite really starts to shine when used as distributed data grid having
>>>> large amount of data in memory on several nodes.
>>>>
>>>> SELECT count(*) from table is not very good test query.
>>>> Postgres may have the result cached, whereas Ignite always do the full
>>>> table traversal.
>>>> Recently I implemented an improvement for this case.
>>>> See https://issues.apache.org/jira/browse/IGNITE-2751 for details.
>>>>
>>>> I strongly recommend to test Ignite performance on the real case.
>>>> Dont' forget to configure GC properly [1]
>>>>
>>>> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2016-05-26 2:09 GMT+03:00 Tomek W <[email protected]>:
>>>>
>>>>> | Also it would be interesting to see result of
>>>>> | SELECT count(*) from the query above in both cases.
>>>>> (number of rows = 2 798 685)
>>>>> SELECT count(*) FROM postgresTable;
>>>>>  456 ms
>>>>> SELECT count(*) FROM postgresTable;
>>>>> 314 ms
>>>>>
>>>>> SELECT count(*) FROM igniteTable;
>>>>> 9746 ms
>>>>> SELECT count(*) FROM igniteTable;
>>>>> 9664 ms
>>>>>
>>>>>
>>>>> Code of Jdbc Drvier (the same code for Ignite and postgresql - url
>>>>> connection is given from command line):
>>>>> http://pastebin.com/mYDSjziN
>>>>> My start sh file:
>>>>> http://pastebin.com/VmRM2sPQ
>>>>>
>>>>> My gc log file (following hint Magda):
>>>>> (file generated during hot loading and query via JDBC).
>>>>> http://pastebin.com/XicnNczV
>>>>>
>>>>>
>>>>> If you would like to see something else let me know.
>>>>>
>>>>> PS How to launch H2 debug console ? I followed docs, but it doesn't
>>>>> help.
>>>>> I set enviroment variable:
>>>>> echo $IGNITE_H2_DEBUG_CONSOLE
>>>>> true
>>>>> now, ./ignite.sh conf.xml
>>>>>
>>>>> sudo netstat -tulpn | grep 61214
>>>>> No opened ports.
>>>>>
>>>>> BTW, during starting ignite it give me information:
>>>>> [01:03:02]  Performance suggestions for grid 'turbines_table_cluster'
>>>>> (fix if possible)
>>>>> [01:03:02] To disable, set
>>>>> -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
>>>>> [01:03:02]   ^-- Disable grid events (remove 'includeEventTypes' from
>>>>> configuration)
>>>>> [01:03:02]   ^-- Enable ATOMIC mode if not using transactions (set
>>>>> 'atomicityMode' to ATOMIC)
>>>>> [01:03:02]   ^-- Enable write-behind to persistent store (set
>>>>> 'writeBehindEnabled' to true)
>>>>>
>>>>>
>>>>> 2016-05-25 12:23 GMT+02:00 Alexei Scherbakov <
>>>>> [email protected]>:
>>>>>
>>>>>> For postgres test I mean initial jdbc query and result set traversal.
>>>>>> For Ignite I mean sql query and iterator traversal.
>>>>>> Also it would be interesting to see result of
>>>>>> *SELECT count(*) from the query above in both cases.*
>>>>>>
>>>>>> 2016-05-25 12:00 GMT+03:00 Tomek W <[email protected]>:
>>>>>>
>>>>>>> [image: Obraz w treści 1]
>>>>>>>
>>>>>>> What code do you mean ? JDBC client ?
>>>>>>>
>>>>>>> 2016-05-25 10:25 GMT+02:00 Alexei Scherbakov <
>>>>>>> [email protected]>:
>>>>>>>
>>>>>>>> What's the batch size for postgresql ?
>>>>>>>> What's the size of one entry ?
>>>>>>>> Could you provide the test code for both postgres and Ignite (just
>>>>>>>> the query + read with the time estimation) ?
>>>>>>>>
>>>>>>>> 2016-05-25 11:13 GMT+03:00 Tomek W <[email protected]>:
>>>>>>>>
>>>>>>>>> | How many entries are downloaded to the client in both cases?
>>>>>>>>> 3000 000
>>>>>>>>>
>>>>>>>>> | Do the both queries involve network I/O ?
>>>>>>>>> No, I have only local one server (for testing purpose).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2016-05-25 9:59 GMT+02:00 Alexei Scherbakov <
>>>>>>>>> [email protected]>:
>>>>>>>>>
>>>>>>>>>> SELECT * is not really a good test query.
>>>>>>>>>> It's result can be affected not only by engine performance.
>>>>>>>>>>
>>>>>>>>>> How many entries are downloaded to the client in both cases?
>>>>>>>>>> Do the both queries involve network I/O ?
>>>>>>>>>>
>>>>>>>>>> 2016-05-25 7:58 GMT+03:00 Denis Magda <[email protected]>:
>>>>>>>>>>
>>>>>>>>>>> In general Ignite is designed to be used in a distributed
>>>>>>>>>>> environment when gigabytes or terabytes of dataset is spread across 
>>>>>>>>>>> many
>>>>>>>>>>> cluster nodes and SQL queries executed across the cluster should be 
>>>>>>>>>>> faster
>>>>>>>>>>> since resources of all the machines will be used and as a result a 
>>>>>>>>>>> query
>>>>>>>>>>> should be completed quicker. In your scenario you just have only a 
>>>>>>>>>>> single
>>>>>>>>>>> cluster node and in fact comparing performance of PostgreSQL and H2 
>>>>>>>>>>> (engine
>>>>>>>>>>> that is used by Ignite SQL) and I can consider that Ignite SQL can 
>>>>>>>>>>> work
>>>>>>>>>>> slightly slowly but this in is not Ignite usage scenario.
>>>>>>>>>>>
>>>>>>>>>>> However if you try to create a cluster of several nodes running
>>>>>>>>>>> on different physical machines, pre-load gigabytes of data there and
>>>>>>>>>>> compare Ignite SQL and PostgresSQL you should see performance 
>>>>>>>>>>> improvements
>>>>>>>>>>> on Ignite side.
>>>>>>>>>>>
>>>>>>>>>>> In any case taking into account the advise above do the
>>>>>>>>>>> following:
>>>>>>>>>>> - execute “EXPLAIN” query to see that the index is chose
>>>>>>>>>>> properly [1];
>>>>>>>>>>> - H2 console will allow you to see how fast a query is presently
>>>>>>>>>>> executed on a single node removing several Ignite layers [2];
>>>>>>>>>>> - check if you have any GC pauses during query execution since
>>>>>>>>>>> it can affect execution time [3]
>>>>>>>>>>>
>>>>>>>>>>> Also share the objects you use as keys and values.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://apacheignite.readme.io/docs/sql-queries#using-explain
>>>>>>>>>>> [2]
>>>>>>>>>>> https://apacheignite.readme.io/docs/sql-queries#using-h2-debug-console
>>>>>>>>>>> [3]
>>>>>>>>>>> https://apacheignite.readme.io/v1.6/docs/jvm-and-system-tuning#section-detailed-garbage-collection-stats
>>>>>>>>>>>
>>>>>>>>>>> —
>>>>>>>>>>> Denis
>>>>>>>>>>>
>>>>>>>>>>> On May 25, 2016, at 3:23 AM, Tomek W <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> +==============================================================================================+
>>>>>>>>>>> |     Node ID8(@), IP      | CPUs | Heap Used | CPU Load |   Up
>>>>>>>>>>> Time   |  Size   | Hi/Mi/Rd/Wr |
>>>>>>>>>>>
>>>>>>>>>>> +==============================================================================================+
>>>>>>>>>>> | 0F0AAF99(@n0), 127.0.0.1 | 8    | 54.50 %   | 3.23 %   |
>>>>>>>>>>> 00:13:13:49 | 3000000 | Hi: 0       |
>>>>>>>>>>> |                          |      |           |
>>>>>>>>>>> |             |         | Mi: 0       |
>>>>>>>>>>> |                          |      |           |
>>>>>>>>>>> |             |         | Rd: 0       |
>>>>>>>>>>> |                          |      |           |
>>>>>>>>>>> |             |         | Wr: 0       |
>>>>>>>>>>>
>>>>>>>>>>> +----------------------------------------------------------------------------------------------+
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I followed your hints. Actually, client doesn't require such
>>>>>>>>>>> many memory as before - thanks for it.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> When it comes to configuration of server, I also followed your
>>>>>>>>>>> hints, results:
>>>>>>>>>>>
>>>>>>>>>>> Querying is done by JDBC Client.  In ignite and postgresql I
>>>>>>>>>>> have single index on column A.
>>>>>>>>>>>
>>>>>>>>>>> Ignite: SELECT * FROM table WHERE A > 1345 takes 6s.
>>>>>>>>>>> Postgres: SELECT * FROM table WHERE A > 1345 takes 4s.
>>>>>>>>>>>
>>>>>>>>>>> As you  can see, postgres is still bettter than Ignite.  I show
>>>>>>>>>>> you significant fragments of my configuration:
>>>>>>>>>>> http://pastebin.com/EQC4JPWR
>>>>>>>>>>>
>>>>>>>>>>> And xml for server file:
>>>>>>>>>>> http://pastebin.com/enR9h5J4
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Try to consider why postgresql is still better, please.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Alexei Scherbakov
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Alexei Scherbakov
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Best regards,
>>>>>> Alexei Scherbakov
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Best regards,
>>>> Alexei Scherbakov
>>>>
>>>
>>>
>>
>>
>> --
>>
>> Best regards,
>> Alexei Scherbakov
>>
>
>


-- 

Best regards,
Alexei Scherbakov

Re: off heap indexes - setSqlOnheapRowCacheSize - how does it improve efficiency ?

Reply via email to