Re: off heap indexes - setSqlOnheapRowCacheSize - how does it improve efficiency ?

Denis Magda Thu, 26 May 2016 13:08:41 -0700

Here is a link with rough estimation
https://apacheignite.readme.io/docs/capacity-planning 
<https://apacheignite.readme.io/docs/capacity-planning>


—
Denis

> On May 26, 2016, at 8:09 PM, Tomek W <[email protected]> wrote:
> 
> | Make sure you have enough memory for your dataset.
> How to check it  ?
> 
> 2016-05-26 18:46 GMT+02:00 Alexei Scherbakov <[email protected] 
> <mailto:[email protected]>>:
> You should measure performance on the real-life cases and see if it's enough 
> for you.
> Ignite performs good in both modes.
> If you really want to use ONHEAP_TIERED, you must tune GC and heap size, as 
> described here [1]
> Make sure you have enough memory for your dataset.
> The goal is to avoid long GC pauses.
> 
> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning 
> <https://apacheignite.readme.io/docs/jvm-and-system-tuning>
> 
> 2016-05-26 19:40 GMT+03:00 Tomek W <[email protected] 
> <mailto:[email protected]>>:
> Ok, I will try it. However, Why OFF_HEAP_TIERED ?  It seem to be not fast as 
> ON HEAP
> 
> 2016-05-26 18:32 GMT+02:00 Alexei Scherbakov <[email protected] 
> <mailto:[email protected]>>:
> We are talking about count(*) query performance, right ?
> WriteBehind is for writing to CacheStore in the async mode.
> 
> If yes, do the following:
> 
> 1) Set OFFHEAP_TIERED mode and reduce max heap memory on example to 4Gb.
> 2) Update to Ignite 1.6
> 3) Measure query performance. Run the query several times and use average 
> value as the estimation.
> 4) If it's not as expected, show me GC logs.
> 
> 
> 
> 2016-05-26 18:28 GMT+03:00 Tomek W <[email protected] 
> <mailto:[email protected]>>:
> No, I am using ON_HEAP_TIERED.
> 
> Maybe WriteBehind should be turned on ?
> My App do exactly one thing:  initialize hot loading.
> 
> When it comes to JDBC client, I did show fragment of code in previous post.
> 
> 2016-05-26 16:15 GMT+02:00 Alexei Scherbakov <[email protected] 
> <mailto:[email protected]>>:
> I see long pauses in your GC log (> 3 seconds)
> This means your app have high pressure on the heap.
> It's hard to tell why without knowing what your app is doing.
> 
> Are you using OFFHEAP_TIERED?
> If yes, try to reduce sqlOnheapRowCacheSize value.
> 
> 
> 
> 
> 2016-05-26 14:57 GMT+03:00 Tomek W <[email protected] 
> <mailto:[email protected]>>:
> Ok,
> i am going to add new machines to ignite cluster. Firstly, please look at my 
> gc file log - previous message.
> 
> 2016-05-26 13:39 GMT+02:00 Alexei Scherbakov <[email protected] 
> <mailto:[email protected]>>:
> Hi,
> 
> The initial question was about setSqlOnheapRowCacheSize and I think
> now it is clear how to improve SQL performance using with parameter.
> 
> If you dissatisfied with the Ignite performance, I suggest you to start a new 
> thread on this,
> providing detailed info about your performance test like
> cluster configuration, server GC settings, and test sources.
> 
> As already mentioned, Ignite SQL engine(H2) has the same(or slightly) less 
> performance when Postresql.
> Ignite really starts to shine when used as distributed data grid having large 
> amount of data in memory on several nodes.
> 
> SELECT count(*) from table is not very good test query.
> Postgres may have the result cached, whereas Ignite always do the full table 
> traversal.
> Recently I implemented an improvement for this case.
> See https://issues.apache.org/jira/browse/IGNITE-2751 
> <https://issues.apache.org/jira/browse/IGNITE-2751> for details.
> 
> I strongly recommend to test Ignite performance on the real case.
> Dont' forget to configure GC properly [1]
> 
> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning 
> <https://apacheignite.readme.io/docs/jvm-and-system-tuning>
> 
> 
> 
> 
> 
> 
> 2016-05-26 2:09 GMT+03:00 Tomek W <[email protected] 
> <mailto:[email protected]>>:
> | Also it would be interesting to see result of 
> | SELECT count(*) from the query above in both cases.
> (number of rows = 2 798 685)
> SELECT count(*) FROM postgresTable;
>  456 ms
> SELECT count(*) FROM postgresTable;
> 314 ms
> 
> SELECT count(*) FROM igniteTable;
> 9746 ms
> SELECT count(*) FROM igniteTable;
> 9664 ms
> 
> 
> Code of Jdbc Drvier (the same code for Ignite and postgresql - url connection 
> is given from command line):
> http://pastebin.com/mYDSjziN <http://pastebin.com/mYDSjziN>
> My start sh file:
> http://pastebin.com/VmRM2sPQ <http://pastebin.com/VmRM2sPQ>
> 
> My gc log file (following hint Magda):
> (file generated during hot loading and query via JDBC).
> http://pastebin.com/XicnNczV <http://pastebin.com/XicnNczV>
> 
> 
> If you would like to see something else let me know.
> 
> PS How to launch H2 debug console ? I followed docs, but it doesn't help.  
> I set enviroment variable:
> echo $IGNITE_H2_DEBUG_CONSOLE
> true
> now, ./ignite.sh conf.xml
> 
> sudo netstat -tulpn | grep 61214
> No opened ports.
> 
> BTW, during starting ignite it give me information: 
> [01:03:02]  Performance suggestions for grid 'turbines_table_cluster' (fix if 
> possible)
> [01:03:02] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
> [01:03:02]   ^-- Disable grid events (remove 'includeEventTypes' from 
> configuration)
> [01:03:02]   ^-- Enable ATOMIC mode if not using transactions (set 
> 'atomicityMode' to ATOMIC)
> [01:03:02]   ^-- Enable write-behind to persistent store (set 
> 'writeBehindEnabled' to true)
> 
> 
> 2016-05-25 12:23 GMT+02:00 Alexei Scherbakov <[email protected] 
> <mailto:[email protected]>>:
> For postgres test I mean initial jdbc query and result set traversal.
> For Ignite I mean sql query and iterator traversal.
> Also it would be interesting to see result of 
> SELECT count(*) from the query above in both cases.
> 
> 2016-05-25 12:00 GMT+03:00 Tomek W <[email protected] 
> <mailto:[email protected]>>:
> <image.png>
> 
> What code do you mean ? JDBC client ?
> 
> 2016-05-25 10:25 GMT+02:00 Alexei Scherbakov <[email protected] 
> <mailto:[email protected]>>:
> What's the batch size for postgresql ?
> What's the size of one entry ?
> Could you provide the test code for both postgres and Ignite (just the query 
> + read with the time estimation) ?
> 
> 2016-05-25 11:13 GMT+03:00 Tomek W <[email protected] 
> <mailto:[email protected]>>:
> | How many entries are downloaded to the client in both cases?
> 3000 000
> 
> | Do the both queries involve network I/O ?
> No, I have only local one server (for testing purpose).
> 
> 
> 2016-05-25 9:59 GMT+02:00 Alexei Scherbakov <[email protected] 
> <mailto:[email protected]>>:
> SELECT * is not really a good test query.
> It's result can be affected not only by engine performance.
> 
> How many entries are downloaded to the client in both cases?
> Do the both queries involve network I/O ?
> 
> 2016-05-25 7:58 GMT+03:00 Denis Magda <[email protected] 
> <mailto:[email protected]>>:
> In general Ignite is designed to be used in a distributed environment when 
> gigabytes or terabytes of dataset is spread across many cluster nodes and SQL 
> queries executed across the cluster should be faster since resources of all 
> the machines will be used and as a result a query should be completed 
> quicker. In your scenario you just have only a single cluster node and in 
> fact comparing performance of PostgreSQL and H2 (engine that is used by 
> Ignite SQL) and I can consider that Ignite SQL can work slightly slowly but 
> this in is not Ignite usage scenario.
> 
> However if you try to create a cluster of several nodes running on different 
> physical machines, pre-load gigabytes of data there and compare Ignite SQL 
> and PostgresSQL you should see performance improvements on Ignite side.
> 
> In any case taking into account the advise above do the following:
> - execute “EXPLAIN” query to see that the index is chose properly [1];
> - H2 console will allow you to see how fast a query is presently executed on 
> a single node removing several Ignite layers [2];
> - check if you have any GC pauses during query execution since it can affect 
> execution time [3]
> 
> Also share the objects you use as keys and values.
> 
> [1] https://apacheignite.readme.io/docs/sql-queries#using-explain 
> <https://apacheignite.readme.io/docs/sql-queries#using-explain>
> [2] https://apacheignite.readme.io/docs/sql-queries#using-h2-debug-console 
> <https://apacheignite.readme.io/docs/sql-queries#using-h2-debug-console>
> [3] 
> https://apacheignite.readme.io/v1.6/docs/jvm-and-system-tuning#section-detailed-garbage-collection-stats
>  
> <https://apacheignite.readme.io/v1.6/docs/jvm-and-system-tuning#section-detailed-garbage-collection-stats>
> 
> —
> Denis
> 
>> On May 25, 2016, at 3:23 AM, Tomek W <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> +==============================================================================================+
>> |     Node ID8(@), IP      | CPUs | Heap Used | CPU Load |   Up Time   |  
>> Size   | Hi/Mi/Rd/Wr |
>> +==============================================================================================+
>> | 0F0AAF99(@n0), 127.0.0.1 | 8    | 54.50 %   | 3.23 %   | 00:13:13:49 | 
>> 3000000 | Hi: 0       |
>> |                          |      |           |          |             |     
>>     | Mi: 0       |
>> |                          |      |           |          |             |     
>>     | Rd: 0       |
>> |                          |      |           |          |             |     
>>     | Wr: 0       |
>> +----------------------------------------------------------------------------------------------+
>> 
>> 
>> I followed your hints. Actually, client doesn't require such many memory as 
>> before - thanks for it.
>> 
>> 
>> When it comes to configuration of server, I also followed your hints, 
>> results:
>> 
>> Querying is done by JDBC Client.  In ignite and postgresql I have single 
>> index on column A.
>> 
>> Ignite: SELECT * FROM table WHERE A > 1345 takes 6s.
>> Postgres: SELECT * FROM table WHERE A > 1345 takes 4s.
>> 
>> As you  can see, postgres is still bettter than Ignite.  I show you 
>> significant fragments of my configuration:
>> http://pastebin.com/EQC4JPWR <http://pastebin.com/EQC4JPWR>
>> 
>> And xml for server file:
>> http://pastebin.com/enR9h5J4 <http://pastebin.com/enR9h5J4>
>> 
>> 
>> Try to consider why postgresql is still better, please.
>> 
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
>

Re: off heap indexes - setSqlOnheapRowCacheSize - how does it improve efficiency ?

Reply via email to