Re: off heap indexes - setSqlOnheapRowCacheSize - how does it improve efficiency ?

Tomek W Thu, 26 May 2016 04:58:25 -0700

Ok,
i am going to add new machines to ignite cluster. Firstly, please look at
my gc file log - previous message.


2016-05-26 13:39 GMT+02:00 Alexei Scherbakov <[email protected]>:

> Hi,
>
> The initial question was about setSqlOnheapRowCacheSize and I think
> now it is clear how to improve SQL performance using with parameter.
>
> If you dissatisfied with the Ignite performance, I suggest you to start a
> new thread on this,
> providing detailed info about your performance test like
> cluster configuration, server GC settings, and test sources.
>
> As already mentioned, Ignite SQL engine(H2) has the same(or slightly) less
> performance when Postresql.
> Ignite really starts to shine when used as distributed data grid having
> large amount of data in memory on several nodes.
>
> SELECT count(*) from table is not very good test query.
> Postgres may have the result cached, whereas Ignite always do the full
> table traversal.
> Recently I implemented an improvement for this case.
> See https://issues.apache.org/jira/browse/IGNITE-2751 for details.
>
> I strongly recommend to test Ignite performance on the real case.
> Dont' forget to configure GC properly [1]
>
> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning
>
>
>
>
>
>
> 2016-05-26 2:09 GMT+03:00 Tomek W <[email protected]>:
>
>> | Also it would be interesting to see result of
>> | SELECT count(*) from the query above in both cases.
>> (number of rows = 2 798 685)
>> SELECT count(*) FROM postgresTable;
>>  456 ms
>> SELECT count(*) FROM postgresTable;
>> 314 ms
>>
>> SELECT count(*) FROM igniteTable;
>> 9746 ms
>> SELECT count(*) FROM igniteTable;
>> 9664 ms
>>
>>
>> Code of Jdbc Drvier (the same code for Ignite and postgresql - url
>> connection is given from command line):
>> http://pastebin.com/mYDSjziN
>> My start sh file:
>> http://pastebin.com/VmRM2sPQ
>>
>> My gc log file (following hint Magda):
>> (file generated during hot loading and query via JDBC).
>> http://pastebin.com/XicnNczV
>>
>>
>> If you would like to see something else let me know.
>>
>> PS How to launch H2 debug console ? I followed docs, but it doesn't
>> help.
>> I set enviroment variable:
>> echo $IGNITE_H2_DEBUG_CONSOLE
>> true
>> now, ./ignite.sh conf.xml
>>
>> sudo netstat -tulpn | grep 61214
>> No opened ports.
>>
>> BTW, during starting ignite it give me information:
>> [01:03:02]  Performance suggestions for grid 'turbines_table_cluster'
>> (fix if possible)
>> [01:03:02] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
>> [01:03:02]   ^-- Disable grid events (remove 'includeEventTypes' from
>> configuration)
>> [01:03:02]   ^-- Enable ATOMIC mode if not using transactions (set
>> 'atomicityMode' to ATOMIC)
>> [01:03:02]   ^-- Enable write-behind to persistent store (set
>> 'writeBehindEnabled' to true)
>>
>>
>> 2016-05-25 12:23 GMT+02:00 Alexei Scherbakov <
>> [email protected]>:
>>
>>> For postgres test I mean initial jdbc query and result set traversal.
>>> For Ignite I mean sql query and iterator traversal.
>>> Also it would be interesting to see result of
>>> *SELECT count(*) from the query above in both cases.*
>>>
>>> 2016-05-25 12:00 GMT+03:00 Tomek W <[email protected]>:
>>>
>>>> [image: Obraz w treści 1]
>>>>
>>>> What code do you mean ? JDBC client ?
>>>>
>>>> 2016-05-25 10:25 GMT+02:00 Alexei Scherbakov <
>>>> [email protected]>:
>>>>
>>>>> What's the batch size for postgresql ?
>>>>> What's the size of one entry ?
>>>>> Could you provide the test code for both postgres and Ignite (just the
>>>>> query + read with the time estimation) ?
>>>>>
>>>>> 2016-05-25 11:13 GMT+03:00 Tomek W <[email protected]>:
>>>>>
>>>>>> | How many entries are downloaded to the client in both cases?
>>>>>> 3000 000
>>>>>>
>>>>>> | Do the both queries involve network I/O ?
>>>>>> No, I have only local one server (for testing purpose).
>>>>>>
>>>>>>
>>>>>> 2016-05-25 9:59 GMT+02:00 Alexei Scherbakov <
>>>>>> [email protected]>:
>>>>>>
>>>>>>> SELECT * is not really a good test query.
>>>>>>> It's result can be affected not only by engine performance.
>>>>>>>
>>>>>>> How many entries are downloaded to the client in both cases?
>>>>>>> Do the both queries involve network I/O ?
>>>>>>>
>>>>>>> 2016-05-25 7:58 GMT+03:00 Denis Magda <[email protected]>:
>>>>>>>
>>>>>>>> In general Ignite is designed to be used in a distributed
>>>>>>>> environment when gigabytes or terabytes of dataset is spread across 
>>>>>>>> many
>>>>>>>> cluster nodes and SQL queries executed across the cluster should be 
>>>>>>>> faster
>>>>>>>> since resources of all the machines will be used and as a result a 
>>>>>>>> query
>>>>>>>> should be completed quicker. In your scenario you just have only a 
>>>>>>>> single
>>>>>>>> cluster node and in fact comparing performance of PostgreSQL and H2 
>>>>>>>> (engine
>>>>>>>> that is used by Ignite SQL) and I can consider that Ignite SQL can work
>>>>>>>> slightly slowly but this in is not Ignite usage scenario.
>>>>>>>>
>>>>>>>> However if you try to create a cluster of several nodes running on
>>>>>>>> different physical machines, pre-load gigabytes of data there and 
>>>>>>>> compare
>>>>>>>> Ignite SQL and PostgresSQL you should see performance improvements on
>>>>>>>> Ignite side.
>>>>>>>>
>>>>>>>> In any case taking into account the advise above do the following:
>>>>>>>> - execute “EXPLAIN” query to see that the index is chose properly
>>>>>>>> [1];
>>>>>>>> - H2 console will allow you to see how fast a query is presently
>>>>>>>> executed on a single node removing several Ignite layers [2];
>>>>>>>> - check if you have any GC pauses during query execution since it
>>>>>>>> can affect execution time [3]
>>>>>>>>
>>>>>>>> Also share the objects you use as keys and values.
>>>>>>>>
>>>>>>>> [1] https://apacheignite.readme.io/docs/sql-queries#using-explain
>>>>>>>> [2]
>>>>>>>> https://apacheignite.readme.io/docs/sql-queries#using-h2-debug-console
>>>>>>>> [3]
>>>>>>>> https://apacheignite.readme.io/v1.6/docs/jvm-and-system-tuning#section-detailed-garbage-collection-stats
>>>>>>>>
>>>>>>>> —
>>>>>>>> Denis
>>>>>>>>
>>>>>>>> On May 25, 2016, at 3:23 AM, Tomek W <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> +==============================================================================================+
>>>>>>>> |     Node ID8(@), IP      | CPUs | Heap Used | CPU Load |   Up
>>>>>>>> Time   |  Size   | Hi/Mi/Rd/Wr |
>>>>>>>>
>>>>>>>> +==============================================================================================+
>>>>>>>> | 0F0AAF99(@n0), 127.0.0.1 | 8    | 54.50 %   | 3.23 %   |
>>>>>>>> 00:13:13:49 | 3000000 | Hi: 0       |
>>>>>>>> |                          |      |           |
>>>>>>>> |             |         | Mi: 0       |
>>>>>>>> |                          |      |           |
>>>>>>>> |             |         | Rd: 0       |
>>>>>>>> |                          |      |           |
>>>>>>>> |             |         | Wr: 0       |
>>>>>>>>
>>>>>>>> +----------------------------------------------------------------------------------------------+
>>>>>>>>
>>>>>>>>
>>>>>>>> I followed your hints. Actually, client doesn't require such many
>>>>>>>> memory as before - thanks for it.
>>>>>>>>
>>>>>>>>
>>>>>>>> When it comes to configuration of server, I also followed your
>>>>>>>> hints, results:
>>>>>>>>
>>>>>>>> Querying is done by JDBC Client.  In ignite and postgresql I have
>>>>>>>> single index on column A.
>>>>>>>>
>>>>>>>> Ignite: SELECT * FROM table WHERE A > 1345 takes 6s.
>>>>>>>> Postgres: SELECT * FROM table WHERE A > 1345 takes 4s.
>>>>>>>>
>>>>>>>> As you  can see, postgres is still bettter than Ignite.  I show you
>>>>>>>> significant fragments of my configuration:
>>>>>>>> http://pastebin.com/EQC4JPWR
>>>>>>>>
>>>>>>>> And xml for server file:
>>>>>>>> http://pastebin.com/enR9h5J4
>>>>>>>>
>>>>>>>>
>>>>>>>> Try to consider why postgresql is still better, please.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Alexei Scherbakov
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Best regards,
>>>>> Alexei Scherbakov
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Best regards,
>>> Alexei Scherbakov
>>>
>>
>>
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>

Re: off heap indexes - setSqlOnheapRowCacheSize - how does it improve efficiency ?

Reply via email to