Congratulation Neeraj and team ...it much appreciated On Mon, Oct 24, 2016 at 11:08 AM, Neeraj Gupta <neeraj.h...@gmail.com> wrote:
> Dear Team, Thanks for all your suggestions. > > Now the time of analytic is reduced to 10 hours 41 minutes. > > We tried to VACUUM as Sam suggested but it didn't help then we upgraded > postgres from 9.4 to 9.5.4 and as Calle and Bob suggested we made some > changes in configuration file of postgres and it reduced the time. But the > database size is still same. > > Thanks for all your help! > > Thanks, > Neeraj > > On Wed, Oct 19, 2016 at 6:46 PM, Calle Hedberg <calle.hedb...@gmail.com> > wrote: > >> Neeraj, >> >> It's always an element of uncertainty linked to database sizes - ref >> Sam's post over. So indicating the number of records you have in the >> datavalue table & key meta-data tables would be useful + indicating whether >> you are running other instances on the same server. Some comments - I've >> been doing a lot of similar optimising work recently: >> >> 1. Upgrading to 9.5.4 is strongly recommended (and don't use 9.6 before >> the worst bugs are fixed and it has stabilised). >> >> 2. Carefully check your postgres.conf against the recommended settings. >> The guide is a bit superficial in the sense that it has recommended "fixed" >> values only and no explanations around ranges below or above those, but you >> can experiment a bit yourself (e.g. the recommended "max_connections = 200" >> might not be sufficient for a really large system like what you have. >> >> 3. If your server is running that single instance only, then 48GB or RAM >> should be sufficient. Our servers are all having 128GB RAM so we >> experimented quite a bit earlier this year with giving a DHIS2 instance >> large amounts or RAM (up to 60-70gb), with negligible impact on >> performance. According to Lars, the DHIS2 cannot really utilize more than >> around 16gb RAM (at least that is how I understood his communication at the >> time). So 48GB should be sufficient for a single instance. >> >> 4. I've been doing performance optimizing recently on an instance with >> - 4-core server with 2x 512gb ssd, 12gb allocated to DHIS2 >> - 31,000 Orgunits >> - 420 data elements >> - 250 indicators >> - around 100 mill datavalue records >> - total size around 140gb with analytics tables. >> >> So the size is only 25% of your 500GB, but RUNNING ANALYTICS ON THAT >> DATABASE INSTANCE IS TAKING JUST OVER 1 HOUR. Fundamentally, if the >> analytics engine is designed well, I would expect a nearly linear >> relationship between database size and the time analytics takes to run. So >> running analytics on your database on our server should in theory take 4-5 >> hours. >> >> We are obviously comparing oranges and nectarines here, in the sense that >> there might be other aspects of our server and database that is different >> from yours (type of CPU, no of OUs, no of DEs/Indicators, whether your >> instance have lots of tracker data, etc etc). I have not seen any >> scientific/quantified comparative performance values related to specific >> parameters like number of CPUs and/or number of cores, but 12 cores SHOULD >> improve analytics performance quite a bit - assuming around 30% then it >> means running analytics on your database/server should take around 3 >> hours...... >> >> I tried getting comparative, quantitative data on various configurations >> of hardware and software (e.g. some users prefer CentOS, others Ubuntu) >> during the academy in August, but did not get much - it seems most >> users/providers have found a setup that works for them for now and nobody >> is doing any systematic performance testing (some of the international >> NGOs/companies using DHIS2 might have, but as with internally developed >> apps they are not that keen on sharing). So it would be highly appreciated >> if you would post the results on analytics time with every upgrade / tweak >> you do - starting with the upgrade to Pg 9.5.4 >> >> Best regards >> Calle >> >> On 19 October 2016 at 13:28, Sam Johnson <samuel.john...@qebo.co.uk> >> wrote: >> >>> Hi Neeraj, >>> >>> >>> >>> *Using VACUUM and ANALYZE* >>> >>> >>> >>> Like Brajesh, my background is MySQL, and one database admin task that >>> is often overlooked in MySQL is OPTIMIZE TABLEs. This reclaims unused >>> space (we’ve had 100Gb databases files drop to half their size) and >>> refreshes index statistics (if the shape of your data has changed over >>> time, this can make indices run faster). >>> >>> >>> >>> I’m new to PostgreSQL, but the core principles are the same, and a quick >>> bit of Googling shows that the equivalents in PostgreSQL are the VACUUM and >>> ANALYZE commands. If your database isn’t set to automatically do VACUUMs >>> (the default DHIS2 postgres config doesn’t seem to be), you might want to >>> try VACUUM *FULL*, which will literally rewrite all of your database >>> tables and indices into smaller, more efficient files (note, however, that >>> on a 500Gb database this could take a *looong* time – perhaps test on a >>> backup first?). The following forum post is a really nice, plain-English >>> explanation of what VACUUM does: >>> >>> http://dba.stackexchange.com/questions/126258/what-is-table- >>> bloating-in-databases >>> >>> >>> >>> As I mentioned, my background is MySQL rather than Postgres, so someone >>> with more specific Postgres experience might like to also chime in here. >>> >>> >>> >>> Cheers, Sam. >>> >>> >>> >>> >>> >>> *From: *Dhis2-users <dhis2-users-bounces+samuel.johnson= >>> qebo.co...@lists.launchpad.net> on behalf of Brajesh Murari < >>> brajesh.mur...@gmail.com> >>> *Date: *Wednesday, 19 October 2016 at 08:28 >>> *To: *Knut Staring <knu...@gmail.com> >>> *Cc: *DHIS 2 Users list <dhis2-users@lists.launchpad.net>, DHIS2 >>> Developers <dhis2-d...@lists.launchpad.net> >>> *Subject: *Re: [Dhis2-users] [Dhis2-devs] 25 hours in completing >>> Analytic >>> >>> >>> >>> Dear Neeraj, >>> >>> >>> >>> The physical database size doesn't matter much, even the number of >>> records don't matter. In my experience the biggest problem that one can >>> going to run in to is not size, but the number of queries you can handle at >>> a time instance specially during analytic functionality execution. Most >>> probably you should going to have to move to a master/slave configuration >>> of your database, so that the read queries can run against the slaves and >>> the write queries run against the master. However, if you and your database >>> management team are not ready for this than, you can tweak your indexes for >>> the queries you are running to speed up the response times. Also there is a >>> lot of tweaking you can do to the network stack and kernel in Linux where >>> MySQL Server has been installed that will help.Perhaps, I would focus first >>> on your indexes, then have a server admin look at your OS, and if all that >>> doesn't help it might be time to implement a master/slave configuration. >>> The most important scalability factor is RAM. If the indexes of your tables >>> fit into memory and your queries are highly optimized in analytic >>> functionality, you can serve a reasonable amount of requests with a average >>> machine. The number of records do matter, depending of how your tables look >>> like. It's a difference to have a lot of varchar fields or only a couple of >>> ints or longs. The physical size of the database matters as well, think of >>> backups, for instance. Depending on your engine, your physical db files on >>> grow, but don't shrink, for instance with innodb. So deleting a lot of >>> rows, doesn't help to shrink your physical files. Thus the database size >>> does matter. If you have more than one table with more than a million >>> records, then performance starts indeed to degrade. Indexig is one of the >>> important stand need to take care, If you hit one million records you will >>> get performance problems, if the indices are not set right (for example no >>> indices for fields in "WHERE statements" or "ON conditions" in joins). If >>> you hit 10 million records, you will start to get performance problems even >>> if you have all your indices right. Hardware upgrades - adding more memory >>> and more processor power, especially memory - often help to reduce the most >>> severe problems by increasing the performance again, at least to a certain >>> degree. >>> >>> >>> >>> On Wed, Oct 19, 2016 at 12:35 PM, Knut Staring <knu...@gmail.com> wrote: >>> >>> Just a heads-up that there seems to be a JDBC issue with Postgres 9.6, >>> so perhaps you should try upgrading to 9.5 first. >>> >>> >>> >>> On Wed, Oct 19, 2016 at 8:58 AM, Lars Helge Øverland <l...@dhis2.org> >>> wrote: >>> >>> >>> >>> Hi Neeraj, >>> >>> >>> >>> what usually helps to improve runtime is to improve/increase: >>> >>> >>> >>> - ssd (read and write speed) >>> >>> - number of CPUs >>> >>> - using latest postgresql (9.6 claims to have even better indexing >>> performance >>> <https://www.postgresql.org/docs/9.6/static/release-9-6.html> than 9.5) >>> >>> - tuning >>> <https://dhis2.github.io/dhis2-docs/master/en/implementer/html/install_server_setup.html#install_postgresql_performance_tuning> >>> of postgresql >>> >>> >>> >>> >>> >>> regards, >>> >>> >>> >>> Lars >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Lars Helge Øverland >>> >>> Lead developer, DHIS 2 >>> >>> University of Oslo >>> >>> Skype: larshelgeoverland >>> >>> l...@dhis2.org >>> >>> http://www.dhis2.org <https://www.dhis2.org/> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Mailing list: https://launchpad.net/~dhis2-users >>> Post to : dhis2-users@lists.launchpad.net >>> Unsubscribe : https://launchpad.net/~dhis2-users >>> More help : https://help.launchpad.net/ListHelp >>> >>> >>> >>> >>> >>> -- >>> >>> Knut Staring >>> >>> Dept. of Informatics, University of Oslo >>> >>> Norway: +4791880522 >>> >>> Skype: knutstar >>> >>> http://dhis2.org >>> >>> >>> _______________________________________________ >>> Mailing list: https://launchpad.net/~dhis2-devs >>> Post to : dhis2-d...@lists.launchpad.net >>> Unsubscribe : https://launchpad.net/~dhis2-devs >>> More help : https://help.launchpad.net/ListHelp >>> >>> >>> >>> >>> >>> -- >>> >>> Best Regards, >>> >>> >>> >>> Brajesh Murari, >>> >>> Postgraduate, Department of Computer Science and Engineering, >>> >>> Chaudhary Devi Lal University, Sirsa, >>> >>> India. >>> >>> >>> >>> The three basic dimensions of human development: a long and healthy >>> life, access to knowledge, and a decent standard of living. >>> >>> _______________________________________________ >>> Mailing list: https://launchpad.net/~dhis2-devs >>> Post to : dhis2-d...@lists.launchpad.net >>> Unsubscribe : https://launchpad.net/~dhis2-devs >>> More help : https://help.launchpad.net/ListHelp >>> >>> >> >> >> -- >> >> ******************************************* >> >> Calle Hedberg >> >> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA >> >> Tel/fax (home): +27-21-685-6472 >> >> Cell: +27-82-853-5352 >> >> Iridium SatPhone: +8816-315-19119 >> >> Email: calle.hedb...@gmail.com >> >> Skype: calle_hedberg >> >> ******************************************* >> >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~dhis2-users >> Post to : dhis2-users@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~dhis2-users >> More help : https://help.launchpad.net/ListHelp >> >> > > > -- > Thanks, > Neeraj Gupta > > _______________________________________________ > Mailing list: https://launchpad.net/~dhis2-users > Post to : dhis2-users@lists.launchpad.net > Unsubscribe : https://launchpad.net/~dhis2-users > More help : https://help.launchpad.net/ListHelp > > -- Best Regards, Brajesh Murari, Postgraduate, Dept of CSE, Chaudhary Devi Lal University, Sirsa, India. The three basic dimensions of human development: a long and healthy life, access to knowledge, and a decent standard of living.
_______________________________________________ Mailing list: https://launchpad.net/~dhis2-users Post to : dhis2-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~dhis2-users More help : https://help.launchpad.net/ListHelp