Re: [Dhis2-users] [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

Dapo Adejumo Sun, 05 Oct 2014 02:56:53 -0700

Hi Jason and Robin ( and Devs) ,

I decided to raise this question here since it is remotely related to the 
discussions below.


I want an indicator that has inputs beyond what is currently available for 
indicator definitions – for example

 

-        Percentage of Health facilities with BCG coverage below 50%

Number of Health facilities can be pulled in using the orgunit count but the 
challenge is the numerator (number of health facilities with coverage less than 
50%) The BCG Coverage is calculated as a separate indicator but can technically 
be recalculated in the numerator definition – how can the 50% logic be 
introduced in the numerator formula?. The only work around I have thought of is 
the creation of a  dataelement like “ BCG Coverage less than 50%” that is 
populated by a script with a value 1 when coverage is less than 50% for the 
facility and then used as the numerator in the indicator calculation.

Jason and Robin have talked below on the possibility of extending the current 
configuration possibilities of  indicators probably including some Logic 
functions similar to what is in the Validation rules.

 

Has anybody dealt with similar scenarios like the  example above or any ideas 
on possible solutions.

Thanks!

 

 

………………………………………

Regards,

Dapo Adejumo

+2348033683677

Skype : dapojorge

 

From: Dhis2-devs 
[mailto:dhis2-devs-bounces+dapo_adejumo=yahoo....@lists.launchpad.net] On 
Behalf Of Jason Pickering
Sent: 16 September, 2014 9:06 AM
To: Robin Martens
Cc: dhis2-users@lists.launchpad.net; dhis2-devs
Subject: Re: [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

 

Hi Robin,

 

I think that is the real issue, namely that you are applying DHIS2 in a domain 
which is slightly different than it's typical domain, namely health. I have 
been involved in some other projects with DHIS2 on the fringes of what it can 
do out of the box in food security, water and sanitation, even using it for 
recording golf handicap scores. What I have seen in each of these domains is 
that there are some challenges with the way that the data is aggregated. Lots 
of things work out of the box, like data collection, user management and 
security, etc. But sometimes, the analysis needs to be done externally through 
other means. Of course, it would be great if DHIS2 could do all of this for all 
domains, but since its primary focus is on collection and management of health 
data, that is where things work most often (although there are some challenges 
there as well, particular on data which needs to be averaged or handled 
different in time or across orgunits, such as ART current count).  
Contributions from the community are of course welcome! :)

 

Regards,

Jason

 

 

On Fri, Sep 12, 2014 at 8:55 AM, Robin Martens <mart...@sher.be 
<mailto:mart...@sher.be> > wrote:

Hi Jason,

 

Thanks for taking the time to read through my email.

 

I'll have a look at the different possibilities you proposed, and we'll be 
looking forward to any future upgrade of the calculation method (for now or 
later). I guess it's just that some sectors need more complex indicators than 
others (our project is in forest management).

 

Have a nice day,

 

Robin

 

From: Jason Pickering [mailto:jason.p.picker...@gmail.com 
<mailto:jason.p.picker...@gmail.com> ] 
Sent: 11 September 2014 19:00


To: Robin Martens
Cc: Lars Helge Øverland; dhis2-users@lists.launchpad.net 
<mailto:dhis2-users@lists.launchpad.net> ; dhis2-devs
Subject: Re: [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

 

Hi Robin,

 

Your mail is dense and will need some digestion. :)

 

 You give a very good level of detail however of you problem in this mail and 
will be very useful as this type of functionality is attempted to be 
implemented. 

 

To respond immediately to how you might be able to solve the issue, you should 
possibly consider using the WebAPI to extract your data, process it as you 
need, and then inject it back into DHIS2. The WebAPI is described in detail 
here <https://www.dhis2.org/doc/snapshot/en/user/html/ch32.html> . I have also 
written a chapter on the use of the R programming language with DHIS2, which is 
particularly well suited to do the type of custom calculations you are 
describing here. It is available here 
<https://www.dhis2.org/doc/snapshot/en/user/html/apc.html> . Of course, other 
language/methods may also be more suited to your situation, such as Python. 
Lastly, you can have a look at the DHIS2 Ad-hoc tool 
<http://bazaar.launchpad.net/~dhis2-devs-core/dhis2/trunk/files/head:/tools/dhis-adhoc/>
  which would allow interaction with the service layer of DHIS2. Another 
approach could be SQL which interacts directly with the database. I am sure 
there are many other means as well. So short answer is, right now there is no 
in-built way to achieve what you need I think, and it will take some coding on 
your side.

 

We have run into similar issues in the water and sanitation sector, where we 
need to work with the "latest reported data", which DHIS2 does not handle 
really. We pull out the data via the WebAPI, do the aggregation externally, and 
then inject everything back into the system to get the figures we need. It 
would be nice if the system did it automatically, but given the nature of the 
project, there are many feature requests and limited resources. Contributions 
of course are welcome. 

 

The current aggregation engine handles the "easy" cases of sums and averages 
pretty well, but for more complex stuff, external routes may be the only 
solution for now. 

 

We should certainly try and distill some of your ideas into a concrete 
blueprint. 

 

Best regards,

Jason

 

 

On Thu, Sep 11, 2014 at 6:15 PM, Robin Martens <mart...@sher.be 
<mailto:mart...@sher.be> > wrote:

Hi Jason,

 

I appreciate your help as this is very important for our project, thanks.

 

Some of our indicators are indeed quite complex and might need some custom 
coding if not too complicated. However, can you give some basic steps on how to 
achieve this (and on how hard this is in terms of programming as we're not 
experts here)?

 

---

 

The rest of this mail is about the specific issue I'm having here, it's 
basically related to three things:

 

1.       The absence of "cross-product" calculations in DHIS2 (I think it's 
what you call compulsory pairs of data).

2.       The fact that when no data exists on a disaggregated level, the value 
is taken to be zero instead of the aggregated (for custom dimensions only I 
think).

3.       The average function only exists over the time dimension (as discussed 
by Lars previously this week).

 

A simple example:

 


 

Population

Conso pp

Total


District 1

10

2

20


District 2

5

3

15


Total

15

5

35

 

When calculating the total national consumption, DHIS2 will do: aggregated 
population (=15) times aggregated consumption per person (=5) makes 75, which 
is wrong. In reality, the two mistakes are:

 

1.       The calculation should happen on district level before aggregating to 
the national value (20 for district1 plus 15 for district2 makes 35, which is 
the correct answer). -> Cross product

2.       DHIS2 always sums over orgunits (to be corrected soon according to 
Lars so I won't go further in detail here)

 

The cross-product issue can actually be "solved" by a workaround: obliging the 
user to explicitly show the disaggregation level (i.e. the level at which the 
cross product happens) in the report tables. Interestingly enough, when 
calculating the total in a report without showing districts, DHIS2 will return 
75, while when showing the districts 35.

 

Imagine now that the consumption has three products (a custom category), ABC. 
The table would look like this:

 


 

Population

Conso pp A

Conso pp B

Conso pp C

Total A

Total B

Total C

Total


District 1

10

2

1

1

20

10

10

40


District 2

5

3

1

0

15

5

0

20


Total

15

5

2

1

35

15

10

60

 

The same principle, but aggregated over the Product category and orgunit 
dimension gives the correct result of 60. This is how DHIS2 would calculate:

 

1.       When not showing the Product category in the table: total population 
(15) x total aggregated consumption (=5+2+1=8) is 120.

2.       When showing the Product category in the table: total population (0, 
it will not find a value and return zero) x consumption is 0 !!!

 

Indeed, the workaround does work for orgunits but not for custom dimensions 
when not all data (in this case the population) has the same custom dimensions. 

 

I guess these are things that won't be solved quickly so I might need to do 
some coding myself. As a conclusion, to increase calculation power in DHIS2 I'd 
say:

 

1.       Use aggregated value when no disaggregated value exists (such as for 
population in the previous example).

2.       Aggregation operators (sum, average,...) should be defined per custom 
category and per data element. In other words, when creating a data element and 
adding categories, you have to add the operator for each category.

3.       Indicators should be available for re-use in other indicators. It 
enables you building complex indicators piece by piece and gives more 
flexibility on intermediate calculation (on disaggregated level).

 

I hope this is somewhat more clear.

 

Kind regards,

 

Robin

 

From: Jason Pickering [mailto:jason.p.picker...@gmail.com 
<mailto:jason.p.picker...@gmail.com> ] 
Sent: 11 September 2014 16:30


To: Robin Martens
Cc: Lars Helge Øverland; dhis2-users@lists.launchpad.net 
<mailto:dhis2-users@lists.launchpad.net> ; dhis2-devs
Subject: Re: [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

 

Hi Robin,

You lost me. Could you maybe give a somewhat simpler example by what you mean 
by an "intermediary calculation"?

 

I am not sure exactly what you are trying to acheive, but what I can say is 
that in certain cases, I have had to write my own calculation methods for 
certain indicators which are basically impossible to calculate with the current 
implementation in DHIS2. It works fine for simple sums, averages, and other 
types of statistical things (standard deviation, etc), but for instance, if you 
want to calculate other statistical properties (skewness, kurtosis) of a given 
set of values, there is not a way to do it directly with DHIS2. Also, certain 
indicators depend on component parts, and cannot be calculated the way DHIS2 
does it, by first summing up the numerator and denominator and then dividing 
it, as opposed to calculating a non-weighted average of compulsory pairs of 
data. What I am getting at, is that you may have to write your own calculation 
methods, depending on how complex they are. 

 

Regards,

Jason

 

 

On Thu, Sep 11, 2014 at 4:20 PM, Robin Martens <mart...@sher.be 
<mailto:mart...@sher.be> > wrote:

Hi Jason, 

 

To pick up the point again, there's an additional question I've been looking 
at. 

 

Even if disaggregated indicator reporting is burdensome (as you explain below), 
it is sometimes necessary for correct aggregated indicator calculations (the 
most obvious one the use of weighted averages) to have "intermediary 
calculations" according to dimensions in the indicator calculation, which can 
then be aggregated over the whole table to obtain the total aggregated 
indicator value. Even in these intermediary calculations, however, the data is 
not available for calculation, returning zero as a result.

 

The conclusion is that the current way of indicator calculation not only 
complicates (if not makes impossible in many cases) calculation of indicators 
per custom dimension, but also making impossible the correct calculation of 
indicators over period and orgunit dimension when any intermediary calculation 
over custom dimensions is necessary.

 

Can you confirm this?

 

If true, is it hard to modify the calculation method to simply pick the 
one-level-higher value of a data element whenever no disaggregated value 
exists? With existing I don't mean NULL or zero, but rather not defined (the 
dimension does not exist). 

 

Robin 

 

From: Jason Pickering [mailto:jason.p.picker...@gmail.com] 
Sent: 10 September 2014 17:55
To: Robin Martens
Cc: Lars Helge Øverland; dhis2-users@lists.launchpad.net 
<mailto:dhis2-users@lists.launchpad.net> ; dhis2-devs
Subject: Re: [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

 

Hi Robin,

It has been a discussed, and certainly not a bug. See a related thread here ( 
<https://lists.launchpad.net/dhis2-devs/msg27571.html> 
https://lists.launchpad.net/dhis2-devs/msg27571.html) for a similar discussion 
on validation rules. It is essentially the same as indicators. What you will 
have to do is to create seperate indicator for each and every combination which 
you need. It can be painful, but the only way really which I know at the 
moment. 

 

Feel free to file a blueprint here.  <https://blueprints.launchpad.net/dhis2> 
https://blueprints.launchpad.net/dhis2

 

Regards,

Jason

 

 

On Wed, Sep 10, 2014 at 5:37 PM, Robin Martens < <mailto:mart...@sher.be> 
mart...@sher.be> wrote:

Dear all,

 

I've been testing the indicator calculation algorithm and noticed something 
particular of which I'm not sure if it's a bug or a deliberate development 
choice.

 

Indicators are not explicitly defined per category such as data elements but 
the reporting tools allow a disaggregated indicator calculation, which is 
definitely very useful. In a specific example, I want to know how many people 
were vaccinated this year and I have 3 kinds of vaccinations: A, B, and C. I 
have two data elements: the total population and the national vaccination 
levels (in %), with a custom category "vaccination type" which can be A, B, or 
C.

 

My indicator would be "total population" x "national vaccination level 
(total)". That works fine when put in a pivot table.

 

However, when trying to disaggregate the indicator calculation by adding my 
custom category to the pivot table, I don't have any values anymore. It seems 
the reason is that the "total population" data element does not have the 
"vaccination type" category (which seems logical) and therefore isn't found by 
the calculation algorithm. As a result, my table is empty. It seems useful that 
the algorithm would take the aggregated value (for population) available in 
such cases.

 

Another example is over the period dimension: my population is a yearly value, 
so when calculating an indicator on a monthly basis, instead of taking the 
available yearly value, he takes zero.

 

So my question: is this a deliberate choice in the development, a bug, or an 
idea for a future system improvement?

 

Kind regards,

 

Robin

 

 


_______________________________________________
Mailing list:  <https://launchpad.net/~dhis2-devs> 
https://launchpad.net/~dhis2-devs
Post to     :  <mailto:dhis2-d...@lists.launchpad.net> 
dhis2-d...@lists.launchpad.net
Unsubscribe :  <https://launchpad.net/~dhis2-devs> 
https://launchpad.net/~dhis2-devs
More help   :  <https://help.launchpad.net/ListHelp> 
https://help.launchpad.net/ListHelp





 

-- 

Jason P. Pickering
email:  <mailto:jason.p.picker...@gmail.com> jason.p.picker...@gmail.com
 <tel:+46764147049> tel:+46764147049





 

-- 

Jason P. Pickering
email: jason.p.picker...@gmail.com <mailto:jason.p.picker...@gmail.com> 
tel:+46764147049





 

-- 

Jason P. Pickering
email:  <mailto:jason.p.picker...@gmail.com> jason.p.picker...@gmail.com
 <tel:+46764147049> tel:+46764147049





 

-- 

Jason P. Pickering
email: jason.p.picker...@gmail.com <mailto:jason.p.picker...@gmail.com> 
tel:+46764147049

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to     : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help   : https://help.launchpad.net/ListHelp

Re: [Dhis2-users] [Dhis2-devs] DHIS2 - Indicator calculation over dimensions

Reply via email to