Re: [Wikitech-l] [Analytics] Wikiscan statistics tool for Wikimedia projects

2017-07-31 Thread Akeron
Thanks Eric, it looks interesting. Actually I am able to maintain a full
dataset for users but not for pages on big wikis, it may be a good
alternative to display the approximative number of edited pages over a
month or more.

2017-07-31 17:22 GMT+02:00 Erik Bernhardson :

> On Mon, Jul 31, 2017 at 7:18 AM, Akeron  wrote:
>
>> Hi Igal,
>> All suggestions are welcome :)
>> Supporting this feature shouldn't be too difficult in theory because it
>> is already working with this kind of aggregation (month are built from
>> days, years from months...). The main problem is scalability for stats
>> which require uniqueness like number of users or number of edits *per
>> page*. That's why yearly stats can actually be disabled on some big wikis.
>> So it would be feasible but with edits limitations for the range (like 3-5
>> millions) and it would be very slow to load with lots of edits.
>>
>
> One way to handle the scalability problem is to use HyperLogLog counters.
> These are an approximate algorithm for which you can store daily counters,
> and then merge the counters to get weekly/monthly/etc, avoiding the cost of
> doing the calculation over something like an entire year just for the one
> stat.  Of course because these are approximate they may not be exactly what
> you are looking for, just an idea.
>
>
>>
>> Akeron
>>
>> 2017-07-31 14:29 GMT+02:00 יגאל חיטרון :
>>
>>> Hello. It's amazing, thank you very much!
>>> Could I suggest one more feature, please? With it, the tool will be
>>> perfect. I'm talking about aggregation. Any kind of historical statistics
>>> for some day, month or year can be also shown as range of time. For
>>> example, if we have month statistics, we could fill From field to be Jan
>>> 2008 and To field to be May 2011, and get the aggregated numbers for this
>>> range. Is it possible?
>>> Thank you very much again,
>>> Igal (User:IKhitron)
>>>
>>> On Jul 30, 2017 22:18, "Pine W"  wrote:
>>>
>>> > Wikiscan is an interesting tool for statistics fans. I suggest briefly
>>> > reading this IEG page
>>> > <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then
>>> > playing with the tool on https://wikiscan.org/
>>> >
>>> > Pine
>>> > ___
>>> > Wikitech-l mailing list
>>> > Wikitech-l@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>> ___
>>> Wikitech-l mailing list
>>> Wikitech-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>>
>>
>> ___
>> Analytics mailing list
>> analyt...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikiscan statistics tool for Wikimedia projects

2017-07-31 Thread Akeron
It depends of the size of the dataset, if you already know the pages or
users you want to compare (limited dataset with reasonable quantity) then
there is no scalability issue and it should not be too difficult to
implement. Otherwise it require a lot of resources to merge the numbers for
every pages or users over the period with a large dataset.


2017-07-31 16:59 GMT+02:00 יגאל חיטרון :

> Thank you. I will not say that I understood your explanation, but I'll try:
> If you have number of viewers of some page for every year, can't you get
> the sum of them and compare with another page to sort them? And the same
> for the number of some user's edits?
> Igal
>
> On Jul 31, 2017 17:19, "Akeron"  wrote:
>
> > Hi Igal,
> > All suggestions are welcome :)
> > Supporting this feature shouldn't be too difficult in theory because it
> is
> > already working with this kind of aggregation (month are built from days,
> > years from months...). The main problem is scalability for stats which
> > require uniqueness like number of users or number of edits *per page*.
> > That's why yearly stats can actually be disabled on some big wikis. So it
> > would be feasible but with edits limitations for the range (like 3-5
> > millions) and it would be very slow to load with lots of edits.
> >
> > Akeron
> >
> > 2017-07-31 14:29 GMT+02:00 יגאל חיטרון :
> >
> > > Hello. It's amazing, thank you very much!
> > > Could I suggest one more feature, please? With it, the tool will be
> > > perfect. I'm talking about aggregation. Any kind of historical
> statistics
> > > for some day, month or year can be also shown as range of time. For
> > > example, if we have month statistics, we could fill From field to be
> Jan
> > > 2008 and To field to be May 2011, and get the aggregated numbers for
> this
> > > range. Is it possible?
> > > Thank you very much again,
> > > Igal (User:IKhitron)
> > >
> > > On Jul 30, 2017 22:18, "Pine W"  wrote:
> > >
> > > > Wikiscan is an interesting tool for statistics fans. I suggest
> briefly
> > > > reading this IEG page
> > > > <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>,
> then
> > > > playing with the tool on https://wikiscan.org/
> > > >
> > > > Pine
> > > > ___
> > > > Wikitech-l mailing list
> > > > Wikitech-l@lists.wikimedia.org
> > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > ___
> > > Wikitech-l mailing list
> > > Wikitech-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikiscan statistics tool for Wikimedia projects

2017-07-31 Thread Akeron
Hi Igal,
All suggestions are welcome :)
Supporting this feature shouldn't be too difficult in theory because it is
already working with this kind of aggregation (month are built from days,
years from months...). The main problem is scalability for stats which
require uniqueness like number of users or number of edits *per page*.
That's why yearly stats can actually be disabled on some big wikis. So it
would be feasible but with edits limitations for the range (like 3-5
millions) and it would be very slow to load with lots of edits.

Akeron

2017-07-31 14:29 GMT+02:00 יגאל חיטרון :

> Hello. It's amazing, thank you very much!
> Could I suggest one more feature, please? With it, the tool will be
> perfect. I'm talking about aggregation. Any kind of historical statistics
> for some day, month or year can be also shown as range of time. For
> example, if we have month statistics, we could fill From field to be Jan
> 2008 and To field to be May 2011, and get the aggregated numbers for this
> range. Is it possible?
> Thank you very much again,
> Igal (User:IKhitron)
>
> On Jul 30, 2017 22:18, "Pine W"  wrote:
>
> > Wikiscan is an interesting tool for statistics fans. I suggest briefly
> > reading this IEG page
> > <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then
> > playing with the tool on https://wikiscan.org/
> >
> > Pine
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l