Re: [Wikitech-l] Page view stats
Hi, On Mon, 2014-06-02 at 11:36 +0900, ikuyamada wrote: It seems that the page view stats have not been uploaded for several days. http://dumps.wikimedia.org/other/pagecounts-raw/2014/ Are there any plans to fix this? See https://bugzilla.wikimedia.org/show_bug.cgi?id=65978 andre -- Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/ ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Page view stats
Hello, It seems that the page view stats have not been uploaded for several days. http://dumps.wikimedia.org/other/pagecounts-raw/2014/ Are there any plans to fix this? Thanks. Ikuya ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Page view stats failure
Hello, It seems that the page view statistics data does not contain the actual data for the last few hours. http://dumps.wikimedia.org/other/pagecounts-raw/2013/2013-07/ Are there any failures on the server-side? Thanks, Ikuya ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Page view stats failure
On Jul 24, 2013 12:43 AM, Ikuya Yamada ik...@sfc.keio.ac.jp wrote: It seems that the page view statistics data does not contain the actual data for the last few hours. http://dumps.wikimedia.org/other/pagecounts-raw/2013/2013-07/ Are there any failures on the server-side? Just looking at file sizes I can see 15, 16, and 20-05(the current hour) UTC all look smaller than normal. (yes, something's broken) -Jeremy ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Page view stats we can believe in
I stumbled on the Danish Wiktionary, of all projects. Danish is the 68th biggest language of Wiktionary, and has a little more than 8,000 articles in total. Most of these articles are very short and provide no value to a reader. There is no reason to link to them, and so very unlikely that the next user should stumble upon them unless they are me. Yet, wikistats tries to make be believe that this tiny project has 400,000 or 500,000 page views each month, and has had so for a long time, http://stats.wikimedia.org/wiktionary/EN/TablesPageViewsMonthly.htm (I'm not talking about January 2012, which seems to have been an error, and reports 2-3 times that many views.) My guess is that da.wiktionary has 4,000 page views per month, not 400,000. It's more likely that 400,000 is some background noise, an offset number that should be subtracted from the number of page views for any project. If you look at the log files for just one day, you should see my IP address (85.228.something) and 3-4 other users who have been editing lately, and not many more people, but perhaps a bunch of interwiki bots. We need an explanation to these vastly inflated page view statistics. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Page view stats we can believe in
acording to http://stats.grok.se/da.d/latest90/mandag has been viewed 127 times in the last 3 months, and ranks on 927. the raw pagecount files are here: http://dumps.wikimedia.org/other/pagecounts-raw/ i then took an arbitrary file and looked into it, at midnight, i guess UTC, feb 1st. as all projects are in this file, lets grep for danish wiktionary, da.d at the beginning of the line: grep '^da\.d\s' pagecounts-20130201-00 | wc 5692276 19572 this means 569 pages accessed in this hour, at least once. so lets sort by third column, which is the page accesses. largest access are at the bottom, so lets take the last 20 lines: grep '^da\.d\s' pagecounts-20130201-00 | sort -k3n,3 | tail -20 da.d pony 2 30008 da.d skak 2 44151 da.d Speciel:Eksporter/engelsk 2 7818 da.d Speciel:Eksporter/hyle 2 4630 da.d Speciel:Eksporter/krog 2 4632 da.d Speciel:Eksporter/skaml%C3%A6ber 2 4632 da.d Forside 3 96050 da.d horse 3 54974 da.d interessant 3 9339 da.d Speciel:Eksporter/arrang%C3%B8rer 3 6948 da.d Speciel:Eksporter/b%C3%B8ger 3 6948 da.d Speciel:Eksporter/forg%C3%A6ves 3 6946 da.d Speciel:Eksporter/hensigtsm%C3%A6ssig 3 6946 da.d Speciel:Eksporter/hvad 3 9900 da.d Speciel:Eksporter/indvendig 3 6948 da.d Speciel:Eksporter/k%C3%A6le 3 6948 da.d Speciel:Eksporter/monogame 3 6944 da.d Speciel:Eksporter/revet 3 6946 da.d Speciel:Eksporter/topstykke 3 6944 da.d springer 3 45292 this means that e.g. springer was supposedly accessed 3 times in that hour. the article does not exist, but there is a red link out of http://da.wiktionary.org/wiki/Wiktionary:Top_1_(Dansk). rupert. On Wed, Feb 13, 2013 at 10:18 PM, Lars Aronsson l...@aronsson.se wrote: I stumbled on the Danish Wiktionary, of all projects. Danish is the 68th biggest language of Wiktionary, and has a little more than 8,000 articles in total. Most of these articles are very short and provide no value to a reader. There is no reason to link to them, and so very unlikely that the next user should stumble upon them unless they are me. Yet, wikistats tries to make be believe that this tiny project has 400,000 or 500,000 page views each month, and has had so for a long time, http://stats.wikimedia.org/wiktionary/EN/TablesPageViewsMonthly.htm (I'm not talking about January 2012, which seems to have been an error, and reports 2-3 times that many views.) My guess is that da.wiktionary has 4,000 page views per month, not 400,000. It's more likely that 400,000 is some background noise, an offset number that should be subtracted from the number of page views for any project. If you look at the log files for just one day, you should see my IP address (85.228.something) and 3-4 other users who have been editing lately, and not many more people, but perhaps a bunch of interwiki bots. We need an explanation to these vastly inflated page view statistics. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Page view stats we can believe in
On 02/14/2013 12:03 AM, rupert THURNER wrote: this means 569 pages accessed in this hour, at least once. Thanks for taking the time to do this check! This number already is unreasonable for an obscure project with 8000 articles. da.d Speciel:Eksporter/engelsk 2 7818 Should Special:Export ever count as page views? Anyway, there are no humans using Special:Export on da.wiktionary in the middle of the night. this means that e.g. springer was supposedly accessed 3 times in that hour. the article does not exist, but there is a red link out of http://da.wiktionary.org/wiki/Wiktionary:Top_1_(Dansk). So are there some stupid bots that follow red links? There could be a large number of such accesses on Wiktionary (in any language) because there are so many red links. But bots should never be counted among the page views. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Page view stats we can believe in
Hi all, Lars, Rupert thanks for flagging this and you are quite right: the numbers are too high because webstatscollector, the software that does the counts, just counts every request as a hit including bots, error pages etc. I am planning on running a sprint at the Amsterdam Hackathon to built an easy queryable datastore with clean pageview counts. Please let me know if you are interested in this so I can pitch this. Best, Diederik On Wed, Feb 13, 2013 at 3:36 PM, Lars Aronsson l...@aronsson.se wrote: On 02/14/2013 12:03 AM, rupert THURNER wrote: this means 569 pages accessed in this hour, at least once. Thanks for taking the time to do this check! This number already is unreasonable for an obscure project with 8000 articles. da.d Speciel:Eksporter/engelsk 2 7818 Should Special:Export ever count as page views? Anyway, there are no humans using Special:Export on da.wiktionary in the middle of the night. this means that e.g. springer was supposedly accessed 3 times in that hour. the article does not exist, but there is a red link out of http://da.wiktionary.org/wiki/**Wiktionary:Top_1_(Dansk)http://da.wiktionary.org/wiki/Wiktionary:Top_1_(Dansk) . So are there some stupid bots that follow red links? There could be a large number of such accesses on Wiktionary (in any language) because there are so many red links. But bots should never be counted among the page views. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se __**_ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] 'Page View' Stats for the Timed Media Handler
Greetings all, Victor is releasing a video tomorrow for valentines day and whilst I was discussing it with him, the topic of how many users actually watch our videos came up. Do we currently have a way of collecting play/click stats for content played by the TimedMediaHandler off of commons? If not, does anyone have any ideas on how to go about getting this information? ~Matt Walker Wikimedia Foundation Fundraising Technology Team ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
2011/11/13 Ariel T. Glenn ar...@wikimedia.org (mailto:ar...@wikimedia.org): Στις 09-11-2011, ημέρα Τετ, και ώρα 10:07 -0500, ο/η Sean Timm έγραψε: On 11/9/2011 8:21 AM, Ikuya Yamada wrote: I had thought to do a daily update. If it turns out that hourly updates are indeed useful, I'll set that up. I don't know of anyone else that has a current mirror. I had been using the hourly updated data previously provided in dammit.lt (http://dammit.lt) in order to detect the real-time trending topics in Wikipedia. It is highly accurate and it seems that the data can be used for various use cases. So, I'd greatly appreciate it if you set it up. Thanks, Ikuya That is my use case as well. Thanks, Sean The files should now be available automatically within the hour. Ariel Thanks! But it seems that the update of pagecounts files is stopped for the past few hours. Is this a temporary problem? Thanks, Ikuya ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Thanks! But it seems that the update of pagecounts files is stopped for the past few hours. Is this a temporary problem? Thanks, Ikuya Yes, very temporary. A mistaken side-effect of taking Domas' server out of the loop; fixed. Ariel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
very cool! is there a readme or project page somewhere that explains what all these files are? On Wed, Nov 16, 2011 at 1:27 PM, Ariel T. Glenn ar...@wikimedia.org wrote: Thanks! But it seems that the update of pagecounts files is stopped for the past few hours. Is this a temporary problem? Thanks, Ikuya Yes, very temporary. A mistaken side-effect of taking Domas' server out of the loop; fixed. Ariel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Yes, the index page :-P ;-) http://dumps.wikimedia.org/other/pagecounts-raw/ Perhaps you have specific questions that aren't answered here? If so, spill 'em and we'll try to add that information or links to it. Ariel Στις 16-11-2011, ημέρα Τετ, και ώρα 13:58 -0500, ο/η Fred Zimmerman έγραψε: very cool! is there a readme or project page somewhere that explains what all these files are? On Wed, Nov 16, 2011 at 1:27 PM, Ariel T. Glenn ar...@wikimedia.org wrote: Thanks! But it seems that the update of pagecounts files is stopped for the past few hours. Is this a temporary problem? Thanks, Ikuya Yes, very temporary. A mistaken side-effect of taking Domas' server out of the loop; fixed. Ariel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
blush. I just found that page. I was spending all my time looking at the directory of the derived products. got it! On Wed, Nov 16, 2011 at 2:05 PM, Ariel T. Glenn ar...@wikimedia.org wrote: Yes, the index page :-P ;-) http://dumps.wikimedia.org/other/pagecounts-raw/ Perhaps you have specific questions that aren't answered here? If so, spill 'em and we'll try to add that information or links to it. Ariel Στις 16-11-2011, ημέρα Τετ, και ώρα 13:58 -0500, ο/η Fred Zimmerman έγραψε: very cool! is there a readme or project page somewhere that explains what all these files are? On Wed, Nov 16, 2011 at 1:27 PM, Ariel T. Glenn ar...@wikimedia.org wrote: Thanks! But it seems that the update of pagecounts files is stopped for the past few hours. Is this a temporary problem? Thanks, Ikuya Yes, very temporary. A mistaken side-effect of taking Domas' server out of the loop; fixed. Ariel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Has anyone already done the work of determining which pages are fastest movers in a way that can be shared? comparing the hour over hour stats for each page would require a lot of resources ... On Wed, Nov 16, 2011 at 2:05 PM, Ariel T. Glenn ar...@wikimedia.org wrote: Yes, the index page :-P ;-) http://dumps.wikimedia.org/other/pagecounts-raw/ Perhaps you have specific questions that aren't answered here? If so, spill 'em and we'll try to add that information or links to it. Ariel Στις 16-11-2011, ημέρα Τετ, και ώρα 13:58 -0500, ο/η Fred Zimmerman έγραψε: very cool! is there a readme or project page somewhere that explains what all these files are? On Wed, Nov 16, 2011 at 1:27 PM, Ariel T. Glenn ar...@wikimedia.org wrote: Thanks! But it seems that the update of pagecounts files is stopped for the past few hours. Is this a temporary problem? Thanks, Ikuya Yes, very temporary. A mistaken side-effect of taking Domas' server out of the loop; fixed. Ariel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Στις 09-11-2011, ημέρα Τετ, και ώρα 10:07 -0500, ο/η Sean Timm έγραψε: On 11/9/2011 8:21 AM, Ikuya Yamada wrote: I had thought to do a daily update. If it turns out that hourly updates are indeed useful, I'll set that up. I don't know of anyone else that has a current mirror. I had been using the hourly updated data previously provided in dammit.lt in order to detect the real-time trending topics in Wikipedia. It is highly accurate and it seems that the data can be used for various use cases. So, I'd greatly appreciate it if you set it up. Thanks, Ikuya That is my use case as well. Thanks, Sean The files should now be available automatically within the hour. Ariel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
I had thought to do a daily update. If it turns out that hourly updates are indeed useful, I'll set that up. I don't know of anyone else that has a current mirror. I had been using the hourly updated data previously provided in dammit.lt in order to detect the real-time trending topics in Wikipedia. It is highly accurate and it seems that the data can be used for various use cases. So, I'd greatly appreciate it if you set it up. Thanks, Ikuya ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
On 11/9/2011 8:21 AM, Ikuya Yamada wrote: I had thought to do a daily update. If it turns out that hourly updates are indeed useful, I'll set that up. I don't know of anyone else that has a current mirror. I had been using the hourly updated data previously provided in dammit.lt in order to detect the real-time trending topics in Wikipedia. It is highly accurate and it seems that the data can be used for various use cases. So, I'd greatly appreciate it if you set it up. Thanks, Ikuya That is my use case as well. Thanks, Sean ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Ariel T. Glenn ariel at wikimedia.org writes: I think we finally have a complete copy from December 2007 through August 2011 of the pageview stats scrounged from various sources, now available on our dumps server. See http://dumps.wikimedia.org/other/pagecounts-raw/ Ariel This is very cool. Thanks for the work, Ariel. I'm interested to look at the historical data. It appears that page view data is pushed to dumps.wikimedia.org daily. dammit.lt used to push page view stats hourly, but it appears to be down now. Are hourly pushes still available somewhere? Thanks, Sean ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Στις 07-11-2011, ημέρα Δευ, και ώρα 18:41 +, ο/η Sean Timm έγραψε: Ariel T. Glenn ariel at wikimedia.org writes: I think we finally have a complete copy from December 2007 through August 2011 of the pageview stats scrounged from various sources, now available on our dumps server. See http://dumps.wikimedia.org/other/pagecounts-raw/ Ariel This is very cool. Thanks for the work, Ariel. I'm interested to look at the historical data. It appears that page view data is pushed to dumps.wikimedia.org daily. dammit.lt used to push page view stats hourly, but it appears to be down now. Are hourly pushes still available somewhere? Thanks, Sean I had thought to do a daily update. If it turns out that hourly updates are indeed useful, I'll set that up. I don't know of anyone else that has a current mirror. Ariel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Hi! I had thought to do a daily update. If it turns out that hourly updates are indeed useful, I'll set that up. I don't know of anyone else that has a current mirror. Yeh, don't believe anything I say, wait for someone on mailing list to tell you the same to make conclusions. Domas ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Ariel T. Glenn wrote: I think we finally have a complete copy from December 2007 through August 2011 of the pageview stats scrounged from various sources, now available on our dumps server. Great news! I do think there should be a note about the systemic under-reporting that made statistics from the last quarter of 2009 and first half of 2010 unreliable, however. -- Harry (User:Jarry1250) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Yes, and I've already been getting the information on that together so it can be documented. :-) Ariel Στις 18-09-2011, ημέρα Κυρ, και ώρα 11:55 +0100, ο/η Harry Burt έγραψε: Ariel T. Glenn wrote: I think we finally have a complete copy from December 2007 through August 2011 of the pageview stats scrounged from various sources, now available on our dumps server. Great news! I do think there should be a note about the systemic under-reporting that made statistics from the last quarter of 2009 and first half of 2010 unreliable, however. -- Harry (User:Jarry1250) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Thanks Ariel. That is important data to preserve. 2011/9/15 Ariel T. Glenn ar...@wikimedia.org I think we finally have a complete copy from December 2007 through August 2011 of the pageview stats scrounged from various sources, now available on our dumps server. See http://dumps.wikimedia.org/other/pagecounts-raw/ Ariel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
This is really cool! Thanks Ariel and team for making this available. best, Diederik On Thu, Sep 15, 2011 at 5:16 PM, MZMcBride z...@mzmcbride.com wrote: Ariel T. Glenn wrote: I think we finally have a complete copy from December 2007 through August 2011 of the pageview stats scrounged from various sources, now available on our dumps server. See http://dumps.wikimedia.org/other/pagecounts-raw/ This is a great step in the right direction! Thanks! MZMcBride ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- a href=http://about.me/diederik;Check out my about.me profile!/a ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] page view stats redux
I think we finally have a complete copy from December 2007 through August 2011 of the pageview stats scrounged from various sources, now available on our dumps server. See http://dumps.wikimedia.org/other/pagecounts-raw/ Ariel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] page view stats redux
Ariel T. Glenn wrote: I think we finally have a complete copy from December 2007 through August 2011 of the pageview stats scrounged from various sources, now available on our dumps server. See http://dumps.wikimedia.org/other/pagecounts-raw/ This is a great step in the right direction! Thanks! MZMcBride ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l