Re: [Wiki-research-l] Unique visitor stats

2014-09-07 Thread Andrew G. West
I have spoken with Erik Zachte about this --  who confirmed that mobile 
views are not being counted on a per-article basis.


Indeed, that documentation does point out the existence of an *.mw 
key(s) for such views. However, if you download one of those raw 
pagecount files and 'grep' for that string, you'll find it appears 
exactly once, where the number aggregate number of mobile views over all 
articles is counted (i.e., one mega-aggregate number, not the several 
million article granularity ones we would expect/like).


Thanks, -AW


On 09/07/2014 02:34 PM, Oliver Keyes wrote:

Er. That's not true, I don't think. See the notes at
http://dumps.wikimedia.org/other/pagecounts-raw/ - wkimedia mobile
pageviews are counted, just in a distinct way, because webstatscollector
(the software that powers page-by-page-PV-collection) looks primarily at
aggregating the entire URLs.

On 7 September 2014 13:41, Andrew G. West west.andre...@gmail.com
mailto:west.andre...@gmail.com wrote:

Related:


https://en.wikipedia.org/wiki/__User_talk:West.andrew.g/__Popular_pages#STICKY:_On_the___Non-Reporting_of_Mobile_Views

https://en.wikipedia.org/wiki/User_talk:West.andrew.g/Popular_pages#STICKY:_On_the_Non-Reporting_of_Mobile_Views

--
Andrew G. West, PhD
Research Scientist
Verisign Labs - Reston, VA
Website: http://www.andrew-g-west.com


On 09/07/2014 03:27 AM, Pine W wrote:

Good to hear. I note that according to the Wikimedia Report Card
that
total pageviews are holding fairly steady even as Comscore reports a
decline in unique visitors. If pageviews are holding steady
despite the
reuse of Wikipedia article summaries in search engines, I think
this is
a net positive. If we have more confidence in the pageview data
than in
the Comscore data then I am inclined to believe that the net
situation
is significantly better than what the Comscore data alone would
suggest.

Pine

On Sat, Sep 6, 2014 at 9:16 AM, Aaron Halfaker
ahalfa...@wikimedia.org mailto:ahalfa...@wikimedia.org
mailto:ahalfaker@wikimedia.__org
mailto:ahalfa...@wikimedia.org wrote:

 FYI, this plan will involve a public proposal and
discussion.  :)
   More to come.


 On Sat, Sep 6, 2014 at 6:52 AM, Oliver Keyes
oke...@wikimedia.org mailto:oke...@wikimedia.org
 mailto:oke...@wikimedia.org
mailto:oke...@wikimedia.org wrote:

 They key word is developing: we don't have it yet.
We'd like
 to have a proper discussion of the privacy implications
around
 it before we do so.


 On 6 September 2014 03:40, Pine W wiki.p...@gmail.com
mailto:wiki.p...@gmail.com
 mailto:wiki.p...@gmail.com
mailto:wiki.p...@gmail.com wrote:

 Dario and company,

 I heard a portion of the discussion during the
September
 metrics meeting about Comscore saying that Wikimedia
 globally has a significant decline in unique
visitors but
 this does not take into account mobile users.

 I thought that Wikimedia was developing an internal
way of
 measuring unique visitors and was using Comscore
data mainly
 to validate the internal data.

 Can you give an update on what the internal data
shows about
 global uniques?

 Pine

 _
 Wiki-research-l mailing list
Wiki-research-l@lists.__wikimedia.org
mailto:Wiki-research-l@lists.wikimedia.org
 mailto:Wiki-research-l@lists.__wikimedia.org
mailto:Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/__mailman/listinfo/wiki-__research-l
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




 --
 Oliver Keyes
 Research Analyst
 Wikimedia Foundation

 _
 Wiki-research-l mailing list
Wiki-research-l@lists.__wikimedia.org
mailto:Wiki-research-l@lists.wikimedia.org
 mailto:Wiki-research-l@lists.__wikimedia.org
mailto:Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/__mailman/listinfo/wiki-__research-l
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



 _
 Wiki-research-l mailing list
Wiki-research-l@lists.__wikimedia.org
mailto:Wiki-research-l@lists.wikimedia.org

Re: [Wiki-research-l] How fast is Wikipedia?

2014-04-23 Thread Andrew G. West

http://collablab.northwestern.edu/pubs/ABS2013_Keegan.pdf


On 04/23/2014 03:55 AM, Johannes Hoffart wrote:

Hi everybody,

I was wondering if there is any work answering the question on how up-to-date 
Wikipedia is.

For some high-impact news, like Snowden's revelation of the PRISM program, 
articles are written in mere hours. For others, e.g. business people taking on 
important posts in companies and thus becoming Wikipedia-relevant, it sometimes 
takes weeks until an article is written (Ian Robertson of BMW is an example).

Is there some work trying to answer this question of how long it takes for 
Wikipedia articles to be created after an event became newsworthy (and 
eventually ends up in Wikipedia)?

Cheers
Johannes

--
Doctoral Researcher

Max-Planck-Institut fuer Informatik
Databases and Information Systems
Campus E1.4
66123 Saarbruecken

http://www.mpi-inf.mpg.de/~jhoffart/


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



--
Andrew G. West, PhD
Research Scientist
Verisign Labs - Reston, VA
Phone:   (304)-415-5824 (personal)
 (571)-455-6161 (mobile)
 (703)-948-4431 (office)
Email:   west.andre...@gmail.com
 aw...@verisign.com
Website: http://www.andrew-g-west.com

Please direct all correspondence not germane to Verisign's business
purposes to my personal phone and/or email addresses.

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] data about failed searches

2013-12-19 Thread Andrew G. West

Greetings,

I am the individual who provided code to Gerard. Towards the Bugzilla 
entry serving as blocker for this and many other inquiries, I will 
note that my code fires nightly to obtain one days worth of pageview 
stats and does write them to an SQL database. I have been persistently 
storing all pageview statistics for en.wp in this query-able format for 
2+ years at this point. I then use this data in my research, as well as 
reports such as [https://en.wikipedia.org/wiki/Wikipedia:Top_5000_pages] 
and [https://en.wikipedia.org/wiki/Wikipedia:TOPRED]. Is it production 
ready? Probably not, but it works for me as research code.


My limitations with this are primarily hardware based. I do it all on a 
single commodity server that also runs services like [[WP:STiki]]. Thus: 
(a) I don't particularly have the storage to do all languages/projects. 
CPU cycles would also become an issue at this scale. It can take up to 3 
hours to parse in a day's worth of en.wp stats. It could be done 
quicker, but with my query-driven indices and scalable format, this is 
how it goes. (b) I am not in a position to open this as a private or 
public API. It would be trivial to DOS this server with some pretty 
simple queries (en.wp sees 10 million+ article titles daily, I think, as 
this data includes attempted URL accesses that don't exist and there is 
all types of muck in that regard).


I am not sure what Gerard is chasing in particular with missing 
searches, but regardless, I get an overwhelming amount of requests to 
do popular pages or redlinks reports for various projects/languages. My 
code could do this by changing a small handful of strings, what is 
really needs is a place to run and someone to oversee it. More than a 
dev, this seems to be in the realm of someone like Erik Zachte, not that 
I am trying to append to anyone's responsibilities. -AW



On 12/19/2013 06:14 AM, Gerard Meijssen wrote:

Hoi,

As I said, there is software that does basically what we need it to do.
I am asking for access for Magnus so that he can modify that software
and make it more useful.

Waiting for perfection takes too long. The need for this functionality
exists and the arguments are in my initial mail.
Thanks,
GerardM


On 19 December 2013 12:10, Federico Leva (Nemo) nemow...@gmail.com
mailto:nemow...@gmail.com wrote:

Gerard Meijssen, 19/12/2013 12:06:

Hoi,
Sorry .. the link [1] and the blog post [2] I wrote when I
learned about it.
Thanks,
   Gerard


[1]
https://en.wikipedia.org/wiki/__User:West.andrew.g/Popular___redlinks
https://en.wikipedia.org/wiki/User:West.andrew.g/Popular_redlinks
[2]

http://ultimategerardm.__blogspot.nl/2013/11/a-__brilliant-idea-barnstar.html

http://ultimategerardm.blogspot.nl/2013/11/a-brilliant-idea-barnstar.html


Ah. Those are not searches, they're direct URL accesses (where
enabled, wdsearch.js shows wikidata search results for those too).
So again that would require the good old
https://bugzilla.wikimedia.__org/show_bug.cgi?id=42259
https://bugzilla.wikimedia.org/show_bug.cgi?id=42259 , our usual
blocker. :( Actual search results misses are something quite harder
to get.


Nemo

_
Wiki-research-l mailing list
Wiki-research-l@lists.__wikimedia.org
mailto:Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/__mailman/listinfo/wiki-__research-l
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



--
Andrew G. West, PhD
Research Scientist
Verisign Labs - Reston, VA
Website: http://www.andrew-g-west.com

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Possible project funding opportunity

2013-02-16 Thread Andrew G. West

Greetings fellow [Wiki-research-l] subscribers,

I just wanted to notify everyone of the new flow funding initiative 
[1] in which I am participating. Essentially, the program decentralizes 
the grant process for small amounts ($2000), giving plentiful 
discretion to individual flow funders (i.e., myself and others).


Since my interests tend to be research-based; pertaining to quantitative 
methods, wiki security, and editing tools (as opposed to say, local 
meetups and community organizing) -- I thought some others on the list 
might have some proposals that fall within my scope.


I imagined a grant of this size might be useful -- for example -- in the 
context of a Senior/Summer project; financing computing resources or a 
corpus labeling task to MTurk. The project is in its pilot stages, so we 
encourage ideas that test/stretch our current expectations.


To submit an idea my way, visit [2]. The main portal [1] also lists the 
other members who might entertain your ideas.


Thanks, -AW

[1] http://meta.wikimedia.org/wiki/FF_portal
[2] http://en.wikipedia.org/wiki/User:West.andrew.g/Flow_funding

--
Andrew G. West, Doctoral Candidate
Dept. of Computer and Information Science
University of Pennsylvania, Philadelphia PA
Email:   west...@cis.upenn.edu
Website: http://www.andrew-g-west.com

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] 2012 top pageview list

2013-01-03 Thread Andrew G. West

The Google Doodle often explains some of the most unusual:

http://en.wikipedia.org/wiki/List_of_Google_Doodles_in_2012

Thanks, -AW


On 01/03/2013 04:06 PM, Kerry Raymond wrote:

Sorry, I meant the referrer stats for the top pages of 2012 in the hope
that some unusual patterns might shed some light on why some of these pages
are so popular (contrary to what common sense might suggest).

Kerry

-Original Message-
From: Federico Leva (Nemo) [mailto:nemow...@gmail.com]
Sent: Thursday, 3 January 2013 10:26 PM
To: kerry.raym...@gmail.com; Research into Wikimedia content and communities
Subject: Re: [Wiki-research-l] 2012 top pageview list

Kerry Raymond, 02/01/2013 22:46:

The problem (as always) is that there is a difference between pages served
(by the web server) and pages actually wanted and read by the user.

It would be interesting to have referrer statistics. I'm guessing that

many

of Wikipedia pages are being referred by Google (and other general search
engines).


See http://stats.wikimedia.org/wikimedia/squids/SquidReportGoogle.htm

Nemo


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



--
Andrew G. West, Doctoral Candidate
Dept. of Computer and Information Science
University of Pennsylvania, Philadelphia PA
Email:   west...@cis.upenn.edu
Website: http://www.andrew-g-west.com

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] 2012 top pageview list

2013-01-01 Thread Andrew G. West
I got a couple of private replies to this thread, so I figured I would 
just answer them publicly for the benefit of the list:



(1) Do I only parse/store English Wikipedia?

Yes; for scalability reasons and because that is my research focus. I'd 
consider opening my database to users with specific academic uses, but 
its probably not the most efficient way to do a lot of computations (see 
below). Plus, I transfer the older tables to offline drives, so I 
probably only have  ~6 months of the most recent data online.



(2) Can you provide some insights into your parsing?

First, I began collecting this data for the purposes of:

http://repository.upenn.edu/cis_papers/470/

Where I knew the revision IDs of damaging revisions and wanted to reason 
about how many people saw that article/RID in its damaged state. This 
involved storing data on EVERY article at the finest granularity 
possible (hourly) and then assuming uniform intra-hour distributions.


See the URL below for my code (with the SQL server credentials blanked 
out) that does this work. A nightly [cron] task fires the Java code. It 
goes and downloads an entire days worth of files (24) and parses them. 
These files contain data for ALL WMF projects and languages, but I use a 
simple string match to only handle en.wp lines. Each column in the 
database represents a single day and contains a binary object wrapping 
(hour, hits) pairs. Each table contains 10 consecutive days of data. 
Much of this design was chosen to accommodate the extremely long tail 
and sparseness of the view distribution; filling a DB with billions of 
NULL values didn't prove to be too efficient in my first attempts. I 
think I use ~1TB yearly for the English Wikipedia data.


I would appreciate if anyone ends up using this code that my original 
work above would get a cite/acknowledgement. However, I imagine most 
will want to do a bit more aggregation, and hopefully this can provide a 
baseline for doing that.


Thanks, -AW


CODE LINK:
http://www.cis.upenn.edu/~westand/docs/wp_stats.zip



On 12/29/2012 11:06 PM, Andrew G. West wrote:

The WMF aggregates them as (page,views) pairs on an hourly basis:

http://dumps.wikimedia.org/other/pagecounts-raw/

I've been parsing these and storing them in a query-able DB format (for
en.wp exclusively; though the files are available for all projects I
think) for about two years. If you want to maintain such a fine
granularity, it can quickly become a terrabyte scale task that eats up a
lot of processing time.

If your looking for more coarse granularity reports (like top views for
day, week, month) a lot of efficient aggregation can be done.

See also: http://en.wikipedia.org/wiki/Wikipedia:5000

Thanks, -AW


On 12/28/2012 07:28 PM, John Vandenberg wrote:

There is a steady stream of blogs and 'news' about these lists

https://encrypted.google.com/search?client=ubuntuchannel=fsq=%22Sean+hoyland%22ie=utf-8oe=utf-8#q=wikipedia+top+2012hl=ensafe=offclient=ubuntutbo=dchannel=fstbm=nwssource=lnttbs=qdr:wsa=Xpsj=1ei=GzjeUOPpAsfnrAeQk4DgCgved=0CB4QpwUoAwbav=on.2,or.r_gc.r_pw.r_cp.r_qf.bvm=bv.1355534169,d.aWMfp=4e60e761ee133369bpcl=40096503biw=1024bih=539


How does a researcher go about obtaining access logs with useragents
in order to answer some of these questions?





--
Andrew G. West, Doctoral Candidate
Dept. of Computer and Information Science
University of Pennsylvania, Philadelphia PA
Email:   west...@cis.upenn.edu
Website: http://www.andrew-g-west.com

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] 2012 top pageview list

2012-12-29 Thread Andrew G. West

The WMF aggregates them as (page,views) pairs on an hourly basis:

http://dumps.wikimedia.org/other/pagecounts-raw/

I've been parsing these and storing them in a query-able DB format (for 
en.wp exclusively; though the files are available for all projects I 
think) for about two years. If you want to maintain such a fine 
granularity, it can quickly become a terrabyte scale task that eats up a 
lot of processing time.


If your looking for more coarse granularity reports (like top views for 
day, week, month) a lot of efficient aggregation can be done.


See also: http://en.wikipedia.org/wiki/Wikipedia:5000

Thanks, -AW


On 12/28/2012 07:28 PM, John Vandenberg wrote:

There is a steady stream of blogs and 'news' about these lists

https://encrypted.google.com/search?client=ubuntuchannel=fsq=%22Sean+hoyland%22ie=utf-8oe=utf-8#q=wikipedia+top+2012hl=ensafe=offclient=ubuntutbo=dchannel=fstbm=nwssource=lnttbs=qdr:wsa=Xpsj=1ei=GzjeUOPpAsfnrAeQk4DgCgved=0CB4QpwUoAwbav=on.2,or.r_gc.r_pw.r_cp.r_qf.bvm=bv.1355534169,d.aWMfp=4e60e761ee133369bpcl=40096503biw=1024bih=539

How does a researcher go about obtaining access logs with useragents
in order to answer some of these questions?



--
Andrew G. West, Doctoral Candidate
Dept. of Computer and Information Science
University of Pennsylvania, Philadelphia PA
Website: http://www.andrew-g-west.com

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Pageviews and ratings

2012-02-07 Thread Andrew G. West

Hi Ben,

If you are interested in pageviews, the best available public resource is:

[http://dumps.wikimedia.org/other/pagecounts-raw/]

which provides an aggregate count of views for a page, by hour (and I 
have a parser to store all this to a MySQLDB if it interests you). 
However, this does not map views to a particular identifier (username or 
IP address) or an exacting time-stamp, as you seem to desire. This might 
be tough because:


* The WMF treats the IP addresses of registered editors as confidential 
information. IP address is used for unregistered editing. Regardless, 
no data pertaining to simple access is available in a public-facing 
fashion to my knowledge (and if it were, it would be trivial to 
determine the IP addresses of registered editors)


* Assuming you were allowed to view it, even for an hour's time, the 
apache-like log of en:wp access would be LARGE. Consider that the terse 
and aggregate format they make available is already on the order of 
~80MB/hour zipped.


I am not terribly familiar with the article ratings tool and its 
operation, but I assume it would incur the same privacy concerns. 
Ratings data does seem to be accessible via the API:


[http://en.wikipedia.org/w/api.php]

But there are no fields describing the user/IP that left that feedback.

-

Of course, I speak only of publicly available data. If you are able to 
convince the administration to collect and confidentially share this 
data, it would become more feasible (although you'd be trying to trace 
user click-paths from a -ton- of data).


Its not my intention to discourage you, but have you thought about 
looking at this in a more aggregate fashion (i.e., average daily 
talk-page views vs. article quality rating)? -AW




On 02/07/2012 03:03 PM, W. Ben Towne wrote:

Hello,
Does the English Wikipedia currently track pageviews?

I'm doing a study looking at the page ratings, and how that is (or
isn't) affected by a reader's understanding of the discussion process
that went on behind the scenes. We'd really like to be able to know if
the rater saw the talk page before they rated the article. As secondary
goals, we'd like to see if they edited the article and/or talk page, and
as a tertiary goal, we'd like to measure how familiar they are with
Wikipedia and talk pages in general (e. g. do they even know Talk pages
exist, are they a frequent discussant on them, etc.).
If it is possible to get the information about ratings and pageviews
(esp. common fields/links between them), can somebody guide me on how to?
If the data is currently not collected but there is a way to start doing
so (i. e. no philosophical objection or significant tech/performance
issue b/c of the caching layers), who's the right person to work with
for that?

Thanks!

Grace and peace,
Ben



--
Andrew G. West, Doctoral Student
Dept. of Computer and Information Science
University of Pennsylvania, Philadelphia PA
Email:   [last name] + and @cis.upenn.edu
Website: http://www.cis.upenn.edu/~westand

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] New toolbox Wikipedia pages

2011-01-25 Thread Andrew G. West
I'll add another note to this article view discussion:

I have parsed the hourly, per-page statistics at 
[http://dammit.lt/wikistats/]. If one assumes uniform intra-hour 
distributions, this makes it possible to arrive at highly accurate view 
estimates for arbitrary pages, for arbitrary time intervals.

I have found this useful to measure how many people saw a particular 
revision and used this heavily in my anti-vandalism research.

I believe this is the same data source all these other services are 
using -- but I don't do any aggregation. I've got data for all of 2010 
for en.wiki (some 400+GB). I'd imagine this volume of parsing and 
storage isn't something all Wiki researchers are capable of.

So, while I'm yet to develop this into a formal public-facing API -- I'd 
be willing to run queries for interested researchers -- and they should 
feel free to contact me.

Thanks, -Andrew G. West


On 01/25/2011 04:22 PM, Carlos d'Andréa wrote:
 Hi, Felipe,

 these tools are really useful!

 I like much the Wikipedia Page History Statistics too:
 http://vs.aka-online.de/cgi-bin/wppagehiststat.pl

 Here in Brazil I've developed (with a computer science student) a tool
 that extracs other interesting data from pages history, like number of
 protections and duration of time of each, number of revertions and
 editions undone, number anda percentage of editions made by
 administrators, bots and IP etc.

 Unfortunately it works only in portuguese Wikipedia, but we are very
 interessed in open the code e make it better.

 BTW, as it's my first mensage here, let me present myself: I'm
 journalist, teacher in Federal University of Viçosa and PHD student in
 Applied Linguistics in Minas Gerais Federal University. In summary, I'm
 studing the editorial process of Biographies of living persons in
 portuguese Wikipedia.

 Best,

 --
 Carlos d'Andréa
 carlosdand.com http://carlosdand.com
 novasm.blogspot.com http://novasm.blogspot.com



 On Tue, Jan 25, 2011 at 6:30 PM, Felipe Ortega glimmer_phoe...@yahoo.es
 mailto:glimmer_phoe...@yahoo.es wrote:

 Hi all.

 I just discovered this, it may be potentially interesting for the
 Wikipedia
 research community.

 In short, now for any Wikipedia page, not only articles, e.g.

 http://en.wikipedia.org/wiki/History_of_free_and_open_source_software

 You can access, from the corresponding View history page:

 * Nice stats (via soxred93 tool in Toolserver) :
 
 http://toolserver.org/~soxred93/articleinfo/index.php?article=History_of_Free_Software
 
 http://toolserver.org/%7Esoxred93/articleinfo/index.php?article=History_of_Free_Software〈=enwiki=wikipedia


 * Ranked contributors (Daniel's tool in Toolserver):
 
 http://toolserver.org/~daniel/WikiSense/Contributors.php?wikilang=enwikifam=.wikipedia.orggrouped=onpage=History_of_Free_Software
 
 http://toolserver.org/%7Edaniel/WikiSense/Contributors.php?wikilang=enwikifam=.wikipedia.orggrouped=onpage=History_of_Free_Software


 * Revision history search (WikiBlame):
 
 http://wikipedia.ramselehof.de/wikiblame.php?lang=enarticle=History_of_Free_Software
 
 http://wikipedia.ramselehof.de/wikiblame.php?lang=enarticle=History_of_Free_Software


 * Page view statistics:
 http://stats.grok.se/en/201101/History_of_Free_Software

 And... incredible:

 * Number of watchers (!!!) (mzmcbride tool in Toolserver):
 
 http://toolserver.org/~mzmcbride/cgi-bin/watcher.py?db=enwiki_ptitles=History_of_Free_Software
 
 http://toolserver.org/%7Emzmcbride/cgi-bin/watcher.py?db=enwiki_ptitles=History_of_Free_Software


 I don't know when (exactly) these services were activated.

 I've also found some (still inactive) API links. Anybody has any
 further info
 about this?

 Cheers,
 Felipe.
 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 mailto:Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

-- 
Andrew G. West, Doctoral Student
Dept. of Computer and Information Science
University of Pennsylvania, Philadelphia PA
Phone:   (304)-415-5824
Email:   west...@cis.upenn.edu
Website: http://www.cis.upenn.edu/~westand

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] New toolbox Wikipedia pages

2011-01-25 Thread Andrew G. West
Dario,

Yes, it is certainly the same data source.

First, I wasn't aware there was a JSON API for [http://stats.grok.se] -- 
can you provide everyone a link to it?

Second, at least in visual form, that site presents only daily totals. 
The actual data uses hourly dumps -- and I was thinking my contribution 
could be finer granularity for those who need it (assuming I am not 
mistaken).

Thanks, -AW


On 01/25/2011 06:06 PM, Dario Taraborelli wrote:
 apologies – that's obviously just an interface to Domas Mituzas' raw data!

 Dario

 On 25 Jan 2011, at 23:02, Dario Taraborelli wrote:

 Andrew,

 So, while I'm yet to develop this into a formal public-facing API -- I'd
 be willing to run queries for interested researchers -- and they should
 feel free to contact me.

 are you aware of this tool based on your data: http://stats.grok.se ?

 It also has a JSON interface, which is really handy (I used it with a simple 
 python script to download view stats for a sample of pages in a given 
 timeframe)

 Dario


 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

-- 
Andrew G. West, Doctoral Student
Dept. of Computer and Information Science
University of Pennsylvania, Philadelphia PA
Phone:   (304)-415-5824
Email:   west...@cis.upenn.edu
Website: http://www.cis.upenn.edu/~westand

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l