Re: [Wiki-research-l] [Analytics] [Release]

2015-02-25 Thread Federico Leva (Nemo)

Erik Zachte, 25/02/2015 23:34:

Compare https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/  and
http://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerLanguageBreakdown.htm


Ironholds' looks more vulnerable to bots, it's easier to see in small 
wikis (though, kudos! many more small wikis are included than in 
wikistats). For instance, 20 more percentage points for USA on Breton 
and Bavarian Wikipedias, 30 on Welsh, 40 on Alemannic, almost 50 on 
Kurdish. For Chinese bots they look similar, though in some cases I'm 
not sure what's going on: for instance als.wiki also sees CH and RO emerge.


Will the new pageviews definition use the same bot filtering method?

Nemo

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Release]

2015-02-25 Thread Giovanni Luca Ciampaglia
This is really, really cool, great job guys!

G


Giovanni Luca Ciampaglia

✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA
☞ http://www.glciampaglia.com/
✆ +1 812 855-7261
✉ gciam...@indiana.edu

2015-02-25 16:06 GMT-05:00 Oliver Keyes :

> Hey all!
>
> We've released a highly-aggregated dataset of readership data -
> specifically, data about where, geographically, traffic to each of our
> projects (and all of our projects) comes from. The data can be found
> at http://dx.doi.org/10.6084/m9.figshare.1317408 - additionally, I've
> put together an exploration tool for it at
> https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/
>
> Hope it's useful to people!
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Signpost readership survey results

2015-02-25 Thread phoebe ayers
On Wed, Feb 25, 2015 at 2:03 PM, Pine W  wrote:
> Hello all,
>
> I have uploaded the results from the Signpost readership survey to Wikimedia
> Commons in PDF format:
> https://commons.wikimedia.org/wiki/File:Signpost_February_2015_survey_results.pdf
>
> Thanks very much to the WMF Learning and Evaluation Team for letting us use
> Qualtrics.


Thanks for doing this and sending it around, Pine. I just read through
all the comments and it's fascinating -- some people love the op-eds
and want more coverage of debates and disputes, but another large
group of people want the Signpost to be neutral and stay away from
drama!

I was also a little disheartened by the lackluster response about what
would motivate readers to contribute -- it seems everyone agrees the
Signpost is useful, but few people want to put the time into making it
that way. It's true that it's a lot of work -- I wrote News & Notes
for a couple of years and it was hugely time-consuming. But it was
also a lot of fun!

Regardless, congratulations on keeping up the 'Post and trying to make
it better.

best,
Phoebe


-- 
* I use this address for lists; send personal messages to phoebe.ayers
 gmail.com *

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Analytics] [Release]

2015-02-25 Thread Oliver Keyes
Yours is looking at just December, while mine is looking at the entire
year, for starters. Also, what's the apps/mobile web inclusion for
that report?

On 25 February 2015 at 17:34, Erik Zachte  wrote:
> I am surprised that the new data, with crawlers excluded, show more wp:en 
> traffic from US (43%) than the old data (36.4% for 2014), which contained 
> much crawler traffic, presumably most of that from US.
>
> Compare https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/ and
> http://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerLanguageBreakdown.htm
>
> Any thoughts?
>
> Erik
>
> -Original Message-
> From: analytics-boun...@lists.wikimedia.org 
> [mailto:analytics-boun...@lists.wikimedia.org] On Behalf Of Oliver Keyes
> Sent: Wednesday, February 25, 2015 22:37
> To: Research into Wikimedia content and communities
> Cc: A mailing list for the Analytics Team at WMF and everybody who has an 
> interest in Wikipedia and analytics.
> Subject: Re: [Analytics] [Wiki-research-l] [Release]
>
> The one major caveat, I think, is that the danger of proportionate data is 
> that it makes small projects very vulnerable to artificial traffic spikes. 
> I'd go out on a limb and say that some of the massive bumps in popularity we 
> see in particular combinations are likely due to either undetected automata 
> or simply a project having so little traffic that a small number of people 
> can sway the results outlandishly.
>
> On 25 February 2015 at 16:32, Andrew Lih  wrote:
>> Great job.
>>
>> Who knew Esperanto was big in Japan and China at #2 and #3?
>>
>>
>>
>> On Wed, Feb 25, 2015 at 4:06 PM, Oliver Keyes  wrote:
>>>
>>> Hey all!
>>>
>>> We've released a highly-aggregated dataset of readership data -
>>> specifically, data about where, geographically, traffic to each of
>>> our projects (and all of our projects) comes from. The data can be
>>> found at http://dx.doi.org/10.6084/m9.figshare.1317408 -
>>> additionally, I've put together an exploration tool for it at
>>> https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/
>>>
>>> Hope it's useful to people!
>>>
>>> --
>>> Oliver Keyes
>>> Research Analyst
>>> Wikimedia Foundation
>>>
>>> ___
>>> Wiki-research-l mailing list
>>> Wiki-research-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>
>>
>> ___
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics



-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Signpost readership survey results

2015-02-25 Thread Pine W
Hello all,

I have uploaded the results from the *Signpost *readership survey to
Wikimedia Commons in PDF format:
https://commons.wikimedia.org/wiki/File:Signpost_February_2015_survey_results.pdf

Thanks very much to the WMF Learning and Evaluation Team for letting us use
Qualtrics.

The *Signpost* management team recently agreed to cross-post selected
content from the Wikimedia Blog into the *Signpost*. By doing this we can
both increase the exposure of Blog content (many *Signpost *readers don't
read the blog) and enhance the value of the *Signpost *to its current
readers (some of whom would like to see more coverage of sister projects
and other, diverse parts of the Wikimedia ecosystem).

Your comments on the survey results would be appreciated. The
*Signpost *management
team will have more to say after we study these results in more detail, and
we will publish our comments in a future *Signpost *issue.

Cheers,

Pine
*Signpost *Publication and Newsroom Manager

*This is an Encyclopedia* 






*One gateway to the wide garden of knowledge, where lies The deep rock of
our past, in which we must delve The well of our future,The clear water we
must leave untainted for those who come after us,The fertile earth, in
which truth may grow in bright places, tended by many hands,And the broad
fall of sunshine, warming our first steps toward knowing how much we do not
know.*

*—Catherine Munro*
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Release]

2015-02-25 Thread Oliver Keyes
The one major caveat, I think, is that the danger of proportionate
data is that it makes small projects very vulnerable to artificial
traffic spikes. I'd go out on a limb and say that some of the massive
bumps in popularity we see in particular combinations are likely due
to either undetected automata or simply a project having so little
traffic that a small number of people can sway the results
outlandishly.

On 25 February 2015 at 16:32, Andrew Lih  wrote:
> Great job.
>
> Who knew Esperanto was big in Japan and China at #2 and #3?
>
>
>
> On Wed, Feb 25, 2015 at 4:06 PM, Oliver Keyes  wrote:
>>
>> Hey all!
>>
>> We've released a highly-aggregated dataset of readership data -
>> specifically, data about where, geographically, traffic to each of our
>> projects (and all of our projects) comes from. The data can be found
>> at http://dx.doi.org/10.6084/m9.figshare.1317408 - additionally, I've
>> put together an exploration tool for it at
>> https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/
>>
>> Hope it's useful to people!
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>> ___
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Release]

2015-02-25 Thread Andrew Lih
Great job.

Who knew Esperanto was big in Japan and China at #2 and #3?



On Wed, Feb 25, 2015 at 4:06 PM, Oliver Keyes  wrote:

> Hey all!
>
> We've released a highly-aggregated dataset of readership data -
> specifically, data about where, geographically, traffic to each of our
> projects (and all of our projects) comes from. The data can be found
> at http://dx.doi.org/10.6084/m9.figshare.1317408 - additionally, I've
> put together an exploration tool for it at
> https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/
>
> Hope it's useful to people!
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Analytics] [Release]

2015-02-25 Thread Pine W
Excellent!

Pine
On Feb 25, 2015 1:26 PM, "Oliver Keyes"  wrote:

> Totally! I'm also going to get together with some NEU hackers tomorrow
> and work on actually visualising the data on *drumroll* maps, which'd
> probably be more interesting eye candy than infinite bar plots :)
>
> On 25 February 2015 at 16:19, Pine W  wrote:
> > Very nice. Do you think that you could pick out a few of your favorite
> > graphs and add them to this week's Recent Research report in a gallery?
> >
> > Thanks!
> > Pine
> >
> > Hey all!
> >
> > We've released a highly-aggregated dataset of readership data -
> > specifically, data about where, geographically, traffic to each of our
> > projects (and all of our projects) comes from. The data can be found
> > at http://dx.doi.org/10.6084/m9.figshare.1317408 - additionally, I've
> > put together an exploration tool for it at
> > https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/
> >
> > Hope it's useful to people!
> >
> > --
> > Oliver Keyes
> > Research Analyst
> > Wikimedia Foundation
> >
> > ___
> > Analytics mailing list
> > analyt...@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/analytics
> >
> > ___
> > Analytics mailing list
> > analyt...@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/analytics
> >
>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Analytics] [Release]

2015-02-25 Thread Oliver Keyes
Totally! I'm also going to get together with some NEU hackers tomorrow
and work on actually visualising the data on *drumroll* maps, which'd
probably be more interesting eye candy than infinite bar plots :)

On 25 February 2015 at 16:19, Pine W  wrote:
> Very nice. Do you think that you could pick out a few of your favorite
> graphs and add them to this week's Recent Research report in a gallery?
>
> Thanks!
> Pine
>
> Hey all!
>
> We've released a highly-aggregated dataset of readership data -
> specifically, data about where, geographically, traffic to each of our
> projects (and all of our projects) comes from. The data can be found
> at http://dx.doi.org/10.6084/m9.figshare.1317408 - additionally, I've
> put together an exploration tool for it at
> https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/
>
> Hope it's useful to people!
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Analytics] [Release]

2015-02-25 Thread Pine W
Very nice. Do you think that you could pick out a few of your favorite
graphs and add them to this week's Recent Research report in a gallery?

Thanks!
Pine
Hey all!

We've released a highly-aggregated dataset of readership data -
specifically, data about where, geographically, traffic to each of our
projects (and all of our projects) comes from. The data can be found
at http://dx.doi.org/10.6084/m9.figshare.1317408 - additionally, I've
put together an exploration tool for it at
https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/

Hope it's useful to people!

--
Oliver Keyes
Research Analyst
Wikimedia Foundation

___
Analytics mailing list
analyt...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] ICWSM Workshop Announcement and Call for Papers

2015-02-25 Thread Leila Zia
Hi,

   Bob West, Jure Leskovec, and myself are organizing a workshop in ICWSM
focused on the challenges and opportunities of Wikipedia. You can find more
information about the workshop and call for papers below.

   Looking forward to seeing many of you in person in the workshop.

Best,
Leila


*Call for Workshop Papers*
Workshop on Wikipedia, a Social Pedia: Research Challenges and Opportunities
May 26, Oxford, England
co-located with the 9th International Conference on Weblogs and Social
Media (ICWSM 2015)
http://snap.stanford.edu/wiki-icwsm15/
Deadline for papers: Tuesday, March 24, 2015, 23:59 AoE

Wikipedia is one of the most popular sites on the Web, a main source of
knowledge for a large fraction of Internet users, and, in the light of its
collaborative nature, an inherently social medium. Therefore, and since not
only all content but also many activity logs are available to the public,
Wikipedia has become an important object of study for researchers across
many subfields of the computational and social sciences, such as
social-network analysis, social psychology, education, anthropology,
political science, human-computer interaction, cognitive science,
artificial intelligence, linguistics, and natural-language processing.
This workshop is a venue for all researchers exploring social aspects of
Wikipedia. The workshop will feature high-profile speakers from academia
and the Wikimedia Foundation and aims to create a forum where participants
can connect both among each other and with researchers at the Wikimedia
Foundation.
Topics of interest include, but are not limited to:

   - Collaborative content creation
   - Consensus-finding and conflict resolution on editorial issues
   - Content consumption on Wikipedia
   - Participation in discussions and their dynamics
   - Collaborative task management
   - Evolution of hierarchies
   - Wikipedia as a sensor for real-world events, culture, etc.
   - Demographics of Wikipedia readers and editors
   - Engagement and incentivization of editors

We invite the submission of regular research papers (6–8 pages) as well as
position papers (2–4 pages). Authors whose papers are accepted to the
workshop will have the opportunity to participate in a poster session.

*Submission instructions*
Regular and position papers should be formatted according to AAAI
formatting guidelines (http://www.aaai.org/Publications/Author/author.php).
Please submit papers using EasyChair at https://easychair.org/conferences/?
conf=wikiicwsm2015

*Review and the archival of papers*
Authors will be notified of acceptance or rejection on or before Tuesday,
March 31, 2015.
The accepted papers will be published on the workshop webpage (unless the
authors object), and authors whose papers are accepted will have the
opportunity to participate in a poster session.

*Organizing committee*
Robert West, Stanford University
Jure Leskovec, Stanford University
Leila Zia, Wikimedia Foundation
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] [Release]

2015-02-25 Thread Oliver Keyes
Hey all!

We've released a highly-aggregated dataset of readership data -
specifically, data about where, geographically, traffic to each of our
projects (and all of our projects) comes from. The data can be found
at http://dx.doi.org/10.6084/m9.figshare.1317408 - additionally, I've
put together an exploration tool for it at
https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/

Hope it's useful to people!

-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Analytics] [Offline-l] Fwd: Reasons you use the XML dumps or want to, but can't?

2015-02-25 Thread Toby Negrin
Thanks for doing that Andrew!

On Tue, Feb 24, 2015 at 1:41 PM, Andrew Otto  wrote:

> I also added some Hadoop based used cases to that document.
>
>
> https://www.mediawiki.org/w/index.php?title=Wikimedia_MediaWiki_Core_Team%2FBacklog%2FImprove_dumps&diff=1422073&oldid=1421455
>
>
> > On Feb 21, 2015, at 05:03, Emmanuel Engelhart  wrote:
> >
> > Hi
> >
> > Thank you Nemo for adverting that interesting page about how to improve
> Wikimedia dumping processes. This topic is of course a primary concern for
> the Kiwix developer team.
> >
> > Here my contribution:
> >
> https://www.mediawiki.org/w/index.php?title=Wikimedia_MediaWiki_Core_Team%2FBacklog%2FImprove_dumps&diff=1417187&oldid=1415717
> >
> > Hope to see things going forward on this, I will help as much as I can.
> >
> > Regards
> > Emmanuel
> >
> > On 21.02.2015 08:44, Federico Leva (Nemo) wrote:
> >> FYI
> >>
> >>
> >>  Messaggio inoltrato 
> >> Oggetto: [Xmldatadumps-l] Your comments needed (long term dumps
> >> rewrite?)
> >> Data: Thu, 19 Feb 2015 12:30:01 +0200
> >> Mittente: Ariel Glenn WMF 
> >> A: xmldatadump...@lists.wikimedia.org
> >>
> >>
> >>
> >> The MediaWiki Core team has opened a discussion about getting more
> >> involved in and maybe redoing the dumps infrastructure.  A good starting
> >> point is to understand how folks use the dumps already or want to use
> >> them but can't, and some questions about that are listed here:
> >>
> https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Backlog/Improve_dumps
> >>
> >> I've added some notes but please go weigh in.  Don't be shy about what
> >> you do/what you need, this is the time to get it all on the table.
> >>
> >> Ariel
> >>
> >>
> >>
> >>
> >> ___
> >> Offline-l mailing list
> >> offlin...@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/offline-l
> >>
> >
> >
> > --
> > Kiwix - Wikipedia Offline & more
> > * Web: http://www.kiwix.org
> > * Twitter: https://twitter.com/KiwixOffline
> > * more: http://www.kiwix.org/wiki/Communication
> >
> > ___
> > Analytics mailing list
> > analyt...@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l