[Wiki-research-l] Re: Digitally-disadvantaged language

2024-06-25 Thread Federico Leva (Nemo)
Thank you for sharing and nice idea, making the acknowledgements section a Unicode test. :) Cc languages list. Congrats, Federico Il 25/06/24 12:29, Biyanto ha scritto: Hi everyone, I hope you are doing great! I just finished my master's degree and would like to share my thesis about

[Wiki-research-l] Re: Call for Papers: Semantic Web Journal Special Issue on Wikidata: Construction, Evaluation and Applications

2023-01-26 Thread Federico Leva (Nemo)
Il 24/01/23 23:59, Simon Razniewski ha scritto: Thanks for asking! The link you shared contains early-access PDFs that are not yet in the final publication layout. Final PDFs contain the appropriate CC BY 4.0 copyright clause, see the PDFs on this official page: Thanks for the clarification. Th

[Wiki-research-l] Re: Call for Papers: Semantic Web Journal Special Issue on Wikidata: Construction, Evaluation and Applications

2023-01-24 Thread Federico Leva (Nemo)
Il 24/01/23 13:10, Lucie Kaffee ha scritto: Also note that the Semantic Web journal is open access . Is it? I don't see any license on recent papers, let alone a free license: " © 0 – IOS Press and the authors. All rights re

Re: [Wiki-research-l] Interesting Wikipedia studies

2020-12-18 Thread Federico Leva (Nemo)
Morten Wang, 18/12/20 17:23: One thing I've noticed is that all the papers I'm referencing focus on the English Wikipedia. When it comes to studies of other language editions, or across multiple ones, I've struggled to come up with a key paper to point to. For this I usually reference Felipe O

Re: [Wiki-research-l] Editor surveys on race/ethnicity/religion

2020-09-21 Thread Federico Leva (Nemo)
Su-Laine Brodsky, 21/09/20 08:19: > I’m wondering if any large-scale surveys have been done that ask Wikipedia > editors about their race, ethnicity, or religion? What international standards exist to phrase such questions? Denominations commonly used in surveys in one country may be considered

Re: [Wiki-research-l] WikiHist.html: English Wikipedia's Full Revision History in HTML Format

2020-09-11 Thread Federico Leva (Nemo)
Robert West, 11/09/20 11:29: > local instances of MediaWiki, > enhanced with the capacity of correct historical macro expansion. Interesting. I see this doesn't include deleted templates. Have you considered using historical dumps? «We emphasize that the limitation of deleted pages, tem- plates,

Re: [Wiki-research-l] Number of registered editors per country

2020-08-22 Thread Federico Leva (Nemo)
Thomas Stieve, 22/08/20 23:59: > Hope all is well. Does anyone know of published statistics for the number > of registered editors per country? No. It's not possible to calculate, because IP addresses are only kept for 90 days. Federico ___ Wiki-resear

Re: [Wiki-research-l] Research, most influential and most surprising?

2020-07-24 Thread Federico Leva (Nemo)
Sven Andersson, 24/07/20 03:53: > Hi everyone! I'm looking for some reading. I hope this is an acceptable use > of this list. Sure. > > Which are the core Wikipedia papers, that shaped understanding Wikipedia and > Wikipedia research? Hard to tell, but there was an attempt to have some select

Re: [Wiki-research-l] Blocking Wikipedia by Country in 2016

2020-07-12 Thread Federico Leva (Nemo)
Thomas Stieve, 13/07/20 01:47: > Does anyone have or know where I can obtain information > about the official blocking of Wikipedia by the government per country > worldwide in Oct - Dec 2016? Why 2016 specifically? If by "official" you mean "publicly known or announced", you can check sources lik

Re: [Wiki-research-l] Tool request

2020-06-20 Thread Federico Leva (Nemo)
Neville Borg, 20/06/20 17:06: > get a batch of users (in this case, users who > participated in a Wiki Loves Monuments contest) and trace whether they are > still active or the date of the last time they used their account. The second part of this used to be simple with Wikimetrics, now Event Metr

Re: [Wiki-research-l] Measuring gender bias in contributors to the French-language Wikipedia

2020-05-24 Thread Federico Leva (Nemo)
Baptiste Fontaine, 22/05/20 15:13: very low level of information we have on contributors’ genders: on WP:FR, 60-70% of contributors have not changed their gender in their user settings. Does anyone have any pointer on this? The gender preference has nothing whatever to do with people's gender.

Re: [Wiki-research-l] Wikipedia Trademark

2020-04-13 Thread Federico Leva (Nemo)
Thomas Stieve, 13/04/20 19:11: I am writing an academic article. For this research, I am creating an index, which I would like to call the "Wikipedia Global Consciousness Index." The relevant section is "nominative use". M

Re: [Wiki-research-l] Wikipedia DOI

2020-03-08 Thread Federico Leva (Nemo)
Thomas Stieve, 09/03/20 01:33: I just wanted to ask, does Wikipedia have a DOI? No. I have done research by data mining the titles and IP edits in Wikipedia. The journal is asking if there is a DOI. In such a case, the most useful thing you can do is to attribute the authors of any softwar

Re: [Wiki-research-l] [Wikimedia-l] Remember Wikipedia Zero.. Where is the research about the effects of its demise?

2019-12-09 Thread Federico Leva (Nemo)
Gerard Meijssen, 09/12/19 08:16: This is most helpful, the information shows abundantly clear that we lost an audience. Or that we had (temporarily) gained one. Federico ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://li

Re: [Wiki-research-l] Does wikipedians feel like commoners ?

2019-12-07 Thread Federico Leva (Nemo)
Juergen Fenn, 08/12/19 01:22: Commoners used to draw on Wikipedia as an outstanding example of commons goods, but Wikipedians usually do not refer to Ostrom's works Indeed, although this has changed a bit after she won a Nobel. It was 2007 when "Understanding knowledge as a commons" made the l

Re: [Wiki-research-l] Why is Wikipedia still so often frowned upon in academic circles?

2019-12-04 Thread Federico Leva (Nemo)
Gerard Meijssen, 04/12/19 11:23: Free licenses are not always possible, it is not as if a single scientist is the only one signing a paper and determining the license. Nearly everyone can deposit at least some of their works as preprints under a free license. What helps a LOT is for scient

Re: [Wiki-research-l] Why is Wikipedia still so often frowned upon in academic circles?

2019-12-04 Thread Federico Leva (Nemo)
Kerry Raymond, 04/12/19 08:52: I think if we want to turn around academic perception, we need to: 1. make academics welcome on Wikipedia (apart from the usual conflict of interests) Yes, but I would argue the easiest and most impactful way for academics to help Wikipedia is to release their

Re: [Wiki-research-l] Generalizability of research across different language versions

2019-10-02 Thread Federico Leva (Nemo)
Jan Dittrich, 02/10/19 14:35: - How would you deal with such criticism, particularly of the "if it is not about 'my' wp it is useless"-kind [2]? At a minimum, the research needs to have used methods which could extend to multiple wikis. Being about 2 languages is ten times better than being a

Re: [Wiki-research-l] gender balance of wikipedia citations

2019-09-02 Thread Federico Leva (Nemo)
WereSpielChequers, 02/09/19 17:10: If you can come up with software that identifies sources that we aren't using but should then that would make for some interesting reports on Wikiprojects, or an interesting opportunity for the wiki library. Good point. This also came up at a research meetup a

Re: [Wiki-research-l] gender balance of Wikipedia citations

2019-08-31 Thread Federico Leva (Nemo)
Greg, 31/08/19 05:17: Thanks, Federico. Do you mean that examining gender bias is more relevant to google than wikipedia? Or necessary before any work can be done here? I'm saying that any gender bias of citations on Wikipedia articles will compound a number of factors, including the underlyi

Re: [Wiki-research-l] gender balance of Wikipedia citations

2019-08-29 Thread Federico Leva (Nemo)
Kerry Raymond, 29/08/19 01:26: > So I think a specific tag to encourage the expansion of "Bloggs et al" > citations to full author listings might work. But it's easier to fix it yourself, using the citation bot: https://en.wikipedia.org/wiki/WP:UCB Greg, 30/08/19 07:48: If the Wikipedia communi

Re: [Wiki-research-l] gender balance of wikipedia citations

2019-08-26 Thread Federico Leva (Nemo)
Greg, 22/08/19 06:19: I do not know the current status of wikicite or if/when this could be used for this inquiry--either to examine all, or a sensible subset of the citations. If I see correctly, you still did not receive an answer on the data available. It's true that the Figshare item for

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-26 Thread Federico Leva (Nemo)
Please everyone avoid using jargon specific to the English Wikipedia on this cross-language and cross-wiki mailing list. Aaron Halfaker, 23/08/19 17:36: I think embeddings[1] would be a nice way to create a signature. There is some discussion of acceptable user fingerprinting (presumably to

Re: [Wiki-research-l] Database of all users

2019-06-07 Thread Federico Leva (Nemo)
Kiril Simeonovski, 07/06/19 09:57: with their contributions to the Wikimedia projects Do you mean the *number* of their contributions, or literally all their contributions? Filtering the stub dumps would be one systematic way to get all the metadata about edits. If you just need aggregate

Re: [Wiki-research-l] Transferring CC-BY scientific literature into WP

2019-04-17 Thread Federico Leva (Nemo)
Alexandre Hocquet, 17/04/19 20:40: My point is : as it can be imagined that the number of CC-BY scientific papers will likely sky-rocket in the next years, would not it be relevant to try to organise "CC-BY scientific papers" driven edit-a-thons Importing text and images from freely licensed p

Re: [Wiki-research-l] User type context sensitivity to introduction sections.

2019-02-10 Thread Federico Leva (Nemo)
This has been discussed many times, see also: https://strategy.wikimedia.org/wiki/Proposal:Multi-level_Articles_(By_Difficulty) https://strategy.wikimedia.org/wiki/Proposal:Filter_content_based_on_desired_level_of_detail https://strategy.wikimedia.org/wiki/Proposal:Divide_Wikipedia https://strateg

Re: [Wiki-research-l] Readers of Wikipedia

2018-12-13 Thread Federico Leva (Nemo)
Ziko van Dijk, 13/12/18 12:02: Regionally important content: Should a Wikipedia language version concentrate on regional topics, or try to cover a large variety of topics? This question is automatically solved if instead of focusing on Wikipedia you do Wikisource. Wikisource will only contain

Re: [Wiki-research-l] where did I read about predicting user conflicts?

2018-09-16 Thread Federico Leva (Nemo)
Kerry Raymond, 16/09/2018 12:27: Some time in the last few months (possibly at Wikimania) someone pointed me at some research about predicting the outcome of Wikipedia consensus building from the language they were using in Talk. Maybe these?

Re: [Wiki-research-l] New viz.: Wikipedias, participation per language

2018-09-13 Thread Federico Leva (Nemo)
Always nice to see language data presented in an appealing way! Samuel Klein, 10/09/2018 23:27: Do we have data on "# of speakers of language X who don't speak a better-covered lang as a secondary language"? I usually have a very hard time finding such data from official/reliable sources, eve

Re: [Wiki-research-l] Wiki-Sul

2018-08-03 Thread Federico Leva (Nemo)
Felipe da Fonseca, 03/08/2018 15:38: I write to ask for help because we intend to establish a research group on Wikipedia in the area of Human Sciences, we would like to know what the current state of the art, so ask for opinion and research ideas. Have you browsed the research newsletter? Did

Re: [Wiki-research-l] Wikimedia Commons data structure - public?

2018-07-17 Thread Federico Leva (Nemo)
Trilce Navarrete, 17/07/2018 14:52: Where could I try to find the Wikimeida Commons data structure? or who may I ask further on this matter? There is no such thing as a "data structure" really. Some concise background: https://commons.wikimedia.org/wiki/Commons:Structured_data/About/FAQ Fede

Re: [Wiki-research-l] Nazi propaganda material

2018-06-28 Thread Federico Leva (Nemo)
For those who may not want to venture to click the link, it's a WMNL newsletter which links this Commons category: Federico ___ Wiki-research-l mailing list Wiki-research-l@l

Re: [Wiki-research-l] Analysis on the "thanks" feature, location of revision data?

2018-06-13 Thread Federico Leva (Nemo)
Maximilian Klein, 13/06/2018 21:12: I don't see recorded which*revision* was being thanked. Does anyone know where I might find this data? It's not public, as noted. See Federico ___ Wiki-research-l mailin

Re: [Wiki-research-l] Reader use of Wikipedia and Commons categories

2018-05-24 Thread Federico Leva (Nemo)
Ziko van Dijk, 24/05/2018 23:08: When it comes to Commons, I would be very interested to learn how many readers (or recipients) are actually non Wikipedia editors. It would be useful to consider less common but high value usage, for instance people looking for illustrations for a publication.

Re: [Wiki-research-l] Wiki Dialogue Systems

2018-05-21 Thread Federico Leva (Nemo)
Adam Sobieski, 21/05/2018 14:07: WIKI DIALOGUE SYSTEMS Exploration into the collaborative authoring and debugging of dialogue systems could result in new wiki technologies. Wiki dialogue systems could resemble spoken language dialogue systems with transcript-based user interfaces, users able

Re: [Wiki-research-l] Terms of use for wmf dump metadata?

2018-05-16 Thread Federico Leva (Nemo)
Edward L Platt, 16/05/2018 23:16: The derivatives in this case are coeditor networks for each WikiProject, based on which editors have edited the same articles. Is this something you produce yourself? I cannot find such a dataset in . Are you in EU? Feder

Re: [Wiki-research-l] Terms of use for wmf dump metadata?

2018-05-16 Thread Federico Leva (Nemo)
Edward L Platt, 16/05/2018 19:23: We're using the pages-meta-history XML files (user ids, timestamps, article ids, etc). Everything I can find on the WMF site refers to "textual content" which is a bit unclear about metadata. The legal page has been added only recently and it's probably uncle

Re: [Wiki-research-l] Terms of use for wmf dump metadata?

2018-05-16 Thread Federico Leva (Nemo)
Edward L Platt, 16/05/2018 18:57: I can find information on the copyright/terms-of-use for text and image data, but nothing explicit about the metadata. Which metadata are you talking about? The copyright license applies to the whole XML text. Federico __

Re: [Wiki-research-l] Digital Infrastructure Research RFP

2018-05-09 Thread Federico Leva (Nemo)
Leila Zia, 09/05/2018 21:22: The Sloan and Ford Foundations have made a request for proposals with the goal of funding a set of research projects to further our understanding of economics, maintenance, and sustainability of digital infrastructures (especially as they rely heavily on volunteer res

Re: [Wiki-research-l] trip report: Wiki Indaba 2018 [partial]

2018-03-26 Thread Federico Leva (Nemo)
I'm glad that language support was flagged. Leila Zia, 26/03/2018 21:37: I don't have much more to add at the moment on this front. One thing on my todo list is to educate myself more about this space and work within our team in the coming year(s) to see where we can make a difference. There a

Re: [Wiki-research-l] [Analytics] A new landing page for the Wikimedia Research team

2018-02-07 Thread Federico Leva (Nemo)
Will it be translatable with standard tools? Federico ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Re: [Wiki-research-l] [discovery] Discovery Weekly Update for the week starting 2017-09-18

2017-09-26 Thread Federico Leva (Nemo)
Chris Koerner, 25/09/2017 23:32: * Mikhail created a dashboard to track the prevalence of sister project search results on fulltext search result pages on desktop, broken up by language. For example, it turns out that nearly 80% of fulltext searches show sister projects on enwiki. [30] [30] http

Re: [Wiki-research-l] Finding what is said about a topic in other articles

2017-06-14 Thread Federico Leva (Nemo)
Kerry Raymond, 14/06/2017 00:45: > What is motivating this > is because I often find that "what links here" often points to some > surprising articles which can reveal new insights into a topic. Indeed. I always teach the "what links here" feature at all my wiki courses. Kerry Raymond, 14/06/20

Re: [Wiki-research-l] is there a way to get the list of edits containing specific tags in the summary

2017-03-02 Thread Federico Leva (Nemo)
Kerry Raymond, 03/03/2017 04:24: For example, can I get all the edits with 1Lib1Ref in them or some other tag to be used at an event next Monday? Since the question was already answered by Pru, I'll just provide an alternative experience: for in-person events, I prefer to let users follow the

Re: [Wiki-research-l] surveying Wiki editors?

2017-03-02 Thread Federico Leva (Nemo)
Misha Teplitskiy, 02/03/2017 15:56: Does anyone have experience surveying Wikipedia editors? Can someone point me to literature that has done this or discuss how one might go about doing it? See https://meta.wikimedia.org/wiki/Editor_Survey Nemo ___

Re: [Wiki-research-l] Time Between edits - difference between RevisionID and {{NUMBEROFEDITS}}

2017-01-25 Thread Federico Leva (Nemo)
Statistics of total (content) edit rate are also available on WikiStats at https://stats.wikimedia.org/EN/TablesDatabaseEdits.htm etc. Edit and revert trends charts tend to be more useful: https://stats.wikimedia.org/EN/PlotsPngEditHistoryTop.htm WereSpielChequers, 25/01/2017 16:10: One area

[Wiki-research-l] Fwd: Divide XML dumps by page.page_namespace (and figure out what to do with the "pages-articles" dump)

2017-01-17 Thread Federico Leva (Nemo)
Input requested: https://lists.wikimedia.org/pipermail/wikitech-l/2017-January/087393.html , https://phabricator.wikimedia.org/T99483 Personally I think that the main issue is the slowness of some of the tools people use (including dumps.wikimedia.org itself), so I tried to improve the docs a

Re: [Wiki-research-l] Global footprint for carbon

2017-01-12 Thread Federico Leva (Nemo)
Gerard Meijssen, 12/01/2017 07:48: Has anyone ever calculated what the footprint of Wikipedia is in terms of the production of carbondioxide? WMF is no longer as transparent as it used to be about which servers are used etc., but someone tried some calculations: https://meta.wikimedia.org/wik

Re: [Wiki-research-l] another pageview db to download

2016-12-11 Thread Federico Leva (Nemo)
Alex Druk, 12/12/2016 08:32: For a few years I have maintained a web site wikipediatrends.com . For variety of reasons I cannot do it any more and the site will be closed in January. However, our DB of English wikipedia pageviews from 2007 can be used for other project

Re: [Wiki-research-l] Year-by-year statistics for unregistered contributors (IPs)

2016-12-08 Thread Federico Leva (Nemo)
Ofer Arazy, 08/12/2016 07:50: Namely, I'm looking for stats regarding the count of these IP users at the end of each calendar year, as well as their activity levels (e.g. avg. monthly edits). Something like https://web.archive.org/web/20091018082131/http://stats.wikimedia.org/wikiquote/EN/Tabl

Re: [Wiki-research-l] Data on the lifespan of Wikipedia articles

2016-11-28 Thread Federico Leva (Nemo)
Stella Yu, 29/11/2016 07:00: Where could I find data on the lifespan of different types of Wikipedia articles? What do you mean by "lifespan"? Does http://wikipapers.referata.com/wiki/Revision_history help? Nemo ___ Wiki-research-l mailing list Wi

Re: [Wiki-research-l] Wiki-editors' activity

2016-10-17 Thread Federico Leva (Nemo)
Alex Yarovoy, 17/10/2016 18:58: Does Wikipedia stored any metadata, logs or anything useful to track ones activity? https://wikitech.wikimedia.org/wiki/Logs ? Nemo ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wi

Re: [Wiki-research-l] [Analytics] Identifying bots and bot edit decline

2016-10-11 Thread Federico Leva (Nemo)
Wikistats knows about 8017 bot usernames according to https://dumps.wikimedia.org/other/pagecounts-ez/wikistats/csv_wp_main.zip (cut -f2 -d, StatisticsBots.csv | sort -u | wc -l ). Given active editors tend to complain a lot if they get counted as bots, a comprehensive list should probably be a

Re: [Wiki-research-l] index of current research on wikipedia?

2016-09-11 Thread Federico Leva (Nemo)
Gerard Meijssen, 11/09/2016 09:42: I wonder if it would make sense to include the data of Wikipapers in Wikidata like any other Wiki so far. Anyone is free to attempt that if they bother. Personally I won't: Semantic MediaWiki is way easier for this sort of thing (e.g. Data Transfer allows ve

Re: [Wiki-research-l] index of current research on wikipedia?

2016-09-10 Thread Federico Leva (Nemo)
Guillaume Paumier, 10/09/2016 16:43: WikiPapers is the main wiki-based curation platform for wiki-related academic publications, but it's down at the moment: http://wikipapers.referata.com/ Up now. I thought https://en.wikipedia.org/wiki/Wikipedia:Academic_studies_of_Wikipedia#Peer_reviewed w

Re: [Wiki-research-l] Thinking big: scaling up Wikimedia's contributor population by two orders of magnitude

2016-08-27 Thread Federico Leva (Nemo)
Dario Taraborelli, 27/08/2016 22:49: How making the edit button 10x larger is not a solution to this problem is a topic I'll reserve to a separate thread. You might want to include screenshots of the popups which are currently run to point people to the edit button. Nemo ___

Re: [Wiki-research-l] Thinking big: scaling up Wikimedia's contributor population by two orders of magnitude

2016-08-27 Thread Federico Leva (Nemo)
Pine W, 27/08/2016 09:13: What would we need in order to stimulate and nourish this kind of growth? https://strategy.wikimedia.org/wiki/Proposal:Make_Wikimedia_projects_scale Nemo ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org

Re: [Wiki-research-l] Research on automatically created articles

2016-08-13 Thread Federico Leva (Nemo)
It's worth noting that research exists which *actively* sought to change real-life behaviour of Wikipedia visitors, such as https://www.econstor.eu/handle/10419/127472 whose authors expanded articles about certain Spanish cities in order to make tourists visit those cities more. Nemo ___

Re: [Wiki-research-l] Research on automatically created articles

2016-08-09 Thread Federico Leva (Nemo)
Denny Vrandečić, 09/08/2016 20:29: 1) the fact that so many of these articles have survived for half a year indicates that there are some problems with our review processes. Does someone want to make an investigation why these articles survived in the given state? Looks like the good old trick

Re: [Wiki-research-l] Alternatives to google forms for surveys

2016-08-02 Thread Federico Leva (Nemo)
LimeSurvey is easy to use, has good i18n and has affordable hosted solutions. See https://phabricator.wikimedia.org/T94807 for a discussion of the topic with some comparisons (John, you may want to add Open Data Kit). Nemo ___ Wiki-research-l mailin

Re: [Wiki-research-l] link trails in different languages

2016-07-31 Thread Federico Leva (Nemo)
(Context: https://www.mediawiki.org/wiki/Linktrail ) Amir E. Aharoni, 31/07/2016 08:58: In other languages they will be different. Note that the $linkTrail variable in each language's MessagesXx.php file needs to be checked, to know which strings actually display as linktrails. Nemo __

Re: [Wiki-research-l] pagecounts and stub-meta-history

2016-07-28 Thread Federico Leva (Nemo)
Definitely consider the redirect :) https://mako.cc/copyrighteous/consider-the-redirect Bruno Goncalves, 28/07/2016 22:00: I've been trying to match edit activity with pagecounts The first question is how much data you need. If a few months are enough, https://wikitech.wikimedia.org/wiki/Pag

Re: [Wiki-research-l] Multi year page views statistics

2016-07-11 Thread Federico Leva (Nemo)
Avner Kantor, 11/07/2016 13:43: Can it be done by https://tools.wmflabs.org/pageviews No. https://wikitech.wikimedia.org/wiki/Analytics/PageviewAPI#Updates_and_backfilling or any other tool? Sure. Preferably by using https://dumps.wikimedia.org/other/pagecounts-ez/ , but most people end u

Re: [Wiki-research-l] WMF Open Access Policy and Independent Researchers

2016-07-09 Thread Federico Leva (Nemo)
Piotr Konieczny, 29/06/2016 07:38: The problem is that most of those are not indexed in top tier indexes. For example, my career requires me to publish in SSCI index, and in my field, sociology, do you know how many out of ~120 journals indexed in SSCI are green open access? Zero. Uh? I sampled

Re: [Wiki-research-l] [Xmldatadumps-l] New mirror of 'other' datasets

2016-05-15 Thread Federico Leva (Nemo)
Ariel Glenn WMF, 04/05/2016 14:33: You can access it at http://wikimedia.crc.nd.edu/other/ so please do! Great news, especially because it's ten times faster than dumps.wikimedia.org! Finally, every time I need a dataset to quickly verify a sudden idea I have, the download becomes a matter of

Re: [Wiki-research-l] Wiki-research-l Digest, Vol 127, Issue 18

2016-03-20 Thread Federico Leva (Nemo)
Alex Druk, 20/03/2016 07:50: Requests with "Special:Random" / total number of requests for each project. Note that such requests are no longer counted in the pageviews data because thery are not HTTP 200. https://meta.wikimedia.org/wiki/Research:Page_view Nemo P.s.: https://meta.wikimedia.o

Re: [Wiki-research-l] citing female academics

2016-02-28 Thread Federico Leva (Nemo)
Stuart A. Yeates, 28/02/2016 20:04: Data has been sucked from GND to wikidata via a number of routes, principally VIAF. See Wikidata:Bot_requests#Import_GND_identifiers_from_VIAF_dump for example for a discussion of an instance of this. In https://www.wikidata.org/?oldid=308216259#Import_GND_i

Re: [Wiki-research-l] citing female academics

2016-02-28 Thread Federico Leva (Nemo)
Stuart A. Yeates, 28/02/2016 18:10: Finding relable sources on this facet of private people is very, very hard Why even bother publishing original research? Nemo ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wiki

Re: [Wiki-research-l] "Quick" request

2016-02-23 Thread Federico Leva (Nemo)
Bruno Goncalves, 23/02/2016 00:19: wikipedia_en_all_nopic_2015-05.zim17-May-2015 10:27 15G Humm... It seems like they are all several months old? As you can see, Kelson recently focused on other things like the "wp1" releases. The ZIM dump production is now orders of magnit

Re: [Wiki-research-l] "Quick" request

2016-02-22 Thread Federico Leva (Nemo)
Bruno Goncalves, 22/02/2016 22:58: There used to be official HTML dumps https://dumps.wikimedia.org/other/static_html_dumps/ but they haven't been updated in almost a decade :) The job is effectively done by Kiwix now. http://download.kiwix.org/zim/wikipedia/ For instance: wikipedia_en_all_

Re: [Wiki-research-l] Community health statistics of Wikiprojects

2016-01-07 Thread Federico Leva (Nemo)
Jonathan Cardy, 08/01/2016 06:45: If I were trying to judge the health of a wikiproject in terms of whether they are a good thing to direct newbies to I would be more interested in questions such as: How many active editors are watchlisting that wikiproject? action=info now gives a better in

Re: [Wiki-research-l] Download of pageviews dataset

2015-11-11 Thread Federico Leva (Nemo)
Cristian Consonni, 11/11/2015 15:09: I am working with a student on scientific citation on Wikipedia and, very simply put, we would like to use the pageview dataset to have a rough measure of how many times a paper was viewed thanks to Wikipedia.[*] The full dataset is, as of now, ~ 4.7TB in siz

Re: [Wiki-research-l] Any Norwegian academics writing about Wikipedia?

2015-10-22 Thread Federico Leva (Nemo)
Laura Hale, 22/10/2015 11:16: I was wondering if any one on the list had any contacts with Norwegian academics doing research on Wikipedia, particularly from a gender gap perspective? http://wikipapers.referata.com/w/index.php?title=Special%3ALinkSearch&target=*.no finds two authors, though t

Re: [Wiki-research-l] Looking for help finding tools to measure UNESCO project

2015-10-06 Thread Federico Leva (Nemo)
Amir E. Aharoni, 06/10/2015 15:12: This raises a wider question: What is the comfortable way to compare the coverage of a topic in different languages? https://tools.wmflabs.org/mix-n-match/ . Example: https://tools.wmflabs.org/mix-n-match/?mode=sitestats&catalog=17 Nemo ___

Re: [Wiki-research-l] Editor Activity Analysis & Graphs

2015-09-18 Thread Federico Leva (Nemo)
jeph, 18/09/2015 06:37: I put together a presentation for the research team yesterday. Sharing it here, http://slides.com/cosmiclattes/edit-activity-graphs-analysis/. Is there a downloadable PDF on Commons? Thanks, Nemo ___ Wiki-research-l ma

[Wiki-research-l] Anyone using the user_daily_contribs table/API?

2015-09-05 Thread Federico Leva (Nemo)
See https://phabricator.wikimedia.org/T85984 The user_daily_contribs table (and associated API) is sometimes used for * JavaScript (e.g. CentralNotice) targeting users based on activity in a certain timeframe, * simplification of SQL queries (e.g. [1]), * other? If you use this data/feature or

Re: [Wiki-research-l] Editorial Bias in Crowd-Sourced Political Information

2015-09-03 Thread Federico Leva (Nemo)
Andrew Lih, 03/09/2015 19:12: four randomized field experiments At least, for once, this is not an euphemism of "random acts of vandalism": they say «one true positive and one true negative fact from a reputable news source on each senator that was not currently mentioned on that senator’s W

Re: [Wiki-research-l] Has the recent increase in English wikipedia's core community gone beyond a statistical blip?

2015-08-29 Thread Federico Leva (Nemo)
Pine W, 29/08/2015 01:00: By the way, is there an easy way to get info on from https://stats.wikimedia.org about editor activity levels that excludes bots? They all exclude bots unless otherwise specified, see docs. Nemo ___ Wiki-research-l mailing

Re: [Wiki-research-l] Has the recent increase in English wikipedia's core community gone beyond a statistical blip?

2015-08-25 Thread Federico Leva (Nemo)
Kerry Raymond, 25/08/2015 02:57: It would be interesting to have some coarse characterisation of edits to see if any growth in edit count is spread uniformly against all contribution types or if the growth is disproportionate some way https://stats.wikimedia.org/EN/TablesWikipediaZZ.htm#editor_

Re: [Wiki-research-l] Has the recent increase in English wikipedia's core community gone beyond a statistical blip?

2015-08-23 Thread Federico Leva (Nemo)
WereSpielChequers, 15/08/2015 15:12: With 8% more editors contributing over 100 edits in June 2015 than in June 2014 , we have now had six consecutive months where this particular metric of the core community is looking positive. I'm not sur

Re: [Wiki-research-l] Tracking authorship of wiki content

2015-08-22 Thread Federico Leva (Nemo)
Luca de Alfaro, 22/08/2015 01:51: So I got inspired, and I cleaned up some code that Michael Shavlovsky and I had written for this: https://github.com/lucadealfaro/authorship-tracking Great! It's always good when code behind a paper is published, it's never too late. If you can please add a l

Re: [Wiki-research-l] citations to articles cited on wikipedia?

2015-08-21 Thread Federico Leva (Nemo)
Andrew Gray, 20/08/2015 14:21: They worked on a journal basis, classing them as "OA" or "not OA". Weird, why didn't they just use DOAJ? https://doaj.org/ Nemo ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimed

Re: [Wiki-research-l] Looking for comparisons of the goals of Wikimedia and other organisations

2015-08-20 Thread Federico Leva (Nemo)
john cummings, 20/08/2015 10:30: I'm currently working as Wikimedian in Residence at UNESCO, does anyone know of any work done to compare the goals of Wikimedia with other organisations who work in education? What part of Wikimedia's activities do you classify as "work in education"? Nemo ___

Re: [Wiki-research-l] "Wikipedia live monitor" for identifying breaking news on Wikipedia

2015-07-19 Thread Federico Leva (Nemo)
Scott Hale, 19/07/2015 11:30: what was popular in different language editions Do you know https://tools.wmflabs.org/wikitrends/ ? Probably collaboration is accepted. as a way to possibly increase multilingual editing/consumption. Generally, a moment of high popularity of a subject (and he

Re: [Wiki-research-l] Aaron Swartz Hypothesis on Wikipedia Authorship

2015-06-23 Thread Federico Leva (Nemo)
Krzysztof Gajewski, 23/06/2015 16:46: I wonder if you know if somebody verified and / or further researched Aaron Swartz's thesis on structure of Wikipedia participation. You can find it here: http://www.aaronsw.com/weblog/whowriteswikipedia http://wikipapers.referata.com/wiki/Authorship has so

Re: [Wiki-research-l] Community health (retitled thread)

2015-06-04 Thread Federico Leva (Nemo)
Juergen Fenn, 04/06/2015 16:50: Reduced traffic on Wikimedia-l is mostly due to list moderation. That's plausible. "Most" people on wikimedia-l are moderated by now; I and others unsubscribed due to tyrannical moderation, too. Nemo ___ Wiki-resear

Re: [Wiki-research-l] How to explain drop in random searches

2015-05-12 Thread Federico Leva (Nemo)
Alex Druk, 12/05/2015 07:56: Going from 86,000,000 a month to 31,000 a month is quite a drop, and the shift is pretty dramatic. It goes from 1.7 million one day to 715 the next and stays flat (http://stats.grok.se/en/201410/Special:Random). That's expected. The new data excludes redirecting URL

Re: [Wiki-research-l] Fwd: Traffic to the portal from Zero providers

2015-05-07 Thread Federico Leva (Nemo)
Scott Hale, 07/05/2015 09:51: The accept-language header is the obvious place to start, but there is amble scope to combine multiple approaches together. Which is what UniversalLanguageSelector / jquery.uls, used on all Wikimedia projects, exists for. :) In addition to accept-language and

Re: [Wiki-research-l] Fwd: Traffic to the portal from Zero providers

2015-05-07 Thread Federico Leva (Nemo)
Thanks for looking into www.wikipedia.org traffic from India; I've been "complaining" about it for a while. :) See also: * https://phabricator.wikimedia.org/T26767 * https://phabricator.wikimedia.org/T5665 Mark J. Nelson, 07/05/2015 04:24: But for the average Copenhagener, the following order i

Re: [Wiki-research-l] Waray-Waray language Wikipedia

2015-05-01 Thread Federico Leva (Nemo)
Pine W, 01/05/2015 10:44: Hi researchers, One would think that you've learnt using WikiStats by now for trivial questions. * https://stats.wikimedia.org/EN/TablesWikipediaWAR.htm#bots * https://stats.wikimedia.org/EN/BotActivityMatrixCreates.htm Nemo

Re: [Wiki-research-l] Wikilink referral statistics

2015-04-29 Thread Federico Leva (Nemo)
This is a popular topic in this period. * https://meta.wikimedia.org/wiki/Research:Improving_link_coverage * https://meta.wikimedia.org/wiki/Research:Hovercards * https://meta.wikimedia.org/wiki/Research:Increasing_article_coverage Nemo ___ Wiki-resear

Re: [Wiki-research-l] Motivations for editing Wikipedia

2015-04-17 Thread Federico Leva (Nemo)
http://wikipapers.referata.com/wiki/Motivation Nemo ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Re: [Wiki-research-l] stream.wikimedia.org

2015-04-02 Thread Federico Leva (Nemo)
Ed Summers, 02/04/2015 22:30: I was wondering if anyone had an example of using stream.wikimedia.org handy? I listed the examples I know of (from a previous thread) at https://wikitech.wikimedia.org/wiki/Stream.wikimedia.org#Clients_and_alternative_access_points Nemo ___

Re: [Wiki-research-l] [Analytics] [Release]

2015-02-25 Thread Federico Leva (Nemo)
Erik Zachte, 25/02/2015 23:34: Compare https://ironholds.shinyapps.io/WhereInTheWorldIsWikipedia/ and http://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerLanguageBreakdown.htm Ironholds' looks more vulnerable to bots, it's easier to see in small wikis (though, kudos! many more

[Wiki-research-l] Fwd: Reasons you use the XML dumps or want to, but can't?

2015-02-20 Thread Federico Leva (Nemo)
FYI Messaggio inoltrato Oggetto:[Xmldatadumps-l] Your comments needed (long term dumps rewrite?) Data: Thu, 19 Feb 2015 12:30:01 +0200 Mittente: Ariel Glenn WMF A: xmldatadump...@lists.wikimedia.org The MediaWiki Core team has opened a discussion about

Re: [Wiki-research-l] a cautious note on gender stats Re: Fwd: [Gendergap] Wikipedia readers

2015-02-16 Thread Federico Leva (Nemo)
aaron shaw, 17/02/2015 05:50: If we want to have a more precise sense of the demographics of participants the biggest need in this space is simply higher quality survey data. My paper with Mako has a lot of detail about why the 2008 editor survey (and all subsequent editor surveys, to my knowled

[Wiki-research-l] New dumps for 268 902 Wikia wikis: most complete ever

2015-02-06 Thread Federico Leva (Nemo)
I just published https://archive.org/details/wikia_dump_20141219 : Snapshot of all the known Wikia dumps. Where a Wikia public dump was missing, we produced one ourselves. 9 broken wikis, as well as lyricswikia and some wikis for which dumpgenerator.py failed, are still missing; some Wik

Re: [Wiki-research-l] Fwd: $55 million raised in 2014

2015-01-02 Thread Federico Leva (Nemo)
Denny Vrandečić, 02/01/2015 21:17: I have one or two ideas about what to do with 3 billion US Dollar. It would be a huge step towards some of my stretch goals for the movement. :) 3 billions are not that much money, for instance they're only enough to pay 6 months of operating costs of a poor

Re: [Wiki-research-l] Pageviews, mobile versus desktop

2014-12-15 Thread Federico Leva (Nemo)
Oliver Keyes, 13/12/2014 21:15: http://ironholds.org/misc/pageviews_year_and_week.png - fascinating! It reveals a lot of seasonality in the desktop views - again, not replicated on mobile (at least, not so strongly) Does this graph also go from 2013-02-01 to 2014-12-01? Nemo _

Re: [Wiki-research-l] StackExchange editor decline (serverfault)

2014-12-12 Thread Federico Leva (Nemo)
Andrew Lih, 12/12/2014 18:42: I wish we had the slides for this, but Jack Herrick of WikiHow presented at Wikimania 2012 on the features put in to promote more community growth. Of course anyone can verify the numbers with the dump WikiTeam made ;) https://archive.org/details/wiki-wikihowcom j

  1   2   >