Yes, that's the idea more or less, but I'm not sure that our search engine
is able to search for headings, though I might be wrong. I suspect,
however, that it will be required to process dumps article by article (or
at least a random sample), and in big projects this could be extremely time
consuming.But maybe there's a faster way of which I am not aware?


--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬

2015-07-13 23:41 GMT+03:00 Pine W <wiki.p...@gmail.com>:

> Would it be possible to run a search on the full text of Wikipedias for
> lines that start and end with "==", "===", "====", and lines that start
> with ";", then make a list of those strings, and count the number of times
> that each title appears in the list?
>
> Pine
> On Jul 13, 2015 10:29 AM, "Jonathan Morgan" <jmor...@wikimedia.org> wrote:
>
>> Cross-posting this request to wiki-research-l. Anyone have data on
>> frequently used section titles in articles (any language), or know of
>> datasets/publications that examined this?
>>
>> I'm not aware of any off the top of my head, Amir.
>>
>> - Jonathan
>>
>> ---------- Forwarded message ----------
>> From: Amir E. Aharoni <amir.ahar...@mail.huji.ac.il>
>> Date: Sat, Jul 11, 2015 at 3:29 AM
>> Subject: [Wikitech-l] statistics about frequent section titles
>> To: Wikimedia developers <wikitec...@lists.wikimedia.org>
>>
>>
>> Hi,
>>
>> Did anybody ever try to collect statistics about frequent section titles
>> in
>> Wikimedia projects?
>>
>> For Wikipedia, for example, titles such as "Biography", "Early life",
>> "Bibliography", "External links", "References", "History", etc., appear in
>> a lot of articles, and their counterparts appear in a lot of languages.
>>
>> There are probably similar things in Wikivoyage, Wiktionary and possibly
>> other projects.
>>
>> Did anybody ever try to collect statistics of the most frequent section
>> titles in each language and project?
>>
>> --
>> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
>> http://aharoni.wordpress.com
>> ‪“We're living in pieces,
>> I want to live in peace.” – T. Moore‬
>> _______________________________________________
>> Wikitech-l mailing list
>> wikitec...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>>
>>
>> --
>> Jonathan T. Morgan
>> Senior Design Researcher
>> Wikimedia Foundation
>> User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
>>
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to