Hi, What would be the best Wikimedia interface to try to get this moving?
Thanks for any sugestions -- Sérgio Nunes On Mon, 7 Jul 2025 at 13:23, Sérgio Nunes <[email protected]> wrote: > Hi all, > > I would like to suggest a new *highly valuable* data dump for Wikipedia: > the release of aggregated search query logs. I am aware that a previous > release of search data was retracted due to privacy concerns. However, I > believe there is a privacy-preserving approach that could still provide > great value to researchers. > > My proposal is to release only aggregated query data—specifically, queries > that have been observed more than X times within a given day or week. The > dataset could follow a simple format such as: > > [day or week] [query text] [frequency] > > This method would eliminate the risk of exposing personal or unique search > queries. The dataset would be especially useful if released regularly > (e.g., monthly) and broken down by language-specific Wikipedias. > > > Is this the best forum for posting this suggestion? > > If you have suggestions for where to direct this proposal, or ideas for an > alternative approach, I would be grateful. > > Best regards, > -- > Sérgio Nunes > _______________________________________________ Wiki-research-l mailing list -- [email protected] To unsubscribe send an email to [email protected]
