Hi,

What would be the best Wikimedia interface to try to get this moving?

Thanks for any sugestions
--
Sérgio Nunes


On Mon, 7 Jul 2025 at 13:23, Sérgio Nunes <[email protected]> wrote:

> Hi all,
>
> I would like to suggest a new *highly valuable* data dump for Wikipedia:
> the release of aggregated search query logs. I am aware that a previous
> release of search data was retracted due to privacy concerns. However, I
> believe there is a privacy-preserving approach that could still provide
> great value to researchers.
>
> My proposal is to release only aggregated query data—specifically, queries
> that have been observed more than X times within a given day or week. The
> dataset could follow a simple format such as:
>
> [day or week] [query text] [frequency]
>
> This method would eliminate the risk of exposing personal or unique search
> queries. The dataset would be especially useful if released regularly
> (e.g., monthly) and broken down by language-specific Wikipedias.
>
>
> Is this the best forum for posting this suggestion?
>
> If you have suggestions for where to direct this proposal, or ideas for an
> alternative approach, I would be grateful.
>
> Best regards,
> --
> Sérgio Nunes
>
_______________________________________________
Wiki-research-l mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to