**Apologies for cross-posting**
Third Call for Papers: Joint Workshop on Legal and Ethical Issues in Human
Language Technologies (LEGAL2026) and Computational Approaches to Language Data
Pseudonymization, Anonymization, De-identification, and Data Privacy
(CALD-pseudo 2026)
Website: https://legal2026.mobileds.de/
Submission: https://softconf.com/lrec2026/LEGAL2026/
We invite submissions to the Joint Workshop on Legal and Ethical Issues in
Human Language Technologies (LEGAL2026) and Computational Approaches to
Language Data Pseudonymization, Anonymization, De-identification, and Data
Privacy (CALD-pseudo 2026), to be held at LREC 2026 on the 12th of May 2026.
Important Dates
*
20th of February 2026: paper submission deadline
*
30th March 2026: camera ready deadline (strict)
*
12th May 2026: workshop date
Introduction
Access to text and speech data is essential for research, yet personal and
sensitive information often prevents open sharing. Techniques such as
pseudonymization and anonymization offer potential solutions, but their
effectiveness, limitations, and impact on data utility require deeper
investigation. Balancing privacy protection with meaningful scientific use
remains a key challenge.
At the same time, legal and ethical requirements increasingly shape how
language resources can be created, processed, and distributed. Regulatory
frameworks, such as the GDPR, the Data Act, and the Artificial Intelligence
Act, affect access, reuse, and documentation duties for both text and speech
data, creating a complex environment that demands interdisciplinary insight.
The workshop brings these two perspectives together by addressing both the
technical and practical aspects of de-identification as well as the legal and
ethical obligations governing data handling. Topics include anonymization and
pseudonymization methods, compliance in practical workflows, provenance and
rights tracking, and emerging approaches to legal metadata. The goal is to
foster responsible, legally sound, and technically robust innovation in human
language technologies.
Topics of Interest
We invite contributions from all disciplines involved in the creation,
processing, governance, and de-identification of text and speech data.
Submissions may address theoretical, empirical, methodological, legal, or
technical questions, including cross-disciplinary work. We particularly
encourage research on less-represented languages and on data from
under-represented communities.
1. Legal Aspects of Language Data (LEGAL2026)
*
Regulatory frameworks and global governance
*
Intellectual property, data protection, and LLM governance
*
Ethics, fairness, trust, and transparency
*
Compliance in practice
*
Ethics, fairness, and trust
*
Operationalizing compliance
*
Emerging and grey areas
*
Interdisciplinary and cross-border coordination
2. Pseudonymization, Anonymization, and De-identification: Theoretical,
Methodological, and Technical Aspects (CALD-pseudo 2026)
*
Detection and classification of personal information (PI)
*
Replacement and transformation of PI
*
Utility and bias after de-identification
*
Approaches to evaluation and adversarial testing
*
Dataset creation for de-identification research
*
Low-resource scenarios
*
Speech-specific challenges
*
Cross-disciplinary applications and challenges
We invite submissions from fields where de-identification of data plays an
important role, including but not limited to Computational Linguistics, Applied
Linguistics, Corpus Linguistics, Digital Humanities, Social Sciences, Political
Sciences, Medical Science etc., from the perspectives of researchers, public
organizations, and industry.
Submission Guidelines
Authors are invited to submit original and unpublished research papers in the
following categories:
*
Long papers (up to 8 pages) for substantial contributions
*
Short papers (up to 4 pages) for:
*
Small, focused contributions or ongoing or preliminary work
*
Extended abstracts for non-technical submissions only, such as conceptual,
theoretical, legal, ethical, policy-oriented, or position papers. Extended
abstract submissions are expected to be developed into regular papers by the
camera-ready submission deadline.
The full papers will be published as workshop proceedings along with the LREC
main conference. They should follow the LREC stylesheet, which is available on
the conference website on the Author’s kit<https://lrec2026.info/authors-kit/>
page. Unlike the main conference, we allow appendices of up to 10 pages already
in the review phase. However, the reviewers will not be required to look in the
appendices and must be able to review the paper based on everything contained
within the main body of the paper (as if there were no appendices).
Submission deadline: 20th of February 2026
Submission link: https://softconf.com/lrec2026/LEGAL2026/
When submitting a paper from the START page, authors will be asked to provide
essential information about resources (in a broad sense, i.e. also
technologies, standards, evaluation kits, etc.) that have been used for the
work described in the paper or are a new result of your research.
Moreover, ELRA encourages all LREC authors to share the described LRs (data,
tools, services, etc.) to enable their reuse and replicability of experiments
(including evaluation ones).
Keynote Talks
We are delighted to announce the workshop will host keynote talks from two
speakers:
*
Paweł Kamocki, Leibniz-Institut für Deutsche Sprache, Germany
*
Ivan Habernal, Ruhr University Bochum, Germany
Workshop Organizers
LEGAL 2026:
*
Ingo Siegert, Otto-von-Guericke Universität Magdeburg, Germany
*
Paweł Kamocki, Leibniz-Institut für Deutsche Sprache, Germany
*
Kossay Talmoudi, ELDA, France
*
Khalid Choukri, ELDA, France
CALD-pseudo 2026
*
Maria Irena Szawerna, University of Gothenburg, Sweden
*
Simon Dobnik, University of Gothenburg, Sweden
*
Therese Lindström Tiedemann, University of Helsinki, Finland
*
Pierre Lison, Norwegian Computing Center & University of Oslo, Norway
*
Ildikó Pilán, Norwegian Computing Center, Norway
*
Ricardo Muñoz Sánchez, University of Gothenburg, Sweden
*
Lisa Södergård, University of Helsinki, Finland
*
Elena Volodina, University of Gothenburg, Sweden
*
Xuan-Son Vu, Lund University & DeepTensor AB, Sweden
Program Committee
A list of program committee members is available on the workshop webpage.
Contact
For inquiries, please contact [email protected] for questions about
LEGAL2026 or [email protected] for questions about CALD-pseudo 2026.
Best regards,
Maria Irena Szawerna
____________________
PhD student
Språkbanken Text<https://spraakbanken.gu.se/>
Institutionen för svenska, flerspråkighet och
språkteknologi<https://www.gu.se/svenska-spraket>
UNIVERSITY OF GOTHENBURG<https://www.gu.se/>
https://spraakbanken.gu.se/om/personal/maria-szawerna
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]