OKBr's, Muita agente aqui da OKBr trabalhou duro com os bits: costumamos chamar esse trabalho típico de "data scraping <https://en.wikipedia.org/wiki/Data_scraping>"... E pelo que entendi está vigorando agora um outro termo, que complementa a descrição dessas empreitadas, é o "*data cleasing* <https://en.wikipedia.org/wiki/Data_cleansing>".
Estou anexando (e-mail Fw) um chamado para submissões na famosa ACM, que provavelmente ajudará a sacramentar o uso do termo *"web data cleansing"* -- ao buscar verificamos que existem já empresas se especializando nisso (!), provavelmente para concorrer com a OKFN :-) O chamado também inclui oportunidade para o pessoal de Web Semântica... No geral me parece tudo bem correlato com o que a OKFN e OKBr fazem! - - - - Mudei o titulo do *"**Fwd: CfP: ACM Journal of Data and Information Quality (JDIQ): Special Issue on Web Data Quality"*, pois a ACM não é exatamente OpenAccess <https://en.wikipedia.org/wiki/OpenAccess>, de modo que alguns poderiam me tachar de garoto-propaganda ;-) Nosso consultor científico aqui da OKBr, de qualquer forma, já comentou que a ACM, como *publisher*, "atualmente tem uma opção mai-ou-menos-open-access ". ---------- Forwarded message ---------- From: Christian Bizer <[email protected]> Date: 2015-07-02 4:58 GMT-03:00 Subject: CfP: ACM Journal of Data and Information Quality (JDIQ): Special Issue on Web Data Quality To: [email protected], [email protected], [email protected], [email protected], [email protected] Hi all, we are happy to announce that the ACM Journal of Data and Information Quality (JDIQ) will feature a special issue on Web Data Quality. The goal of the special issue is to present innovative research in the areas of Web Data Quality Assessment and Web Data Cleansing. The submission deadline for the special issue is November 1st, 2015. Please find the detailed call for papers below and at http://jdiq.acm.org/announcements.cfm#special-issue-of-acm-jdiq-on-web-data-quality Best, Luna Dong, Ihab Ilyas, Maria-Esther Vidal, and Christian Bizer --------------------- Call for Papers: ACM Journal of Data and Information Quality (JDIQ) Special Issue on Web Data Quality --------------------- Guest editors: * Christian Bizer, University of Mannheim, Germany, [email protected] * Luna Dong, Google, USA, [email protected] * Ihab Ilyas, University of Waterloo, Canada, [email protected] * Maria-Esther Vidal, Universidad Simon Bolivar, Venezuela, [email protected] Introduction: The volume and variety of data that is available on the web has risen sharply. In addition to traditional data sources and formats such as CSV files, HTML tables and deep web query interfaces, new techniques such as Microdata, RDFa, Microformats and Linked Data have found wide adoption. In parallel, techniques for extracting structured data from web text and semi-structured web content have matured resulting in the creation of large-scale knowledge bases such as NELL, YAGO, DBpedia, and the Knowledge Vault. Independent of the specific data source or format or information extraction methodology, data quality challenges persist in the context of the web. Applications are confronted with heterogeneous data from a large number of independent data sources while metadata is sparse and of mixed quality. In order to utilize the data, applications must first deal with this widely varying quality of the available data and metadata. Topics: The goal of this special issue of JDIQ is to present innovative research in the areas of Web Data Quality Assessment and Web Data Cleansing. Specific topics within the scope of the call include, but are not limited to, the following: WEB DATA QUALITY ASSESSMENT: * Metrics and methods for assessing the quality of web data, including Linked Data, Microdata, RDFa, Microformats and tabular data. * Methods for uncovering distorted and biased data / data SPAM detection. * Methods for quality-based web data source selection. * Methods for copy detection. * Methods for assessing the quality of instance- and schema-level links Linked Data. * Ontologies and controlled vocabularies for describing the quality of web data sources and metadata. * Best practices for metadata provision. * Cost and benefits of web data quality assessment and benchmarks. WEB DATA CLEANSING: * Methods for cleansing Web data, Linked Data, Microdata, RDFa, Microformats and tabular data. * Conflict resolution using semantic knowledge and truth discovery. * Human-in-the-loop and crowdsourcing for data cleansing. * Data quality for automated knowledge base construction. * Empirical evaluation of scalability and performance of data cleansing methods and benchmarks. APPLICATIONS AND USE CASES IN THE LIFE SCIENCES, HEALTHCARE, MEDIA, SOCIAL MEDIA, GOVERNMENT AND SENSOR DATA. Important dates: Initial submission: November 1, 2015 First review: January 15, 2016 Revised manuscripts: February 15, 2016 Second review: March 30, 2016 Publication: May 2016 Submission guidelines: http://jdiq.acm.org/authors.cfm -- Prof. Dr. Christian Bizer Data and Web Science Group University of Mannheim, Germany [email protected] http://dws.informatik.uni-mannheim.de/bizer
_______________________________________________ okfn-br mailing list [email protected] https://lists.okfn.org/mailman/listinfo/okfn-br Unsubscribe: https://lists.okfn.org/mailman/options/okfn-br
