The workshop program, along with links to the full papers, is now available: http://glicom.upf.edu/OIAF4HLT/Program.html
I'm looking forward to seeing many of you there. I'll be staying at DCU (College Park). -- Jens On Tue, Jul 1, 2014 at 6:52 PM, Jens Grivolla <j+...@grivolla.net> wrote: > The list of accepted papers is now available: > http://glicom.upf.edu/OIAF4HLT/Papers.html > > For anybody interested in attending the workshop and COLING, please > remember that the early registration deadline is tomorrow, July 2nd. > > Looking forward to seeing many of you there... > > -- Jens > > > On Wed, Mar 26, 2014 at 2:34 PM, Jens Grivolla <j+...@grivolla.net> wrote: > >> Workshop on Open Infrastructures and Analysis Frameworks for HLT >> ================================================================ >> >> http://glicom.upf.edu/OIAF4HLT/ >> >> At the 25th International Conference on Computational Linguistics (COLING >> 2014) >> Helix Conference Centre at Dublin City University (DCU) >> 23-29 August 2014 >> >> Description >> ----------- >> >> Recent advances in digital storage and networking, coupled with the >> extension of human language technologies (HLT) into ever broader areas and >> the persistence of difficulties in software portability, have led to an >> increased focus on development and deployment of web-based infrastructures >> that allow users to access tools and other resources and combine them to >> create novel solutions that can be efficiently composed, tuned, evaluated, >> disseminated and consumed. This in turn engenders collaborative development >> and deployment among individuals and teams across the globe. It also >> increases the need for robust, widely available evaluation methods and >> tools, means to achieve interoperability of software and data from diverse >> sources, means to handle licensing for limited access resources distributed >> over the web, and, perhaps crucially, the need to develop strategies for >> multi-site collaborative work. >> >> For many decades, NLP has suffered from low software engineering >> standards causing a limited degree of re-usability of code and >> interoperability of different modules within larger NLP systems. While this >> did not really hamper success in limited task areas (such as implementing a >> parser), it caused serious problems for building complex integrated >> software systems, e.g., for information extraction or machine translation. >> This lack of integration has led to duplicated software development, >> work-arounds for programs written in different (versions of) programming >> languages, and ad-hoc tweaking of interfaces between modules developed at >> different sites. >> >> In recent years, two main frameworks, UIMA and GATE, have emerged that >> aim to allow the easy integration of varied tools through common type >> systems and standardized communication methods for components analysing >> unstructured textual information, such as natural language. Both frameworks >> offer a solid processing infrastructure that allows developers to >> concentrate on the implementation of the actual analytics components. An >> increasing number of members of the NLP community have adopted one of these >> frameworks as a platform for facilitating the creation of reusable NLP >> components that can be assembled to address different NLP tasks depending >> on their order, combination and configuration. Analysis frameworks also >> reduce the problem of reproducibility of NLP results by formalising >> solution composition and making language processing tools shareable. >> >> Very recently, several efforts have been devoted to the development of >> web service platforms for NLP. These platforms exploit the growing number >> of web-based tools and services available for tasks related to HLT, >> including corpus annotation, configuration and execution of NLP pipelines, >> and evaluation of results and automatic parameter tuning. These platforms >> can also integrate modules and pipelines from existing frameworks such as >> UIMA and GATE, in order to achieve interoperability with a wide variety of >> modules from different sources. >> >> Many of the issues and challenges surrounding these developments have >> been addressed individually in particular projects and workshops, but there >> are ramifications that cut across all of them. We therefore feel that this >> is the moment to bring together participants representing the range of >> interests that comprise the comprehensive picture for community-driven, >> distributed, collaborative, web-based development and use for language >> processing software and resources. This includes those engaged in >> development of infrastructures for HLT as well as those who will use these >> services and infrastructures, especially for multi-site collaborative work. >> >> >> ### Workshop Objectives >> >> The overall goal of this workshop is to provide a forum for discussion of >> the requirements for an envisaged open “global laboratory” for HLT research >> and development and establish the basis of a community effort to develop >> and support it. To this end, the workshop will include both presentations >> addressing the issues and challenges of developing, deploying, and using >> the global laboratory for distributed and collaborative efforts and >> discussion that will identify next steps for moving forward, fostering >> community-wide awareness, and establishing and encouraging communication >> among the various players. >> >> It aims at bringing together members of the NLP community specifically >> users, developers or providers of components and tools for these frameworks >> in order to explore and discuss the opportunities and challenges in using >> such platforms for modern, well-engineered NLP applications. >> >> The challenge of creating reusable and interoperable components raises >> particular interest and are affected by legal issues, such as potentially >> incompatible licenses of components and tools as well as the technical >> aspects of packaging and distribution of components. Also, tools are >> important, for example to assemble complex processing pipelines, to manage >> the bodies of data that are to be analysed and to visualize, explore, and >> further deploy the analysis results. Further challenges are involved in >> embedding framework based analysis within applications or using it in >> distributed computing scenarios, such as deployment of and access to >> required resources. Finally, the preservation of analysis results, their >> provenance and reproducibility are of particular interest to the scientific >> user community. >> >> ### Topics >> >> Workshop topics include, but are not limited to: >> >> - processing of very large data collections: scale-out, parallelization, >> and performance optimization >> - advanced applications driven by an NLP framework >> - sophisticated tools to build and manage complex processing pipelines >> - analysis of results: exploration, evaluation, visualization, and >> statistical analysis >> - experience reports combining components from different sources, as well >> as solutions to interoperability issues >> - experience reports combining different frameworks (e.g. >> GATE/UIMA/WebLicht/etc.) >> - UIMA components with a special focus on genericity and type-system >> independence >> - repositories of ready-to-use components for UIMA and/or GATE >> - distribution of components: documentation, licensing and packaging >> - developing for UIMA or GATE: simplified APIs, debugging, unit testing, >> and limitations of the frameworks >> - combining annotation type systems in processing frameworks (GATE, UIMA, >> etc.) with standardization efforts, such as done in the ISO TC37/SC4 or TEI >> contexts. >> - use of NLP frameworks in real-world "industry" settings >> - reports on current projects and frameworks, their challenges and >> proposed or implemented solutions, including efforts to address >> interoperability >> - issues and challenges of multi-site collaborative projects, including >> reports of implemented or proposed strategies >> - pipeline management, including authentication, strategies for passing >> resources through disparate tools and across hosting nodes, and licensing >> - development and use of evaluation environments that facilitate >> assessment of HLT component performance, iterative application development, >> and replication of results >> - community awareness and implementation of open infrastructures, >> including how to engage the community, establish confidence in the process, >> and promote use >> >> Dates >> ----- >> Paper Submission Deadline: 2nd May 2014 >> Author Notification Deadline: 6th June 2014 >> Camera-Ready Paper Deadline: 27th June 2014 >> Workshop: 23rd August 2014 >> >> Organisers >> ---------- >> Nancy Ide >> Department of Computer Science, Vassar College >> >> James Pustejovsky >> Department of Computer Science, Brandeis University >> >> Eric Nyberg >> Language Technologies Institute, School of Computer Science, Carnegie >> Mellon University >> >> Christopher Cieri >> Linguistic Data Consortium, University of Pennsylvania >> >> Jonathan Wright >> Linguistic Data Consortium, University of Pennsylvania >> >> Jens Grivolla >> GLiCom, Universitat Pompeu Fabra >> >> Kalina Bontcheva >> Department of Computer Science, University of Sheffield >> >> >