LATA 2015: 1st call for papers
*To be removed from our mailing list, please respond to this message with UNSUBSCRIBE in the subject line* 9th INTERNATIONAL CONFERENCE ON LANGUAGE AND AUTOMATA THEORY AND APPLICATIONS LATA 2015 Nice, France March 2-6, 2015 Organized by: CNRS, I3S, UMR 7271 Nice Sophia Antipolis University Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University http://grammars.grlmc.com/lata2015/ AIMS: LATA is a conference series on theoretical computer science and its applications. Following the tradition of the diverse PhD training events in the field developed at Rovira i Virgili University in Tarragona since 2002, LATA 2015 will reserve significant room for young scholars at the beginning of their career. It will aim at attracting contributions from classical theory fields as well as application areas. VENUE: LATA 2015 will take place in Nice, the second largest French city on the Mediterranean coast. The venue will be the University Castle at Parc Valrose. SCOPE: Topics of either theoretical or applied interest include, but are not limited to: algebraic language theory algorithms for semi-structured data mining algorithms on automata and words automata and logic automata for system analysis and programme verification automata networks automata, concurrency and Petri nets automatic structures cellular automata codes combinatorics on words computational complexity data and image compression descriptional complexity digital libraries and document engineering foundations of finite state technology foundations of XML fuzzy and rough languages grammars (Chomsky hierarchy, contextual, unification, categorial, etc.) grammatical inference and algorithmic learning graphs and graph transformation language varieties and semigroups language-based cryptography parallel and regulated rewriting parsing patterns power series string and combinatorial issues in bioinformatics string processing algorithms symbolic dynamics term rewriting transducers trees, tree languages and tree automata unconventional models of computation weighted automata STRUCTURE: LATA 2015 will consist of: invited talks invited tutorials peer-reviewed contributions INVITED SPEAKERS: to be announced PROGRAMME COMMITTEE: Andrew Adamatzky (West of England, Bristol, UK) Andris Ambainis (Latvia, Riga, LV) Franz Baader (Dresden Tech, DE) Rajesh Bhatt (Massachusetts, Amherst, US) José-Manuel Colom (Zaragoza, ES) Bruno Courcelle (Bordeaux, FR) Erzsébet Csuhaj-Varjú (Eötvös Loránd, Budapest, HU) Aldo de Luca (Naples Federico II, IT) Susanna Donatelli (Turin, IT) Paola Flocchini (Ottawa, CA) Enrico Formenti (Nice, FR) Tero Harju (Turku, FI) Monika Heiner (Brandenburg Tech, Cottbus, DE) Yiguang Hong (Chinese Academy, Beijing, CN) Kazuo Iwama (Kyoto, JP) Sanjay Jain (National Singapore, SG) Maciej Koutny (Newcastle, UK) Antonín Kučera (Masaryk, Brno, CZ) Thierry Lecroq (Rouen, FR) Salvador Lucas (Valencia Tech, ES) Veli Mäkinen (Helsinki, FI) Carlos Martín-Vide (Rovira i Virgili, Tarragona, ES, chair) Filippo Mignosi (L’Aquila, IT) Victor Mitrana (Madrid Tech, ES) Ilan Newman (Haifa, IL) Joachim Niehren (INRIA, Lille, FR) Enno Ohlebusch (Ulm, DE) Arlindo Oliveira (Lisbon, PT) Joël Ouaknine (Oxford, UK) Wojciech Penczek (Polish Academy, Warsaw, PL) Dominique Perrin (ESIEE, Paris, FR) Alberto Policriti (Udine, IT) Sanguthevar Rajasekaran (Connecticut, Storrs, US) Jörg Rothe (Düsseldorf, DE) Frank Ruskey (Victoria, CA) Helmut Seidl (Munich Tech, DE) Ayumi Shinohara (Tohoku, Sendai, JP) Bernhard Steffen (Dortmund, DE) Frank Stephan (National Singapore, SG) Paul Tarau (North Texas, Denton, US) Andrzej Tarlecki (Warsaw, PL) Jacobo Torán (Ulm, DE) Frits Vaandrager (Nijmegen, NL) Jaco van de Pol (Twente, Enschede, NL) Pierre Wolper (Liège, BE) Zhilin Wu (Chinese Academy, Beijing, CN) Slawomir Zadrozny (Polish Academy, Warsaw, PL) Hans Zantema (Eindhoven Tech, NL) ORGANIZING COMMITTEE: Sébastien Autran (Nice) Adrian Horia Dediu (Tarragona) Enrico Formenti (Nice, co-chair) Sandrine Julia (Nice) Carlos Martín-Vide (Tarragona, co-chair) Christophe Papazian (Nice) Julien Provillard (Nice) Pierre-Alain Scribot (Nice) Bianca Truthe (Giessen) Florentina Lilica Voicu (Tarragona) SUBMISSIONS: Authors are invited to submit non-anonymized papers in English presenting original and unpublished research. Papers should not exceed 12 single-spaced pages (including eventual appendices, references, etc.) and should be prepared according to the standard format for Springer Verlag's LNCS series (see http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0). Submissions have to be uploaded to: https://www.easychair.org/conferences/?conf=lata2015 PUBLICATIONS: A volume of proceedings published by Springer in the LNCS series will be available by the time of the
Re: Just what *does* robots.txt mean for a LOD site?
Thanks Hugh for the subject change and the reasonable summary. @Luca, per my previous emails, I think that a robots.txt blacklist should affect a broad range of Linked Data agents, so much so that I would no longer consider the URIs affected dereferenceable, and thus I would no longer call the affected data Linked Data. I don't feel that "harsh" is applicable ... but I guess there is room for discussion. :) The difference in opinion remains to what extent Linked Data agents need to pay attention to the robots.txt file. As many others have suggested, I buy into the idea of any agent not relying document-wise on user input being subject to robots.txt. I should add that in your case Hugh, you can avoid problems while considering more fine-grained controls in your robots.txt file. For example, you can specifically ban Google/Yahoo/Yandex!/Bing agents, etc., from parts of your site using robots.txt. Likewise, if you are considered about the use of resources, you can throttle agents using "Crawl-delay" (a non-standard exception, but one that should be respected by the "big agents"). You can set a crawl delay with respect to the costs you foresee per request and the number of agents you see competing for resources. Note also that even the big spiders like Google, Yahoo!, etc., are unlikely to actually crawl very deep into your dataset unless you've a lot of incoming links. Essentially your site as you describe sounds like a part of the "Deep Web". Best, Aidan On 26/07/2014 07:16, Hugh Glaser wrote: > Hi. > > Im pretty sure this discussion suggest that we (the LD community) should > come try to come to some consensus of policy on exactly what it means if > an agent finds a robots.txt on a Linked Data site. > > So I have changed the subject line - sorry Chris, it should have been > changed earlier. > > Not an easy thing to come to, I suspect, but it seems to have become > significant. > Is there a more official forum for this sort of thing? > > On 26 Jul 2014, at 00:55, Luca Matteis wrote: > >> On Sat, Jul 26, 2014 at 1:34 AM, Hugh Glaser wrote: >>> That sort of sums up what I want. >> >> Indeed. So I agree that robots.txt should probably not establish >> whether something is a linked dataset or not. To me your data is still >> linked data even though robots.txt is blocking access of specific >> types of agents, such as crawlers. >> >> Aidan, >> >>> *) a Linked Dataset behind a robots.txt blacklist is not a Linked > Dataset. >> >> Isn't that a bit harsh? That would be the case if the only type of >> agent is a crawler. But as Hugh mentioned, linked datasets can be >> useful simply by treating URIs as dereferenceable identifiers without >> following links. > In Aidans view (I hope I am right here), it is perfectly sensible. > If you start from the premise that robots.txt is intended to prohibit > access be anything other than a browser with a human at it, then only > humans could fetch the RDF documents. > Which means that the RDF document is completely useless as a machine- > interpretable semantics for the resource, since it would need a human > to do some cut and paste or something to get it into a processor. > > It isnt really a question of harsh - it is perfectly logical from that > view of robots.txt (which isnt our view, because we think that robots.txt > is about "specific types of agents, as you say). > > Cheers > Hugh >
Re: Just what *does* robots.txt mean for a LOD site?
Hi Hugh I think an interpretation that would make sense would be similar to the policy of several websites APIs e.g. linkedin. "go ahead and get my data .. as long as its in reaction to an input your user directly generates" e.g. a user directly wants to know about "hugh glaser", that generates a call to linkin API, fine. scraping to get "all the users that" ? no way! so a chrome plugin that for example automatically follows some linked data in response to a user action would be ok, a spider just following links to get a dump would not. this above said i am afraid this whole thing is unlikely to have any actual impact.. if there are no aplications that would care/have any use to do either of the two above Gio On Sat, Jul 26, 2014 at 1:16 PM, Hugh Glaser wrote: > Hi. > > I’m pretty sure this discussion suggest that we (the LD community) should > come try to come to some consensus of policy on exactly what it means if an > agent finds a robots.txt on a Linked Data site. > > So I have changed the subject line - sorry Chris, it should have been > changed earlier. > > Not an easy thing to come to, I suspect, but it seems to have become > significant. > Is there a more official forum for this sort of thing? > > On 26 Jul 2014, at 00:55, Luca Matteis wrote: > > > On Sat, Jul 26, 2014 at 1:34 AM, Hugh Glaser wrote: > >> That sort of sums up what I want. > > > > Indeed. So I agree that robots.txt should probably not establish > > whether something is a linked dataset or not. To me your data is still > > linked data even though robots.txt is blocking access of specific > > types of agents, such as crawlers. > > > > Aidan, > > > >> *) a Linked Dataset behind a robots.txt blacklist is not a Linked > Dataset. > > > > Isn't that a bit harsh? That would be the case if the only type of > > agent is a crawler. But as Hugh mentioned, linked datasets can be > > useful simply by treating URIs as dereferenceable identifiers without > > following links. > In Aidan’s view (I hope I am right here), it is perfectly sensible. > If you start from the premise that robots.txt is intended to prohibit > access be anything other than a browser with a human at it, then only > humans could fetch the RDF documents. > Which means that the RDF document is completely useless as a > machine-interpretable semantics for the resource, since it would need a > human to do some cut and paste or something to get it into a processor. > > It isn’t really a question of harsh - it is perfectly logical from that > view of robots.txt (which isn’t our view, because we think that robots.txt > is about "specific types of agents”, as you say). > > Cheers > Hugh > > -- > Hugh Glaser >20 Portchester Rise >Eastleigh >SO50 4QS > Mobile: +44 75 9533 4155, Home: +44 23 8061 5652 > > > >
Just what *does* robots.txt mean for a LOD site?
Hi. I’m pretty sure this discussion suggest that we (the LD community) should come try to come to some consensus of policy on exactly what it means if an agent finds a robots.txt on a Linked Data site. So I have changed the subject line - sorry Chris, it should have been changed earlier. Not an easy thing to come to, I suspect, but it seems to have become significant. Is there a more official forum for this sort of thing? On 26 Jul 2014, at 00:55, Luca Matteis wrote: > On Sat, Jul 26, 2014 at 1:34 AM, Hugh Glaser wrote: >> That sort of sums up what I want. > > Indeed. So I agree that robots.txt should probably not establish > whether something is a linked dataset or not. To me your data is still > linked data even though robots.txt is blocking access of specific > types of agents, such as crawlers. > > Aidan, > >> *) a Linked Dataset behind a robots.txt blacklist is not a Linked Dataset. > > Isn't that a bit harsh? That would be the case if the only type of > agent is a crawler. But as Hugh mentioned, linked datasets can be > useful simply by treating URIs as dereferenceable identifiers without > following links. In Aidan’s view (I hope I am right here), it is perfectly sensible. If you start from the premise that robots.txt is intended to prohibit access be anything other than a browser with a human at it, then only humans could fetch the RDF documents. Which means that the RDF document is completely useless as a machine-interpretable semantics for the resource, since it would need a human to do some cut and paste or something to get it into a processor. It isn’t really a question of harsh - it is perfectly logical from that view of robots.txt (which isn’t our view, because we think that robots.txt is about "specific types of agents”, as you say). Cheers Hugh -- Hugh Glaser 20 Portchester Rise Eastleigh SO50 4QS Mobile: +44 75 9533 4155, Home: +44 23 8061 5652