Re: PUBLINK Linked Data Consultancy
On Thu, 2010-10-07 at 01:38 +0200, Sören Auer wrote: On 07.10.2010 1:13, Georgi Kobilarov wrote: So, now the EU also takes that burden off the small linked data consultancies and businesses. Not at all! PUBLINK is not aimed at organizations which already precisely know what they want and are willing to pay for it. It is more aimed at people in organizations who want to persuade their decision makers or decision makers who need more information or a showcase in order to get ultimately involved. Insofar PUBLINK rather clears the way for commercial linked data service providers. But is not working with any breadth of such providers. I share Georgi's reservations, seems like an odd direction for EU framework projects to take. Dave
Re: PUBLINK Linked Data Consultancy
On 07.10.2010 9:57, Dave Reynolds wrote: Insofar PUBLINK rather clears the way for commercial linked data service providers. But is not working with any breadth of such providers. I share Georgi's reservations, seems like an odd direction for EU framework projects to take. Its not really a fundamental change of direction, our main focus is research but we also want to evaluate our results on real data and give something back to the citizens, which is why we aim to get in touch with data owners of high public interest and help them a little to move in the right (i.e. LOD) direction ;-) If commercial linked data service providers beyond LOD2/LATC consortia, want to get involved in PUBLINK we are more than happy about that. Let me know if you have suggestions how this could be implemented best. Sören PS: Please also keep in mind that PUBLINK is very limited (max. 3-5 data owning organizations) and ca. 10 man days of support for each.
Re: PUBLINK Linked Data Consultancy
Clearly this is an exciting thing to be doing, but I couldn't let Sören's comments go :-) On 07/10/2010 08:57, Dave Reynolds dave.e.reyno...@gmail.com wrote: On Thu, 2010-10-07 at 01:38 +0200, Sören Auer wrote: On 07.10.2010 1:13, Georgi Kobilarov wrote: So, now the EU also takes that burden off the small linked data consultancies and businesses. Not at all! PUBLINK is not aimed at organizations which already precisely know what they want and are willing to pay for it. Er, if you know of an organisation that knows precisely what they want in Linked Data, please tell. It is more aimed at people in organizations who want to persuade their decision makers or decision makers who need more information or a showcase in order to get ultimately involved. That quite neatly describes every organisation on any new technology, and certainly every one I have spoken to about Linked Data. Insofar PUBLINK rather clears the way for commercial linked data service providers. But is not working with any breadth of such providers. I share Georgi's reservations, seems like an odd direction for EU framework projects to take. Not unusual in direction, of course, but usually there is more of a financial or externally reviewed contribution from the user organisation. I too was slightly surprised at the announcement, and thought that's unusual. Seems like the EU is simply funding some companies to do what they they have to do for their main business. I think the question is whether this is pre-competitive: maybe, but only just. There are quite a few companies for whom this is exactly what they do (including project partners). (I may be out of date about Framework needing to be pre-competitive?) Of course it makes perfect sense from the projects' point of view, which is clearly trying to generate new knowledge/technologies as required, and is a very interesting way of presenting what they want. Looking for partners to work with to hone your processes and technologies in Linked Data, and grow the community (both of which we all want), you want to tell the possible customers that this is a well-polished field, not that they are being invited to engage in pre-competitive RD. So a great initiative for the community, but it does look strange as presented. But it is only max 5 across Europe. Best Hugh Dave
RE: PUBLINK Linked Data Consultancy
Sören Auer wrote: PS: Please also keep in mind that PUBLINK is very limited (max. 3-5 data owning organizations) and ca. 10 man days of support for each. I think those numbers are the really important bits. I have seen EU projects where there were plans to perform really huge field studies. I would consider this a problem in this case (not only for existing startups, but also for the project consortium :)). But 3-5 organizations sounds fair to me and will probably not lead to much conflict with existing companies. Whether 10 man days will be sufficient is a different question... :) Just my 2 Euro Cents. Michael -- Dipl.-Inform. Michael Schneider Research Scientist, Information Process Engineering (IPE) Tel : +49-721-9654-726 Fax : +49-721-9654-727 Email: michael.schnei...@fzi.de WWW : http://www.fzi.de/michael.schneider === FZI Forschungszentrum Informatik an der Universität Karlsruhe Haid-und-Neu-Str. 10-14, D-76131 Karlsruhe Tel.: +49-721-9654-0, Fax: +49-721-9654-959 Stiftung des bürgerlichen Rechts, Az 14-0563.1, RP Karlsruhe Vorstand: Prof. Dr.-Ing. Rüdiger Dillmann, Dipl. Wi.-Ing. Michael Flor, Prof. Dr. Dr. h.c. Wolffried Stucky, Prof. Dr. Rudi Studer Vorsitzender des Kuratoriums: Ministerialdirigent Günther Leßnerkraus ===
Re: PUBLINK Linked Data Consultancy
On 07/10/2010 11:58, Hugh Glaser h...@ecs.soton.ac.uk wrote: Clearly this is an exciting thing to be doing, but I couldn't let Sören's comments go :-) On 07/10/2010 08:57, Dave Reynolds dave.e.reyno...@gmail.com wrote: On Thu, 2010-10-07 at 01:38 +0200, Sören Auer wrote: On 07.10.2010 1:13, Georgi Kobilarov wrote: So, now the EU also takes that burden off the small linked data consultancies and businesses. Not at all! PUBLINK is not aimed at organizations which already precisely know what they want and are willing to pay for it. Er, if you know of an organisation that knows precisely what they want in Linked Data, please tell. [silence] :-) It is more aimed at people in organizations who want to persuade their decision makers or decision makers who need more information or a showcase in order to get ultimately involved. That quite neatly describes every organisation on any new technology, and certainly every one I have spoken to about Linked Data. I agree but think there are arguments before you even get to linked data: - why should we make our data available at all in any format - if we do make it available can we still control use (separate api, api key, rate limiting etc) Once those are out of the way there's no greater problem making rdf than there is any other representation. The only barrier is the ontologification of the (hopefully semi-sane) data model you already have But there is help available for that. If this project is about helping organisations to data model or write ontologies or write and deploy actual code then I think it is stepping on commercial toes Insofar PUBLINK rather clears the way for commercial linked data service providers. By doing what? Which bits does publink do and which bits are left to the commercial sector? From the lines above it aims to help people in organizations who want to persuade their decision makers or persuade decision makers in general with demos Personally I think if that's the intention it's good. I know where to find help with data modelling, hosting, data consolidation, existing ontologies, content negotiation etc etc. But I don't know where to go for help translating developer understanding to business understanding Seeing companies who provide api services (api keys, rate limits etc) operate I can see they understand how to translate the usual businesses didn't used to publish prices you know stuff into language that business types understand. I don't know where to look for that kind of advice in linked data that doesn't speak at a technical / academic level. Basically feels like we're missing some marketing If publink can fill that gap and leave help with implementation to the commercial sector I think that would be good. Unless that help already exists in the commercial sector and I've just missed it But is not working with any breadth of such providers. I share Georgi's reservations, seems like an odd direction for EU framework projects to take. Not unusual in direction, of course, but usually there is more of a financial or externally reviewed contribution from the user organisation. I too was slightly surprised at the announcement, and thought that's unusual. Seems like the EU is simply funding some companies to do what they they have to do for their main business. I think the question is whether this is pre-competitive: maybe, but only just. There are quite a few companies for whom this is exactly what they do (including project partners). If there's any that specialise in translating for business decision makers a list would be cool :-) (I may be out of date about Framework needing to be pre-competitive?) Of course it makes perfect sense from the projects' point of view, which is clearly trying to generate new knowledge/technologies as required, and is a very interesting way of presenting what they want. But if it's about that is it really about influencing and dare I say marketing? Looking for partners to work with to hone your processes and technologies in Linked Data, and grow the community (both of which we all want), you want to tell the possible customers that this is a well-polished field, not that they are being invited to engage in pre-competitive RD. So a great initiative for the community, but it does look strange as presented. But it is only max 5 across Europe. And there is that :-) Best Hugh Dave http://www.bbc.co.uk/ This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this.
Re: PUBLINK Linked Data Consultancy
On Thu, Oct 7, 2010 at 1:00 PM, Michael Schneider schn...@fzi.de wrote: Sören Auer wrote: PS: Please also keep in mind that PUBLINK is very limited (max. 3-5 data owning organizations) and ca. 10 man days of support for each. I think those numbers are the really important bits. I have seen EU projects where there were plans to perform really huge field studies. I would consider this a problem in this case (not only for existing startups, but also for the project consortium :)). But 3-5 organizations sounds fair to me and will probably not lead to much conflict with existing companies. Whether 10 man days will be sufficient is a different question... :) While, I welcome more free assistance to linked data adoption, I think this would be most effective if it were targetted towards organisations that do not have existing funds to pay for training and consultancy. At Talis we have encountered several in that situation and while we help where we can we do have to earn an income. EU funded help would be perfect for these organisations. Targetting organisations that would otherwise buy from a commercial company just undermines a nascent market. Ian
Call for Chapters: Linking Government Data
Hi all, Please find below a Call for Chapters for a new contributed book to be entitled Linking_Government_Data. Please distribute this information as widely as possible to help us collect useful success stories, techniques and benefits to using Linked Data in governments. Thanks in advance. Regards, Dave -- David Wood announces a Call for Chapters for a new book to be entitled Linking Government Data. First proposal submissions are due November 30, 2010 to da...@3roundstones.com. The book is intended to be published in print, ebooks format and on the Web, but a publisher has not yet been chosen. More than one publisher is interested. CHAPTER PROPOSALS INVITED FROM RESEARCHERS AND PRACTITIONERS IN LINKED DATA, DATA MANAGEMENT AND WEB INFORMATION SYSTEMS 1st Proposal Submission Deadline: November 30, 2010 Full Chapter Submission Deadline: March 1, 2010 Linking Government Data A book edited by David Wood, Talis, USA I. Introduction Linking Government Data is the application of Semantic Web architecture principles to real-world information management issues faced by government agencies. The term LGD is a play on Linking Open Data (LOD), a community project started by the World Wide Web Consortium’s Semantic Web Education and Outreach Interest Group aimed at exposing data sets to the Web in standard formats and actively relating them to one another with hyperlinks. Data in general is growing at a much faster rate than traditional technologies allow. The World Wide Web is the only information system we know that scales to the degree that it does and is robust to both changes and failure of components. Most software does not work nearly as well as the Web does. Applying the Web’s architectural principles to government information distribution programs may be the only way to effectively address the current and future information glut. Challenges remain, however, because the publication of data to the Web requires government agencies to give up the central control and planning traditionally applied by IT departments. A primary goal of this book is to highlight both costs and benefits to broader society of the publication of raw data to the Web by government agencies. How might the use of government Linked Data by the Fourth Estate of the public press change societies? How can agencies fulfill their missions with less cost? How must intra-agency culture change to allow public presentation of Linked Data? This book follows the successful publication of Linking Enterprise Data by Springer Science+Business Media in October 2011. II. Objective of the Book This book aims to provide practical approaches to addressing common information management issues by the application of Semantic Web and Linked Data research to government environments and to report early experiences with the publication of Linked Data by government agencies. The approaches taken are based on international standards. The book is to be written and edited by leaders in Semantic Web and Linked Data research and standards development and early adopters of Semantic Web and Linked Data standards and techniques. III. Target Audience This book is meant for Semantic Web researchers and academicians, and CTOs, CIOs, enterprise architects, project managers and application developers in commercial, not-for-profit and government organizations concerned with scalability, flexibility and robustness of information management systems. Not-for-profit organizations specifically include the library and museum communities. Recommended topics include, but are not limited to, the following: – social, technical and mission values of applying Web architecture to government content, such as the means by which deployment agility, resilience and reuse of data may be accomplished – Relating to other eGov initiatives – Building of social (human-centered) communities to curate distributed data – Enterprise infrastructure for Linking Government Data – Persistent Identifiers – Linking the government cloud – Applications of Linked Data to government transparency, organizational learning or curation of/access to distributed information – Publishing large-scale Linked Data. Contributions from those working with government Linked Data projects of all sizes are sought. Many stories exist from the U.S. and U.K. government agencies, but contributions from Estonia, Germany, New Zealand, Norway, etc, etc, are more than welcome. IV. Publisher The book is intended to be published in print, ebooks format and on the Web, but a publisher has not yet been chosen. More than one publisher is interested. This book is expected to be published in late 2011. V. Proposals Proposals for chapters should consist of a summary of intended material, approximately 1-2 pages in length. Please provide a working chapter title, authors names and affiliations, relevant experience with Linked Data projects for a government entity (or
RE: PUBLINK Linked Data Consultancy
So, now the EU also takes that burden off the small linked data consultancies and businesses. Not at all! PUBLINK is not aimed at organizations which already precisely know what they want and are willing to pay for it. Oh yes, clients who precisely know what they want. Of course, how could I not think of those? It is more aimed at people in organizations who want to persuade their decision makers or decision makers who need more information or a showcase in order to get ultimately involved. If it requires an EU funded consortium of researchers to go in and persuade people, something is fundamentally wrong. Georgi
RE: PUBLINK Linked Data Consultancy
Hi Michael, Insofar PUBLINK rather clears the way for commercial linked data service providers. By doing what? Which bits does publink do and which bits are left to the commercial sector? From the lines above it aims to help people in organizations who want to persuade their decision makers or persuade decision makers in general with demos Personally I think if that's the intention it's good. I know where to find help with data modelling, hosting, data consolidation, existing ontologies, content negotiation etc etc. But I don't know where to go for help translating developer understanding to business understanding The intention is good, I agree, but centralizing all of the work into just one consortium isn't. One research consortium as the new linked data monopolist, that's not the message to send out into the world. Plus, in my opinion there is little of a business model in publishing data in itself. There can only be a business model in having other people use data, for which publishing is one necessity of course. So the showcases for the business will be on the data consumption side, and these demos should be developed by people who know how to build demos and showcases. Persuading a few more data publishers won't change the landscape, so I'd much rather see that money go into e.g. developer/design competitions like the Sunlight Foundation is doing in the US. Cheers, Georgi
Re: PUBLINK Linked Data Consultancy
Sören, before you get too depressed with all this negativity . I'd just like to say that I for one think that this is a *great* idea and a very good use of EU project resources. Getting out there and helping to kick start more institutions to LoD is definitely the way to go with your project. Good Luck! and please keep me posted on results cheers John On 7 Oct 2010, at 16:45, Georgi Kobilarov wrote: Hi Michael, Insofar PUBLINK rather clears the way for commercial linked data service providers. By doing what? Which bits does publink do and which bits are left to the commercial sector? From the lines above it aims to help people in organizations who want to persuade their decision makers or persuade decision makers in general with demos Personally I think if that's the intention it's good. I know where to find help with data modelling, hosting, data consolidation, existing ontologies, content negotiation etc etc. But I don't know where to go for help translating developer understanding to business understanding The intention is good, I agree, but centralizing all of the work into just one consortium isn't. One research consortium as the new linked data monopolist, that's not the message to send out into the world. Plus, in my opinion there is little of a business model in publishing data in itself. There can only be a business model in having other people use data, for which publishing is one necessity of course. So the showcases for the business will be on the data consumption side, and these demos should be developed by people who know how to build demos and showcases. Persuading a few more data publishers won't change the landscape, so I'd much rather see that money go into e.g. developer/ design competitions like the Sunlight Foundation is doing in the US. Cheers, Georgi -- _ Deputy Director, Knowledge Media Institute, The Open University Walton Hall, Milton Keynes, MK7 6AA, UK phone: 0044 1908 653800, fax: 0044 1908 653169 email: j.b.domin...@open.ac.uk web: kmi.open.ac.uk/people/domingue/ President, STI International Amerlingstrasse 19/35, Austria - 1060 Vienna phone: 0043 1 23 64 002 - 16, fax: 0043 1 23 64 002-99 email: john.domin...@sti2.org web: www.sti2.org -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England Wales and a charity registered in Scotland (SC 038302).
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
Hi Martin, We have discussed this off-list before, but maybe others would like to chime in... I don't think it is sad; because using invisible div / span elements nicely decouple the organization of the visual content from the embedded data. Martin, you never fail to hash-mark your #GoodRelations tweets with #SEO. Decoupling triples and content raises an interesting SEO problem: state A in the visible content, state B in the invisible triples. Now which information do we trust? It's the white text on a white background search engine fooling of the 21st century. I'm not yet sure if it's a real problem, but could imagine that tweaking price tags might be tempting to some. Opinions? Thanks, Tom Disclaimer: I work for Google, but I have no insider information at all how/if we deal with this. -- Thomas Steiner, Research Scientist, Google Inc. http://blog.tomayac.com, http://twitter.com/tomayac
Reminder: Call for Use Cases: Library Linked Data
[apologies for cross-posting] W3C Library Linked Data Incubator Group - http://www.w3.org/2005/Incubator/lld/ Call for Use Cases: Library Linked Data Are you currently using linked data technology [1] for library-related data, or considering doing it in the near future? If so, please tell us more by filling in the questionnaire below and sending it back to us or to public-...@w3.org, preferably before October 15th, 2010. The information you provide will be influential in guiding the activities the Library Linked Data Incubator Group will undertake to help increase global interoperability of library data on the Web. The information you provide will be curated and published on the group wikispace at [3]. We understand that your time is precious, so please don't feel you have to answer every question. Some sections of the templates are clearly marked as optional. However, the more information you can provide, the easier it will be for the Incubator Group to understand your case. And, of course, please do not hesitate to contact us if you have any trouble answering our questions. Editorial guidance on specific points is provided at [2], and examples are available at [3]. We are particularly interested in use cases describing the use of library linked data for end-user oriented applications. However, we're not ruling anything out at this stage, and the Incubator Group will carefully consider all submissions we receive. On behalf of the Incubator Group, thanks in advance for your time, Emmanuelle Bermes (Emmanuelle.Bermes_bnf.fr), Alexander Haffner (A.Haffner_d-nb.de), Antoine Isaac (aisaac_few.vu.nl) and Jodi Schneider (jodi.schneider_deri.org) [1] http://www.w3.org/DesignIssues/LinkedData.html [2] http://www.w3.org/2005/Incubator/lld/wiki/UCCuration [3] http://www.w3.org/2005/Incubator/lld/wiki/UseCases === Name === A short name by which we can refer to the use case in discussions. === Owner === The contact person for this use case. === Background and Current Practice === Where this use case takes place in a specific domain, and so requires some prior information to understand, this section is used to describe that domain. As far as possible, please put explanation of the domain in here, to keep the scenario as short as possible. If this scenario is best illustrated by showing how applying technology could replace current existing practice, then this section can be used to describe the current practice. Often, the key to why a use case is important also lies in what problem would occur if it was not achieved, or what problem means it is hard to achieve. === Goal === Two short statements stating (1) what is achieved in the scenario without reference to linked data, and (2) how we use linked data technology to achieve this goal. === Target Audience === The main audience of your case. For example scholars, the general public, service providers, archivists, computer programs... === Use Case Scenario === The use case scenario itself, described as a story in which actors interact with systems. This section should focus on the user needs in this scenario. Do not mention technical aspects and/or the use of linked data. === Application of linked data for the given use case === This section describes how linked data technology could be used to support the use case above. Try to focus on linked data on an abstract level, without mentioning concrete applications and/or vocabularies. Hint: Nothing library domain specific. === Existing Work (optional) === This section is used to refer to existing technologies or approaches which achieve the use case (Hint: Specific approaches in the library domain). It may especially refer to running prototypes or applications. === Related Vocabularies (optional) === Here you can list and clarify the use of vocabularies (element sets and value vocabularies) which can be helpful and applied within this context. === Problems and Limitations (optional) === This section lists reasons why this scenario is or may be difficult to achieve, including pre-requisites which may not be met, technological obstacles etc. Please explicitly list here the technical challenges made apparent by this use case. This will aid in creating a roadmap to overcome those challenges. === Related Use Cases and Unanticipated Uses (optional) === The scenario above describes a particular case of using linked data. However, by allowing this scenario to take place, the likely solution allows for other use cases. This section captures unanticipated uses of the same system apparent in the use case scenario. === References (optional) === This section is used to refer to cited literature and quoted websites.
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
On Wed, Oct 6, 2010 at 1:49 PM, Martin Hepp martin.h...@ebusiness-unibw.org wrote: It is too expensive to expect data owners to lift their existing data to academic expectations. You must empower them to preserve as much data semantics and data structure as they can provide ad hoc. Lifting and augmenting the data can be added later. Don't get the idea that academic expectations are better than commercial expectations, they're just different. The whole point of Ontology2 is to commercize information extraction with a philosophy very much like what these folks are doing: http://rtw.ml.cmu.edu/papers/carlson-aaai10.pdf Now in some ways they've got something way more advanced than what I've got: however, they say that their ontology is populated with 242,453 new facts with estimated precsion on 74%. For me, I can't get away with an estimated precision of 74%, I'd look like a total fool publishing data that dirty on the web, unless I can find some way to conceal the dirt. Talking with people who are interested in semantic technology for e-commerce, I find a common desire is to not only reduce the cost of human labor but to also build systems that attain superhuman accuracy in describing and categorizing products (at least better accuracy than the people who are doing this job today.) [Note also that the rate of fact extraction these guys are doing isn't so hot either... You can get 10^7-10^8 facts out of dbpedia+freebase covering a similar domain] From a commercial viewpoint, imperfect data is an opportunity. If I didn't have other projects ahead of it in the queue, I'd seriously be thinking about building a shopping aggregator that cleans up GoodRelations and other data, reconciles product identities, categorizes products, creates good product descriptions, and make something that improves on current affiliate marketing and comparison shopping systems. Note that the beauty of an ontology is in the eyes of a user. One user might want to have a broad but vague ontology of products, they are happy to say that a digital camera is a :DigitalCamera. Other people might want to just cover the photography domain, but do it in great detail -- describing both the differences between digital cameras manufactured today but also lenses, and even covering, in great detail, vintage cameras that you might find on eBay. You can't say that one of these ontologies is better than the other. The best thing is to have all of these ontologies available [populated with data!] and to pick and choose the the ones that fit your needs.
Re: Call for Chapters: Linking Government Data
On 10/7/10 10:02 AM, David Wood wrote: Hi all, Please find below a Call for Chapters for a new contributed book to be entitled Linking_Government_Data. Please distribute this information as widely as possible to help us collect useful success stories, techniques and benefits to using Linked Data in governments. Thanks in advance. Regards, Dave -- David Wood announces a Call for Chapters for a new book to be entitled Linking Government Data. First proposal submissions are due November 30, 2010 to da...@3roundstones.com. The book is intended to be published in print, ebooks format and on the Web, but a publisher has not yet been chosen. More than one publisher is interested. CHAPTER PROPOSALS INVITED FROM RESEARCHERS AND PRACTITIONERS IN LINKED DATA, DATA MANAGEMENT AND WEB INFORMATION SYSTEMS 1st Proposal Submission Deadline: November 30, 2010 Full Chapter Submission Deadline: March 1, 2010 Linking Government Data A book edited by David Wood, Talis, USA I. Introduction Linking Government Data is the application of Semantic Web architecture principles to real-world information management issues faced by government agencies. The term LGD is a play on Linking Open Data (LOD), a community project started by the World Wide Web Consortium’s Semantic Web Education and Outreach Interest Group aimed at exposing data sets to the Web in standard formats and actively relating them to one another with hyperlinks. Data in general is growing at a much faster rate than traditional technologies allow. The World Wide Web is the only information system we know that scales to the degree that it does and is robust to both changes and failure of components. Most software does not work nearly as well as the Web does. Applying the Web’s architectural principles to government information distribution programs may be the only way to effectively address the current and future information glut. Challenges remain, however, because the publication of data to the Web requires government agencies to give up the central control and planning traditionally applied by IT departments. A primary goal of this book is to highlight both costs and benefits to broader society of the publication of raw data to the Web by government agencies. How might the use of government Linked Data by the Fourth Estate of the public press change societies? How can agencies fulfill their missions with less cost? How must intra-agency culture change to allow public presentation of Linked Data? This book follows the successful publication of Linking Enterprise Data by Springer Science+Business Media in October 2011. II. Objective of the Book This book aims to provide practical approaches to addressing common information management issues by the application of Semantic Web and Linked Data research to government environments and to report early experiences with the publication of Linked Data by government agencies. The approaches taken are based on international standards. The book is to be written and edited by leaders in Semantic Web and Linked Data research and standards development and early adopters of Semantic Web and Linked Data standards and techniques. III. Target Audience This book is meant for Semantic Web researchers and academicians, and CTOs, CIOs, enterprise architects, project managers and application developers in commercial, not-for-profit and government organizations concerned with scalability, flexibility and robustness of information management systems. Not-for-profit organizations specifically include the library and museum communities. Recommended topics include, but are not limited to, the following: – social, technical and mission values of applying Web architecture to government content, such as the means by which deployment agility, resilience and reuse of data may be accomplished – Relating to other eGov initiatives – Building of social (human-centered) communities to curate distributed data – Enterprise infrastructure for Linking Government Data – Persistent Identifiers – Linking the government cloud – Applications of Linked Data to government transparency, organizational learning or curation of/access to distributed information – Publishing large-scale Linked Data. Contributions from those working with government Linked Data projects of all sizes are sought. Many stories exist from the U.S. and U.K. government agencies, but contributions from Estonia, Germany, New Zealand, Norway, etc, etc, are more than welcome. IV. Publisher The book is intended to be published in print, ebooks format and on the Web, but a publisher has not yet been chosen. More than one publisher is interested. This book is expected to be published in late 2011. V. Proposals Proposals for chapters should consist of a summary of intended material, approximately 1-2 pages in length. Please provide a working chapter title, authors names and affiliations, relevant experience with Linked Data
Re: Call for Chapters: Linking Government Data
Will all due respect, as with any monograph this is a call to *contribute*; the benefits if accepted are being part of an important work. Recipients are free to not submit! On Thu, Oct 7, 2010 at 11:23 AM, Kingsley Idehen kide...@openlinksw.com wrote: On 10/7/10 10:02 AM, David Wood wrote: Hi all, Please find below a Call for Chapters for a new contributed book to be entitled Linking_Government_Data. Please distribute this information as widely as possible to help us collect useful success stories, techniques and benefits to using Linked Data in governments. Thanks in advance. Regards, Dave -- David Wood announces a Call for Chapters for a new book to be entitled Linking Government Data. First proposal submissions are due November 30, 2010 to da...@3roundstones.com. The book is intended to be published in print, ebooks format and on the Web, but a publisher has not yet been chosen. More than one publisher is interested. CHAPTER PROPOSALS INVITED FROM RESEARCHERS AND PRACTITIONERS IN LINKED DATA, DATA MANAGEMENT AND WEB INFORMATION SYSTEMS 1st Proposal Submission Deadline: November 30, 2010 Full Chapter Submission Deadline: March 1, 2010 Linking Government Data A book edited by David Wood, Talis, USA I. Introduction Linking Government Data is the application of Semantic Web architecture principles to real-world information management issues faced by government agencies. The term LGD is a play on Linking Open Data (LOD), a community project started by the World Wide Web Consortium’s Semantic Web Education and Outreach Interest Group aimed at exposing data sets to the Web in standard formats and actively relating them to one another with hyperlinks. Data in general is growing at a much faster rate than traditional technologies allow. The World Wide Web is the only information system we know that scales to the degree that it does and is robust to both changes and failure of components. Most software does not work nearly as well as the Web does. Applying the Web’s architectural principles to government information distribution programs may be the only way to effectively address the current and future information glut. Challenges remain, however, because the publication of data to the Web requires government agencies to give up the central control and planning traditionally applied by IT departments. A primary goal of this book is to highlight both costs and benefits to broader society of the publication of raw data to the Web by government agencies. How might the use of government Linked Data by the Fourth Estate of the public press change societies? How can agencies fulfill their missions with less cost? How must intra-agency culture change to allow public presentation of Linked Data? This book follows the successful publication of Linking Enterprise Data by Springer Science+Business Media in October 2011. II. Objective of the Book This book aims to provide practical approaches to addressing common information management issues by the application of Semantic Web and Linked Data research to government environments and to report early experiences with the publication of Linked Data by government agencies. The approaches taken are based on international standards. The book is to be written and edited by leaders in Semantic Web and Linked Data research and standards development and early adopters of Semantic Web and Linked Data standards and techniques. III. Target Audience This book is meant for Semantic Web researchers and academicians, and CTOs, CIOs, enterprise architects, project managers and application developers in commercial, not-for-profit and government organizations concerned with scalability, flexibility and robustness of information management systems. Not-for-profit organizations specifically include the library and museum communities. Recommended topics include, but are not limited to, the following: – social, technical and mission values of applying Web architecture to government content, such as the means by which deployment agility, resilience and reuse of data may be accomplished – Relating to other eGov initiatives – Building of social (human-centered) communities to curate distributed data – Enterprise infrastructure for Linking Government Data – Persistent Identifiers – Linking the government cloud – Applications of Linked Data to government transparency, organizational learning or curation of/access to distributed information – Publishing large-scale Linked Data. Contributions from those working with government Linked Data projects of all sizes are sought. Many stories exist from the U.S. and U.K. government agencies, but contributions from Estonia, Germany, New Zealand, Norway, etc, etc, are more than welcome. IV. Publisher The book is intended to be published in print, ebooks format and on the Web, but a publisher has not yet been chosen. More than one publisher is interested. This
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
On Wed, Oct 6, 2010 at 5:09 PM, Martin Hepp martin.h...@ebusiness-unibw.org wrote: I've got mixed feelings about snippets vs fully embeded RDFa. For the most part I think systems that use snippets will be more maintainable, but I've seen cases where fully embedded RDFa fits very well into a system and there may be cases where the size of the HTML can be reduced by using it -- and HTML size is a big deal in the real world where loading time matters and we're increasingly targeting mobile devices. The RDFa issue that really bugs me is that a linked data URI can be read to signify a number of different things. Consider, for instance, http://dbpedia.org/resource/Rainbow_Bridge_(Tokyo) (i) This is a string. It has a length. It uses a particular subset of available characters (ii) This is a URI. It has a scheme, it has a host, path, might have a # in it, query strings, all that; a number of assertions can be made about it as a URI (iii) This is a document. We can assert the content-type of this document (or at least one version we've negotiated), we can assert it's charset, length in bytes, length in characters, particular subset of available characters used, number of triples asserted directly in the document, the number of triples we can infer by applying certain rules to this in connection with a certain knowledgebase, and on and on (iv) This is about a wikipedia article (some wikipedia articles don't map cleanly to a named entity) (v) This is about a named entity The more I think about it, the more I it bugs me, and it's all the worse when you've using RDFa and you've got HTML documents. For instance, you could clearly see http://ookaboo.com/o/pictures/topic/28999/Beijing as a signifier for a city. Some people would make the assertion that dbpedia:Beijing owl:sameAs ookaboo:topic/28999/Beijing. and that's not entirely stupid. On the other hand, it's definitely true that ookaboo:topic/28999/Beijing is sioc:ImageGallery. Put something true together with a practice that's common and you get the absurd result that dbpedia:Beijing is sioc:ImageGallery.
Re: Call for Chapters: Linking Government Data
On 10/7/10 11:42 AM, John Erickson wrote: Will all due respect, as with any monograph this is a call to *contribute*; the benefits if accepted are being part of an important work. Recipients are free to not submit! John, My question still stands? Who benefits from the sale of the book? No harm in investing a little more time about the expanse of the value chain graph. Time is money. Time is a fixed component that is eternally scarce. Time is the ultimate problem. From these problems come opportunities and opportunity costs. People don't always have enough time to figure our the density of any given value graph or its superficial value chain. Finding out vital details *after* you've committed time and effort typically leads to bad-will. Let's be clear about this stuff. That's all I seek. Transparency hasn't killed anyone or made enemies of friends, not the case with opacity! Kingsley On Thu, Oct 7, 2010 at 11:23 AM, Kingsley Idehenkide...@openlinksw.com wrote: On 10/7/10 10:02 AM, David Wood wrote: Hi all, Please find below a Call for Chapters for a new contributed book to be entitled Linking_Government_Data. Please distribute this information as widely as possible to help us collect useful success stories, techniques and benefits to using Linked Data in governments. Thanks in advance. Regards, Dave -- David Wood announces a Call for Chapters for a new book to be entitled Linking Government Data. First proposal submissions are due November 30, 2010 to da...@3roundstones.com. The book is intended to be published in print, ebooks format and on the Web, but a publisher has not yet been chosen. More than one publisher is interested. CHAPTER PROPOSALS INVITED FROM RESEARCHERS AND PRACTITIONERS IN LINKED DATA, DATA MANAGEMENT AND WEB INFORMATION SYSTEMS 1st Proposal Submission Deadline: November 30, 2010 Full Chapter Submission Deadline: March 1, 2010 Linking Government Data A book edited by David Wood, Talis, USA I. Introduction Linking Government Data is the application of Semantic Web architecture principles to real-world information management issues faced by government agencies. The term LGD is a play on Linking Open Data (LOD), a community project started by the World Wide Web Consortium’s Semantic Web Education and Outreach Interest Group aimed at exposing data sets to the Web in standard formats and actively relating them to one another with hyperlinks. Data in general is growing at a much faster rate than traditional technologies allow. The World Wide Web is the only information system we know that scales to the degree that it does and is robust to both changes and failure of components. Most software does not work nearly as well as the Web does. Applying the Web’s architectural principles to government information distribution programs may be the only way to effectively address the current and future information glut. Challenges remain, however, because the publication of data to the Web requires government agencies to give up the central control and planning traditionally applied by IT departments. A primary goal of this book is to highlight both costs and benefits to broader society of the publication of raw data to the Web by government agencies. How might the use of government Linked Data by the Fourth Estate of the public press change societies? How can agencies fulfill their missions with less cost? How must intra-agency culture change to allow public presentation of Linked Data? This book follows the successful publication of Linking Enterprise Data by Springer Science+Business Media in October 2011. II. Objective of the Book This book aims to provide practical approaches to addressing common information management issues by the application of Semantic Web and Linked Data research to government environments and to report early experiences with the publication of Linked Data by government agencies. The approaches taken are based on international standards. The book is to be written and edited by leaders in Semantic Web and Linked Data research and standards development and early adopters of Semantic Web and Linked Data standards and techniques. III. Target Audience This book is meant for Semantic Web researchers and academicians, and CTOs, CIOs, enterprise architects, project managers and application developers in commercial, not-for-profit and government organizations concerned with scalability, flexibility and robustness of information management systems. Not-for-profit organizations specifically include the library and museum communities. Recommended topics include, but are not limited to, the following: – social, technical and mission values of applying Web architecture to government content, such as the means by which deployment agility, resilience and reuse of data may be accomplished – Relating to other eGov initiatives – Building of social (human-centered) communities to curate distributed data – Enterprise infrastructure for Linking
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
On 10/7/10 11:14 AM, Paul Houle wrote: On Wed, Oct 6, 2010 at 1:49 PM, Martin Hepp martin.h...@ebusiness-unibw.org mailto:martin.h...@ebusiness-unibw.org wrote: It is too expensive to expect data owners to lift their existing data to academic expectations. You must empower them to preserve as much data semantics and data structure as they can provide ad hoc. Lifting and augmenting the data can be added later. Don't get the idea that academic expectations are better than commercial expectations, they're just different. The whole point of Ontology2 is to commercize information extraction with a philosophy very much like what these folks are doing: http://rtw.ml.cmu.edu/papers/carlson-aaai10.pdf Now in some ways they've got something way more advanced than what I've got: however, they say that their ontology is populated with 242,453 new facts with estimated precsion on 74%. For me, I can't get away with an estimated precision of 74%, I'd look like a total fool publishing data that dirty on the web, unless I can find some way to conceal the dirt. Talking with people who are interested in semantic technology for e-commerce, I find a common desire is to not only reduce the cost of human labor but to also build systems that attain superhuman accuracy in describing and categorizing products (at least better accuracy than the people who are doing this job today.) [Note also that the rate of fact extraction these guys are doing isn't so hot either... You can get 10^7-10^8 facts out of dbpedia+freebase covering a similar domain] From a commercial viewpoint, imperfect data is an opportunity. Yes, one that could enable folks like to you create superhuman killer users courtesy of the distinguishing accuracy from your particular Linked Data Space :-) Your insignia (i.e., your data space URIs) is the key to controlling how your value works its way through the value chain (one that is inherently long-tailed) . If I didn't have other projects ahead of it in the queue, I'd seriously be thinking about building a shopping aggregator that cleans up GoodRelations and other data, reconciles product identities, categorizes products, creates good product descriptions, and make something that improves on current affiliate marketing and comparison shopping systems. Yes!! These are the opportunities that a Linked Open Commerce Data Space [1] opens up etc.. Note that the beauty of an ontology is in the eyes of a user. One user might want to have a broad but vague ontology of products, they are happy to say that a digital camera is a :DigitalCamera. Other people might want to just cover the photography domain, but do it in great detail -- describing both the differences between digital cameras manufactured today but also lenses, and even covering, in great detail, vintage cameras that you might find on eBay. You can't say that one of these ontologies is better than the other. The best thing is to have all of these ontologies available [populated with data!] and to pick and choose the the ones that fit your needs. Amen!! Links: 1. http://linkedopencommerce.com -- Linked Open Commerce Data Space -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
Il 07/10/2010 18:27, Kingsley Idehen ha scritto: For instance, you could clearly see http://ookaboo.com/o/pictures/topic/28999/Beijing as a signifier for a city. Some people would make the assertion that dbpedia:Beijing owl:sameAs ookaboo:topic/28999/Beijing. and that's not entirely stupid. On the other hand, it's definitely true that ookaboo:topic/28999/Beijing is sioc:ImageGallery. Put something true together with a practice that's common and you get the absurd result that dbpedia:Beijing is sioc:ImageGallery. Hopefully my response clears this all up, at least a little :-) We recently discussed about similar issues, anyway it's still quite obsure for me why this: dbpedia:Beijing owl:sameAs ookaboo:topic/28999/Beijing . ookaboo:topic/28999/Beijing rdf:type sioc:ImageGallery. = dbpedia:Beijing rdf:type sioc:ImageGallery. is not a problem. -- Regards, Roberto Mirizzi http://sisinflab.poliba.it/mirizzi
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
These things that bug you do so with good reason. I often call it semantic infidelity. For an in depth discussion of a closely related issue see: Overloading OWL sameAshttp://ontologydesignpatterns.org/wiki/Community:Overloading_OWL_sameAsA summary is given below. Michael *Issue: *owl:sameAs is being used in the linked data community in a way that is inconsistent with its semantics. *Source*: Numerous, this issue has been discussed over and over on various lists. The summary so far is mainly based on a discussion that was originally about the proliferation of URIs and managing co-reference, and evolved into a discussion about owl:sameAs *per se*. - W3C Semantic Web Listhttp://lists.w3.org/Archives/Public/semantic-web/: Managing Co-reference (Was: A Semantic Elephant?)http://lists.w3.org/Archives/Public/semantic-web/2008May/0126.htmlMay 2008 - W3C Semantic Web Listhttp://lists.w3.org/Archives/Public/semantic-web/: ISBNs, owl:sameAs, etchttp://lists.w3.org/Archives/Public/semantic-web/December 2009 *Related Discussions: * - [linking open data] Open DatamsgId=19328 URI aliases and owl:sameAs was: Terminology Questionhttp://simile.mit.edu/mail/ReadMsg?listName=Linking - W3C public-lod sameAs proliferation (was Visualizing LOD Linkage)http://www.mail-archive.com/public-lod@w3.org/msg00663.htmlAugust 2008 - W3C public-lod owl:sameAs links from OpenCyc to WordNethttp://lists.w3.org/Archives/Public/public-lod/2009Feb/0186.htmlFebruary 2009 - W3C semantic-web-lifesci owl:sameAs and identity [was Re: blog: semantic dissonance in uniprothttp://lists.w3.org/Archives/Public/public-semweb-lifesci/2009Mar/0169.html] March 2009 - [tbc-usershttp://www.mail-archive.com/topbraid-composer-us...@googlegroups.com/msg00994.htmlcounting and owl:sameAs] April 2009 - W3C public-lod how do I report bad sameAs links? (dbpedia - Cyc)http://lists.w3.org/Archives/Public/public-lod/2009Jun/0443.htmlJune 2009 - W3C public-lod sameas.orghttp://lists.w3.org/Archives/Public/public-lod/2009Jun/0038.htmlJune 2009 - W3C public-lod A sameas widget for Firefoxhttp://www.mail-archive.com/public-lod@w3.org/msg02554.htmlJune 2009 - W3C public-lod owl:sameAs [recipehttp://lists.w3.org/Archives/Public/public-lod/2009Jul/0306.html] July 2009 - W3C public-lod SKOS, owl:sameAs and DBpediahttp://lists.w3.org/Archives/Public/public-lod/2010Mar/0215.htmlMarch 2010 *Related Modeling Issues*: - Versioning and URIshttp://ontologydesignpatterns.org/wiki/Community:Versioning_and_URIs - Proliferation of URIs, Managing Coreferencehttp://ontologydesignpatterns.org/wiki/Community:Proliferation_of_URIs%2C_Managing_Coreference *Examples:* - relating a foaf:Person instance to the person's home page. - relating a geographical region with a political entity. For example, the physical area that a city occupies with the city itself. - relating the DBpedia resource referring to a place with to a GeoNames resource corresponding to that same place *Conclusions:* There is a lot of confusion about how owl:sameAs should be used in the linked open data community. It is being used in ways that are semantically incorrect and can give incorrect inferences. A number of points and suggestions came up. 1. There is frequent tendency to use sameAs to link resources that provide information about something to resources that represent the thing. E.g. relating a resource denoting a book to a resource that is the Amazon page for the book. 2. There is a tradeoff between formal accuracy on the one hand and pragmatic usefulness on the other hand. It often arises that treating things as the same has the desired behavior. Rather than being harmful, the vagueness can be an advantage. 3. It was proposed that a weaker similarity relationship be created to be used instead of sameAs when there is not true identity between the two resources. Some argued that there already are alternatives, e.g. skos:related and rdfs:seeAlso 4. Arguments were given pro and con, as to whether the new relationship should have a formal semantics. One proposal creates a mechanism that removes it from the logic entirely See: Managing URI Synonymity to Enable Consistent Reference on the Semantic Webhttp://eprints.ecs.soton.ac.uk/15614/1/camera-ready.pdf. If the formal semantics is important, should the similarity relation 1. be a relation in the logical vocabulary of OWL, as sameAs is? -or- 2. be just a relation in an ontology? 5. Having too many ways to specify similarity might be confusing and hinder uptake of the technology. 6. A suggestion was made to have owl:sameAs links made in separate files so that they can easily be excluded. 7. A suggestion was made that there be specific guidelines and practices between owners of data in how they reach agreement on what should be linked. See: Bernard Vatant suggested some good practice of mutual
SPARQL 1.1 query question
Hello All, I have a question about SPARQL 1.1 queries. If have the following triples: :LeagueA abc:hasMembers :Alice, :Bob, :Carol . :LeagueB abc:hasMembers :Dante, :Edward. If I want the following table of results: ?league ?membercount LeagueA 3 LeagueB 2 Given the data and my desired results, will the following SPARQL 1.1 query work? SELECT ?league (COUNT(?member) AS ?membercount) WHERE { SELECT ?league ?member WHERE { ?league abc:hasMember ?member . } GROUP BY ?league } Whether or not this query works, is there a way I can write this query without a subquery? Thank you. -- Michael Ransom Ontologist Revelytix, Inc. Work: 410.584.0099 Cell: 410.591.6878 Personal Email: michael.evan.ran...@gmail.com Work Email: mran...@revelytix.com Skype: michael.evan.ransom
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
On 10/7/10 12:51 PM, Roberto Mirizzi wrote: Il 07/10/2010 18:27, Kingsley Idehen ha scritto: For instance, you could clearly see http://ookaboo.com/o/pictures/topic/28999/Beijing as a signifier for a city. Some people would make the assertion that dbpedia:Beijing owl:sameAs ookaboo:topic/28999/Beijing. and that's not entirely stupid. On the other hand, it's definitely true that ookaboo:topic/28999/Beijing is sioc:ImageGallery. Put something true together with a practice that's common and you get the absurd result that dbpedia:Beijing is sioc:ImageGallery. Hopefully my response clears this all up, at least a little :-) We recently discussed about similar issues, anyway it's still quite obsure for me why this: dbpedia:Beijing owl:sameAs ookaboo:topic/28999/Beijing . Where did that come from? Did you make that claim or did something else? ookaboo:topic/28999/Beijing rdf:type sioc:ImageGallery. = dbpedia:Beijing rdf:type sioc:ImageGallery. is not a problem. Is a problem, that's wrong too :-) Maybe the *focal point* of an image (or what the image isAbout) that's part of a collection, for instance. Kingsley -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
Il 07/10/2010 19:37, Kingsley Idehen ha scritto: We recently discussed about similar issues, anyway it's still quite obsure for me why this: dbpedia:Beijing owl:sameAs ookaboo:topic/28999/Beijing . Where did that come from? Did you make that claim or did something else? We discussed here: http://www.mail-archive.com/dbpedia-discuss...@lists.sourceforge.net/msg01945.html Anyway, here I was just using Paul's example about Bejing. :-) ookaboo:topic/28999/Beijing rdf:type sioc:ImageGallery. = dbpedia:Beijing rdf:type sioc:ImageGallery. is not a problem. Is a problem, that's wrong too :-) That's what I mean. :-) Btw, as Michael reports here: http://ontologydesignpatterns.org/wiki/Community:Overloading_OWL_sameAs, the issue is old, well-known, and not uniquely solved. Roberto
Re: Call for Chapters: Linking Government Data
Hi Kingsley, I hope the following context answers your question since I'm familiar with the details ... In the spirit of transparency, we are looking for ways to raise the profile of successful projects in order to increase credibility of the SemWeb/LD community. A book like this doesn't generate the editor or authors money per se, rather it raises the profile of these projects builds towards our common objectives in a credible manner. Author's individual names were listed in the table of contents with the chapter title, not their affiliation or organization. We hope the book will be very representative of all nations involved in Linking Data, including their motivations, approaches and lessons learned. The first book (LED) is very diverse IMO. But back to the money since we are talking transparency. The editor will do most of the work upfront (call for papers, coordination, peer review, mark up, etc). The editor then finds a suitable publisher, enter into a contract, negotiate the details on publication timeline, rights, fees, etc.In the case of the LED book, Dave stands to earn $10/hr for the hours he spent organizing the call for chapters, working with at least three peers to review/edit each chapter, putting the book into LaTeX, etc. It is a labor of love so to speak ... I doubt Springer will make the NY Times best seller list with the Linking Enterprise Data book[1], but when books and conferences happen around a topic, it is perceived as having a market which helps legitimize our efforts. We promise to continue looking for innovative ways to make content like this available for linked data producers consumers, as are more more people each day around the world. As John said, it is entirely up to you if you wish to contribute but *no one* is editing and/or writing a chapter for the money. Cheers, Bernadette Hyland CEO, Talis, Inc. www.talis.com Tel. +1-540-898-6410 [1] http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-1-4419-7664-2 On Oct 7, 2010, at 12:21 PM, Kingsley Idehen wrote: On 10/7/10 11:42 AM, John Erickson wrote: Will all due respect, as with any monograph this is a call to *contribute*; the benefits if accepted are being part of an important work. Recipients are free to not submit! John, My question still stands? Who benefits from the sale of the book? No harm in investing a little more time about the expanse of the value chain graph. Time is money. Time is a fixed component that is eternally scarce. Time is the ultimate problem. From these problems come opportunities and opportunity costs. People don't always have enough time to figure our the density of any given value graph or its superficial value chain. Finding out vital details *after* you've committed time and effort typically leads to bad-will. Let's be clear about this stuff. That's all I seek. Transparency hasn't killed anyone or made enemies of friends, not the case with opacity! Kingsley On Thu, Oct 7, 2010 at 11:23 AM, Kingsley Idehenkide...@openlinksw.com wrote: On 10/7/10 10:02 AM, David Wood wrote: Hi all, Please find below a Call for Chapters for a new contributed book to be entitled Linking_Government_Data. Please distribute this information as widely as possible to help us collect useful success stories, techniques and benefits to using Linked Data in governments. Thanks in advance. Regards, Dave -- David Wood announces a Call for Chapters for a new book to be entitled Linking Government Data. First proposal submissions are due November 30, 2010 to da...@3roundstones.com. The book is intended to be published in print, ebooks format and on the Web, but a publisher has not yet been chosen. More than one publisher is interested. CHAPTER PROPOSALS INVITED FROM RESEARCHERS AND PRACTITIONERS IN LINKED DATA, DATA MANAGEMENT AND WEB INFORMATION SYSTEMS 1st Proposal Submission Deadline: November 30, 2010 Full Chapter Submission Deadline: March 1, 2010 Linking Government Data A book edited by David Wood, Talis, USA I. Introduction Linking Government Data is the application of Semantic Web architecture principles to real-world information management issues faced by government agencies. The term LGD is a play on Linking Open Data (LOD), a community project started by the World Wide Web Consortium’s Semantic Web Education and Outreach Interest Group aimed at exposing data sets to the Web in standard formats and actively relating them to one another with hyperlinks. Data in general is growing at a much faster rate than traditional technologies allow. The World Wide Web is the only information system we know that scales to the degree that it does and is robust to both changes and failure of components. Most software does not work nearly as well as the Web does. Applying the Web’s architectural principles to government information distribution programs
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
On 10/7/10 1:50 PM, Roberto Mirizzi wrote: Il 07/10/2010 19:37, Kingsley Idehen ha scritto: We recently discussed about similar issues, anyway it's still quite obsure for me why this: dbpedia:Beijing owl:sameAs ookaboo:topic/28999/Beijing . Where did that come from? Did you make that claim or did something else? We discussed here: http://www.mail-archive.com/dbpedia-discuss...@lists.sourceforge.net/msg01945.html Anyway, here I was just using Paul's example about Bejing. :-) ookaboo:topic/28999/Beijing rdf:type sioc:ImageGallery. = dbpedia:Beijing rdf:type sioc:ImageGallery. is not a problem. Is a problem, that's wrong too :-) That's what I mean. :-) Btw, as Michael reports here: http://ontologydesignpatterns.org/wiki/Community:Overloading_OWL_sameAs, the issue is old, well-known, and not uniquely solved. In this case we have a broken triple coming from somewhere. Question is: where? Kingsley Roberto -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: Call for Chapters: Linking Government Data
On 10/7/10 1:53 PM, Bernadette Hyland wrote: Hi Kingsley, I hope the following context answers your question since I'm familiar with the details ... Bernadette, In the spirit of transparency, we are looking for ways to raise the profile of successful projects in order to increase credibility of the SemWeb/LD community. A book like this doesn't generate the editor or authors money per se, rather it raises the profile of these projects builds towards our common objectives in a credible manner. Author's individual names were listed in the table of contents with the chapter title, *not* their affiliation or organization. We hope the book will be very representative of all nations involved in Linking Data, including their motivations, approaches and lessons learned. The first book (LED) is very diverse IMO. But back to the money since we are talking transparency. The editor will do most of the work upfront (call for papers, coordination, peer review, mark up, etc). The editor then finds a suitable publisher, enter into a contract, negotiate the details on publication timeline, rights, fees, etc.In the case of the LED book, Dave stands to earn $10/hr for the hours he spent organizing the call for chapters, working with at least three peers to review/edit each chapter, putting the book into LaTeX, etc. It is a labor of love so to speak I doubt Springer will make the NY Times best seller list with the Linking Enterprise Data book[1], but when books and conferences happen around a topic, it is perceived as having a market which helps legitimize our efforts. We promise to continue looking for innovative ways to make content like this available for linked data producers consumers, as are more more people each day around the world. All good, re. clarity. But note some assumptions that nobody has control over: 1. Springer making the NY Times best seller list -- we are in exponential times, Linked Data is hot, and the InterWeb is redefining Media amongst other things, it could be a best seller 2. Labor of love -- still a case of dealing with that scarce resource we know as Time, all contributors should be clear about this aspect from the get-go 3. Attribution -- it's highly likely that most contributors to this book also possess WebIDs, so why not consider Attribution by WebID in addition to Literal Names? As John said, it is entirely up to you if you wish to contribute but *no one* is editing and/or writing a chapter for the money. John: was reacting (I believe) rather than responding to my comment. You've just responded to my comment :-) An opaque and inherently ambiguous project participation call-out has now morphed into a much clearer endeavor -- I hope -- with regards to all potential participants. Kingsley Cheers, Bernadette Hyland CEO, Talis, Inc. www.talis.com http://www.talis.com Tel. +1-540-898-6410 [1] http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-1-4419-7664-2 On Oct 7, 2010, at 12:21 PM, Kingsley Idehen wrote: On 10/7/10 11:42 AM, John Erickson wrote: Will all due respect, as with any monograph this is a call to *contribute*; the benefits if accepted are being part of an important work. Recipients are free to not submit! John, My question still stands? Who benefits from the sale of the book? No harm in investing a little more time about the expanse of the value chain graph. Time is money. Time is a fixed component that is eternally scarce. Time is the ultimate problem. From these problems come opportunities and opportunity costs. People don't always have enough time to figure our the density of any given value graph or its superficial value chain. Finding out vital details *after* you've committed time and effort typically leads to bad-will. Let's be clear about this stuff. That's all I seek. Transparency hasn't killed anyone or made enemies of friends, not the case with opacity! Kingsley On Thu, Oct 7, 2010 at 11:23 AM, Kingsley Idehenkide...@openlinksw.com mailto:kide...@openlinksw.com wrote: On 10/7/10 10:02 AM, David Wood wrote: Hi all, Please find below a Call for Chapters for a new contributed book to be entitled Linking_Government_Data. Please distribute this information as widely as possible to help us collect useful success stories, techniques and benefits to using Linked Data in governments. Thanks in advance. Regards, Dave -- David Wood announces a Call for Chapters for a new book to be entitled Linking Government Data. First proposal submissions are due November 30, 2010 to da...@3roundstones.com mailto:da...@3roundstones.com. The book is intended to be published in print, ebooks format and on the Web, but a publisher has not yet been chosen. More than one publisher is interested. CHAPTER PROPOSALS INVITED FROM RESEARCHERS AND PRACTITIONERS IN LINKED DATA, DATA MANAGEMENT AND WEB INFORMATION SYSTEMS 1st Proposal
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
Hi Paul: On 07.10.2010, at 17:56, Paul Houle wrote: On Wed, Oct 6, 2010 at 5:09 PM, Martin Hepp martin.h...@ebusiness-unibw.org wrote: I've got mixed feelings about snippets vs fully embeded RDFa. For the most part I think systems that use snippets will be more maintainable, but I've seen cases where fully embedded RDFa fits very well into a system and there may be cases where the size of the HTML can be reduced by using it -- and HTML size is a big deal in the real world where loading time matters and we're increasingly targeting mobile devices. That is a common misconception - even very comprehensive RDFa in snippet style has less than 1% impact on the uncompressed size of typical HTML documents, see slide # 11 in http://www.slideshare.net/mhepp/goodrelations-semtech2010 The additional complexity of maintaining RDFa that is densely interwoven with the content for rendering is usually not worth any of the potential savings in page size, in particular since - the redundancy will be partly compensated by HTTP compression - the page loading time has also a fix part for DNS look-up etc., so that a linear increase in page size will not linearly increase the loading time. The RDFa issue that really bugs me is that a linked data URI can be read to signify a number of different things. Consider, for instance, http://dbpedia.org/resource/Rainbow_Bridge_(Tokyo) What you are describing is no RDFa-specific issue, afaik, because it is fairly easy to define separate URIs for documents and non-documents using the about attribute, e.g. with a hash. The only RDFa-specific clash I see are hash URI references that may mean different things from an HTML and from an RDFa perspective. Martin martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! = * Project Main Page: http://purl.org/goodrelations/ * Quickstart Guide for Developers: http://bit.ly/quickstart4gr * Vocabulary Reference: http://purl.org/goodrelations/v1 * Developer's Wiki: http://www.ebusiness-unibw.org/wiki/GoodRelations * Examples: http://bit.ly/cookbook4gr * Presentations: http://bit.ly/grtalks * Videos: http://bit.ly/grvideos
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
On 07.10.2010, at 17:14, Paul Houle wrote: On Wed, Oct 6, 2010 at 1:49 PM, Martin Hepp martin.h...@ebusiness-unibw.org wrote: From a commercial viewpoint, imperfect data is an opportunity. If I didn't have other projects ahead of it in the queue, I'd seriously be thinking about building a shopping aggregator that cleans up GoodRelations and other data, reconciles product identities, categorizes products, creates good product descriptions, and make something that improves on current affiliate marketing and comparison shopping systems. http://linkedopencommerce.com Note that the beauty of an ontology is in the eyes of a user. Yes and no. While there is a degree of subjectivity when evaluating ontologies, there are also hard criteria, e.g. - does it provide meaningful distinctions, i.e. such that are useful to preserve (in order to save reclassification effort by the data consumer) and reasonably cheap to populate (by the data owners). - is it embedded in an economically feasible ecosystem with incentives for data owners to publish respective data. Keep in mind positive network externalities! One user might want to have a broad but vague ontology of products, they are happy to say that a digital camera is a :DigitalCamera. Other people might want to just cover the photography domain, but do it in great detail -- describing both the differences between digital cameras manufactured today but also lenses, and even covering, in great detail, vintage cameras that you might find on eBay. You can't say that one of these ontologies is better than the other. The best thing is to have all of these ontologies available [populated with data!] and to pick and choose the the ones that fit your needs. If you ignore economics, you can have as many ontologies as you like, but ontologies are goods with strong positive network externalities (they gain in utility by the number of users and tools), so in practice, you may have many ontologies, but using a popular one out of a rather small set will often be the best choice. But of course you are right that using any ontology, even if proprietary, is better than using no ontolohy, since entity matching on the schema level is usually less effort than on the instance level. Martin martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! = * Project Main Page: http://purl.org/goodrelations/ * Quickstart Guide for Developers: http://bit.ly/quickstart4gr * Vocabulary Reference: http://purl.org/goodrelations/v1 * Developer's Wiki: http://www.ebusiness-unibw.org/wiki/GoodRelations * Examples: http://bit.ly/cookbook4gr * Presentations: http://bit.ly/grtalks * Videos: http://bit.ly/grvideos
Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages
Hi Tom, taking our thread to the public is definitely good. My point is that, yes, it is tempting to tweak your invisible data to polish your ranking. Example: In HTML, you say the price is 100 USD, in RDFa you say it was 50 USD. However, it is - from a computational perspective - very, very easy for Google or anybody else to spot divergences between the visible and the invisible content and punish pages that use such semantic black- hat SEO. My main argument is that structured data also simplifies checking for black-hat SEO. A very simple algorithm would be to tag pages that don't contain the sequence of digits contained in an invisible r:hasCurrencyValue property anywhere in the visible part of the page. True algorithms for such checks will be more complex, but I hope this gives a hint. Bottomline: I think the fraud issue is overrated, since Google, Bing, and Yahoo have the pretty strong instrument of delisting sites that use black-hat SEO. Martin On 07.10.2010, at 17:05, Thomas Steiner wrote: Hi Martin, We have discussed this off-list before, but maybe others would like to chime in... I don't think it is sad; because using invisible div / span elements nicely decouple the organization of the visual content from the embedded data. Martin, you never fail to hash-mark your #GoodRelations tweets with #SEO. Decoupling triples and content raises an interesting SEO problem: state A in the visible content, state B in the invisible triples. Now which information do we trust? It's the white text on a white background search engine fooling of the 21st century. I'm not yet sure if it's a real problem, but could imagine that tweaking price tags might be tempting to some. Opinions? Thanks, Tom Disclaimer: I work for Google, but I have no insider information at all how/if we deal with this. -- Thomas Steiner, Research Scientist, Google Inc. http://blog.tomayac.com, http://twitter.com/tomayac
Best Way to Extend the Geo Vocabulary to include an error or extent radius in meters
Hi LOD'ers, There was some discussion about ways to record species observations using the geo vocabulary at a recent biodiversity informatics meeting. Some see the advantages of using the geo standard, but we really need to have a way to incorporate and error or extent in meters. What would be the best way to extend the current geo vocabulary so that it includes this radius measure but still works well with geo aware tools and services? The addition would be defined as the total extent including calculated error. This would be used to record that a species or thing was observed or collected within this geographical defined area. Something like the example below, but I suspect that this might not make it a real geo:Point? geo:Point geo:lat55.701/geo:lat geo:long12.552/geo:long dwc:radius10/dwc:radius /geo:Point This should work so that if I have 10,000 records recorded from within this same area,. I can define it once and then refer to that area description RDF in the 10,000 species occurrence records with one simple URI to that area definition. One solution would be for the geo authors to add the radius to the geo vocabulary. Also there may be unrelated groups that might like the radius attribute for some other use. Another alternative would be to somehow extend the geo vocabulary within a separate vocabulary. It is not clear to me what is the best way to do this and still retain geo compatibility. I look forward to ideas and suggestions, - Pete Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base http://www.taxonconcept.org/ / GeoSpecies Knowledge Base http://lod.geospecies.org/ About the GeoSpecies Knowledge Base http://about.geospecies.org/
Re: Best Way to Extend the Geo Vocabulary to include an error or extent radius in meters
Hi Peter Something like the example below, but I suspect that this might not make it a real geo:Point? barely. The old maths teacher in me frowns at points having a radius :) geo:Point geo:lat55.701/geo:lat geo:long12.552/geo:long dwc:radius10/dwc:radius /geo:Point What about something as the following, since the radius is not really a property of the point ... geo:Area geo:center geo:Point geo:lat55.701/geo:lat geo:long12.552/geo:long /geo:Point /geo:center dwc:radius10/dwc:radius /geo:Area namespaces ad libitum of course Cheers Bernard -- Bernard Vatant Senior Consultant Vocabulary Data Engineering Tel: +33 (0) 971 488 459 Mail: bernard.vat...@mondeca.com Mondeca 3, cité Nollez 75018 Paris France Web:http://www.mondeca.com Blog:http://mondeca.wordpress.com
GoodRelations Service Update 2010-09-16
Dear all: After thorough testing, we have just deployed a service update of the GoodRelations vocabulary at http://purl.org/goodrelations/v1 It is 99.99% backwards compatible with all existing data and applications. Please refresh your cashes! The main changes are as follows: 1. Using gr:Offering becomes a lot easier for the simple case of just one product per offer. Accordingly, the use of gr:ActualProductOrServiceInstance or gr:ProductOrServicesSomeInstancesPlaceholder becomes obsolete for marking up data in 90 % of the cases. 2. We defined typical properties for the product name, a description, the condition, weight, dimensions, and color directly in GoodRelations, so that using a second ontology for product features becomes unnecessary for just those standard properties. 3. For quantitative values, gr:QuantitativeValue can now be used as a fully-fledged value class, instead of just gr:QuantitativeValueFloat and gr:QuantitativeValueInteger. The latter remain valid. 4. We added a gr:addOn property that allows publishing optional extensions (additional services or components) that are available only in combination with the base offer. 5. There is now a gr:valueReference property for linking a value to one or more values that provide context for that value (e.g. temperature, revolutions per minute, etc.). The full list of changes is attached below. Best wishes Martin Hepp 2010-09-16: Service Update Summary • Added gr:condition property • Changed the range of gr:includes to gr:ProductOrService, which allows much more concise markup in the general case of linking an offer of a single product to model data. Also updated the inference rules for expanding the shortcut (if the object of the triple is a gr:ProductOrServiceModel, one must create both gr:TypeAndQuantityNode and gr:ProductOrServicesSomeInstancesPlaceholder instances) • Changed the domain of gr:serialNumber to the union of gr:Offering and gr:ActualProductOrServiceInstance • Changed the domain of gr:hasInventoryLevel to the union of gr:Offering and gr:ProductOrServicesSomeInstancesPlaceholder • Added a gr:category property for attaching product category information in a lightweight manner if no dedicated ontology exists • Added a gr:name property as a shortcut for dc:title and rdfs:label • Added/reactivated the gr:description property as a shortcut for rdfs:comment and dcterms:description. Also changed the domain to owl:Thing • Defined the product features gr:weight, gr:width, gr:height, gr:depth, and gr:color directly in GoodRelations • Added the range rdfs:Literal to gr:hasMinValue and gr:hasMaxValue so that they become fully usable for annotations (not just for queries, as originally). • Changed the cardinality recommendation for gr:hasMinValue and gr:hasMaxValue to 0..1 and updated their textual definition. • Added a gr:hasValue property, an rdfs:subPropertyOf of gr:hasMinValue and gr:hasMaxValue, which simplifies the markup for quantitative data without breaking existing content • Added the range rdfs:Literal to all text properties, i.e. gr:condition, gr:description, gr:legalName, and gr:category • Removed unused namespace declarations xmlns:swrl=http://www.w3.org/2003/11/swrl# and xmlns:swrlb=http://www.w3.org/2003/11/swrlb#; • Added UN/CEFACT unit recommendations to gr:weight, gr:width, gr:height, and gr:depth • Updated the intro section: Removed gr:ActualProductOrServiceInstance and gr:ProductOrServicesSomeInstancesPlaceholder from the list of core classes, since they will be less important for very popular cases • Updated the UML class diagram accordingly • Added rdfs:isDefinedBy to all new elements • Fixed the rdfs:comment of gr:QualitativeValue • Polished the text of the ontology meta-data • Fixed the text of gr:BusinessEntity to make clear it can also be used with gr:seeks and that stores are gr:LocationOfSalesAndServiceProvisioning • Polished the text of gr:BusinessFunction • Updated the text of gr:Offering to reflect the new gr:includes shortcut to model data • Polished the text for gr:ProductOrService 2010-07-27: Service Update V (not officially deployed, thus also mentioned in here) * Added gr:hasMPN property * Changed the text of gr:hasStockKeepingUnit slightly in order to differentiate from hasMPN * Added gr:valueReference property * Added gr:addOn property * Added gr:Offering to the range of gr:hasEligibleQuantity and updated the text for gr:hasEligibleQuantity accordingly.
Re: Best Way to Extend the Geo Vocabulary to include an error or extent radius in meters
Thanks Bernard, I will try that! :-) - Pete On Thu, Oct 7, 2010 at 4:54 PM, Bernard Vatant bernard.vat...@mondeca.comwrote: Hi Peter Something like the example below, but I suspect that this might not make it a real geo:Point? barely. The old maths teacher in me frowns at points having a radius :) geo:Point geo:lat55.701/geo:lat geo:long12.552/geo:long dwc:radius10/dwc:radius /geo:Point What about something as the following, since the radius is not really a property of the point ... geo:Area geo:center geo:Point geo:lat55.701/geo:lat geo:long12.552/geo:long /geo:Point /geo:center dwc:radius10/dwc:radius /geo:Area namespaces ad libitum of course Cheers Bernard -- Bernard Vatant Senior Consultant Vocabulary Data Engineering Tel: +33 (0) 971 488 459 Mail: bernard.vat...@mondeca.com Mondeca 3, cité Nollez 75018 Paris France Web:http://www.mondeca.com Blog:http://mondeca.wordpress.com -- Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base http://www.taxonconcept.org/ / GeoSpecies Knowledge Base http://lod.geospecies.org/ About the GeoSpecies Knowledge Base http://about.geospecies.org/