Re: Organization ontology
On Wed, Jun 2, 2010 at 8:09 PM, Dave Reynolds dave.e.reyno...@googlemail.com wrote: On Wed, 2010-06-02 at 17:06 +1200, Stuart A. Yeates wrote: On Tue, Jun 1, 2010 at 7:50 PM, Dave Reynolds dave.e.reyno...@googlemail.com wrote: We would like to announce the availability of an ontology for description of organizational structures including government organizations. This was motivated by the needs of the data.gov.uk project. After some checking we were unable to find an existing ontology that precisely met our needs and so developed this generic core, intended to be extensible to particular domains of use. [1] http://www.epimorphics.com/public/vocabulary/org.html I think this is great, but I'm a little worried that a number of Western (and specifically Westminister) assumptions may have been built into it. Interesting. We tried to keep the ontology reasonably neutral, that's why, for example, there is no notion of a Government or Corporation. Could you say a little more about the specific Western Westminster assumptions that you feel are built into it? (*) that structure is relatively static with sharp transitions between states. (*) that an organisation has a single structure rather than a set of structures depending on the operations you are concerned with (finance, governance, authority, criminal justice, ...) (*) that the structures are intended to be as they are, rather than being steps towards some kind of Platonic ideal ... Modelling the crime organisations (the mafia, drug runners, Enron, identity crime syndicates) may also be helpful in exposing assumptions, particularly those in mapping the real-world to legal entities. Alternatively, this may help in defining the subset of organisations that you're trying to model. Control is a different issue from organizational structure. This ontology is not designed to support reasoning about authority and governance models. There are Enterprise Ontologies that explicitly model authority, accountability and empowerment flows and it would be possible to create a generic one which bolted alongside org but org is not such a beast :) I suspect I may have mis-understood the subset of problems you're trying to solve. A statement such as the above in the ontology document might save others making the same mistake. cheers stuart
Re: Why should we publish ordered collections or indexes as RDF?
2010/6/3 Haijie.Peng haijie.p...@gmail.com: [Apologies for cross-posting] Why should we publish ordered collections or indexes as RDF? is it necessary? On the Web, very little is 'necessary'. But some things can be useful. Indexes and summaries can help software prioritise, and allow larger files to be loaded only when needed. It depends what you mean by 'ordered collections' and 'indexes'. But the reason for sitemap-style summaries is usually to help external sites monitor the content of the Web better. At http://www.sitemaps.org/ there is an explanation of the sitemaps format which several crawlers use. I believe the Google crawler will use it to help schedule activity on a site, and that -for example- it can help if you want your RDF/FOAF or XFN documents to be indexed byGoogle's Social Graph API - http://code.google.com/apis/socialgraph/ There is also a version of this format called Semantic Sitemaps, but http://sw.deri.org/2007/07/sitemapextension/ is offline right now. In other cases, RSS feeds (also Atom) do the same thing, and provide a 'What's new' feed for a site, letting everyone know which documents are new or updated, so that they can be (re-)indexed. For large collections of documents, it is useful sometimes to have smaller summary documents so that the bigger files can be fetched only when they are needed. Mobile apps that care about bandwidth are an example scenario there. Regarding Linked Data, what we do there is to link descriptions together. Each partial description often links to other documents that are about the same real-world thing. This addresses some of the same needs as a top level index or catalogue, because you can retrieve different levels of detail from different sites. So my small FOAF file is in some ways a top level entry (index?) for me, and it might point to larger files (eg. twitter or flickr datasets) that are maintained separately. RDF aggregator sItes like sindice.com can be used to link these together, even if the top level file does not contain links to every other file that mentions me. So in that scenario, it is not 100% necessary for the small file to be an index to the large files. The data can be linked together later if common identifiers are used in each data set. Hope this helps. Can you say more about the specific situation you have in mind? cheers, Dan
Re: Organization ontology
On Thu, Jun 3, 2010 at 8:47 AM, Stuart A. Yeates syea...@gmail.com wrote: On Wed, Jun 2, 2010 at 8:09 PM, Dave Reynolds dave.e.reyno...@googlemail.com wrote: On Wed, 2010-06-02 at 17:06 +1200, Stuart A. Yeates wrote: On Tue, Jun 1, 2010 at 7:50 PM, Dave Reynolds dave.e.reyno...@googlemail.com wrote: We would like to announce the availability of an ontology for description of organizational structures including government organizations. This was motivated by the needs of the data.gov.uk project. After some checking we were unable to find an existing ontology that precisely met our needs and so developed this generic core, intended to be extensible to particular domains of use. [1] http://www.epimorphics.com/public/vocabulary/org.html I think this is great, but I'm a little worried that a number of Western (and specifically Westminister) assumptions may have been built into it. Interesting. We tried to keep the ontology reasonably neutral, that's why, for example, there is no notion of a Government or Corporation. Could you say a little more about the specific Western Westminster assumptions that you feel are built into it? (*) that structure is relatively static with sharp transitions between states. This simplification pretty much comes 'out of the box' with the use of RDF or other simple logics (SQL too). Nothing we do here deals in a very fluid manner with an ever-changing, subtle and complex world. But still SQL and increasingly RDF can be useful tools, and used carefully I don't think they're instruments of western cultural imperialism. I don't find anything particularly troublesome about the org: vocab on this front. If you really want to critique culturally-loaded ontologies, I'd go find one that declares class hierarchies with terms like 'Terrorist' without giving any operational definitions... (*) that an organisation has a single structure rather than a set of structures depending on the operations you are concerned with (finance, governance, authority, criminal justice, ...) Couldn't the subOrganizationOf construct be used to allow these different aspects be described and then grouped loosly together? (*) that the structures are intended to be as they are, rather than being steps towards some kind of Platonic ideal I'm not getting that from the docs. For example, We felt that the best approach was to develop a small, generic, reusable core ontology for organizational information and then let developers extend and specialize it to particular domains. ...suggests a hope for incremental refinement / improvement, but also a hope that the basic pieces are likely to map onto multiple parties situations at a higher level. Bit of both there, but no Plato. ... Modelling the crime organisations (the mafia, drug runners, Enron, identity crime syndicates) may also be helpful in exposing assumptions, particularly those in mapping the real-world to legal entities. I agree these are interesting areas to attempt to describe, but dealing with situations where obfuscation, secrecy and complexity are core business is a tough stress-test of any model. Ontology-style modeling works best when there is a shared conceptualisation of what's going on; even many direct participants in these complex crime situations lack that. So I'd suggest for those situations taking a more evidence-based social networks approach; instead of saying here's their org chart, build things up from raw data of who emails who, who knows who, who met who, where and when (or who claimed that they did), etc. RDF is ok for that task too. Those techniques are also useful when understanding how more legitimate organizations really function, but (as mentioned w.r.t. accountability) it can largely be broken out as a separate descriptive problem. Alternatively, this may help in defining the subset of organisations that you're trying to model. Yup Control is a different issue from organizational structure. This ontology is not designed to support reasoning about authority and governance models. There are Enterprise Ontologies that explicitly model authority, accountability and empowerment flows and it would be possible to create a generic one which bolted alongside org but org is not such a beast :) I suspect I may have mis-understood the subset of problems you're trying to solve. A statement such as the above in the ontology document might save others making the same mistake. Perhaps the scope is organizations in which there is some ideal that all participants can share a common explicit understanding of (the basics of) how things work - who does roughly what, and what the main aggregations of activity are. Companies, clubs, societies, public sector bodies etc. Sure there will be old-boy networks, secret handshakes and all kinds of undocumented channels, but those are understood as routing-around the main tranparent shared picture of how the organization works (or should work).
Discogs Linked Data
Does anyone know the state of play wrt a linked dataset describing Discogs (the music/record site)? I know that Leigh Dodds did some work about a year ago [1] but it appears that the data incubator page for the dataset is not active. There is also a SPARQL endpoint to the data at [2] but no access to a dump of the triples. Thinking about setting up a dataset describing Discogs, but would not do so if this has been done already and the dataset is regularly updated. thanks Matthew Rowe, MEng PhD Student OAK Group Department of Computer Science University of Sheffield m.r...@dcs.shef.ac.uk [1] http://discogs.dataincubator.org/ [2] http://api.talis.com/stores/discogs/services/sparql
Re: Organization ontology
On 10-06-03 09:01, Dan Brickley wrote: I don't find anything particularly troublesome about the org: vocab on this front. If you really want to critique culturally-loaded ontologies, I'd go find one that declares class hierarchies with terms like 'Terrorist' without giving any operational definitions... I must admit when I looked at the org vocabulary I had a feeling that there were some assumptions buried in it but discarded a couple of draft emails trying to articulate it. I think it stems from org:FormalOrganization being a thing that is legally recognized and org:OrganizationalUnit (btw, any particular reason for using the North American spelling here?) being an entity that is not recognised outside of the FormalOrg Organisations can become recognised in some circumstances despite never having solicited outside recognition from a state -- this might happen in a court proceeding after some collective wrongdoing. Conversely you might have something that can behave like a kind of organisation, e.g. a class in a class-action lawsuit without the internal structure present it most organisations. Is a state an Organisation? Organisational units can often be semi-autonomous (e.g. legally recognised) subsidiaries of a parent or holding company. What about quangos or crown-corporations (e.g. corporations owned by the state). They have legal recognition but are really like subsidiaries or units. Some types of legally recognised organisations don't have a distinct legal personality, e.g. a partnership or unincorporated association so they cannot be said to have rights and responsibilities, rather the members have joint (or joint and several) rights and responsibilities. This may seem like splitting hairs but from a legal perspective its an important distinction at least in some legal environments. The description provided in the vocabulary is really only true for corporations or limited companies. I think the example, eg:contract1 is misleading since this is an inappropriate way to model a contract. A contract has two or more parties. A contract might include a duty to fill a role on the part of one party but it is not normally something that has to do with membership Membership usually has a particular meaning as applied to cooperatives and not-for-profits. They usually wring their hands extensively about what exactly membership means. This concept normally doesn't apply to other types of organisations and does not normally have much to do with the concept of a role. The president of ${big_corporation} cannot be said to have any kind of membership relationship to that corporation, for example. I think there might be more, but I don't think its a problem of embedding westminister assumptions because I don't think the vocabulary fits very well even in the UK and commonwealth countries when you start looking at it closely. Thoughts? Cheers, -w -- William Waites william.wai...@okfn.org Mob: +44 789 798 9965Open Knowledge Foundation Fax: +44 131 464 4948Edinburgh, UK
Re: Organization ontology
Is any sample instance data available, whether it's using real or fake organizations? thanks, Bob
Re: Organization ontology
On Thu, 2010-06-03 at 09:29 -0400, Bob DuCharme wrote: Is any sample instance data available, whether it's using real or fake organizations? Not yet, but there will be. Dave
Re: Discogs Linked Data
On 6/3/10 7:07 AM, Matthew Rowe wrote: Does anyone know the state of play wrt a linked dataset describing Discogs (the music/record site)? There have always been Virtuoso Sponger [1] Cartridges (Basic and Meta) for Discogs. Examples: 1. http://linkeddata.uriburner.com/about/id/entity/http/www.discogs.com/artist/Stevie+Wonder -- Stevie Wonder 2. http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger -- Virtuoso Sponger Middleware Kingsley I know that Leigh Dodds did some work about a year ago [1] but it appears that the data incubator page for the dataset is not active. There is also a SPARQL endpoint to the data at [2] but no access to a dump of the triples. Thinking about setting up a dataset describing Discogs, but would not do so if this has been done already and the dataset is regularly updated. thanks Matthew Rowe, MEng PhD Student OAK Group Department of Computer Science University of Sheffield m.r...@dcs.shef.ac.uk [1] http://discogs.dataincubator.org/ [2] http://api.talis.com/stores/discogs/services/sparql
Re: Organization ontology
On Thu, Jun 3, 2010 at 3:07 PM, William Waites william.wai...@okfn.org wrote: On 10-06-03 09:01, Dan Brickley wrote: I don't find anything particularly troublesome about the org: vocab on this front. If you really want to critique culturally-loaded ontologies, I'd go find one that declares class hierarchies with terms like 'Terrorist' without giving any operational definitions... I must admit when I looked at the org vocabulary I had a feeling that there were some assumptions buried in it but discarded a couple of draft emails trying to articulate it. I think it stems from org:FormalOrganization being a thing that is legally recognized and org:OrganizationalUnit (btw, any particular reason for using the North American spelling here?) Re spelling - fair question. I think there are good reasons. British spelling accepts both. FOAF, which was made largely in Bristol UK but with international participants, has used 'Z' spelling for nearly a decade, http://xmlns.com/foaf/spec/#term_Organization ... as far as I know without any complaints. I'm really happy to see this detailed work happen and hope to nudge FOAF a little too, perhaps finding a common form of words to define the shared general Org class. It would be pretty unfortunate to have foaf:Organization and org:Organisation; much worse imho than the camel-case vs underscore differences that show up within and between vocabularies. Z seems the pragmatic choice. I don't know much about English usage outside the UK and the northern Americas, but I find 'z' is generally accepted in the UK, whereas in the US, 's' is seen as a mistake. This seems supported by whoever wrote this bit of wikipedia, http://en.wikipedia.org/wiki/American_and_British_English_spelling_differences#-ise.2C_-ize_.28-isation.2C_-ization.29 American spelling accepts only -ize endings in most cases, such as organize, realize, and recognize.[53] British usage accepts both -ize and -ise (organize/organise, realize/realise, recognize/recognise).[53] British English using -ize is known as Oxford spelling, and is used in publications of the Oxford University Press, most notably the Oxford English Dictionary, as well as other authoritative British sources. being an entity that is not recognised outside of the FormalOrg Organisations can become recognised in some circumstances despite never having solicited outside recognition from a state -- this might happen in a court proceeding after some collective wrongdoing. Conversely you might have something that can behave like a kind of organisation, e.g. a class in a class-action lawsuit without the internal structure present it most organisations. Yes. In FOAF we have a class foaf:Project but it is not quite clear how best to characteri[sz]e it. In purely FOAF oriented scenarios, I believe it is hardly ever used (although humm stats below seem to contradict that). However, the pretty successful DOAP project ('description of a project') has made extensive use of a subclass, doap:Project in describing open source collaborative projects. These have something of the character of an organization, but are usually on the bazaar end of the cathedral/bazzar spectrum. Are some but not all projects also organizations? etc. discuss :) See also http://xmlns.com/foaf/spec/#term_Project http://trac.usefulinc.com/doap http://sindice.com/search?q=foaf:project+qt=term Search results for terms “foaf:project ”, found about 13.0 thousand (sindice seems to require downcasing for some reason) http://sindice.com/search?q=doap:project+qt=term Search results for terms “doap:project ”, found about 8.41 thousand (I haven't time to dig into those results, probably the queries could be tuned better to filter out some misleading matches) Is a state an Organisation? It would be great to link if possible to FAO's Geopolitical ontology here, see http://en.wikipedia.org/wiki/Geopolitical_ontology ... this for example has a model for groupings that geo-political entities belong to (I'm handwaving a bit here on the detail). It also has a class Organization btw, as well as extensive mappings to different coding systems. Organisational units can often be semi-autonomous (e.g. legally recognised) subsidiaries of a parent or holding company. What about quangos or crown-corporations (e.g. corporations owned by the state). They have legal recognition but are really like subsidiaries or units. As an aside, I would like to have a way of representing boards of directors, to update the old (theyrule-derrived) FOAFCorp data and schema. Ancient page here: http://rdfweb.org/foafcorp/intro.html schema http://xmlns.com/foaf/corp/ Some types of legally recognised organisations don't have a distinct legal personality, e.g. a partnership or unincorporated association so they cannot be said to have rights and responsibilities, rather the members have joint (or joint and several) rights and responsibilities. This may seem like splitting hairs but from a
Re: Organization ontology
On Thu, 2010-06-03 at 14:07 +0100, William Waites wrote: On 10-06-03 09:01, Dan Brickley wrote: I don't find anything particularly troublesome about the org: vocab on this front. If you really want to critique culturally-loaded ontologies, I'd go find one that declares class hierarchies with terms like 'Terrorist' without giving any operational definitions... I must admit when I looked at the org vocabulary I had a feeling that there were some assumptions buried in it but discarded a couple of draft emails trying to articulate it. I think it stems from org:FormalOrganization being a thing that is legally recognized and org:OrganizationalUnit (btw, any particular reason for using the North American spelling here?) being an entity that is not recognised outside of the FormalOrg org:Organization is useful directly, the two subClasses do not form a covering they do not exhaust the space. They are just useful distinctions in a broad variety of applications - as indicated by their presence in a number of the ontologies we surveyed [2]. On spelling, to quote from the public design notes [1]: Let's get this one out of the way - are we organized or organised? American English demands -ize but both are correct in British English; -ize is preferred by the OED (the Oxford spelling); -ise is preferred by Fowler, The Times and is 50% more common in the British National Corpus. If we want to strive for broad uptake then picking one which is acceptable for all versions of English is the obvious choice so we'll go for -ize. After all, being on the same side as the OED can't be all bad. Organisations can become recognised in some circumstances despite never having solicited outside recognition from a state -- this might happen in a court proceeding after some collective wrongdoing. Conversely you might have something that can behave like a kind of organisation, e.g. a class in a class-action lawsuit without the internal structure present it most organisations. The ontology doesn't talk about having solicited recognition so I don't think that distinction is relevant here. It is up to you, in applying this simple core ontology whether the distinction between general org:Organization and org:FormalOrganization is useful to your application. The nature of the formality is left fairly open but if it is too constraining then model at org:Organization level. Is a state an Organisation? Yes, whether it is one that you would usefully model using this is a different question. Organisational units can often be semi-autonomous (e.g. legally recognised) subsidiaries of a parent or holding company. What about quangos or crown-corporations (e.g. corporations owned by the state). They have legal recognition but are really like subsidiaries or units. Certainly, there is no requirement that FormalOrganzations can't have other FormalOrganizations as subOrganizations. The containment hierarchy is very open specifically to allow just that sort of structure. Some types of legally recognised organisations don't have a distinct legal personality, e.g. a partnership or unincorporated association so they cannot be said to have rights and responsibilities, rather the members have joint (or joint and several) rights and responsibilities. This may seem like splitting hairs but from a legal perspective its an important distinction at least in some legal environments. The description provided in the vocabulary is really only true for corporations or limited companies. [Aside: I believe that in the UK Partnerships do have some legal recognition, just as Sole Traders do. Partners also have joint and several responsibilities but the Partnership itself is a recognized entity for some purposes. ] It would be great if you could suggest a better phrasing of the description of a FormalOrganization that would better encompass the range of entities you think should go there? Or are you advocating that the distinction between a generic organization and a externally recognized semi-autonomous organization is not a useful one? I think the example, eg:contract1 is misleading since this is an inappropriate way to model a contract. A contract has two or more parties. A contract might include a duty to fill a role on the part of one party but it is not normally something that has to do with membership You are reading way too much into the choice of spelling of a URI! The example is simply to illustrate how the vocabulary should be used to bind a person to an organization in some form of role. I could have used a bNode there. There is nothing in there to model Contracts with a big-C - that would be a whole other ball game! I'll change the name to avoid such confusion. Membership usually has a particular meaning as applied to cooperatives and not-for-profits. They usually wring their hands extensively about what exactly membership means. This concept normally doesn't apply to other types of
Re: Organization ontology
Weren't these details of the discussion the sort of Mission Creep the org vocabulary meant to avoid ? Certainly NGO's including Commercial Interests would like nothing better than to ride the trustworthiness coattails of a Geo-Political State. But the State is trustworthy precisely because it does not render services to groups, averages or price points, but rather to individuals. Current Industry Standards simply do not protect personal privacy adequately while Government Standards must do so. The org vocabulary has no provision for redaction of what might be private personal information after the next election. But that is not necessary if one is only making the general distinction between Official Function and Functional Office. The problem arises with the introduction of Office Function. Forgive me for arguing semantics :) --- On Thu, 6/3/10, Dan Brickley dan...@danbri.org wrote: On 10-06-03 09:01, Dan Brickley wrote: I don't find anything particularly troublesome about the org: vocab on this front. If you really want to critique culturally-loaded ontologies, I'd go find one that declares class hierarchies with terms like 'Terrorist' without giving any operational definitions... I must admit when I looked at the org vocabulary I had a feeling that there were some assumptions buried in it but discarded a couple of draft emails trying to articulate it. I think it stems from org:FormalOrganization being a thing that is legally recognized and org:OrganizationalUnit (btw, any particular reason for using the North American spelling here?) Re spelling - fair question. I think there are good reasons. British spelling accepts both. FOAF, which was made largely in Bristol UK but with international participants, has used 'Z' spelling for nearly a decade, http://xmlns.com/foaf/spec/#term_Organization ... as far as I know without any complaints. I'm really happy to see this detailed work happen and hope to nudge FOAF a little too, perhaps finding a common form of words to define the shared general Org class. It would be pretty unfortunate to have foaf:Organization and org:Organisation; much worse imho than the camel-case vs underscore differences that show up within and between vocabularies. Z seems the pragmatic choice. I don't know much about English usage outside the UK and the northern Americas, but I find 'z' is generally accepted in the UK, whereas in the US, 's' is seen as a mistake. This seems supported by whoever wrote this bit of wikipedia, http://en.wikipedia.org/wiki/American_and_British_English_spelling_differences#-ise.2C_-ize_.28-isation.2C_-ization.29 American spelling accepts only -ize endings in most cases, such as organize, realize, and recognize.[53] British usage accepts both -ize and -ise (organize/organise, realize/realise, recognize/recognise).[53] British English using -ize is known as Oxford spelling, and is used in publications of the Oxford University Press, most notably the Oxford English Dictionary, as well as other authoritative British sources. being an entity that is not recognised outside of the FormalOrg Organisations can become recognised in some circumstances despite never having solicited outside recognition from a state -- this might happen in a court proceeding after some collective wrongdoing. Conversely you might have something that can behave like a kind of organisation, e.g. a class in a class-action lawsuit without the internal structure present it most organisations. Yes. In FOAF we have a class foaf:Project but it is not quite clear how best to characteri[sz]e it. In purely FOAF oriented scenarios, I believe it is hardly ever used (although humm stats below seem to contradict that). However, the pretty successful DOAP project ('description of a project') has made extensive use of a subclass, doap:Project in describing open source collaborative projects. These have something of the character of an organization, but are usually on the bazaar end of the cathedral/bazzar spectrum. Are some but not all projects also organizations? etc. discuss :) See also http://xmlns.com/foaf/spec/#term_Project http://trac.usefulinc.com/doap http://sindice.com/search?q=foaf:project+qt=term Search results for terms “foaf:project ”, found about 13.0 thousand (sindice seems to require downcasing for some reason) http://sindice.com/search?q=doap:project+qt=term Search results for terms “doap:project ”, found about 8.41 thousand (I haven't time to dig into those results, probably the queries could be tuned better to filter out some misleading matches) Is a state an Organisation? It would be great to link if possible to FAO's Geopolitical ontology here, see http://en.wikipedia.org/wiki/Geopolitical_ontology ... this for example has a model for groupings that geo-political entities belong to (I'm handwaving
Re: Organization ontology
Dave, Does this mean that no sample data has been created yet, or that samples used in the course of development are not data that you are free to share? thanks, Bob Dave Reynolds wrote: On Thu, 2010-06-03 at 09:29 -0400, Bob DuCharme wrote: Is any sample instance data available, whether it's using real or fake organizations? Not yet, but there will be. Dave
Re: Discogs Linked Data
Hello, Does anyone know the state of play wrt a linked dataset describing Discogs (the music/record site)? I've spent some time w/ Discogs stuff - it needs some work. The links to DBpedia are broken b/c of some capitalization errors, and the artist URIs and foaf:names are a bit borked b/c the underlying data has some unicode errors (two bytes v one byte unicode not handled properly) There have always been Virtuoso Sponger [1] Cartridges (Basic and Meta) for Discogs. Examples: 1. http://linkeddata.uriburner.com/about/id/entity/http/www.discogs.com/artist/Stevie+Wonder -- Stevie Wonder 2. http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger -- Virtuoso Sponger Middleware you might see if you can just use these. otherwise the ruby code is around somewhere on the talis platform site. no time to find it now - i've got a new born to look after :-) -kurt j
Re: Discogs Linked Data
The main major thing lacking now I think is links[1] to MusicBrainz, and I don't think that can be done without a dump. Apart from that, the mappings are also incomplete. The ruby code can be found at dataincubator[2]. 1. http://blog.dbtune.org/post/2007/06/11/Linking-open-data%3A-interlinking-the-Jamendo-and-the-Musicbrainz-datasets 2. http://code.google.com/p/dataincubator/source/browse/trunk/#trunk/discogs Cheers, Mats On Thu, Jun 3, 2010 at 10:45 PM, Kurt J kur...@gmail.com wrote: Hello, Does anyone know the state of play wrt a linked dataset describing Discogs (the music/record site)? I've spent some time w/ Discogs stuff - it needs some work. The links to DBpedia are broken b/c of some capitalization errors, and the artist URIs and foaf:names are a bit borked b/c the underlying data has some unicode errors (two bytes v one byte unicode not handled properly) There have always been Virtuoso Sponger [1] Cartridges (Basic and Meta) for Discogs. Examples: 1. http://linkeddata.uriburner.com/about/id/entity/http/www.discogs.com/artist/Stevie+Wonder -- Stevie Wonder 2. http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger-- Virtuoso Sponger Middleware you might see if you can just use these. otherwise the ruby code is around somewhere on the talis platform site. no time to find it now - i've got a new born to look after :-) -kurt j
Re: Organization ontology
On Thu, 2010-06-03 at 12:41 -0400, Bob DuCharme wrote: Dave, Does this mean that no sample data has been created yet, or that samples used in the course of development are not data that you are free to share? Given the rather ... short ... timescale we were working under the sketchy examples used in the course of development are not in a fit state to publish as examples of how to do things. There are several strands of work going on applying and specializing the ontology to real data and that will, I hope, result in publishable examples soon. Possibly, given that this work seems to have struck a chord with people, it might we worth generating a worked example sooner that isn't encumbered by the quality and completeness requirements that the real data has. Will think about that. Cheers, Dave
Re: Discogs Linked Data
this is a data set i really want too somebody know a way around the unicode problem??? Maybe find stuff like these #195;#175; with a regexp and then replace them with the correct unicode chars. In Python something like this looped through each line of the files should work I think: import re teststr = 'Tcha#195;#175;kovsky' regex = re.compile(r'(?!(#\d{3};))(#\d{3};){2}(?!(#\d{3};))') rObj = re.search(regex, teststr) if rObj is not None: hexValues = [hex(int(rObj.group()[2:5])), hex(int(rObj.group()[8:11]))] newChar = ''.join([chr(int(c, 16)) for c in hexValues]).decode('utf8') print re.sub(regex, newChar, teststr) outputTchaïkovsky I've posted a more complete version here http://pastebin.com/vuq72irC Cheers, Mats