Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-12-07 Thread Andrea Perego

Dear Sebastian, dear Kay,

I'm not sure this has already been mentioned in the thread, but another 
service providing bulk download is GLEIF (Global Legal Entity Identifier 
Foundation):


https://www.gleif.org/en/lei-data/gleif-concatenated-file/lei-download

They now have around 400K records.

GLEIF is operating also HTTP URIs for companies / organisations with a 
LEI (Legal Entity Identifier). Those URIs return just HTML, but other 
formats are available from their discovery page - the search result can 
be exported in .xls, .csv. .xml, and .json - see:


https://www.gleif.org/lei/search

Cheers,

Andrea


On 03/11/2015 16:17, Sebastian Hellmann wrote:

[Apologies for cross-posting]

Dear all,
this message is part announcement of an open data initiative and part
call for feedback and support.

We are considering to work on creating a free, open and interoperable
dataset on companies and organisations, which we are planing to
integrate into DBpedia+ and offer as dump download. As we are in a very
early phase of the endeavour, we would like to know whether there is
existing work in this area.

We are looking for any available datasets which have information about
companies and other organizations in any language and any country.
Ideally, the datasets are:
1. downloadable as dump
2. openly licensed , e.g. CC-BY following the http://opendefinition.org/
3. in an easily parseable format, e.g. RDF or CSV and not PDF

But hey! Send around anything you know, and we will look at it and see
whether we can make use of it. You can reach us either by replying  to
this email or send feedback directly to me and Kay Müller
.
If you have any private/closed data, please contact us as well. We might
make use of it to cross-reference and validate public/open data with it.
Or just learn from it to build a good scheme.

We started a link collection here (and attached the current status at
the end of this email)
https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
Also we started to collect potential identifiers for linking here:
https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0

Regards and thank you for any support on this,
Sebastian and Kay

##

https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit


*


  Open Company Data

Open Company Data


Identifiers for companies/organisation


URIs (Linked Data/Semantic Web)


Downloadable Datasets with Company info (confirmed)


Portals with no bulk downloads


Portals, we will still need to investigate




Identifiers for companies/organisation

Table with identifiers:

https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0


  URIs (Linked Data/Semantic Web)

  *

DBpedia/Wikipedia/Wikidata URIs - http://dbpedia.org

  *

LinkedGeoData - http://linkedgeodata.org/


DownloadableDatasets with Company info (confirmed)

  *

VIAF - http://viaf.org/viaf/data/

  *

DBpedia -

http://downloads.dbpedia.org/current/core/

  *

Wikidata -

http://downloads.dbpedia.org/current/ext/wikidata/

  *

LinkedGeoData -

http://downloads.linkedgeodata.org/releases/

  *

Company Data Index:

http://index.okfn.org/dataset/companies/

  o

e.g. UK company data:

http://download.companieshouse.gov.uk/en_output.html


Portals with no bulk downloads

  *

https://opencorporates.com/

  *

http://registries.opencorporates.com/


Portals, we will still need to investigate


  *

https://www.wlw.de/

  *

https://www.crunchbase.com

  *



AW: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-13 Thread Neubert, Joachim
Hi Sebastian, Kay,

If you are interested in historical company data, perhaps have a look at the 
company section of the 20th Century Press Archives 
(http://zbw.eu/beta/p20/company). It contains basic metadata about the 
companies with links to scanned versions of annual reports and press articles 
about these companies. Currently, ca 1400 companies are covered, mostly from 
Germany, the Netherlands, Czech, Austria and Poland, with material dating 
mostly from the first half of the 20th century.

The dataset is available as XHTML/RDFa. It is currently not explicitly 
licensed, but I think that the metadata (other than the scanned images, where 
intellectual property is dispersed) could be openly licensed and published with 
links to the according GND/VIAF URIs - if there is express interest to use it 
in a setting as you describe it.

Cheers, Joachim

More info: http://challenge.semanticweb.org/submissions/swc2010_submission_6.pdf

Von: Sebastian Hellmann [mailto:hellm...@informatik.uni-leipzig.de]
Gesendet: Dienstag, 3. November 2015 16:17
An: public-lod
Betreff: Are there any datasets about companies? ( DBpedia Open Data Initiative)

[Apologies for cross-posting]

Dear all,
this message is part announcement of an open data initiative and part call for 
feedback and support.

We are considering to work on creating a free, open and interoperable dataset 
on companies and organisations, which we are planing to integrate into DBpedia+ 
and offer as dump download. As we are in a very early phase of the endeavour, 
we would like to know whether there is existing work in this area.

We are looking for any available datasets which have information about 
companies and other organizations in any language and any country. Ideally, the 
datasets are:
1. downloadable as dump
2. openly licensed , e.g. CC-BY following the http://opendefinition.org/
3. in an easily parseable format, e.g. RDF or CSV and not PDF

But hey! Send around anything you know, and we will look at it and see whether 
we can make use of it. You can reach us either by replying  to this email or 
send feedback directly to me and Kay Müller 
<kay.muel...@informatik.uni-leipzig.de><mailto:kay.muel...@informatik.uni-leipzig.de>.
If you have any private/closed data, please contact us as well. We might make 
use of it to cross-reference and validate public/open data with it. Or just 
learn from it to build a good scheme.

We started a link collection here (and attached the current status at the end 
of this email)
https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
Also we started to collect potential identifiers for linking here:
https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0

Regards and thank you for any support on this,
Sebastian and Kay

##

https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit



Open Company Data

Open Company 
Data<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.buuo7dypfd9a>

Identifiers for 
companies/organisation<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.qs150ivpio94>

URIs (Linked Data/Semantic 
Web)<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.b9yeovqjeglz>

Downloadable Datasets with Company info 
(confirmed)<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.7ihxrlrypp14>

Portals with no bulk 
downloads<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.a95o85lqil72>

Portals, we will still need to 
investigate<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.p50bjh96q3ok>

Identifiers for companies/organisation

Table with identifiers:

https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0

URIs (Linked Data/Semantic Web)

· DBpedia/Wikipedia/Wikidata URIs - http://dbpedia.org

· LinkedGeoData - http://linkedgeodata.org/

Downloadable Datasets with Company info (confirmed)

· VIAF - http://viaf.org/viaf/data/

· DBpedia - http://downloads.dbpedia.org/current/core/

· Wikidata - http://downloads.dbpedia.org/current/ext/wikidata/

· LinkedGeoData - http://downloads.linkedgeodata.org/releases/

· Company Data Index: http://index.okfn.org/dataset/companies/

oe.g. UK company data: http://download.companieshouse.gov.uk/en_output.html

Portals with no bulk downloads

· https://opencorporates.com/

· http://registries.opencorporates.com/

Portals, we will still need to investigate


· https://www.wlw.de/

· https://www.crunchbase.com

· http://data.crunchbase.com/v3/page/crunchbase-open-data-map-odm

· http://ww

Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-06 Thread Rolf Kleef
Hi Sebastian, Kay,

If you haven't done it yet, I suggest getting in touch with Chris
Taggart of Open Corporates (cc'd). He has years of experience doing
this, and is also involved in cross-standards work on "organisational
identifiers", crucial in the development of for instance the Open
Contracting Data Standard and the International Aid Transparancy Initiative:

http://www.open-contracting.org/
http://iatistandard.org/201/organisation-identifiers/

~~Rolf.

On 03/11/15 16:17, Sebastian Hellmann wrote:
> [Apologies for cross-posting]
> 
> Dear all,
> this message is part announcement of an open data initiative and part
> call for feedback and support.
> 
> We are considering to work on creating a free, open and interoperable
> dataset on companies and organisations, which we are planing to
> integrate into DBpedia+ and offer as dump download. As we are in a very
> early phase of the endeavour, we would like to know whether there is
> existing work in this area.
> 
> We are looking for any available datasets which have information about
> companies and other organizations in any language and any country.
> Ideally, the datasets are:
> 1. downloadable as dump
> 2. openly licensed , e.g. CC-BY following the http://opendefinition.org/
> 3. in an easily parseable format, e.g. RDF or CSV and not PDF
> 
> But hey! Send around anything you know, and we will look at it and see
> whether we can make use of it. You can reach us either by replying  to
> this email or send feedback directly to me and Kay Müller
> .
> If you have any private/closed data, please contact us as well. We might
> make use of it to cross-reference and validate public/open data with it.
> Or just learn from it to build a good scheme.
> 
> We started a link collection here (and attached the current status at
> the end of this email)
> https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
> Also we started to collect potential identifiers for linking here:
> https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
> 
> Regards and thank you for any support on this,
> Sebastian and Kay
> 
> ##
> 
> https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
> 
> 
> *
> 
> 
>   Open Company Data
> 
> Open Company Data
> 
> 
> Identifiers for companies/organisation
> 
> 
> URIs (Linked Data/Semantic Web)
> 
> 
> Downloadable Datasets with Company info (confirmed)
> 
> 
> Portals with no bulk downloads
> 
> 
> Portals, we will still need to investigate
> 
> 
> 
> 
> Identifiers for companies/organisation
> 
> Table with identifiers:
> 
> https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
> 
> 
>   URIs (Linked Data/Semantic Web)
> 
>   *
> 
> DBpedia/Wikipedia/Wikidata URIs - http://dbpedia.org
> 
>   *
> 
> LinkedGeoData - http://linkedgeodata.org/
> 
> 
> DownloadableDatasets with Company info (confirmed)
> 
>   *
> 
> VIAF - http://viaf.org/viaf/data/
> 
>   *
> 
> DBpedia -
> 
> http://downloads.dbpedia.org/current/core/
> 
>   *
> 
> Wikidata -
> 
> http://downloads.dbpedia.org/current/ext/wikidata/
> 
>   *
> 
> LinkedGeoData -
> 
> http://downloads.linkedgeodata.org/releases/
> 
>   *
> 
> Company Data Index:
> 
> http://index.okfn.org/dataset/companies/
> 
>   o
> 
> e.g. UK company data:
> 
> http://download.companieshouse.gov.uk/en_output.html
> 
> 
> Portals with no bulk downloads
> 
>   *
> 
> https://opencorporates.com/
> 
>   *
> 
> 
> http://registries.opencorporates.com/
> 
> 
> Portals, we will still need to investigate
> 
> 
>   *
> 
> https://www.wlw.de/
> 
>   *
> 
> 

Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-06 Thread Gannon Dick
Hi all,

Organizational Identifiers are a bit dangerous for the little people to talk 
about :-)

1) First, some food for thought ... if FOAF identifies real people rigorously, 
one would think complexity less and convergence faster for many fewer 
organizations.  That would make no sense, unless (reads manual).
2) Second, an observation ... is the "Open World" assumption an HTML ordered 
list or an HTML unordered list ?  Who decides ? Hint: Moses had 10 
Commandments, but plainly meant an unordered list.  Even the most (hardened 
agnostic) developer should be able to admit that 10 Commandments in an 
unordered list and 10 Items in an ordered list is not a valid substitution 
pattern. Learning to love Turtle is not a resolution to this dilemma, BTW.
3) Strategy Markup Language (StratML) Collections resolve these issues by using 
a compound key for the :
a) Organization Name -> Acronym (caps of Proper Case)
b) Subdivision Name -> UUID, Acronym (caps of Proper Case)
c) (StratML (XML) File Name) ->   (Acronym from a) DOT (Acronym from b) DOT xml

This can enable styling within the Core by CSS or XSLT while maintaining 
Collection integrity because an OUTER JOIN on Organization Name preserves a 
collection of right-directed graphs.  If this sounds like slavery to you, take 
a nap, "they" can't own your dreams ;-)

--Gannon

On Thu, 11/5/15, Rolf Kleef <r...@openforchange.info> wrote:

 Subject: Re: Are there any datasets about companies? ( DBpedia Open Data  
Initiative)
 To: "Sebastian Hellmann" <hellm...@informatik.uni-leipzig.de>, "Kay Müller" 
<kay.muel...@informatik.uni-leipzig.de>, "Chris Taggart" 
<countcult...@gmail.com>, public-lod@w3.org
 Date: Thursday, November 5, 2015, 6:49 AM
 
 Hi Sebastian, Kay,
 
 If you haven't done it
 yet, I suggest getting in touch with Chris
 Taggart of Open Corporates (cc'd). He has
 years of experience doing
 this, and is also
 involved in cross-standards work on "organisational
 identifiers", crucial in the development
 of for instance the Open
 Contracting Data
 Standard and the International Aid Transparancy
 Initiative:
 
 http://www.open-contracting.org/
 http://iatistandard.org/201/organisation-identifiers/
 
 ~~Rolf.
 
 On 03/11/15 16:17, Sebastian Hellmann wrote:
 > [Apologies for cross-posting]
 > 
 > Dear all,
 > this message is part announcement of an
 open data initiative and part
 > call for
 feedback and support.
 > 
 > We are considering to work on creating a
 free, open and interoperable
 > dataset on
 companies and organisations, which we are planing to
 > integrate into DBpedia+ and offer as dump
 download. As we are in a very
 > early
 phase of the endeavour, we would like to know whether there
 is
 > existing work in this area.
 > 
 > We are looking for
 any available datasets which have information about
 > companies and other organizations in any
 language and any country.
 > Ideally, the
 datasets are:
 > 1. downloadable as
 dump
 > 2. openly licensed , e.g. CC-BY
 following the http://opendefinition.org/
 > 3. in an easily parseable format, e.g. RDF
 or CSV and not PDF
 > 
 > But hey! Send around anything you know,
 and we will look at it and see
 > whether
 we can make use of it. You can reach us either by replying 
 to
 > this email or send feedback directly
 to me and Kay Müller
 > <kay.muel...@informatik.uni-leipzig.de>.
 > If you have any private/closed data,
 please contact us as well. We might
 >
 make use of it to cross-reference and validate public/open
 data with it.
 > Or just learn from it to
 build a good scheme.
 > 
 > We started a link collection here (and
 attached the current status at
 > the end
 of this email)
 > https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
 > Also we started to collect potential
 identifiers for linking here:
 > https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
 > 
 > Regards and thank
 you for any support on this,
 > Sebastian
 and Kay
 > 
 >
 ##
 > 
 > https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
 > 
 > 
 > *
 > 
 > 
 >   Open
 Company Data
 > 
 >
 Open Company Data
 > <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.buuo7dypfd9a>
 > 
 > Identifiers for
 companies/organisation
 > <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.qs150ivpio94>
 > 
 > URIs (Linked
 Data/Semantic Web)
 > <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.b9yeovqjeglz>
 > 
 > Downloadable
 Datasets with Compa

Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-05 Thread Jerven Bolleman
I just saw this http://api.opencorporates.com/ might be interesting
via
http://www.theguardian.com/odine-partner-zone/2015/nov/04/winners-second-call-odine-call-open-data-incubator-programme-europe

On Wed, Nov 4, 2015 at 5:03 PM, <brian.uli...@thomsonreuters.com> wrote:

> Hi Sebastian,
>
> Thomson Reuters offers a bulk download and API for company identifiers at
> the level of legal entities here:
>
> https://permid.org
>
> The PermID Service lets you utilize the permanent ID of 3.5 million
> organizations, 240K equity instruments and 1.17 million equity quotes from
> the Thomson Reuters core entity data set. PermID Service provides access to
> Thomson Reuters permanent identifiers (permanent unique IDs formatted as 
> Uniform
> Resource Identifiers
> <http://en.wikipedia.org/wiki/Uniform_resource_identifier>) along with
> the associated descriptive fields that Thomson Reuters exposes to the
> public.
>
> (Accessible by appending ?format=turtle to the URI: e.g.
> https://permid.org/1-4295861160?format=turtle returns triples about the
> status and location of Thomson Reuters Corp as triples.)  Unfortunately,
> there is not currently a SPARQL endpoint for this information.
>
> The descriptive fields enable the user to verify that a consumed permID
> represents the entity of interest.
>
> The data is live; records are updated every 15 minutes.
>
> In the future, PermID Service will support additional entities such as
> People, Fixed Income Instruments, Fixed Income Quote, and more.
>
> The identifiers are also produced as output in the Open Calais Tagger,
> with an API and interface at the same site.  That is, if a string is
> identified in free text as denoting a company with an open permid, the open
> permid URI is returned as part of the RDF output.
> Bulk Files Content and Format
>
> The Open PermID database is licensed under the Creative Commons with
> Attribution license, version 4.0 (CC-BY).
>
> An extended set of fields is also available under the Creative Commons
> Non-Commercial license (CC-NC 3.0).
>
> Plain
>
> language summaries of these licenses are available on the Creative Commons
> website.
> Supported Formats: Turtle and NTriples (See
> Appendix F for an example in ttl format.)
> Coverage: The same records provided by the Entity Search API (see terms
> of use <https://permid.org/terms>).
>
> Frequency: New bulk files will be published once a week.
> Incremental Updates: Once the bulk files are consumed, the subsequent
> incremental updates can be consumed via our Atom Feed.
>
> Let me know if you have any questions.
>
> The files are available for download via the Open PermID
> <https://permid.org/> website:
>
> Best regards,
>
> Brian Ulicny, PhD
> Director, Data Science
> Data Innovation Lab
> Thomson Reuters Corp.
> 22 Thomson Pl
> Boston, MA 02210
> ----------
> *From:* Sebastian Hellmann [hellm...@informatik.uni-leipzig.de]
> *Sent:* Tuesday, November 03, 2015 9:17 AM
> *To:* public-lod
> *Subject:* Are there any datasets about companies? ( DBpedia Open Data
> Initiative)
>
> [Apologies for cross-posting]
>
> Dear all,
> this message is part announcement of an open data initiative and part call
> for feedback and support.
>
> We are considering to work on creating a free, open and interoperable
> dataset on companies and organisations, which we are planing to integrate
> into DBpedia+ and offer as dump download. As we are in a very early phase
> of the endeavour, we would like to know whether there is existing work in
> this area.
>
> We are looking for any available datasets which have information about
> companies and other organizations in any language and any country. Ideally,
> the datasets are:
> 1. downloadable as dump
> 2. openly licensed , e.g. CC-BY following the
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__opendefinition.org_=AwMDaQ=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q=MWFkXZKGzjUiPsZJmvKkJFzcfjNv8b2-O-FtVxe_lKo=OMBq1pjpko-reA0yvs5X6147GDpMFmm16eXOAeS6frU=BFdpNkHKAPYHRsx74-ikwmbfb98D7wg1PgyD8KJrA08=>
> http://opendefinition.org/
> 3. in an easily parseable format, e.g. RDF or CSV and not PDF
>
> But hey! Send around anything you know, and we will look at it and see
> whether we can make use of it. You can reach us either by replying  to this
> email or send feedback directly to me and Kay Müller
> <kay.muel...@informatik.uni-leipzig.de>
> <kay.muel...@informatik.uni-leipzig.de>.
> If you have any private/closed data, please contact us as well. We might
> make use of it to cross-reference and validate public/open data with it. Or
> just learn from it to build a good 

Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-05 Thread Kingsley Idehen
On 11/5/15 9:31 AM, brian.uli...@thomsonreuters.com wrote:
> I have uploaded my paper at the METHOD 2015 Workshop at this year's
> ISWC here:
>
> http://www.researchgate.net/publication/283500696_Constructing_Knowledge_Graphs_with_Trust#share
>
> It explains the rationale behind Thomson Reuters'  company open permid
> URIs and the advantages they have over rival identifier schemes like
> DUNS numbers, company websites, DBpedia URIs, etc.
>
> Brian Ulicny, PhD
> Director, Data Science
> Data Innovation Lab
> Thomson Reuters
> 22 Thomson Pl
> Boston, MA 02210

Brian,

When will there be any combination of the following, from this data space:

[1] SPARQL Endpoint
[2] Data Dump in one or more of the standard RDF document formats?

Working through an API is too constraining, hence the need for the
additional flexibility provided by the items above.


Kingsley
>
> 
> *From:* Ulicny, Brian (TR Technology)
> *Sent:* Wednesday, November 04, 2015 10:03 AM
> *To:* hellm...@informatik.uni-leipzig.de;
> kay.muel...@informatik.uni-leipzig.de; public-lod@w3.org
> *Subject:* RE: Are there any datasets about companies? ( DBpedia Open
> Data Initiative)
>
> Hi Sebastian,
>
> Thomson Reuters offers a bulk download and API for company identifiers
> at the level of legal entities here:
>
> https://permid.org
>
> The PermID Service lets you utilize the permanent ID of 3.5 million
> organizations, 240K equity instruments and 1.17 million equity quotes
> from the Thomson Reuters core entity data set. PermID Service provides
> access to Thomson Reuters permanent identifiers (permanent unique IDs
> formatted as Uniform Resource Identifiers
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__en.wikipedia.org_wiki_Uniform-5Fresource-5Fidentifier=AwMGaQ=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q=MWFkXZKGzjUiPsZJmvKkJFzcfjNv8b2-O-FtVxe_lKo=elpUN1KLE56lCpIjiGotQ9cblL9JUkPw12bvumYsUzg=rqxlN7LIEf0nP1zVyZ9BUfw5H_nnH2j7zgXxwULlqLU=>)
> along with the associated descriptive fields that Thomson Reuters
> exposes to the public. 
>
> (Accessible by appending ?format=turtle to the URI:
> e.g. https://permid.org/1-4295861160?format=turtle
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__permid.org_1-2D4295861160-3Fformat-3Dturtle=AwMGaQ=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q=MWFkXZKGzjUiPsZJmvKkJFzcfjNv8b2-O-FtVxe_lKo=elpUN1KLE56lCpIjiGotQ9cblL9JUkPw12bvumYsUzg=iIkyM4PutW6Pnor6KJhcCLSq39y5En9hUDjoYh_ZZBY=>
>  returns
> triples about the status and location of Thomson Reuters Corp as
> triples.)  Unfortunately, there is not currently a SPARQL endpoint for
> this information.
>
> The descriptive fields enable the user to verify that a consumed
> permID represents the entity of interest.
>
> The data is live; records are updated every 15 minutes.
>
> In the future, PermID Service will support additional entities such as
> People, Fixed Income Instruments, Fixed Income Quote, and more.
>
> The identifiers are also produced as output in the Open Calais Tagger,
> with an API and interface at the same site.  That is, if a string is
> identified in free text as denoting a company with an open permid, the
> open permid URI is returned as part of the RDF output.
>
> Bulk Files Content and Format
>
> The Open PermID database is licensed under the Creative Commons with
> Attribution license, version 4.0 (CC-BY).
>
> An extended set of fields is also available under the Creative Commons
> Non-Commercial license (CC-NC 3.0). 
>
> Plain 
>
> language summaries of these licenses are available on the Creative
> Commons website.
>
> Supported Formats: Turtle and NTriples (See Appendix F for an example
> in ttl format.)
> Coverage: The same records provided by the Entity Search API
> (see terms of use
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__permid.org_terms=AwMGaQ=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q=MWFkXZKGzjUiPsZJmvKkJFzcfjNv8b2-O-FtVxe_lKo=elpUN1KLE56lCpIjiGotQ9cblL9JUkPw12bvumYsUzg=2JRkJ5BcWb2a-jHyJFiwRXGH2pM6CdrxLXXy_zxxlCE=>).
>
> Frequency: New bulk files will be published once a week.
> Incremental Updates: Once the bulk files are consumed, the subsequent
> incremental updates can be consumed via our Atom Feed.
>
> Let me know if you have any questions.
>
> The files are available for download via the Open PermID
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__permid.org_=AwMGaQ=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q=MWFkXZKGzjUiPsZJmvKkJFzcfjNv8b2-O-FtVxe_lKo=elpUN1KLE56lCpIjiGotQ9cblL9JUkPw12bvumYsUzg=VG5z9OApZNqJztFt1Xaq9a8y2e6qhnTM-xiplHcxAn0=>
>  website:
>
> Best regards,
>
> Brian Ulicny, PhD
> Director, Data Science
> D

Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-05 Thread Sebastian Hellmann

Hi Chris,

However, making sense of this data is very, very time consuming, not 
to mentioned writing and maintaining bots  (we now have hundreds and 
hundreds of them) to scrape jurisdictions that aren't open data (the 
vast majority) takes significant resources, and we don't see any way 
of sustaining this on a CC-BY licence. 

a) is the code for these bots somewhere?
b) we hope to find a way to maintain it this time. DBpedia has received 
funding via http://smartdataweb.de/ and also http://aligned-project.eu/
We are also currently  working on a charter for an non-profit 
association that is committed to keep all data open under cc-by (we are 
accepting donations, membership fees among other things)


I could also write a book about corporate identifiers, and the issues 
with those on the list (but don't have time).

We are writing such a book* in parallel, do you want to help?
Sebastian

*= well it's just a paper


On 05.11.2015 19:18, Chris Taggart wrote:

Rolf etc

Thanks for cc'ing me. We'd had contact from Sebastian and given him an 
API key. The main issues here are sustainability and domain knowledge. 
We'd love more people to be downloading the open datasets from the UK 
and others, and using them in all sorts of innovative ways, and the 
main reason we do the Open Company Data Index 
, is to motivate company 
registers to opening up their data (I was speaking at the Open Govt 
Partnership Summit in Mexico City last week on the same subject). 
However, making sense of this data is very, very time consuming, not 
to mentioned writing and maintaining bots  (we now have hundreds and 
hundreds of them) to scrape jurisdictions that aren't open data (the 
vast majority) takes significant resources, and we don't see any way 
of sustaining this on a CC-BY licence.


Finally, there are very few registers that are CC-BY licences or less 
(for example Denmark places restrictions on use for marketing), even 
ignoring DPA issues (we are now spending a considerable amount on 
legal fees on this issue). I could also write a book about corporate 
identifiers, and the issues with those on the list (but don't have time).


So, we'd love to see more activity in the area, particularly in 
Germany – where the Handelsregister and Bundesanzeiger are very 
definitely not open data  ;-)


Chris

On 5 November 2015 at 12:49, Rolf Kleef > wrote:


Hi Sebastian, Kay,

If you haven't done it yet, I suggest getting in touch with Chris
Taggart of Open Corporates (cc'd). He has years of experience doing
this, and is also involved in cross-standards work on "organisational
identifiers", crucial in the development of for instance the Open
Contracting Data Standard and the International Aid Transparancy
Initiative:

http://www.open-contracting.org/
http://iatistandard.org/201/organisation-identifiers/

~~Rolf.

On 03/11/15 16:17, Sebastian Hellmann wrote:
> [Apologies for cross-posting]
>
> Dear all,
> this message is part announcement of an open data initiative and
part
> call for feedback and support.
>
> We are considering to work on creating a free, open and
interoperable
> dataset on companies and organisations, which we are planing to
> integrate into DBpedia+ and offer as dump download. As we are in
a very
> early phase of the endeavour, we would like to know whether there is
> existing work in this area.
>
> We are looking for any available datasets which have information
about
> companies and other organizations in any language and any country.
> Ideally, the datasets are:
> 1. downloadable as dump
> 2. openly licensed , e.g. CC-BY following the
http://opendefinition.org/
> 3. in an easily parseable format, e.g. RDF or CSV and not PDF
>
> But hey! Send around anything you know, and we will look at it
and see
> whether we can make use of it. You can reach us either by
replying  to
> this email or send feedback directly to me and Kay Müller
> >.
> If you have any private/closed data, please contact us as well.
We might
> make use of it to cross-reference and validate public/open data
with it.
> Or just learn from it to build a good scheme.
>
> We started a link collection here (and attached the current
status at
> the end of this email)
>

https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
> Also we started to collect potential identifiers for linking here:
>

https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
>
> Regards and thank you for any support on this,
> Sebastian and Kay
>
> ##
>
>
  

RE: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-04 Thread Brian.Ulicny
Hi Sebastian,

Thomson Reuters offers a bulk download and API for company identifiers at the 
level of legal entities here:

https://permid.org


The PermID Service lets you utilize the permanent ID of 3.5 million 
organizations, 240K equity instruments and 1.17 million equity quotes from the 
Thomson Reuters core entity data set. PermID Service provides access to Thomson 
Reuters permanent identifiers (permanent unique IDs formatted as Uniform 
Resource Identifiers<http://en.wikipedia.org/wiki/Uniform_resource_identifier>) 
along with the associated descriptive fields that Thomson Reuters exposes to 
the public.

(Accessible by appending ?format=turtle to the URI: e.g. 
https://permid.org/1-4295861160?format=turtle returns triples about the status 
and location of Thomson Reuters Corp as triples.)  Unfortunately, there is not 
currently a SPARQL endpoint for this information.

The descriptive fields enable the user to verify that a consumed permID 
represents the entity of interest.

The data is live; records are updated every 15 minutes.

In the future, PermID Service will support additional entities such as People, 
Fixed Income Instruments, Fixed Income Quote, and more.

The identifiers are also produced as output in the Open Calais Tagger, with an 
API and interface at the same site.  That is, if a string is identified in free 
text as denoting a company with an open permid, the open permid URI is returned 
as part of the RDF output.

Bulk Files Content and Format

The Open PermID database is licensed under the Creative Commons with 
Attribution license, version 4.0 (CC-BY).

An extended set of fields is also available under the Creative Commons 
Non-Commercial license (CC-NC 3.0).

Plain

language summaries of these licenses are available on the Creative Commons 
website.

Supported Formats: Turtle and NTriples (See Appendix F for an example in ttl 
format.)
Coverage: The same records provided by the Entity Search API (see terms of 
use<https://permid.org/terms>).

Frequency: New bulk files will be published once a week.
Incremental Updates: Once the bulk files are consumed, the subsequent 
incremental updates can be consumed via our Atom Feed.

Let me know if you have any questions.

The files are available for download via the Open PermID<https://permid.org/> 
website:
[http://developer.permid.org/ref/Entity%20Search%20API%20User%20Guide/images/image3.png]

Best regards,

Brian Ulicny, PhD
Director, Data Science
Data Innovation Lab
Thomson Reuters Corp.
22 Thomson Pl
Boston, MA 02210

From: Sebastian Hellmann [hellm...@informatik.uni-leipzig.de]
Sent: Tuesday, November 03, 2015 9:17 AM
To: public-lod
Subject: Are there any datasets about companies? ( DBpedia Open Data Initiative)

[Apologies for cross-posting]

Dear all,
this message is part announcement of an open data initiative and part call for 
feedback and support.

We are considering to work on creating a free, open and interoperable dataset 
on companies and organisations, which we are planing to integrate into DBpedia+ 
and offer as dump download. As we are in a very early phase of the endeavour, 
we would like to know whether there is existing work in this area.

We are looking for any available datasets which have information about 
companies and other organizations in any language and any country. Ideally, the 
datasets are:
1. downloadable as dump
2. openly licensed , e.g. CC-BY following the 
<https://urldefense.proofpoint.com/v2/url?u=http-3A__opendefinition.org_=AwMDaQ=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q=MWFkXZKGzjUiPsZJmvKkJFzcfjNv8b2-O-FtVxe_lKo=OMBq1pjpko-reA0yvs5X6147GDpMFmm16eXOAeS6frU=BFdpNkHKAPYHRsx74-ikwmbfb98D7wg1PgyD8KJrA08=>
 http://opendefinition.org/
3. in an easily parseable format, e.g. RDF or CSV and not PDF

But hey! Send around anything you know, and we will look at it and see whether 
we can make use of it. You can reach us either by replying  to this email or 
send feedback directly to me and Kay Müller 
<kay.muel...@informatik.uni-leipzig.de><mailto:kay.muel...@informatik.uni-leipzig.de>.
If you have any private/closed data, please contact us as well. We might make 
use of it to cross-reference and validate public/open data with it. Or just 
learn from it to build a good scheme.

We started a link collection here (and attached the current status at the end 
of this email)
https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1IaWSSt4-5FSZVhypvB1QzBlCtBuMQHv-2Dq5Ti0n8xoZFIQ_edit=AwMDaQ=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q=MWFkXZKGzjUiPsZJmvKkJFzcfjNv8b2-O-FtVxe_lKo=OMBq1pjpko-reA0yvs5X6147GDpMFmm16eXOAeS6frU=DvNEM89JMWU1KYpvTmeXS1067sO4JCkwqGd3Vcd33CY=>
Also we started to collect potential identifiers for linking here:
https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0&

Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-03 Thread Sebastian Hellmann

[Apologies for cross-posting]

Dear all,
this message is part announcement of an open data initiative and part 
call for feedback and support.


We are considering to work on creating a free, open and interoperable 
dataset on companies and organisations, which we are planing to 
integrate into DBpedia+ and offer as dump download. As we are in a very 
early phase of the endeavour, we would like to know whether there is 
existing work in this area.


We are looking for any available datasets which have information about 
companies and other organizations in any language and any country. 
Ideally, the datasets are:

1. downloadable as dump
2. openly licensed , e.g. CC-BY following the http://opendefinition.org/
3. in an easily parseable format, e.g. RDF or CSV and not PDF

But hey! Send around anything you know, and we will look at it and see 
whether we can make use of it. You can reach us either by replying  to 
this email or send feedback directly to me and Kay Müller 
.
If you have any private/closed data, please contact us as well. We might 
make use of it to cross-reference and validate public/open data with it. 
Or just learn from it to build a good scheme.


We started a link collection here (and attached the current status at 
the end of this email)

https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
Also we started to collect potential identifiers for linking here:
https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0

Regards and thank you for any support on this,
Sebastian and Kay

##

https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit


*


 Open Company Data

Open Company Data 



Identifiers for companies/organisation 



URIs (Linked Data/Semantic Web) 



Downloadable Datasets with Company info (confirmed) 



Portals with no bulk downloads 



Portals, we will still need to investigate 





   Identifiers for companies/organisation

Table with identifiers:

https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0


 URIs (Linked Data/Semantic Web)

 *

   DBpedia/Wikipedia/Wikidata URIs - http://dbpedia.org

 *

   LinkedGeoData - http://linkedgeodata.org/


   DownloadableDatasets with Company info (confirmed)

 *

   VIAF - http://viaf.org/viaf/data/

 *

   DBpedia - http://downloads.dbpedia.org/current/core/

 *

   Wikidata - http://downloads.dbpedia.org/current/ext/wikidata/

 *

   LinkedGeoData - http://downloads.linkedgeodata.org/releases/

 *

   Company Data Index: http://index.okfn.org/dataset/companies/

 o

   e.g. UK company data:
   http://download.companieshouse.gov.uk/en_output.html


   Portals with no bulk downloads

 *

   https://opencorporates.com/

 *

   http://registries.opencorporates.com/


   Portals, we will still need to investigate


 *

   https://www.wlw.de/

 *

   https://www.crunchbase.com

 *

   http://data.crunchbase.com/v3/page/crunchbase-open-data-map-odm

 *

   http://www.industrystock.de

 *

   http://www.ebr.org/

 *

   https://simfin.com/data/browse/companies

 *

   http://c-lei.org/

 *

   http://data.imf.org/

 *

   http://worldbank.270a.info/.html

 *

   http://datacatalog.worldbank.org/

 *

   http://www.europages.com/

 *

   http://www.sec.gov/data

 *

   http://faculty.philau.edu/russowl/industry.html

 *

   USA: http://www.corporationwiki.com/

 *

   India: http://www.companywiki.in/

 *

   Handelsregister: www.Handelsregister.de

 *

   Creditreform: http://www.creditsafetrial.com/de/?country=DE

 *

   Bürgel: https://www.buergel.de/en

 *

   Factiva:
   https://global.factiva.com/factivalogin/login.asp?productname=global

 *


Interesting Links:

 *

   German
   
http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-1/

 *

   
http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-2/

*

--
Sebastian Hellmann
AKSW/KILT research group
Insitute for Applied Informatics (InfAI) at Leipzig University
DBpedia Association
Events:
* *Nov 20th, 2015* Extended Deadline for Quality Management of Semantic 
Web Assets (Data, Services and Systems) 

Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-03 Thread Giovanni Tummarello
Hi Sebastian, just for context

(i am collaborating with a leadingmarket data provider) there are 17 M+
organizations in italy alone (either alive or dead .. but maybe worth still
being in a database).

Maaaybe, just maaybe its worth tto talk to some of these organization and
campaign the opening up of a super minimal dataset e.g. just name,
registration city, status dead or alive.

The rationale is that they could receive more hits to get all the "rest of
the data" from paying customers.

but it will be quite difficult one has to come up with a good pitch, and a
lot of patience. Consider that permid seems to have one such super open
dataset so maybe that's a starting point.

Self catered "add your company" approaches, are not going to work in my
opinion.

Gio

On Tue, Nov 3, 2015 at 2:05 PM, Nandana Mihindukulasooriya <
nmihi...@fi.upm.es> wrote:

> Hi Sebastian,
>
> Open PermID and Open Calais [1,2] initiatives from Thomson Reuters with
> Linked Data + bulk download (CC-BY 4.0) might be of interest to your
> work. Brian Ulicny presented it in ISWC 2015 [3] and it has identifiers
> curated and maintained by Thomson Reuters for more than 3.5 million
> organizations .
>
> It also has several useful information about those organizations.
> http://tinyurl.com/permid-org-properties
> http://tinyurl.com/permid-triple-patterns
>
> Best Regards,
> Nandana
>
> [1] https://permid.org/faq
> [2] http://www.opencalais.com/about/
> [3] https://twitter.com/nandanamihindu/status/653232796874506240
>
> On Tue, Nov 3, 2015 at 4:17 PM, Sebastian Hellmann <
> hellm...@informatik.uni-leipzig.de> wrote:
>
>> [Apologies for cross-posting]
>>
>> Dear all,
>> this message is part announcement of an open data initiative and part
>> call for feedback and support.
>>
>> We are considering to work on creating a free, open and interoperable
>> dataset on companies and organisations, which we are planing to integrate
>> into DBpedia+ and offer as dump download. As we are in a very early phase
>> of the endeavour, we would like to know whether there is existing work in
>> this area.
>>
>> We are looking for any available datasets which have information about
>> companies and other organizations in any language and any country. Ideally,
>> the datasets are:
>> 1. downloadable as dump
>> 2. openly licensed , e.g. CC-BY following the
>> http://opendefinition.org/
>> 3. in an easily parseable format, e.g. RDF or CSV and not PDF
>>
>> But hey! Send around anything you know, and we will look at it and see
>> whether we can make use of it. You can reach us either by replying  to this
>> email or send feedback directly to me and Kay Müller
>> 
>> .
>> If you have any private/closed data, please contact us as well. We might
>> make use of it to cross-reference and validate public/open data with it. Or
>> just learn from it to build a good scheme.
>>
>> We started a link collection here (and attached the current status at the
>> end of this email)
>>
>> https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
>> Also we started to collect potential identifiers for linking here:
>>
>> https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
>>
>> Regards and thank you for any support on this,
>> Sebastian and Kay
>>
>> ##
>>
>>
>> https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> * Open Company Data Open Company Data
>> 
>> Identifiers for companies/organisation
>> 
>> URIs (Linked Data/Semantic Web)
>> 
>> Downloadable Datasets with Company info (confirmed)
>> 
>> Portals with no bulk downloads
>> 
>> Portals, we will still need to investigate
>> 
>> Identifiers for companies/organisation Table with identifiers:
>> https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
>> 
>> URIs (Linked Data/Semantic Web) - DBpedia/Wikipedia/Wikidata URIs -
>> 

Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-03 Thread Alfredo Serafini
Hi I love the idea! thanks for sharing

(I see references to opencorporates, but it uses an API if I'm not wrong?
http://api.opencorporates.com/)

how can someone contribute, apart from suggestiong new sources?
(Sadly  in Italy I can hardly we still lack an open index of companies, and
what we could do is to combine informations from various sources, with many
problems in terms of licences / provenance / etc.)

I suggest to put all the collection of sources directly on github (or
similar) because it could help in managing the various contributions!

Alfredo


2015-11-03 16:17 GMT+01:00 Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de>:

> [Apologies for cross-posting]
>
> Dear all,
> this message is part announcement of an open data initiative and part call
> for feedback and support.
>
> We are considering to work on creating a free, open and interoperable
> dataset on companies and organisations, which we are planing to integrate
> into DBpedia+ and offer as dump download. As we are in a very early phase
> of the endeavour, we would like to know whether there is existing work in
> this area.
>
> We are looking for any available datasets which have information about
> companies and other organizations in any language and any country. Ideally,
> the datasets are:
> 1. downloadable as dump
> 2. openly licensed , e.g. CC-BY following the 
> http://opendefinition.org/
> 3. in an easily parseable format, e.g. RDF or CSV and not PDF
>
> But hey! Send around anything you know, and we will look at it and see
> whether we can make use of it. You can reach us either by replying  to this
> email or send feedback directly to me and Kay Müller
> 
> .
> If you have any private/closed data, please contact us as well. We might
> make use of it to cross-reference and validate public/open data with it. Or
> just learn from it to build a good scheme.
>
> We started a link collection here (and attached the current status at the
> end of this email)
>
> https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
> Also we started to collect potential identifiers for linking here:
>
> https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
>
> Regards and thank you for any support on this,
> Sebastian and Kay
>
> ##
>
>
> https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> * Open Company Data Open Company Data
> 
> Identifiers for companies/organisation
> 
> URIs (Linked Data/Semantic Web)
> 
> Downloadable Datasets with Company info (confirmed)
> 
> Portals with no bulk downloads
> 
> Portals, we will still need to investigate
> 
> Identifiers for companies/organisation Table with identifiers:
> https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
> 
> URIs (Linked Data/Semantic Web) - DBpedia/Wikipedia/Wikidata URIs -
> http://dbpedia.org  - LinkedGeoData
> - http://linkedgeodata.org/
>  Downloadable Datasets with Company info
> (confirmed) - VIAF - http://viaf.org/viaf/data/
>  - DBpedia -
> http://downloads.dbpedia.org/current/core/
>  - Wikidata -
> http://downloads.dbpedia.org/current/ext/wikidata/
>  - LinkedGeoData -
> http://downloads.linkedgeodata.org/releases/
>  - Company Data Index:
> http://index.okfn.org/dataset/companies/
>  - e.g. UK company data:
> 

Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

2015-11-03 Thread Daniel Hladky
Sebastian,

Here a few thoughts:

In Switzerland you can try to scrape http://www.zefix.ch/

Otherwise other company databases that I am aware off are under commercial
license.

Swiss commercial data providers are:
http://www.bisnode.de/product/firmendatenbank/
http://ch.kompass.com/

On a ww scale you buy data from Bloomberg alike.

Besides the known opencorporate, dbpedia etc approach I am not aware of
another open db.
In general this is big business for certain company data providers.

Cheers, Daniel


On Tue, Nov 3, 2015 at 4:17 PM, Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> wrote:

> [Apologies for cross-posting]
>
> Dear all,
> this message is part announcement of an open data initiative and part call
> for feedback and support.
>
> We are considering to work on creating a free, open and interoperable
> dataset on companies and organisations, which we are planing to integrate
> into DBpedia+ and offer as dump download. As we are in a very early phase
> of the endeavour, we would like to know whether there is existing work in
> this area.
>
> We are looking for any available datasets which have information about
> companies and other organizations in any language and any country. Ideally,
> the datasets are:
> 1. downloadable as dump
> 2. openly licensed , e.g. CC-BY following the 
> http://opendefinition.org/
> 3. in an easily parseable format, e.g. RDF or CSV and not PDF
>
> But hey! Send around anything you know, and we will look at it and see
> whether we can make use of it. You can reach us either by replying  to this
> email or send feedback directly to me and Kay Müller
> 
> .
> If you have any private/closed data, please contact us as well. We might
> make use of it to cross-reference and validate public/open data with it. Or
> just learn from it to build a good scheme.
>
> We started a link collection here (and attached the current status at the
> end of this email)
>
> https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
> Also we started to collect potential identifiers for linking here:
>
> https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
>
> Regards and thank you for any support on this,
> Sebastian and Kay
>
> ##
>
>
> https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> * Open Company Data Open Company Data
> 
> Identifiers for companies/organisation
> 
> URIs (Linked Data/Semantic Web)
> 
> Downloadable Datasets with Company info (confirmed)
> 
> Portals with no bulk downloads
> 
> Portals, we will still need to investigate
> 
> Identifiers for companies/organisation Table with identifiers:
> https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
> 
> URIs (Linked Data/Semantic Web) - DBpedia/Wikipedia/Wikidata URIs -
> http://dbpedia.org  - LinkedGeoData
> - http://linkedgeodata.org/
>  Downloadable Datasets with Company info
> (confirmed) - VIAF - http://viaf.org/viaf/data/
>  - DBpedia -
> http://downloads.dbpedia.org/current/core/
>  - Wikidata -
> http://downloads.dbpedia.org/current/ext/wikidata/
>  - LinkedGeoData -
> http://downloads.linkedgeodata.org/releases/
>  - Company Data Index:
> http://index.okfn.org/dataset/companies/
>  - e.g. UK company data:
> http://download.companieshouse.gov.uk/en_output.html
>