Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-24 Thread Leigh Dodds
Hi,

On 23 August 2011 15:17, Gannon Dick gannon_d...@yahoo.com wrote:
 Either Linked Data ecosystem or linked data Ecosystem is a dangerously 
 flawed paradigm, IMHO.  You don't improve MeSH by
 flattening it, for example, it is what it is. Since CAS numbers are not a 
 directed graph, an algorithmic transform to a URI (which *is* a
 directed graph) is risks the creation of a new irreconcilable taxonomy.  
 For example, Nitrogen is ok to breathe and liquid Nitrogen is a
 not very practical way to chill wine.

A URI isn't a directed graph. You can use them to build one by making
statements though.

Setting aside any copyright issues, the CAS identifiers are useful
Natural Keys [1]. As they're well deployed, using them to create URIs
[2] is sensible as it simplifies the process of linking between
datasets [3].

To answer Patrick's question, to help bridging between systems that
only use the original literal version, rather than the URIs, then we
should ensure that the literal keys are included in the data [4].

These are well deployed patterns and, from my experience, make it
really simple and easy to bridge and link between different datasets
and systems.

Cheers,

L.

[1]. http://patterns.dataincubator.org/book/natural-keys.html
[2]. http://patterns.dataincubator.org/book/patterned-uris.html
[3]. http://patterns.dataincubator.org/book/shared-keys.html
[4]. http://patterns.dataincubator.org/book/literal-keys.html

-- 
Leigh Dodds
Programme Manager, Talis Platform
Mobile: 07850 928381
http://kasabi.com
http://talis.com

Talis Systems Ltd
43 Temple Row
Birmingham
B2 5LS



Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-24 Thread David Wood
On Aug 24, 2011, at 2:44, Leigh Dodds leigh.do...@talis.com wrote:

 Hi,
 
 On 23 August 2011 15:17, Gannon Dick gannon_d...@yahoo.com wrote:
 Either Linked Data ecosystem or linked data Ecosystem is a dangerously 
 flawed paradigm, IMHO.  You don't improve MeSH by
 flattening it, for example, it is what it is. Since CAS numbers are not a 
 directed graph, an algorithmic transform to a URI (which *is* a
 directed graph) is risks the creation of a new irreconcilable taxonomy.  
 For example, Nitrogen is ok to breathe and liquid Nitrogen is a
 not very practical way to chill wine.
 
 A URI isn't a directed graph. You can use them to build one by making
 statements though.
 
 Setting aside any copyright issues, the CAS identifiers are useful
 Natural Keys [1]. As they're well deployed, using them to create URIs
 [2] is sensible

Hi Leigh,

Right.  Unfortunately it is also illegal :/

Regards,
Dave


 as it simplifies the process of linking between
 datasets [3].
 
 To answer Patrick's question, to help bridging between systems that
 only use the original literal version, rather than the URIs, then we
 should ensure that the literal keys are included in the data [4].
 
 These are well deployed patterns and, from my experience, make it
 really simple and easy to bridge and link between different datasets
 and systems.
 
 Cheers,
 
 L.
 
 [1]. http://patterns.dataincubator.org/book/natural-keys.html
 [2]. http://patterns.dataincubator.org/book/patterned-uris.html
 [3]. http://patterns.dataincubator.org/book/shared-keys.html
 [4]. http://patterns.dataincubator.org/book/literal-keys.html
 
 -- 
 Leigh Dodds
 Programme Manager, Talis Platform
 Mobile: 07850 928381
 http://kasabi.com
 http://talis.com
 
 Talis Systems Ltd
 43 Temple Row
 Birmingham
 B2 5LS
 



Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-24 Thread Leigh Dodds
Hi,

On 24 August 2011 15:40, David Wood da...@3roundstones.com wrote:
 On Aug 24, 2011, at 2:44, Leigh Dodds leigh.do...@talis.com wrote:

 Hi,

 On 23 August 2011 15:17, Gannon Dick gannon_d...@yahoo.com wrote:
 Either Linked Data ecosystem or linked data Ecosystem is a dangerously 
 flawed paradigm, IMHO.  You don't improve MeSH by
 flattening it, for example, it is what it is. Since CAS numbers are not a 
 directed graph, an algorithmic transform to a URI (which *is* a
 directed graph) is risks the creation of a new irreconcilable taxonomy.  
 For example, Nitrogen is ok to breathe and liquid Nitrogen is a
 not very practical way to chill wine.

 A URI isn't a directed graph. You can use them to build one by making
 statements though.

 Setting aside any copyright issues, the CAS identifiers are useful
 Natural Keys [1]. As they're well deployed, using them to create URIs
 [2] is sensible

 Hi Leigh,

 Right.  Unfortunately it is also illegal :/

Yes, I read the first part of the thread! I was merely pointing out
the useful patterns for projecting identifiers into URIs.

Cheers,

L.

-- 
Leigh Dodds
Programme Manager, Talis Platform
Mobile: 07850 928381
http://kasabi.com
http://talis.com

Talis Systems Ltd
43 Temple Row
Birmingham
B2 5LS



Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-24 Thread Damian Steer

On 24 Aug 2011, at 15:40, David Wood wrote:

 On Aug 24, 2011, at 2:44, Leigh Dodds leigh.do...@talis.com wrote:
 
 Hi,
 
 On 23 August 2011 15:17, Gannon Dick gannon_d...@yahoo.com wrote:
 Either Linked Data ecosystem or linked data Ecosystem is a dangerously 
 flawed paradigm, IMHO.  You don't improve MeSH by
 flattening it, for example, it is what it is. Since CAS numbers are not a 
 directed graph, an algorithmic transform to a URI (which *is* a
 directed graph) is risks the creation of a new irreconcilable taxonomy.  
 For example, Nitrogen is ok to breathe and liquid Nitrogen is a
 not very practical way to chill wine.
 
 A URI isn't a directed graph. You can use them to build one by making
 statements though.
 
 Setting aside any copyright issues, the CAS identifiers are useful
 Natural Keys [1]. As they're well deployed, using them to create URIs
 [2] is sensible
 
 Hi Leigh,
 
 Right.  Unfortunately it is also illegal :/

For people like me who haven't paid attention, and were taken aback by that:

i. A User or Organization may include, without a license and without paying a 
fee, up to 10,000 CAS Registry Numbers or CASRNs in a catalog,
 web site, or other product for which there is no charge. *The following 
attribution should be referenced or appear with the use of each 
CASRN: CAS Registry Number is a Registered Trademark of the American Chemical 
Society* [1]

So up to 10,000 is ok, but will include 10,000 attributions.

Damian

[1] http://www.cas.org/legal/infopolicy.html#authorized


Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-24 Thread David Wood
On Aug 24, 2011, at 8:01, Damian Steer d.st...@bristol.ac.uk wrote:

 
 On 24 Aug 2011, at 15:40, David Wood wrote:
 
 On Aug 24, 2011, at 2:44, Leigh Dodds leigh.do...@talis.com wrote:
 
 Hi,
 
 On 23 August 2011 15:17, Gannon Dick gannon_d...@yahoo.com wrote:
 Either Linked Data ecosystem or linked data Ecosystem is a dangerously 
 flawed paradigm, IMHO.  You don't improve MeSH by
 flattening it, for example, it is what it is. Since CAS numbers are not a 
 directed graph, an algorithmic transform to a URI (which *is* a
 directed graph) is risks the creation of a new irreconcilable taxonomy.  
 For example, Nitrogen is ok to breathe and liquid Nitrogen is a
 not very practical way to chill wine.
 
 A URI isn't a directed graph. You can use them to build one by making
 statements though.
 
 Setting aside any copyright issues, the CAS identifiers are useful
 Natural Keys [1]. As they're well deployed, using them to create URIs
 [2] is sensible
 
 Hi Leigh,
 
 Right.  Unfortunately it is also illegal :/
 
 For people like me who haven't paid attention, and were taken aback by that:
 
 i. A User or Organization may include, without a license and without paying 
 a fee, up to 10,000 CAS Registry Numbers or CASRNs in a catalog,
 web site, or other product for which there is no charge. *The following 
 attribution should be referenced or appear with the use of each 
 CASRN: CAS Registry Number is a Registered Trademark of the American Chemical 
 Society* [1]
 
 So up to 10,000 is ok, but will include 10,000 attributions.

Thanks.  For what it is worth, the US EPA currently uses about 100,000.  

Regards,
Dave

 
 Damian
 
 [1] http://www.cas.org/legal/infopolicy.html#authorized



Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-23 Thread Patrick Durusau

David,

On 8/22/2011 9:55 PM, David Booth wrote:

On Mon, 2011-08-22 at 20:27 -0400, Patrick Durusau wrote:
[ . . . ]

The use of CAS identifiers supports searching across vast domains of
*existing* literature. Not all, but most of it for the last 60 or so
years.

That is non-trivial and should not be lightly discarded.

BTW, your objection is that non-licensed systems cannot use CAS
identifiers? Are these commercial systems that are charging their
customers? Why would you think such systems should be able to take
information created by others?


Using the information associated with an identifier is one thing; using
the identifier itself is another.  I'm sure the CAS numbers have added
non-trivial value that should not be ignored.  But their business model
needs to change.  It is ludicrous in this web era to prohibit the use of
the identifiers themselves.

If there is one principle we have learned from the web, it is enormous
value and importance of freely usable universal identifiers.  URIs rule!
http://urisrule.org/

:)
Well, I won't take the bait on URIs, ;-), but will note that re-use of 
identifiers of a sort was addressed quite a few years ago.


See: /*Feist Publications, Inc., v. Rural Telephone Service Co.*/, 499 
U.S. 340 (1991) or follow this link:


http://en.wikipedia.org/wiki/Feist_v._Rural

The circumstances with CAS numbers is slightly different because to get 
access to the full set of CAS numbers I suspect you have to sign a 
licensing agreement on re-use, which makes it a matter of *contract* law 
and not copyright.


Perhaps they should increase the limits beyond 10,000 identifiers but 
the only people who want the whole monty as it were are potential 
commercial competitors.


The people who publish the periodical Brain for example at $10,000 a 
year. Why should I want the complete set of identifiers to be freely 
available to help them?


Personally I think given the head start that the CAS maintainers have on 
the literature, etc., that different models for use of the identifiers 
might suit their purposes just as well. Universal identifiers change 
over time and my concern is with the least semantic friction and not as 
much with how we get there.


Hope you are having a great day!

Patrick






--
Patrick Durusau
patr...@durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

Another Word For It (blog): http://tm.durusau.net
Homepage: http://www.durusau.net
Twitter: patrickDurusau



Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-23 Thread John Erickson
This is an important discussion that (I believe) foreshadows how
canonical identifiers are managed moving forward.

Both CAS and DUNS numbers are a good example. Consider the challenge
of linking EPA data; it's easy to create a list of toxic chemicals
that are common across many EPA datasets. Based on those chemical
names, its possible to further find (in most cases) references in
DBPedia and other sources, such as PubChem:

* ACETALDEHYDE
* http://dbpedia.org/page/Acetaldehyde
* http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=177
* etc...

Now, add to this a sensible agency-rooted URI design and a
DBPedia-like infrastructure and one has a very powerful hub that
strengthens the Linked Data ecosystem. It would arguably be stronger
if CAS identifiers were also (somehow) included, but even the bits of
linking shown above change the value proposition of traditional
proprietary naming schemes...

John
PS: At TWC we are about to go live with a registry called Instance
Hub that will demonstrate the association of agency-based URI schemes
--- think EPA, HHS, DOE, USDA, etc --- with instance data over which
the agency has some authority or interest...More very soon!

On Tue, Aug 23, 2011 at 8:31 AM, Patrick Durusau patr...@durusau.net wrote:
 David,

 On 8/22/2011 9:55 PM, David Booth wrote:

 On Mon, 2011-08-22 at 20:27 -0400, Patrick Durusau wrote:
 [ . . . ]

 The use of CAS identifiers supports searching across vast domains of
 *existing* literature. Not all, but most of it for the last 60 or so
 years.

 That is non-trivial and should not be lightly discarded.

 BTW, your objection is that non-licensed systems cannot use CAS
 identifiers? Are these commercial systems that are charging their
 customers? Why would you think such systems should be able to take
 information created by others?

 Using the information associated with an identifier is one thing; using
 the identifier itself is another.  I'm sure the CAS numbers have added
 non-trivial value that should not be ignored.  But their business model
 needs to change.  It is ludicrous in this web era to prohibit the use of
 the identifiers themselves.

 If there is one principle we have learned from the web, it is enormous
 value and importance of freely usable universal identifiers.  URIs rule!
 http://urisrule.org/

 :)

 Well, I won't take the bait on URIs, ;-), but will note that re-use of
 identifiers of a sort was addressed quite a few years ago.

 See: Feist Publications, Inc., v. Rural Telephone Service Co., 499 U.S. 340
 (1991) or follow this link:

 http://en.wikipedia.org/wiki/Feist_v._Rural

 The circumstances with CAS numbers is slightly different because to get
 access to the full set of CAS numbers I suspect you have to sign a licensing
 agreement on re-use, which makes it a matter of *contract* law and not
 copyright.

 Perhaps they should increase the limits beyond 10,000 identifiers but the
 only people who want the whole monty as it were are potential commercial
 competitors.

 The people who publish the periodical Brain for example at $10,000 a year.
 Why should I want the complete set of identifiers to be freely available to
 help them?

 Personally I think given the head start that the CAS maintainers have on the
 literature, etc., that different models for use of the identifiers might
 suit their purposes just as well. Universal identifiers change over time and
 my concern is with the least semantic friction and not as much with how we
 get there.

 Hope you are having a great day!

 Patrick




 --
 Patrick Durusau
 patr...@durusau.net
 Chair, V1 - US TAG to JTC 1/SC 34
 Convener, JTC 1/SC 34/WG 3 (Topic Maps)
 Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

 Another Word For It (blog): http://tm.durusau.net
 Homepage: http://www.durusau.net
 Twitter: patrickDurusau




-- 
John S. Erickson, Ph.D.
http://bitwacker.com
olyerick...@gmail.com
Twitter: @olyerickson
Skype: @olyerickson



Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-23 Thread Patrick Durusau

John

On 8/23/2011 9:05 AM, John Erickson wrote:

This is an important discussion that (I believe) foreshadows how
canonical identifiers are managed moving forward.

Both CAS and DUNS numbers are a good example. Consider the challenge
of linking EPA data; it's easy to create a list of toxic chemicals
that are common across many EPA datasets. Based on those chemical
names, its possible to further find (in most cases) references in
DBPedia and other sources, such as PubChem:

* ACETALDEHYDE
* http://dbpedia.org/page/Acetaldehyde
* http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=177
* etc...

Now, add to this a sensible agency-rooted URI design and a
DBPedia-like infrastructure and one has a very powerful hub that
strengthens the Linked Data ecosystem. It would arguably be stronger
if CAS identifiers were also (somehow) included, but even the bits of
linking shown above change the value proposition of traditional
proprietary naming schemes...
Quite so and I did not mean to imply otherwise. Yes, gathering 
government agency URI identifiers for toxic chemicals is a value-add 
proposition.


I am curious if you find that different offices within agencies use the 
same URIs? Or did they have other identifiers in their records prior to 
the URIs?


That is will the URIs map to the identifiers used in EPA datasets for 
example?


Despite its obvious value, I don't agree that the project change[s] the 
value proposition of traditional proprietary naming schemes...


Mostly because it does not address the *prior* use of other identifiers 
in the published literature. However convenient it may be to pretend 
that we are starting off fresh, in fact we are not, in any information 
system.


The fact remains that even if we switched (miraculously) today to all 
new URI identifiers, we will be accessing literature using prior 
identifiers for a very long time. I suspect hundreds of years.


BTW, who bridges between the new URI schemes and the CAS identifiers? 
For searching traditional literature?



John
PS: At TWC we are about to go live with a registry called Instance
Hub that will demonstrate the association of agency-based URI schemes
--- think EPA, HHS, DOE, USDA, etc --- with instance data over which
the agency has some authority or interest...More very soon!

Looking forward to it!

Hope you are having a great day!

Patrick




On Tue, Aug 23, 2011 at 8:31 AM, Patrick Durusaupatr...@durusau.net  wrote:

David,

On 8/22/2011 9:55 PM, David Booth wrote:

On Mon, 2011-08-22 at 20:27 -0400, Patrick Durusau wrote:
[ . . . ]

The use of CAS identifiers supports searching across vast domains of
*existing* literature. Not all, but most of it for the last 60 or so
years.

That is non-trivial and should not be lightly discarded.

BTW, your objection is that non-licensed systems cannot use CAS
identifiers? Are these commercial systems that are charging their
customers? Why would you think such systems should be able to take
information created by others?

Using the information associated with an identifier is one thing; using
the identifier itself is another.  I'm sure the CAS numbers have added
non-trivial value that should not be ignored.  But their business model
needs to change.  It is ludicrous in this web era to prohibit the use of
the identifiers themselves.

If there is one principle we have learned from the web, it is enormous
value and importance of freely usable universal identifiers.  URIs rule!
http://urisrule.org/

:)

Well, I won't take the bait on URIs, ;-), but will note that re-use of
identifiers of a sort was addressed quite a few years ago.

See: Feist Publications, Inc., v. Rural Telephone Service Co., 499 U.S. 340
(1991) or follow this link:

http://en.wikipedia.org/wiki/Feist_v._Rural

The circumstances with CAS numbers is slightly different because to get
access to the full set of CAS numbers I suspect you have to sign a licensing
agreement on re-use, which makes it a matter of *contract* law and not
copyright.

Perhaps they should increase the limits beyond 10,000 identifiers but the
only people who want the whole monty as it were are potential commercial
competitors.

The people who publish the periodical Brain for example at $10,000 a year.
Why should I want the complete set of identifiers to be freely available to
help them?

Personally I think given the head start that the CAS maintainers have on the
literature, etc., that different models for use of the identifiers might
suit their purposes just as well. Universal identifiers change over time and
my concern is with the least semantic friction and not as much with how we
get there.

Hope you are having a great day!

Patrick




--
Patrick Durusau
patr...@durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

Another Word For It (blog): http://tm.durusau.net
Homepage: http://www.durusau.net

Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-23 Thread Gannon Dick
Either Linked Data ecosystem or linked data Ecosystem is a dangerously 
flawed paradigm, IMHO.  You don't improve MeSH by flattening it, for example, 
it is what it is. Since CAS numbers are not a directed graph, an algorithmic 
transform to a URI (which *is* a directed graph) is risks the creation of a 
new irreconcilable taxonomy.  For example, Nitrogen is ok to breathe and 
liquid Nitrogen is a not very practical way to chill wine.

Just my 2 cents.

--- On Tue, 8/23/11, John Erickson olyerick...@gmail.com wrote:

 From: John Erickson olyerick...@gmail.com
 Subject: Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my  
 Semantic Web presentation at SXSW)
 To: public-lod@w3.org
 Date: Tuesday, August 23, 2011, 8:05 AM
 This is an important discussion that
 (I believe) foreshadows how
 canonical identifiers are managed moving forward.
 
 Both CAS and DUNS numbers are a good example. Consider the
 challenge
 of linking EPA data; it's easy to create a list of toxic
 chemicals
 that are common across many EPA datasets. Based on those
 chemical
 names, its possible to further find (in most cases)
 references in
 DBPedia and other sources, such as PubChem:
 
 * ACETALDEHYDE
 * http://dbpedia.org/page/Acetaldehyde
 * http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=177
 * etc...
 
 Now, add to this a sensible agency-rooted URI design and a
 DBPedia-like infrastructure and one has a very powerful hub
 that
 strengthens the Linked Data ecosystem. It would arguably be
 stronger
 if CAS identifiers were also (somehow) included, but even
 the bits of
 linking shown above change the value proposition of
 traditional
 proprietary naming schemes...
 
 John
 PS: At TWC we are about to go live with a registry called
 Instance
 Hub that will demonstrate the association of agency-based
 URI schemes
 --- think EPA, HHS, DOE, USDA, etc --- with instance data
 over which
 the agency has some authority or interest...More very
 soon!
 
 On Tue, Aug 23, 2011 at 8:31 AM, Patrick Durusau patr...@durusau.net
 wrote:
  David,
 
  On 8/22/2011 9:55 PM, David Booth wrote:
 
  On Mon, 2011-08-22 at 20:27 -0400, Patrick Durusau
 wrote:
  [ . . . ]
 
  The use of CAS identifiers supports searching across
 vast domains of
  *existing* literature. Not all, but most of it for the
 last 60 or so
  years.
 
  That is non-trivial and should not be lightly
 discarded.
 
  BTW, your objection is that non-licensed systems
 cannot use CAS
  identifiers? Are these commercial systems that are
 charging their
  customers? Why would you think such systems should be
 able to take
  information created by others?
 
  Using the information associated with an identifier is
 one thing; using
  the identifier itself is another.  I'm sure the
 CAS numbers have added
  non-trivial value that should not be ignored. 
 But their business model
  needs to change.  It is ludicrous in this web era
 to prohibit the use of
  the identifiers themselves.
 
  If there is one principle we have learned from the
 web, it is enormous
  value and importance of freely usable universal
 identifiers.  URIs rule!
  http://urisrule.org/
 
  :)
 
  Well, I won't take the bait on URIs, ;-), but will
 note that re-use of
  identifiers of a sort was addressed quite a few years
 ago.
 
  See: Feist Publications, Inc., v. Rural Telephone
 Service Co., 499 U.S. 340
  (1991) or follow this link:
 
  http://en.wikipedia.org/wiki/Feist_v._Rural
 
  The circumstances with CAS numbers is slightly
 different because to get
  access to the full set of CAS numbers I suspect you
 have to sign a licensing
  agreement on re-use, which makes it a matter of
 *contract* law and not
  copyright.
 
  Perhaps they should increase the limits beyond 10,000
 identifiers but the
  only people who want the whole monty as it were are
 potential commercial
  competitors.
 
  The people who publish the periodical Brain for
 example at $10,000 a year.
  Why should I want the complete set of identifiers to
 be freely available to
  help them?
 
  Personally I think given the head start that the CAS
 maintainers have on the
  literature, etc., that different models for use of the
 identifiers might
  suit their purposes just as well. Universal
 identifiers change over time and
  my concern is with the least semantic friction and not
 as much with how we
  get there.
 
  Hope you are having a great day!
 
  Patrick
 
 
 
 
  --
  Patrick Durusau
  patr...@durusau.net
  Chair, V1 - US TAG to JTC 1/SC 34
  Convener, JTC 1/SC 34/WG 3 (Topic Maps)
  Editor, OpenDocument Format TC (OASIS), Project Editor
 ISO/IEC 26300
  Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
 
  Another Word For It (blog): http://tm.durusau.net
  Homepage: http://www.durusau.net
  Twitter: patrickDurusau
 
 
 
 
 -- 
 John S. Erickson, Ph.D.
 http://bitwacker.com
 olyerick...@gmail.com
 Twitter: @olyerickson
 Skype: @olyerickson
 




Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-23 Thread Gannon Dick


--- On Tue, 8/23/11, Patrick Durusau patr...@durusau.net wrote:

The fact remains that even if we switched (miraculously) today to all
new URI identifiers, we will be accessing literature using prior
identifiers for a very long time. I suspect hundreds of years.

Somewhere around 1890, I think, the amount of published scientific literature 
exceeded the ability of a person to read it all in a lifetime.  Selectivity has 
been the rule for over 100 years.  So the answer is not hundreds of years 
it's forever.

--Gannon



CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-22 Thread David Wood
Hi all,

On Aug 19, 2011, at 06:37, Patrick Durusau wrote:
 Case in point, CAS, http://www.cas.org/. Coming up on 62 million organic and 
 inorganic substances given unique identifiers. What is the incentive for any 
 of their users/customers to switch to Linked Data?

Well, for one thing, CAS (like DUNS) identifiers are proprietary.  They can't 
be reused for the purposes of identification in non-licensed systems.  That 
causes no end of trouble for researchers, government agencies and corporations 
who have bought into those proprietary identification schemes only to find out 
that they can't reuse the identifiers in new contexts.

An example is the US Environmental Protection Agency, who uses CAS numbers.  
They cannot reuse those identifiers when they publish open government data.  
They are not thrilled about that.  The EPA is now publishing their own 
identifiers.  How long will CAS last as a standard?  How many ids has the 
Encyclopedia of Life developed?  Or Wikipedia?

DUNS numbers, another widely used proprietary identification scheme, are very 
similar.  Orgpedia [1] and similar approaches are and have been started just to 
break the deadlock of that scheme.

Face it:  People just hate being boxed in.  Sure, you can make a business model 
out of doing so, but don't expect anyone to love you for it.  The Web allows 
people to think about not boxing themselves in.  That is a direct threat to 
those older and less friendly business models, DUNS and CAS included.

Regards,
Dave

[1] http://dotank.nyls.edu/ORGPedia.html




Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-22 Thread Patrick Durusau

David,

On 8/22/2011 7:39 PM, David Wood wrote:

Hi all,

On Aug 19, 2011, at 06:37, Patrick Durusau wrote:
Case in point, CAS, http://www.cas.org/. Coming up on 62 million 
organic and inorganic substances given unique identifiers. What is 
the incentive for any of their users/customers to switch to Linked Data?


Well, for one thing, CAS (like DUNS) identifiers are proprietary. 
 They can't be reused for the purposes of identification in 
non-licensed systems.  That causes no end of trouble for researchers, 
government agencies and corporations who have bought into those 
proprietary identification schemes only to find out that they can't 
reuse the identifiers in new contexts.


Not quite correct. You can use up to 10,000 of the CAS identifiers 
before licensing restrictions kick in.


I think the EPA creating their own identifiers is the result of bad advice.

For the following reasons:

1) It simply dirties up the pond of identifiers for organic and 
inorganic substances with yet another identifier.


2) Users and other implementers will bear the added cost of supporting 
yet another set of identifiers.


3) The literature in the area will have yet another set of identifiers 
to either be discovered or mapped.


4) The expertise behind CAS numbers is well known and has a history of 
high quality work.


The use of CAS identifiers supports searching across vast domains of 
*existing* literature. Not all, but most of it for the last 60 or so years.


That is non-trivial and should not be lightly discarded.

BTW, your objection is that non-licensed systems cannot use CAS 
identifiers? Are these commercial systems that are charging their 
customers? Why would you think such systems should be able to take 
information created by others?


Hope you are having a great day!

Patrick



An example is the US Environmental Protection Agency, who uses CAS 
numbers.  They cannot reuse those identifiers when they publish open 
government data.  They are not thrilled about that.  The EPA is now 
publishing their own identifiers.  How long will CAS last as a 
standard?  How many ids has the Encyclopedia of Life developed?  Or 
Wikipedia?


DUNS numbers, another widely used proprietary identification scheme, 
are very similar.  Orgpedia [1] and similar approaches are and have 
been started just to break the deadlock of that scheme.


Face it:  People just hate being boxed in.  Sure, you can make a 
business model out of doing so, but don't expect anyone to love you 
for it.  The Web allows people to think about not boxing themselves 
in.  That is a direct threat to those older and less friendly business 
models, DUNS and CAS included.


Regards,
Dave

[1] http://dotank.nyls.edu/ORGPedia.html




--
Patrick Durusau
patr...@durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

Another Word For It (blog): http://tm.durusau.net
Homepage: http://www.durusau.net
Twitter: patrickDurusau



Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-22 Thread David Booth
On Mon, 2011-08-22 at 20:27 -0400, Patrick Durusau wrote:
[ . . . ]
 The use of CAS identifiers supports searching across vast domains of
 *existing* literature. Not all, but most of it for the last 60 or so
 years. 
 
 That is non-trivial and should not be lightly discarded.
 
 BTW, your objection is that non-licensed systems cannot use CAS
 identifiers? Are these commercial systems that are charging their
 customers? Why would you think such systems should be able to take
 information created by others?


Using the information associated with an identifier is one thing; using
the identifier itself is another.  I'm sure the CAS numbers have added
non-trivial value that should not be ignored.  But their business model
needs to change.  It is ludicrous in this web era to prohibit the use of
the identifiers themselves.  

If there is one principle we have learned from the web, it is enormous
value and importance of freely usable universal identifiers.  URIs rule!
http://urisrule.org/ 

:)


-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.




Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-19 Thread Patrick Durusau

Kingsley,

One more attempt.

The press release I pointed to was an example that would have to be 
particularized to a CIO or CTO in term of *their* expenses of 
integration, then showing *their* savings.


The difference in our positions, from my context, is that I am saying 
the benefit to enterprises has to be expressed in terms of *their* 
bottom line, over the next quarter, six months, year. I hear (your 
opinion likely differs) you saying there is a global benefit that 
enterprises should invest in with no specific ROI for their bottom line 
in any definite period.


Case in point, CAS, http://www.cas.org/. Coming up on 62 million organic 
and inorganic substances given unique identifiers. What is the incentive 
for any of their users/customers to switch to Linked Data?


As I said several post ago, your success depends upon people investing 
in a technology for your benefit. (In all fairness you argue they 
benefit as well, but they are the best judges of the best use of their 
time and resources.)


Hope you are looking forward to a great weekend!

Patrick

On 8/18/2011 10:09 PM, Kingsley Idehen wrote:

On 8/18/11 5:27 PM, Patrick Durusau wrote:

Kingsley,

Citing your own bookmark file hardly qualifies as market numbers. 


My own bookmark? I gave you a URL to a bookmark collection. The 
collection contains links for a variety of research documents.


People promoting technologies make up all sorts of numbers about what 
use of X will save. Reminds me of the music or software theft numbers. 


Er. and you posted a link to a press release. What's your point?


They have no relationship to any reality that I share.


But you posted an Informatica press release to make some kind of 
point. Or am I completely misreading and misunderstanding the purpose 
of that URL too?




It's been enjoyable as usual but without some common basis for 
discussion we aren't going to get any closer to a common understanding.


Correct :-)

Kingsley


Hope you are having a great week!

Patrick



On 8/18/2011 3:24 PM, Kingsley Idehen wrote:

On 8/18/11 2:50 PM, Patrick Durusau wrote:

Kingsley,

On 8/18/2011 1:52 PM, Kingsley Idehen wrote:

On 8/18/11 1:40 PM, Patrick Durusau wrote:

Kingsley,

From below:

This critical value only materializes via appropriate context 
lenses. For decision makers it is always via opportunity 
costs.  If someone else is eating you lunch by disrupting your 
market you simply have to respond. Thus, on this side of the 
fence its better to focus on eating lunch rather than warning 
about the possibility of doing so, or outlining how it could be 
done. Just do it! 


I appreciate the sentiment, Just do it! as my close friend Jack 
Park says it fairly often.


But Just do it! doesn't answer the question of cost/benefit.


I mean: just start eating the lunch i.e., make a solution that 
takes advantage of an opportunity en route to market disruption. 
Trouble with the Semantic Web is that people spend too much time 
arguing and postulating. Ironically, when TimBL worked on the 
early WWW, his mindset was: just do it! :-)



Still dodging the question I see. ;-)


Of course not.

You want market research numbers, see the related section at the end 
of this reply. I sorta assumed you would have found this 
serendipitously though? Ah! You don't quite believe in the utility 
of this Linked Data stuff etc..






It avoids it in favor of advocacy.


See my comments above. You are skewing my comments to match you 
desired outcome, methinks.



You reach that conclusion pretty frequently.


See my earlier comment.



I ask for hard numbers, you say that isn't your question and/or 
skewing your comments.


Yes. I didn't know this was about market research and numbers [1].





Example: Privacy controls and Facebook. How much would it cost to 
solve this problem?


I assume you know the costs of the above.
It won't cost north of a billion dollars to make a WebID based 
solution. In short, such a thing has existed for a long time, 
depending on your context lenses .




I assume everyone here is familiar with: 
http://www.w3.org/wiki/WebID ?


So we need to take the number of users who have a WebID and 
subtract that from the number of FaceBook users.


Yes?


No!

Take the number of people that have are members of a service that's 
ambivalent to the self calibration of the vulnerabilities of its 
members aka. privacy.




The remaining number need a WebID or some substantial portion, yes?


Ultimately they need a WebID absolutely! And do you know why? It 
will enable members begin the inevitable journey towards self 
calibration of their respective vulnerabilities.


I hope you understand that society is old and the likes of G+, FB 
are new and utterly immature. In society, one is innocent until 
proven guilty or not guilty. In the world of FB and G+ the 
fundamentals of society are currently being inverted. Anyone can 
ultimately say anything about you. Both parties are building cyber 
police states 

Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-19 Thread Kingsley Idehen

On 8/19/11 6:37 AM, Patrick Durusau wrote:

Kingsley,

One more attempt.

The press release I pointed to was an example that would have to be 
particularized to a CIO or CTO in term of *their* expenses of 
integration, then showing *their* savings.


Yes, and I sent you a link to a collection of similar documents from 
which you could find similar research depending on problem type. On the 
first page you should have seen a link to a research document about the 
cost of email spam, for instance.


CEO, CIOs, CTOs are all dealing with costs of:

1. Spam
2. Password Management
3. Security
4. Data Integration.

There isn't a shortage of market research material re. the above and 
their costs across a plethora of domains.




The difference in our positions, from my context, is that I am 
saying the benefit to enterprises has to be expressed in terms of 
*their* bottom line, over the next quarter, six months, year.
For what its worth I worked for many years as an accountant before I 
crossed over to the vendor realm during the early days of Open Systems 
-- when Unix was being introduced to enterprises. That's the reason why 
integration middleware and dbms technology has been my passion for 20+ 
years. I am a slightly different profile to what you assume in your 
comments re. cost-benefits analysis.


I hear (your opinion likely differs) you saying there is a global 
benefit that enterprises should invest in with no specific ROI for 
their bottom line in any definite period.


See comment above. I live problems first, then architect technology to 
solve them. When I tell you about the costs of data integration to 
enterprises I am basically telling you that I've lived the problem for 
many years. My understanding is quite deep. Sorry, but this isn't an 
area when I can pretend to be modest :-)




Case in point, CAS, http://www.cas.org/. Coming up on 62 million 
organic and inorganic substances given unique identifiers. What is the 
incentive for any of their users/customers to switch to Linked Data?


I think the issue is more about: what would identifiers provide to this 
organization with regards to the obvious need to virtualize its critical 
data sources such that:


1. data sources are represented as fine grained data objects
2. every data object is endowed with an identifier
3. identifiers become superkey that provide conduits highly navigable 
data object based zeitgeists -- a single identifier should resolve to 
graph pictorial representing all data associated with that specific 
identifier and and additional data that has been reconciled logically 
e.g., leverage owl:sameAs and IFP (inverse functional property) logic.




As I said several post ago, your success depends upon people investing 
in a technology for your benefit. (In all fairness you argue they 
benefit as well, but they are the best judges of the best use of their 
time and resources.)


Kingsley


Hope you are looking forward to a great weekend!

Patrick

On 8/18/2011 10:09 PM, Kingsley Idehen wrote:

On 8/18/11 5:27 PM, Patrick Durusau wrote:

Kingsley,

Citing your own bookmark file hardly qualifies as market numbers. 


My own bookmark? I gave you a URL to a bookmark collection. The 
collection contains links for a variety of research documents.


People promoting technologies make up all sorts of numbers about 
what use of X will save. Reminds me of the music or software theft 
numbers. 


Er. and you posted a link to a press release. What's your point?


They have no relationship to any reality that I share.


But you posted an Informatica press release to make some kind of 
point. Or am I completely misreading and misunderstanding the purpose 
of that URL too?




It's been enjoyable as usual but without some common basis for 
discussion we aren't going to get any closer to a common understanding.


Correct :-)

Kingsley


Hope you are having a great week!

Patrick



On 8/18/2011 3:24 PM, Kingsley Idehen wrote:

On 8/18/11 2:50 PM, Patrick Durusau wrote:

Kingsley,

On 8/18/2011 1:52 PM, Kingsley Idehen wrote:

On 8/18/11 1:40 PM, Patrick Durusau wrote:

Kingsley,

From below:

This critical value only materializes via appropriate context 
lenses. For decision makers it is always via opportunity 
costs.  If someone else is eating you lunch by disrupting your 
market you simply have to respond. Thus, on this side of the 
fence its better to focus on eating lunch rather than warning 
about the possibility of doing so, or outlining how it could be 
done. Just do it! 


I appreciate the sentiment, Just do it! as my close friend 
Jack Park says it fairly often.


But Just do it! doesn't answer the question of cost/benefit.


I mean: just start eating the lunch i.e., make a solution that 
takes advantage of an opportunity en route to market disruption. 
Trouble with the Semantic Web is that people spend too much time 
arguing and postulating. Ironically, when TimBL worked on the 
early WWW, his mindset was: just do it! :-)



Still 

Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-19 Thread Patrick Durusau

Kingsley,

Correction: I have never accused you of being modest or of not being an 
accountant. ;-)


Nor have I said the costs you talk about in your accountant voice don't 
exist.


The problem is identifying the cost to a particular client, say of email 
spam, versus the cost the solution for the same person.


For example, I picked a spam article at random that says a 100 person 
firm *could be losing* as much as $55,000 per year due to spam.


Think about that for a minute. That works out to $550 per person.

So, if your solution costs more than $550 per person, it isn't worth 
buying.


Besides, the $550 per person *isn't on the books.* Purchasing your 
solution is. As they say, spam is a hidden cost. Hidden costs are hard 
to quantify or get people to address.


Not to mention that your solution requires an investment before the 
software can exist for any benefit. That is an even harder sell.


Isn't investment to enable a return from another investment (software, 
later) something accountants can see?


Hope you are having a great day!

Patrick


PS: The random spam article: 
http://blogs.cisco.com/smallbusiness/the_big_cost_of_spam_viruses_for_small_business/



On 8/19/2011 9:57 AM, Kingsley Idehen wrote:

On 8/19/11 6:37 AM, Patrick Durusau wrote:

Kingsley,

One more attempt.

The press release I pointed to was an example that would have to be 
particularized to a CIO or CTO in term of *their* expenses of 
integration, then showing *their* savings.


Yes, and I sent you a link to a collection of similar documents from 
which you could find similar research depending on problem type. On 
the first page you should have seen a link to a research document 
about the cost of email spam, for instance.


CEO, CIOs, CTOs are all dealing with costs of:

1. Spam
2. Password Management
3. Security
4. Data Integration.

There isn't a shortage of market research material re. the above and 
their costs across a plethora of domains.




The difference in our positions, from my context, is that I am 
saying the benefit to enterprises has to be expressed in terms of 
*their* bottom line, over the next quarter, six months, year.
For what its worth I worked for many years as an accountant before I 
crossed over to the vendor realm during the early days of Open Systems 
-- when Unix was being introduced to enterprises. That's the reason 
why integration middleware and dbms technology has been my passion for 
20+ years. I am a slightly different profile to what you assume in 
your comments re. cost-benefits analysis.


I hear (your opinion likely differs) you saying there is a global 
benefit that enterprises should invest in with no specific ROI for 
their bottom line in any definite period.


See comment above. I live problems first, then architect technology to 
solve them. When I tell you about the costs of data integration to 
enterprises I am basically telling you that I've lived the problem for 
many years. My understanding is quite deep. Sorry, but this isn't an 
area when I can pretend to be modest :-)




Case in point, CAS, http://www.cas.org/. Coming up on 62 million 
organic and inorganic substances given unique identifiers. What is 
the incentive for any of their users/customers to switch to Linked Data?


I think the issue is more about: what would identifiers provide to 
this organization with regards to the obvious need to virtualize its 
critical data sources such that:


1. data sources are represented as fine grained data objects
2. every data object is endowed with an identifier
3. identifiers become superkey that provide conduits highly navigable 
data object based zeitgeists -- a single identifier should resolve to 
graph pictorial representing all data associated with that specific 
identifier and and additional data that has been reconciled logically 
e.g., leverage owl:sameAs and IFP (inverse functional property) logic.




As I said several post ago, your success depends upon people 
investing in a technology for your benefit. (In all fairness you 
argue they benefit as well, but they are the best judges of the best 
use of their time and resources.)


Kingsley


Hope you are looking forward to a great weekend!

Patrick

On 8/18/2011 10:09 PM, Kingsley Idehen wrote:

On 8/18/11 5:27 PM, Patrick Durusau wrote:

Kingsley,

Citing your own bookmark file hardly qualifies as market numbers. 


My own bookmark? I gave you a URL to a bookmark collection. The 
collection contains links for a variety of research documents.


People promoting technologies make up all sorts of numbers about 
what use of X will save. Reminds me of the music or software theft 
numbers. 


Er. and you posted a link to a press release. What's your point?


They have no relationship to any reality that I share.


But you posted an Informatica press release to make some kind of 
point. Or am I completely misreading and misunderstanding the 
purpose of that URL too?




It's been enjoyable as usual but without some 

Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-19 Thread Patrick Logan
As fascinating as this discussion is, maybe the two of you want to
work it out directly and then report back with a summary?

Speaking as just one subscriber's data point, of course, I'm...

-Patrick


On Fri, Aug 19, 2011 at 7:41 AM, Patrick Durusau patr...@durusau.net wrote:
 Kingsley,

 Correction: I have never accused you of being modest or of not being an
 accountant. ;-)

 Nor have I said the costs you talk about in your accountant voice don't
 exist.

 The problem is identifying the cost to a particular client, say of email
 spam, versus the cost the solution for the same person.

 For example, I picked a spam article at random that says a 100 person firm
 *could be losing* as much as $55,000 per year due to spam.

 Think about that for a minute. That works out to $550 per person.

 So, if your solution costs more than $550 per person, it isn't worth buying.

 Besides, the $550 per person *isn't on the books.* Purchasing your solution
 is. As they say, spam is a hidden cost. Hidden costs are hard to quantify or
 get people to address.

 Not to mention that your solution requires an investment before the software
 can exist for any benefit. That is an even harder sell.

 Isn't investment to enable a return from another investment (software,
 later) something accountants can see?

 Hope you are having a great day!

 Patrick


 PS: The random spam article:
 http://blogs.cisco.com/smallbusiness/the_big_cost_of_spam_viruses_for_small_business/


 On 8/19/2011 9:57 AM, Kingsley Idehen wrote:

 On 8/19/11 6:37 AM, Patrick Durusau wrote:

 Kingsley,

 One more attempt.

 The press release I pointed to was an example that would have to be
 particularized to a CIO or CTO in term of *their* expenses of integration,
 then showing *their* savings.

 Yes, and I sent you a link to a collection of similar documents from which
 you could find similar research depending on problem type. On the first page
 you should have seen a link to a research document about the cost of email
 spam, for instance.

 CEO, CIOs, CTOs are all dealing with costs of:

 1. Spam
 2. Password Management
 3. Security
 4. Data Integration.

 There isn't a shortage of market research material re. the above and their
 costs across a plethora of domains.


 The difference in our positions, from my context, is that I am saying
 the benefit to enterprises has to be expressed in terms of *their* bottom
 line, over the next quarter, six months, year.

 For what its worth I worked for many years as an accountant before I
 crossed over to the vendor realm during the early days of Open Systems --
 when Unix was being introduced to enterprises. That's the reason why
 integration middleware and dbms technology has been my passion for 20+
 years. I am a slightly different profile to what you assume in your comments
 re. cost-benefits analysis.

 I hear (your opinion likely differs) you saying there is a global
 benefit that enterprises should invest in with no specific ROI for their
 bottom line in any definite period.

 See comment above. I live problems first, then architect technology to
 solve them. When I tell you about the costs of data integration to
 enterprises I am basically telling you that I've lived the problem for many
 years. My understanding is quite deep. Sorry, but this isn't an area when I
 can pretend to be modest :-)


 Case in point, CAS, http://www.cas.org/. Coming up on 62 million organic
 and inorganic substances given unique identifiers. What is the incentive for
 any of their users/customers to switch to Linked Data?

 I think the issue is more about: what would identifiers provide to this
 organization with regards to the obvious need to virtualize its critical
 data sources such that:

 1. data sources are represented as fine grained data objects
 2. every data object is endowed with an identifier
 3. identifiers become superkey that provide conduits highly navigable data
 object based zeitgeists -- a single identifier should resolve to graph
 pictorial representing all data associated with that specific identifier and
 and additional data that has been reconciled logically e.g., leverage
 owl:sameAs and IFP (inverse functional property) logic.


 As I said several post ago, your success depends upon people investing in
 a technology for your benefit. (In all fairness you argue they benefit as
 well, but they are the best judges of the best use of their time and
 resources.)

 Kingsley

 Hope you are looking forward to a great weekend!

 Patrick

 On 8/18/2011 10:09 PM, Kingsley Idehen wrote:

 On 8/18/11 5:27 PM, Patrick Durusau wrote:

 Kingsley,

 Citing your own bookmark file hardly qualifies as market numbers.

 My own bookmark? I gave you a URL to a bookmark collection. The
 collection contains links for a variety of research documents.

 People promoting technologies make up all sorts of numbers about what
 use of X will save. Reminds me of the music or software theft numbers.

 Er. and you posted a link to a press 

Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-18 Thread Patrick Durusau

Kingsley,

Your characterization of problems is spot on:

On 8/18/2011 9:01 AM, Kingsley Idehen wrote:

snip

Linked Data addresses many real world problems. The trouble is that 
problems are subjective. If you have experienced a problem it doesn't 
exist. If you don't understand a problem it doesn't exist. If you 
don't know a problem exists then again it doesn't exist in you context.




But you left out: The recognized problem must *cost more* than the 
cost of addressing it.


A favorable cost/benefit ratio has to be recognized by the people being 
called upon to make the investment in solutions.


That is recognition of a favorable cost/benefit ratio by the W3C and 
company is insufficient.


Yes?

For the umpteenth time here are three real world problems addressed 
effectively by Linked Data courtesy of AWWW (Architecture of the World 
Wide Web):


1. Verifiable Identifiers -- as delivered via WebID (leveraging Trust 
Logic and FOAF)
2. Access Control Lists -- an application of WebID and Web Access 
Control Ontology
3. Heterogeneous Data Access and Integration -- basically taking use 
beyond the limits of ODBC, JDBC etc..


Let's apply the items above to some contemporary solutions that 
illuminate the costs of not addressing the above:


1. G+ -- the real name debacle is WebID 101 re. pseudonyms, 
synonyms, and anonymity
2. Facebook -- all the privacy shortcomings boil down to not 
understanding the power of InterWeb scale verifiable identifiers and 
access control lists
3. Twitter -- inability to turn Tweets into structured annotations 
that are basically nano-memes
4. Email, Comment, Pingback SPAM -- a result of not being able to 
verify identifiers
5. Precision Find -- going beyond the imprecision of Search Engines 
whereby subject attribute and properties are used to contextually 
discover relevant things (explicitly or serendipitously).


The problem isn't really a shortage of solutions, far from it.

For the sake of argument only, conceding these are viable solutions, the 
question is:


Do they provide more benefit than they cost?

If that can't be answered favorably, in hard currency (or some other 
continuum of value that appeals to particular investors), no one is 
going to make the investment.


Economics 101.

That isn't specific to SemWeb but any solution to a problem. The 
solution has to provide a favorable cost/benefit ratio or it won't be 
adopted. Or at least not widely.


Hope you are having a great day!

Patrick

--
Patrick Durusau
patr...@durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

Another Word For It (blog): http://tm.durusau.net
Homepage: http://www.durusau.net
Twitter: patrickDurusau




Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-18 Thread Kingsley Idehen

On 8/18/11 10:25 AM, Patrick Durusau wrote:

Kingsley,

Your characterization of problems is spot on:

On 8/18/2011 9:01 AM, Kingsley Idehen wrote:

snip

Linked Data addresses many real world problems. The trouble is that 
problems are subjective. If you have experienced a problem it doesn't 
exist. If you don't understand a problem it doesn't exist. If you 
don't know a problem exists then again it doesn't exist in you context.




But you left out: The recognized problem must *cost more* than the 
cost of addressing it.


Yes. Now in my case I assumed the above to be implicit when context is 
about a solution or solutions :-)


If a solution costs more than the problem, it is a problem^n matter. No 
good.




A favorable cost/benefit ratio has to be recognized by the people 
being called upon to make the investment in solutions.


Always! Investment evaluation 101 for any business oriented decision maker.



That is recognition of a favorable cost/benefit ratio by the W3C and 
company is insufficient.


Yes?


Yes-ish. And here's why. Implementation cost is a tricky factor, one 
typically glossed over in marketing communications that more often than 
not blind side decision makers; especially those that are extremely 
technically challenged. Note, when I say technically challenged I am 
not referring to programming skills. I am referring to basic 
understanding of technology as it applies to a given domain e.g. the 
enterprise.


Back to the W3C and The Semantic Web Project. In this case, the big 
issue is that degree of unobtrusive delivery hasn't been a leading 
factor -- bar SPARQL where its deliberate SQL proximity is all about 
unobtrusive implementation and adoption. Ditto R2RML .


RDF is an example of a poorly orchestrated revolution at the syntax 
level that is implicitly obtrusive at adoption and implementation time. 
It is in this context I agree fully with you. There was a misconception 
that RDF would be adopted like HTML, just like that. As we can all see 
today, that never happened and will never happened via revolution.


What can happen, unobtrusively, is the use and appreciation of solutions 
that generate Linked Data (expressed using a variety of syntaxes and 
serialized in a variety of formats). That's why we've invested so much 
time in both Linked Data Middleware and DBMS technology for ingestion, 
indexing, querying, and serialization.


For the umpteenth time here are three real world problems addressed 
effectively by Linked Data courtesy of AWWW (Architecture of the 
World Wide Web):


1. Verifiable Identifiers -- as delivered via WebID (leveraging Trust 
Logic and FOAF)
2. Access Control Lists -- an application of WebID and Web Access 
Control Ontology
3. Heterogeneous Data Access and Integration -- basically taking use 
beyond the limits of ODBC, JDBC etc..


Let's apply the items above to some contemporary solutions that 
illuminate the costs of not addressing the above:


1. G+ -- the real name debacle is WebID 101 re. pseudonyms, 
synonyms, and anonymity
2. Facebook -- all the privacy shortcomings boil down to not 
understanding the power of InterWeb scale verifiable identifiers and 
access control lists
3. Twitter -- inability to turn Tweets into structured annotations 
that are basically nano-memes
4. Email, Comment, Pingback SPAM -- a result of not being able to 
verify identifiers
5. Precision Find -- going beyond the imprecision of Search Engines 
whereby subject attribute and properties are used to contextually 
discover relevant things (explicitly or serendipitously).


The problem isn't really a shortage of solutions, far from it.

For the sake of argument only, conceding these are viable solutions, 
the question is:


Do they provide more benefit than they cost?


Yes. They do, unequivocally.


If that can't be answered favorably, in hard currency (or some other 
continuum of value that appeals to particular investors), no one is 
going to make the investment.


Economics 101.


This critical value only materializes via appropriate context lenses. 
For decision makers it is always via opportunity costs.  If someone else 
is eating you lunch by disrupting your market you simply have to 
respond. Thus, on this side of the fence its better to focus on eating 
lunch rather than warning about the possibility of doing so, or 
outlining how it could be done. Just do it!




That isn't specific to SemWeb but any solution to a problem.


Yes!

The solution has to provide a favorable cost/benefit ratio or it won't 
be adopted. Or at least not widely.


Hope you are having a great day!

Patrick




--

Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen








smime.p7s
Description: S/MIME Cryptographic Signature


Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-18 Thread Patrick Durusau

Kingsley,

From below:

This critical value only materializes via appropriate context 
lenses. For decision makers it is always via opportunity costs.  If 
someone else is eating you lunch by disrupting your market you simply 
have to respond. Thus, on this side of the fence its better to focus 
on eating lunch rather than warning about the possibility of doing so, 
or outlining how it could be done. Just do it! 


I appreciate the sentiment, Just do it! as my close friend Jack Park 
says it fairly often.


But Just do it! doesn't answer the question of cost/benefit.

It avoids it in favor of advocacy.

Example: Privacy controls and Facebook. How much would it cost to solve 
this problem? Then, what increase in revenue will result from solving it?


Or if Facebook's lunch is going to be eaten, say by G+, then why doesn't 
G+ solve the problem?


Are privacy controls are a non-problem?

Your context lenses.

True, you can market a product/service that no one has ever seen before. 
Like pet rocks.


And they just did it!

With one important difference.

Their *doing it* did not depend upon the gratuitous efforts of thousands 
if not millions of others.


Isn't that an important distinction?

Hope you are having a great day!

Patrick


On 8/18/2011 10:54 AM, Kingsley Idehen wrote:

On 8/18/11 10:25 AM, Patrick Durusau wrote:

Kingsley,

Your characterization of problems is spot on:

On 8/18/2011 9:01 AM, Kingsley Idehen wrote:

snip

Linked Data addresses many real world problems. The trouble is that 
problems are subjective. If you have experienced a problem it 
doesn't exist. If you don't understand a problem it doesn't exist. 
If you don't know a problem exists then again it doesn't exist in 
you context.




But you left out: The recognized problem must *cost more* than the 
cost of addressing it.


Yes. Now in my case I assumed the above to be implicit when context is 
about a solution or solutions :-)


If a solution costs more than the problem, it is a problem^n matter. 
No good.




A favorable cost/benefit ratio has to be recognized by the people 
being called upon to make the investment in solutions.


Always! Investment evaluation 101 for any business oriented decision 
maker.




That is recognition of a favorable cost/benefit ratio by the W3C and 
company is insufficient.


Yes?


Yes-ish. And here's why. Implementation cost is a tricky factor, one 
typically glossed over in marketing communications that more often 
than not blind side decision makers; especially those that are 
extremely technically challenged. Note, when I say technically 
challenged I am not referring to programming skills. I am referring 
to basic understanding of technology as it applies to a given domain 
e.g. the enterprise.


Back to the W3C and The Semantic Web Project. In this case, the big 
issue is that degree of unobtrusive delivery hasn't been a leading 
factor -- bar SPARQL where its deliberate SQL proximity is all about 
unobtrusive implementation and adoption. Ditto R2RML .


RDF is an example of a poorly orchestrated revolution at the syntax 
level that is implicitly obtrusive at adoption and implementation 
time. It is in this context I agree fully with you. There was a 
misconception that RDF would be adopted like HTML, just like that. As 
we can all see today, that never happened and will never happened via 
revolution.


What can happen, unobtrusively, is the use and appreciation of 
solutions that generate Linked Data (expressed using a variety of 
syntaxes and serialized in a variety of formats). That's why we've 
invested so much time in both Linked Data Middleware and DBMS 
technology for ingestion, indexing, querying, and serialization.


For the umpteenth time here are three real world problems addressed 
effectively by Linked Data courtesy of AWWW (Architecture of the 
World Wide Web):


1. Verifiable Identifiers -- as delivered via WebID (leveraging 
Trust Logic and FOAF)
2. Access Control Lists -- an application of WebID and Web Access 
Control Ontology
3. Heterogeneous Data Access and Integration -- basically taking use 
beyond the limits of ODBC, JDBC etc..


Let's apply the items above to some contemporary solutions that 
illuminate the costs of not addressing the above:


1. G+ -- the real name debacle is WebID 101 re. pseudonyms, 
synonyms, and anonymity
2. Facebook -- all the privacy shortcomings boil down to not 
understanding the power of InterWeb scale verifiable identifiers and 
access control lists
3. Twitter -- inability to turn Tweets into structured annotations 
that are basically nano-memes
4. Email, Comment, Pingback SPAM -- a result of not being able to 
verify identifiers
5. Precision Find -- going beyond the imprecision of Search Engines 
whereby subject attribute and properties are used to contextually 
discover relevant things (explicitly or serendipitously).


The problem isn't really a shortage of solutions, far from it.

For the sake of argument only, conceding these are 

Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-18 Thread Kingsley Idehen

On 8/18/11 1:40 PM, Patrick Durusau wrote:

Kingsley,

From below:

This critical value only materializes via appropriate context 
lenses. For decision makers it is always via opportunity costs.  If 
someone else is eating you lunch by disrupting your market you simply 
have to respond. Thus, on this side of the fence its better to focus 
on eating lunch rather than warning about the possibility of doing 
so, or outlining how it could be done. Just do it! 


I appreciate the sentiment, Just do it! as my close friend Jack Park 
says it fairly often.


But Just do it! doesn't answer the question of cost/benefit.


I mean: just start eating the lunch i.e., make a solution that takes 
advantage of an opportunity en route to market disruption. Trouble with 
the Semantic Web is that people spend too much time arguing and 
postulating. Ironically, when TimBL worked on the early WWW, his mindset 
was: just do it! :-)




It avoids it in favor of advocacy.


See my comments above. You are skewing my comments to match you desired 
outcome, methinks.




Example: Privacy controls and Facebook. How much would it cost to 
solve this problem?


I assume you know the costs of the above.
It won't cost north of a billion dollars to make a WebID based solution. 
In short, such a thing has existed for a long time, depending on your 
context lenses .



Then, what increase in revenue will result from solving it?


FB -- less vulnerability and bleed.

Startups or Smartups: massive opportunity to make sales by solving a 
palpable problem.




Or if Facebook's lunch is going to be eaten, say by G+, then why 
doesn't G+ solve the problem?


G+ is trying to do just that, but in the wrong Web dimension. That's why 
neither G+ nor FB have been able to solve the identity reconciliation 
riddle.




Are privacy controls are a non-problem?

Your context lenses.

True, you can market a product/service that no one has ever seen 
before. Like pet rocks.


And they just did it!

With one important difference.

Their *doing it* did not depend upon the gratuitous efforts of 
thousands if not millions of others.


Don't quite get your point. I am talking about a solution that starts 
off with identity reconciliation, passes through access control lists, 
and ultimately makes virtues of heterogeneous data virtualization 
clearer re. data integration pain alleviation.


In the above we have a market place north of 100 Billion Dollars.



Isn't that an important distinction?


Yes, and one that has never been lost on me :-)

Kingsley


Hope you are having a great day!

Patrick


On 8/18/2011 10:54 AM, Kingsley Idehen wrote:

On 8/18/11 10:25 AM, Patrick Durusau wrote:

Kingsley,

Your characterization of problems is spot on:

On 8/18/2011 9:01 AM, Kingsley Idehen wrote:

snip

Linked Data addresses many real world problems. The trouble is that 
problems are subjective. If you have experienced a problem it 
doesn't exist. If you don't understand a problem it doesn't exist. 
If you don't know a problem exists then again it doesn't exist in 
you context.




But you left out: The recognized problem must *cost more* than the 
cost of addressing it.


Yes. Now in my case I assumed the above to be implicit when context 
is about a solution or solutions :-)


If a solution costs more than the problem, it is a problem^n matter. 
No good.




A favorable cost/benefit ratio has to be recognized by the people 
being called upon to make the investment in solutions.


Always! Investment evaluation 101 for any business oriented decision 
maker.




That is recognition of a favorable cost/benefit ratio by the W3C and 
company is insufficient.


Yes?


Yes-ish. And here's why. Implementation cost is a tricky factor, one 
typically glossed over in marketing communications that more often 
than not blind side decision makers; especially those that are 
extremely technically challenged. Note, when I say technically 
challenged I am not referring to programming skills. I am referring 
to basic understanding of technology as it applies to a given domain 
e.g. the enterprise.


Back to the W3C and The Semantic Web Project. In this case, the big 
issue is that degree of unobtrusive delivery hasn't been a leading 
factor -- bar SPARQL where its deliberate SQL proximity is all about 
unobtrusive implementation and adoption. Ditto R2RML .


RDF is an example of a poorly orchestrated revolution at the syntax 
level that is implicitly obtrusive at adoption and implementation 
time. It is in this context I agree fully with you. There was a 
misconception that RDF would be adopted like HTML, just like that. As 
we can all see today, that never happened and will never happened via 
revolution.


What can happen, unobtrusively, is the use and appreciation of 
solutions that generate Linked Data (expressed using a variety of 
syntaxes and serialized in a variety of formats). That's why we've 
invested so much time in both Linked Data Middleware and DBMS 
technology for 

Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-18 Thread Patrick Durusau

Kingsley,

Here are some hard numbers on integration of data benefits:

Future Integration Needs: Emerging Complex Data - 
http://www.informatica.com/news_events/press_releases/Pages/08182011_aberdeen_b2b.aspx


*/Integration costs are rising/* -- As integration of external data 
rises, it continues to be a labor- and cost-intensive task, with 
organizations integrating external sources spending 25 percent of 
their total integration budget in this area. 


So I can ask a decision maker, what do you spend on integration now? 
Take 25% of that figure.


Compare to X cost for integration using my software Y.

Or better yet, selling the integrated data as a service.

Data that isn't in demand to be integrated, isn't.

Technique neutral, could be SemWeb, could be third-world coding shops, 
could be Watson.


Timely, useful, integrated results are all that count.

Hope you are having a great day!

Patrick

On 8/18/2011 1:40 PM, Patrick Durusau wrote:

Kingsley,

From below:

This critical value only materializes via appropriate context 
lenses. For decision makers it is always via opportunity costs.  If 
someone else is eating you lunch by disrupting your market you simply 
have to respond. Thus, on this side of the fence its better to focus 
on eating lunch rather than warning about the possibility of doing 
so, or outlining how it could be done. Just do it! 


I appreciate the sentiment, Just do it! as my close friend Jack Park 
says it fairly often.


But Just do it! doesn't answer the question of cost/benefit.

It avoids it in favor of advocacy.

Example: Privacy controls and Facebook. How much would it cost to 
solve this problem? Then, what increase in revenue will result from 
solving it?


Or if Facebook's lunch is going to be eaten, say by G+, then why 
doesn't G+ solve the problem?


Are privacy controls are a non-problem?

Your context lenses.

True, you can market a product/service that no one has ever seen 
before. Like pet rocks.


And they just did it!

With one important difference.

Their *doing it* did not depend upon the gratuitous efforts of 
thousands if not millions of others.


Isn't that an important distinction?

Hope you are having a great day!

Patrick


On 8/18/2011 10:54 AM, Kingsley Idehen wrote:

On 8/18/11 10:25 AM, Patrick Durusau wrote:

Kingsley,

Your characterization of problems is spot on:

On 8/18/2011 9:01 AM, Kingsley Idehen wrote:

snip

Linked Data addresses many real world problems. The trouble is that 
problems are subjective. If you have experienced a problem it 
doesn't exist. If you don't understand a problem it doesn't exist. 
If you don't know a problem exists then again it doesn't exist in 
you context.




But you left out: The recognized problem must *cost more* than the 
cost of addressing it.


Yes. Now in my case I assumed the above to be implicit when context 
is about a solution or solutions :-)


If a solution costs more than the problem, it is a problem^n matter. 
No good.




A favorable cost/benefit ratio has to be recognized by the people 
being called upon to make the investment in solutions.


Always! Investment evaluation 101 for any business oriented decision 
maker.




That is recognition of a favorable cost/benefit ratio by the W3C and 
company is insufficient.


Yes?


Yes-ish. And here's why. Implementation cost is a tricky factor, one 
typically glossed over in marketing communications that more often 
than not blind side decision makers; especially those that are 
extremely technically challenged. Note, when I say technically 
challenged I am not referring to programming skills. I am referring 
to basic understanding of technology as it applies to a given domain 
e.g. the enterprise.


Back to the W3C and The Semantic Web Project. In this case, the big 
issue is that degree of unobtrusive delivery hasn't been a leading 
factor -- bar SPARQL where its deliberate SQL proximity is all about 
unobtrusive implementation and adoption. Ditto R2RML .


RDF is an example of a poorly orchestrated revolution at the syntax 
level that is implicitly obtrusive at adoption and implementation 
time. It is in this context I agree fully with you. There was a 
misconception that RDF would be adopted like HTML, just like that. As 
we can all see today, that never happened and will never happened via 
revolution.


What can happen, unobtrusively, is the use and appreciation of 
solutions that generate Linked Data (expressed using a variety of 
syntaxes and serialized in a variety of formats). That's why we've 
invested so much time in both Linked Data Middleware and DBMS 
technology for ingestion, indexing, querying, and serialization.


For the umpteenth time here are three real world problems addressed 
effectively by Linked Data courtesy of AWWW (Architecture of the 
World Wide Web):


1. Verifiable Identifiers -- as delivered via WebID (leveraging 
Trust Logic and FOAF)
2. Access Control Lists -- an application of WebID and Web Access 
Control Ontology

Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-18 Thread Kingsley Idehen

On 8/18/11 2:03 PM, Patrick Durusau wrote:

Kingsley,

Here are some hard numbers on integration of data benefits:

Future Integration Needs: Emerging Complex Data - 
http://www.informatica.com/news_events/press_releases/Pages/08182011_aberdeen_b2b.aspx


*/Integration costs are rising/* -- As integration of external data 
rises, it continues to be a labor- and cost-intensive task, with 
organizations integrating external sources spending 25 percent of 
their total integration budget in this area. 


So I can ask a decision maker, what do you spend on integration now? 
Take 25% of that figure.


Compare to X cost for integration using my software Y.

Or better yet, selling the integrated data as a service.

Data that isn't in demand to be integrated, isn't.

Technique neutral, could be SemWeb, could be third-world coding shops, 
could be Watson.


Timely, useful, integrated results are all that count.


Technique wouldn't be SemWeb. It would be Data Virtualization that 
leverages Semantic Web Project outputs such as:


1. Linked Data -- data homogenization (virtualization) mechanism
2. OWL  -- facilitator of reasoning against the vitualized substrate.

To the target customer the experience would go something like this:

1. Install Data Virtualization product
2. Identify heterogeneous data sources and their access method -- these 
will typically accessible via ODBC, JDBC (if RDBMS hosted), Web Services 
(SOAP based or via RESTful patterns what used to be SOA), or URLs 
especially if external data sources are in the mix

3. Bind to data sources
4. Virtualize
5. Show the new levels of agility 1-4 accord across all tool capable of 
consuming URLs.


What would you call such a product? At OpenLink Software we call it 
OpenLink Virtuoso :-)


Kingsley


Hope you are having a great day!

Patrick

On 8/18/2011 1:40 PM, Patrick Durusau wrote:

Kingsley,

From below:

This critical value only materializes via appropriate context 
lenses. For decision makers it is always via opportunity costs.  If 
someone else is eating you lunch by disrupting your market you 
simply have to respond. Thus, on this side of the fence its better 
to focus on eating lunch rather than warning about the possibility 
of doing so, or outlining how it could be done. Just do it! 


I appreciate the sentiment, Just do it! as my close friend Jack 
Park says it fairly often.


But Just do it! doesn't answer the question of cost/benefit.

It avoids it in favor of advocacy.

Example: Privacy controls and Facebook. How much would it cost to 
solve this problem? Then, what increase in revenue will result from 
solving it?


Or if Facebook's lunch is going to be eaten, say by G+, then why 
doesn't G+ solve the problem?


Are privacy controls are a non-problem?

Your context lenses.

True, you can market a product/service that no one has ever seen 
before. Like pet rocks.


And they just did it!

With one important difference.

Their *doing it* did not depend upon the gratuitous efforts of 
thousands if not millions of others.


Isn't that an important distinction?

Hope you are having a great day!

Patrick


On 8/18/2011 10:54 AM, Kingsley Idehen wrote:

On 8/18/11 10:25 AM, Patrick Durusau wrote:

Kingsley,

Your characterization of problems is spot on:

On 8/18/2011 9:01 AM, Kingsley Idehen wrote:

snip

Linked Data addresses many real world problems. The trouble is 
that problems are subjective. If you have experienced a problem it 
doesn't exist. If you don't understand a problem it doesn't exist. 
If you don't know a problem exists then again it doesn't exist in 
you context.




But you left out: The recognized problem must *cost more* than 
the cost of addressing it.


Yes. Now in my case I assumed the above to be implicit when context 
is about a solution or solutions :-)


If a solution costs more than the problem, it is a problem^n matter. 
No good.




A favorable cost/benefit ratio has to be recognized by the people 
being called upon to make the investment in solutions.


Always! Investment evaluation 101 for any business oriented decision 
maker.




That is recognition of a favorable cost/benefit ratio by the W3C 
and company is insufficient.


Yes?


Yes-ish. And here's why. Implementation cost is a tricky factor, one 
typically glossed over in marketing communications that more often 
than not blind side decision makers; especially those that are 
extremely technically challenged. Note, when I say technically 
challenged I am not referring to programming skills. I am referring 
to basic understanding of technology as it applies to a given domain 
e.g. the enterprise.


Back to the W3C and The Semantic Web Project. In this case, the 
big issue is that degree of unobtrusive delivery hasn't been a 
leading factor -- bar SPARQL where its deliberate SQL proximity is 
all about unobtrusive implementation and adoption. Ditto R2RML .


RDF is an example of a poorly orchestrated revolution at the syntax 
level that is implicitly obtrusive 

Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-18 Thread Patrick Durusau

Kingsley,

On 8/18/2011 1:52 PM, Kingsley Idehen wrote:

On 8/18/11 1:40 PM, Patrick Durusau wrote:

Kingsley,

From below:

This critical value only materializes via appropriate context 
lenses. For decision makers it is always via opportunity costs.  If 
someone else is eating you lunch by disrupting your market you 
simply have to respond. Thus, on this side of the fence its better 
to focus on eating lunch rather than warning about the possibility 
of doing so, or outlining how it could be done. Just do it! 


I appreciate the sentiment, Just do it! as my close friend Jack 
Park says it fairly often.


But Just do it! doesn't answer the question of cost/benefit.


I mean: just start eating the lunch i.e., make a solution that takes 
advantage of an opportunity en route to market disruption. Trouble 
with the Semantic Web is that people spend too much time arguing and 
postulating. Ironically, when TimBL worked on the early WWW, his 
mindset was: just do it! :-)



Still dodging the question I see. ;-)



It avoids it in favor of advocacy.


See my comments above. You are skewing my comments to match you 
desired outcome, methinks.



You reach that conclusion pretty frequently.

I ask for hard numbers, you say that isn't your question and/or skewing 
your comments.




Example: Privacy controls and Facebook. How much would it cost to 
solve this problem?


I assume you know the costs of the above.
It won't cost north of a billion dollars to make a WebID based 
solution. In short, such a thing has existed for a long time, 
depending on your context lenses .




I assume everyone here is familiar with: http://www.w3.org/wiki/WebID ?

So we need to take the number of users who have a WebID and subtract 
that from the number of FaceBook users.


Yes?

The remaining number need a WebID or some substantial portion, yes?

So who bears that cost? Each of those users? It cost each of them 
something to get a WebID. Yes?


What is their benefit from getting that WebID? Will it outweigh their 
cost in their eyes?



Then, what increase in revenue will result from solving it?


FB -- less vulnerability and bleed.

Startups or Smartups: massive opportunity to make sales by solving a 
palpable problem.




Or if Facebook's lunch is going to be eaten, say by G+, then why 
doesn't G+ solve the problem?


G+ is trying to do just that, but in the wrong Web dimension. That's 
why neither G+ nor FB have been able to solve the identity 
reconciliation riddle.



Maybe you share your observations with G and FB. ;-)

Seriously, I don't think they are as dumb as everyone seems to think.

It may well be they have had this very discussion and decided it isn't 
cost effective to address.




Are privacy controls are a non-problem?

Your context lenses.

True, you can market a product/service that no one has ever seen 
before. Like pet rocks.


And they just did it!

With one important difference.

Their *doing it* did not depend upon the gratuitous efforts of 
thousands if not millions of others.


Don't quite get your point. I am talking about a solution that starts 
off with identity reconciliation, passes through access control lists, 
and ultimately makes virtues of heterogeneous data virtualization 
clearer re. data integration pain alleviation.


In the above we have a market place north of 100 Billion Dollars.



Yes, but your solution: ...starts off with identity reconciliation...

Sure, start with the critical problem already solved and you really are 
at a ...market place north of 100 Billion Dollars..., but that is all 
in your imagination.


Having a system of assigned and reconciled WebIDs isn't a zero cost to 
users or businesses solution. It is going to cost someone to assign and 
reconcile those WebIDs. Yes?


Since it is your solution, may I ask who is going to pay that cost?



Isn't that an important distinction?


Yes, and one that has never been lost on me :-)



Interested to hear your answer since that distinction has never been 
lost on you.


Patrick



Kingsley


Hope you are having a great day!

Patrick


On 8/18/2011 10:54 AM, Kingsley Idehen wrote:

On 8/18/11 10:25 AM, Patrick Durusau wrote:

Kingsley,

Your characterization of problems is spot on:

On 8/18/2011 9:01 AM, Kingsley Idehen wrote:

snip

Linked Data addresses many real world problems. The trouble is 
that problems are subjective. If you have experienced a problem it 
doesn't exist. If you don't understand a problem it doesn't exist. 
If you don't know a problem exists then again it doesn't exist in 
you context.




But you left out: The recognized problem must *cost more* than 
the cost of addressing it.


Yes. Now in my case I assumed the above to be implicit when context 
is about a solution or solutions :-)


If a solution costs more than the problem, it is a problem^n matter. 
No good.




A favorable cost/benefit ratio has to be recognized by the people 
being called upon to make the investment in solutions.


Always! Investment 

Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-18 Thread Patrick Durusau

Kingsley,

On 8/18/2011 2:25 PM, Kingsley Idehen wrote:

On 8/18/11 2:03 PM, Patrick Durusau wrote:

Kingsley,

Here are some hard numbers on integration of data benefits:

Future Integration Needs: Emerging Complex Data - 
http://www.informatica.com/news_events/press_releases/Pages/08182011_aberdeen_b2b.aspx


*/Integration costs are rising/* -- As integration of external data 
rises, it continues to be a labor- and cost-intensive task, with 
organizations integrating external sources spending 25 percent of 
their total integration budget in this area. 


So I can ask a decision maker, what do you spend on integration now? 
Take 25% of that figure.


Compare to X cost for integration using my software Y.

Or better yet, selling the integrated data as a service.

Data that isn't in demand to be integrated, isn't.

Technique neutral, could be SemWeb, could be third-world coding 
shops, could be Watson.


Timely, useful, integrated results are all that count.


Technique wouldn't be SemWeb. It would be Data Virtualization that 
leverages Semantic Web Project outputs such as:


1. Linked Data -- data homogenization (virtualization) mechanism
2. OWL  -- facilitator of reasoning against the vitualized substrate.

To the target customer the experience would go something like this:

1. Install Data Virtualization product
2. Identify heterogeneous data sources and their access method -- 
these will typically accessible via ODBC, JDBC (if RDBMS hosted), Web 
Services (SOAP based or via RESTful patterns what used to be SOA), or 
URLs especially if external data sources are in the mix

3. Bind to data sources
4. Virtualize
5. Show the new levels of agility 1-4 accord across all tool capable 
of consuming URLs.


What would you call such a product? At OpenLink Software we call it 
OpenLink Virtuoso :-)


I would call it *no sale* if OpenLink Virtuoso + services costs more 
than I am spending now.


Isn't that the pertinent question?

Patrick



Kingsley


Hope you are having a great day!

Patrick

On 8/18/2011 1:40 PM, Patrick Durusau wrote:

Kingsley,

From below:

This critical value only materializes via appropriate context 
lenses. For decision makers it is always via opportunity costs.  
If someone else is eating you lunch by disrupting your market you 
simply have to respond. Thus, on this side of the fence its better 
to focus on eating lunch rather than warning about the possibility 
of doing so, or outlining how it could be done. Just do it! 


I appreciate the sentiment, Just do it! as my close friend Jack 
Park says it fairly often.


But Just do it! doesn't answer the question of cost/benefit.

It avoids it in favor of advocacy.

Example: Privacy controls and Facebook. How much would it cost to 
solve this problem? Then, what increase in revenue will result from 
solving it?


Or if Facebook's lunch is going to be eaten, say by G+, then why 
doesn't G+ solve the problem?


Are privacy controls are a non-problem?

Your context lenses.

True, you can market a product/service that no one has ever seen 
before. Like pet rocks.


And they just did it!

With one important difference.

Their *doing it* did not depend upon the gratuitous efforts of 
thousands if not millions of others.


Isn't that an important distinction?

Hope you are having a great day!

Patrick


On 8/18/2011 10:54 AM, Kingsley Idehen wrote:

On 8/18/11 10:25 AM, Patrick Durusau wrote:

Kingsley,

Your characterization of problems is spot on:

On 8/18/2011 9:01 AM, Kingsley Idehen wrote:

snip

Linked Data addresses many real world problems. The trouble is 
that problems are subjective. If you have experienced a problem 
it doesn't exist. If you don't understand a problem it doesn't 
exist. If you don't know a problem exists then again it doesn't 
exist in you context.




But you left out: The recognized problem must *cost more* than 
the cost of addressing it.


Yes. Now in my case I assumed the above to be implicit when context 
is about a solution or solutions :-)


If a solution costs more than the problem, it is a problem^n 
matter. No good.




A favorable cost/benefit ratio has to be recognized by the people 
being called upon to make the investment in solutions.


Always! Investment evaluation 101 for any business oriented 
decision maker.




That is recognition of a favorable cost/benefit ratio by the W3C 
and company is insufficient.


Yes?


Yes-ish. And here's why. Implementation cost is a tricky factor, 
one typically glossed over in marketing communications that more 
often than not blind side decision makers; especially those that 
are extremely technically challenged. Note, when I say technically 
challenged I am not referring to programming skills. I am 
referring to basic understanding of technology as it applies to a 
given domain e.g. the enterprise.


Back to the W3C and The Semantic Web Project. In this case, the 
big issue is that degree of unobtrusive delivery hasn't been a 
leading factor -- bar SPARQL where its 

Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-18 Thread Bob Ferris

Just an example from practise:

http://blog.seevl.net/2011/08/18/about-json-ld-and-content-negotiation/

near the end of this blog post:

... Then, we save costs. - that's it! ;)

Cheers,


Bo



Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW

2011-08-18 Thread Patrick Durusau

Kingsley,

Citing your own bookmark file hardly qualifies as market numbers. People 
promoting technologies make up all sorts of numbers about what use of X 
will save. Reminds me of the music or software theft numbers. They have 
no relationship to any reality that I share.


It's been enjoyable as usual but without some common basis for 
discussion we aren't going to get any closer to a common understanding.


Hope you are having a great week!

Patrick



On 8/18/2011 3:24 PM, Kingsley Idehen wrote:

On 8/18/11 2:50 PM, Patrick Durusau wrote:

Kingsley,

On 8/18/2011 1:52 PM, Kingsley Idehen wrote:

On 8/18/11 1:40 PM, Patrick Durusau wrote:

Kingsley,

From below:

This critical value only materializes via appropriate context 
lenses. For decision makers it is always via opportunity costs.  
If someone else is eating you lunch by disrupting your market you 
simply have to respond. Thus, on this side of the fence its better 
to focus on eating lunch rather than warning about the possibility 
of doing so, or outlining how it could be done. Just do it! 


I appreciate the sentiment, Just do it! as my close friend Jack 
Park says it fairly often.


But Just do it! doesn't answer the question of cost/benefit.


I mean: just start eating the lunch i.e., make a solution that takes 
advantage of an opportunity en route to market disruption. Trouble 
with the Semantic Web is that people spend too much time arguing and 
postulating. Ironically, when TimBL worked on the early WWW, his 
mindset was: just do it! :-)



Still dodging the question I see. ;-)


Of course not.

You want market research numbers, see the related section at the end 
of this reply. I sorta assumed you would have found this 
serendipitously though? Ah! You don't quite believe in the utility of 
this Linked Data stuff etc..






It avoids it in favor of advocacy.


See my comments above. You are skewing my comments to match you 
desired outcome, methinks.



You reach that conclusion pretty frequently.


See my earlier comment.



I ask for hard numbers, you say that isn't your question and/or 
skewing your comments.


Yes. I didn't know this was about market research and numbers [1].





Example: Privacy controls and Facebook. How much would it cost to 
solve this problem?


I assume you know the costs of the above.
It won't cost north of a billion dollars to make a WebID based 
solution. In short, such a thing has existed for a long time, 
depending on your context lenses .




I assume everyone here is familiar with: http://www.w3.org/wiki/WebID ?

So we need to take the number of users who have a WebID and subtract 
that from the number of FaceBook users.


Yes?


No!

Take the number of people that have are members of a service that's 
ambivalent to the self calibration of the vulnerabilities of its 
members aka. privacy.




The remaining number need a WebID or some substantial portion, yes?


Ultimately they need a WebID absolutely! And do you know why? It will 
enable members begin the inevitable journey towards self calibration 
of their respective vulnerabilities.


I hope you understand that society is old and the likes of G+, FB are 
new and utterly immature. In society, one is innocent until proven 
guilty or not guilty. In the world of FB and G+ the fundamentals of 
society are currently being inverted. Anyone can ultimately say 
anything about you. Both parties are building cyber police states via 
their respective silos. Grr... don't get me going on this matter.


Every single netizen needs a verifiable identifier. That's the bottom 
line, and WebID (courtesy of Linked Data) and Trust Semantics nails 
the issue.




So who bears that cost? Each of those users? It cost each of them 
something to get a WebID. Yes?


Look here is a real world example. Just google up on wire shark re. 
Facebook and Google. Until the wire shark episodes both peddled lame 
excuses for not using HTTPS. Today both use HTTPS. Do you want to know 
why? Simple answer: opportunity cost of not doing so became palpable.




What is their benefit from getting that WebID? Will it outweigh their 
cost in their eyes?


See comment above.

We've already witnessed Craigslist horrors. But all of this is child's 
play if identity isn't fixed on the InterWeb. If you think I need to 
give you market numbers for that too, then I think we are simply 
talking past ourselves (a common occurence).





Then, what increase in revenue will result from solving it?


FB -- less vulnerability and bleed.

Startups or Smartups: massive opportunity to make sales by solving a 
palpable problem.




Or if Facebook's lunch is going to be eaten, say by G+, then why 
doesn't G+ solve the problem?


G+ is trying to do just that, but in the wrong Web dimension. That's 
why neither G+ nor FB have been able to solve the identity 
reconciliation riddle.



Maybe you share your observations with G and FB. ;-)


Hmm. wondering how you've concluded either way :-)



Seriously, I don't