Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-07-02 Thread Jet Villegas
This. I don't want to lose Jonas' point in this long thread, but I also
haven't read anything here that warrants new native parser(s) yet. Let's
iterate in Gaia for now. I don't see how a C++ metadata parser is
advantageous at this point, and the RDF history lessons certainly don't
encourage that path.

--Jet

On Wed, Jul 1, 2015 at 11:11 PM, Jonas Sicking  wrote:

> I'd definitely like to keep the implementation of whatever formats we
> use in Gaia given that this is still an experimental feature and the
> use cases are likely to evolve as we get user feedback.reiterate the
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-07-02 Thread Eric Rescorla
On Thu, Jul 2, 2015 at 11:47 AM, Gordon Brander 
wrote:

> This thread has been fun to follow. There are only 2 hard problems in Comp
> Sci and naming things is one of them ;).
>
> Just wanted to quickly chip in: during our lively discussion about naming,
> let’s not forget Postel’s Law.
>
> It’s smart to debate which format we should encourage for _publishing_.
> It’s wise to be liberal in what formats we _accept_


Hmm... I'm not sure Postel was really referring to this kind of case so
much as about specification compliance. In any case, I think there's an
argument to be made that supporting a lot of format is not a good thing.

See also::
http://datatracker.ietf.org/doc/draft-thomson-postel-was-wrong/

-Ekr


>
So we can encourage developers to use the solution we think is best,
> while simultaneously falling back to anything reasonable that’s
> there. og:x, twitter:y, Microformats... if it’s being actively used on the
> web we would be silly to turn up our nose at good data!
>
> ---
> Gordon Brander
> Sr Design Strategist
> Mozilla
>
> On July 2, 2015 at 10:59:15 , Benjamin Francis (bfran...@mozilla.com)
> wrote:
> > On 2 July 2015 at 03:37, Tantek Çelik wrote:
> >
> > > tl;dr: It's time. Let's land microformats parsing support in Gecko as
> > > a Q3 Platform deliverable that Gaia can use.
> > >
> >
> > Happy to hear this!
> >
> >
> > > I think there's rough consensus that a subset of OG, as described by
> > > Ted, satisfies this. Minimizing our exposure to OG (including Twitter
> > > Cards) is ideal for a number of reasons (backcompat/proprietary
> > > maintenance etc.).
> > >
> >
> > That's certainly a good start. It seems a shame to intentionally filter
> out
> > all the extra meta tags used by other Open Graph types like:
> >
> > - music.song
> > - music.album
> > - music.playlist
> > - music.radio_station
> > - video.movie
> > - video.episode
> > - video.tv_show
> > - article
> > - book
> > - profile
> > - business
> > - fitness.course
> > - game.achievement
> > - place
> > - product
> > - restaurant.menu
> >
> > I envisage allowing the community to contribute addons to add extra
> > experimental card packs for types we don't support out of the box from
> day
> > one. Filtering out this data would make it very difficult for them to do
> > that, for no good reason.
> >
> > I absolutely understand the argument about having to maintain backwards
> > compatibility with a format if we don't want to promote it going forward
> > though, which is why I agree we should be conservative when adding
> built-in
> > Open Graph types.
> >
> > There appear to be multiple options for this, with the best (most
> > > open, aligned with our mission, already open source interoperably
> > > implemented, etc.) being microformats.
> > >
> >
> > That is your opinion. There may be things you don't like about JSON-LD
> for
> > example, but it is a W3C Recommendation created through a standards body
> > and has open source implementations in just as many languages as
> > Microformats. There may be other more subjective measures of "open"
> you're
> > talking about, but I think it would be better for us all to stick to
> > arguments about technical merit and adoption statistics when making
> > comparisons in this case, at the risk of falling into the Not Invented
> Here
> > trap.
> >
> >
> > > "fulfils" mostly in theory. Schema is 99% overdesigned and
> > > aspirational, most objects and properties not showing up anywhere even
> > > in search results (except generic testing tools perhaps).
> > >
> >
> > > A small handful of Schema objects and subset of properties are
> > > actually implemented by anyone in anything user-facing.
> > >
> >
> > As I mentioned, level of current usage is not the most important criteria
> > for Gaia's own requirements, but if we're talking about how proven these
> > schemas are, according to schema.org these are the number of domains
> which
> > use the schemas we're talking about:
> >
> > - Person - over 1,000,000 domains
> > - Event - 100,000 - 250,000 domains
> > - ImageObject - over 1,000,000 domains
> > - AudioObject - 10,000 - 50,000 domains
> > - VideoObject - 100,000 - 200,000 domains
> > - RadioChannel - fewer than 10 domains
> > - EmailMessage - 100 - 1000 domains
> > - Comment - 10,000 - 50,000 domains
> >
> > The only equivalent data I have for Microformats is for hCard (equivalent
> > to the Person schema) from a crawl at the end of last year [1], and it
> has
> > about the same usage:
> >
> > - hCard - 1,095,517 domains
> >
> > The data also shows that Microdata and RDFa are used on more pages per
> > domain than Microformats.
> >
> > I'd say that Microformats looks at best equally as unproven on that
> basis,
> > though I'm open to new data.
> >
> >
> > > Everything else is untested, and claiming "fulfils these use cases"
> > > puts far too much faith in a company known for abandoning their
> > > overdesigned efforts (APIs, vocabularies, syntaxes!) every few years.
> > > Google Base 

Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-07-02 Thread Gordon Brander
This thread has been fun to follow. There are only 2 hard problems in Comp Sci 
and naming things is one of them ;).

Just wanted to quickly chip in: during our lively discussion about naming, 
let’s not forget Postel’s Law.

It’s smart to debate which format we should encourage for _publishing_.
It’s wise to be liberal in what formats we _accept_.

So we can encourage developers to use the solution we think is best, while 
simultaneously falling back to anything reasonable that’s there. og:x, 
twitter:y, Microformats... if it’s being actively used on the web we would be 
silly to turn up our nose at good data!

---
Gordon Brander
Sr Design Strategist  
Mozilla

On July 2, 2015 at 10:59:15 , Benjamin Francis (bfran...@mozilla.com) wrote:
> On 2 July 2015 at 03:37, Tantek Çelik wrote:
>  
> > tl;dr: It's time. Let's land microformats parsing support in Gecko as
> > a Q3 Platform deliverable that Gaia can use.
> >
>  
> Happy to hear this!
>  
>  
> > I think there's rough consensus that a subset of OG, as described by
> > Ted, satisfies this. Minimizing our exposure to OG (including Twitter
> > Cards) is ideal for a number of reasons (backcompat/proprietary
> > maintenance etc.).
> >
>  
> That's certainly a good start. It seems a shame to intentionally filter out
> all the extra meta tags used by other Open Graph types like:
>  
> - music.song
> - music.album
> - music.playlist
> - music.radio_station
> - video.movie
> - video.episode
> - video.tv_show
> - article
> - book
> - profile
> - business
> - fitness.course
> - game.achievement
> - place
> - product
> - restaurant.menu
>  
> I envisage allowing the community to contribute addons to add extra
> experimental card packs for types we don't support out of the box from day
> one. Filtering out this data would make it very difficult for them to do
> that, for no good reason.
>  
> I absolutely understand the argument about having to maintain backwards
> compatibility with a format if we don't want to promote it going forward
> though, which is why I agree we should be conservative when adding built-in
> Open Graph types.
>  
> There appear to be multiple options for this, with the best (most
> > open, aligned with our mission, already open source interoperably
> > implemented, etc.) being microformats.
> >
>  
> That is your opinion. There may be things you don't like about JSON-LD for
> example, but it is a W3C Recommendation created through a standards body
> and has open source implementations in just as many languages as
> Microformats. There may be other more subjective measures of "open" you're
> talking about, but I think it would be better for us all to stick to
> arguments about technical merit and adoption statistics when making
> comparisons in this case, at the risk of falling into the Not Invented Here
> trap.
>  
>  
> > "fulfils" mostly in theory. Schema is 99% overdesigned and
> > aspirational, most objects and properties not showing up anywhere even
> > in search results (except generic testing tools perhaps).
> >
>  
> > A small handful of Schema objects and subset of properties are
> > actually implemented by anyone in anything user-facing.
> >
>  
> As I mentioned, level of current usage is not the most important criteria
> for Gaia's own requirements, but if we're talking about how proven these
> schemas are, according to schema.org these are the number of domains which
> use the schemas we're talking about:
>  
> - Person - over 1,000,000 domains
> - Event - 100,000 - 250,000 domains
> - ImageObject - over 1,000,000 domains
> - AudioObject - 10,000 - 50,000 domains
> - VideoObject - 100,000 - 200,000 domains
> - RadioChannel - fewer than 10 domains
> - EmailMessage - 100 - 1000 domains
> - Comment - 10,000 - 50,000 domains
>  
> The only equivalent data I have for Microformats is for hCard (equivalent
> to the Person schema) from a crawl at the end of last year [1], and it has
> about the same usage:
>  
> - hCard - 1,095,517 domains
>  
> The data also shows that Microdata and RDFa are used on more pages per
> domain than Microformats.
>  
> I'd say that Microformats looks at best equally as unproven on that basis,
> though I'm open to new data.
>  
>  
> > Everything else is untested, and claiming "fulfils these use cases"
> > puts far too much faith in a company known for abandoning their
> > overdesigned efforts (APIs, vocabularies, syntaxes!) every few years.
> > Google Base / gData / etc. likely "fulfilled" these use cases too.
> >
>  
> Our Gecko and Gaia code is not going to stop working if Google decides to
> use something else. Content authors on the wider web might migrate to newer
> vocabularies (or even syntaxes) over time, but that's something we're going
> to have to monitor on an ongoing basis anyway.
>  
> Existing interoperably implemented microformats support most of these:
> >
> > - Contact - http://microformats.org/wiki/h-card
> > - Event - http://microformats.org/wiki/h-event
> > - Photo - http://microformats.o

Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-07-02 Thread Benjamin Francis
On 2 July 2015 at 03:37, Tantek Çelik  wrote:

> tl;dr: It's time. Let's land microformats parsing support in Gecko as
> a Q3 Platform deliverable that Gaia can use.
>

Happy to hear this!


> I think there's rough consensus that a subset of OG, as described by
> Ted, satisfies this. Minimizing our exposure to OG (including Twitter
> Cards) is ideal for a number of reasons (backcompat/proprietary
> maintenance etc.).
>

That's certainly a good start. It seems a shame to intentionally filter out
all the extra meta tags used by other Open Graph types like:

   - music.song
   - music.album
   - music.playlist
   - music.radio_station
   - video.movie
   - video.episode
   - video.tv_show
   - article
   - book
   - profile
   - business
   - fitness.course
   - game.achievement
   - place
   - product
   - restaurant.menu

I envisage allowing the community to contribute addons to add extra
experimental card packs for types we don't support out of the box from day
one. Filtering out this data would make it very difficult for them to do
that, for no good reason.

I absolutely understand the argument about having to maintain backwards
compatibility with a format if we don't want to promote it going forward
though, which is why I agree we should be conservative when adding built-in
Open Graph types.

There appear to be multiple options for this, with the best (most
> open, aligned with our mission, already open source interoperably
> implemented, etc.) being microformats.
>

That is your opinion. There may be things you don't like about JSON-LD for
example, but it is a W3C Recommendation created through a standards body
and has open source implementations in just as many languages as
Microformats. There may be other more subjective measures of "open" you're
talking about, but I think it would be better for us all to stick to
arguments about technical merit and adoption statistics when making
comparisons in this case, at the risk of falling into the Not Invented Here
trap.


> "fulfils" mostly in theory. Schema is 99% overdesigned and
> aspirational, most objects and properties not showing up anywhere even
> in search results (except generic testing tools perhaps).
>

> A small handful of Schema objects and subset of properties are
> actually implemented by anyone in anything user-facing.
>

As I mentioned, level of current usage is not the most important criteria
for Gaia's own requirements, but if we're talking about how proven these
schemas are, according to schema.org these are the number of domains which
use the schemas we're talking about:

   - Person - over 1,000,000 domains
   - Event - 100,000 - 250,000 domains
   - ImageObject - over 1,000,000 domains
   - AudioObject - 10,000 - 50,000 domains
   - VideoObject - 100,000 - 200,000 domains
   - RadioChannel - fewer than 10 domains
   - EmailMessage - 100 - 1000 domains
   - Comment - 10,000 - 50,000 domains

The only equivalent data I have for Microformats is for hCard (equivalent
to the Person schema) from a crawl at the end of last year [1], and it has
about the same usage:

   - hCard - 1,095,517 domains

The data also shows that Microdata and RDFa are used on more pages per
domain than Microformats.

I'd say that Microformats looks at best equally as unproven on that basis,
though I'm open to new data.


> Everything else is untested, and claiming "fulfils these use cases"
> puts far too much faith in a company known for abandoning their
> overdesigned efforts (APIs, vocabularies, syntaxes!) every few years.
> Google Base / gData / etc. likely "fulfilled" these use cases too.
>

Our Gecko and Gaia code is not going to stop working if Google decides to
use something else. Content authors on the wider web might migrate to newer
vocabularies (or even syntaxes) over time, but that's something we're going
to have to monitor on an ongoing basis anyway.

Existing interoperably implemented microformats support most of these:
>
> - Contact - http://microformats.org/wiki/h-card
> - Event - http://microformats.org/wiki/h-event
> - Photo - http://microformats.org/wiki/h-entry with u-photo property
> - Song - no current vocabulary - classic hAudio vocabulary could be
> simplified for this
> - Video - http://microformats.org/wiki/h-entry with u-video property
> - Radio station - no current vocabulary - worth researching with
> schema RadioChannel as input
> - Email - http://microformats.org/wiki/h-entry with u-in-reply-to property
> - Message - http://microformats.org/wiki/h-entry
>

OK, so there are actually three Microformats that are useful to us here.
For photos, videos, emails and messages we have to re-use the same hEntry
Microformat and try to figure out from its properties which type of thing
it is. For song and radio station we'd need to invent something new.

This is not very attractive for Firefox OS where we'd like to have cleary
defined types of cards with different card templates. It also makes it
harder for the community to create new types of cards

Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-07-02 Thread Kelly Davis
On Thu, Jul 2, 2015 at 4:37 AM, Tantek Çelik  wrote:

>
> > Schema.org also provides existing schemas for actions associated with
> items
> > (https://schema.org/docs/actions.html),
>
> ...
>
> Currently the IndieWeb community is pursuing Web Actions (and has them
> working across sites)
>
> http://indiewebcamp.com/webactions


TL;DR

WebActions, as presented in [1], are not sufficiently well developed for
us  to base an implementation upon. With lots of additional work, they
could one day form the basis of an implementation, but, as a target for
FirefoxOS 2.5, they are simply not there yet.


!(TL;DR)

I am uneasy about going in too much detail about each point as I feel that
doing so will be a waste of my time and yours. So, I'll try to keep it
short.

The WebActions referred to in [1] have many problems which need to be
addressed before they enter general usage:

- They are not well defined
- They are not well defined enough to compute over
- There is no well defined means of extension
- There is no active community
- There is no means to specify action parameters
- The vocabulary of current actions is not sufficient to do anything now
- ...

They are not well defined - The main  tag is used "to wrap
any third party/silo action buttons/links". What exactly is a "third party
action"? Are schema.org actions supported? If so, how does the schema.org
"target" attribute interact with the  "with" attribute? If
schema.org actions are not supported, what "third party actions" are?
Explicitly, how do the URL templates, say, of such unspecified "third party
actions" interact with the  "with" attribute?Basically,
this document[1] needs much work before it can be said to define an
.

They are not well defined enough to compute over - Assume a dailer web app
presents a  tag corresponding to a "dial" action. (This is a
simple use case that we have to be able to handle. This is not an edge
case.) Assume further that there was a well defined means of adding actions
so such an action could even exist. Does the  tag for the
dial action contain a URL for every possible number it can dial? (The
description of [1] never uses anything like URL templates. So, this seems
to imply that only non-template URL's are allowed.) Assuming that this was
a mistake in the description of [1] and URL templates are allowed, how does
one specify the type of a URL template argument. In other words, could I
pass "fldska" as a telephone number? This is never touched uponAgain,
it's obvious here that much work is needed before the document[1] can be
said to define an  that is able to be computed over.

There is no well defined means of extension - How do I add a new action?
This is not mentioned. (There is mention made of one possible new verb
"tip"[2], but no detail is given on its meaning. It's only stated "tip -
for Flattr, Gittip buttons and maybe other payment providers.") Mention is
made that "we can create a common verb registry like the rel registry", but
no registry is ever presented nor is the means to register in such a
registry if it even existed. Again, this needs lots of work which hasn't
been done.

There is no active community - The last entry in the History section of [1]
is from 2012. In contrast, the last entry in the schema.org github
repository[3] is from yesterday.

There is no means to specify action parameters - There is no mention of how
an action is parameterized. Again, a simple example. For a dailer web app
that exposes the dialing web action to dial an arbitrary phone number. How
is this dail action exposed with WebActions? There is no specification or
discussion of how this would occur. We can't possibly have a fixed URL for
all possible numbers; there has to be some type of URL template that one
can specify. There is no mention made of this. Again, this needs work.

The vocabulary of current actions is not sufficient to do anything now -
The only actions that exist now are post, reply, repost, and like (and
maybe tip), and there is no official means to add new actions. So, what if
we wanted a dial action? Currently this is impossible. With the current
limited vocabulary and no means to add new actions, WebActions is a
non-starter for my use cases and those of Taipei.



TL;DR:

WebActions, as presented in [1], are not sufficiently well developed for us
to base an implementation upon. With lots of additional work, they could
one day form the basis of an implementation, but, as a target for FirefoxOS
2.5, they are simply not there yet.


[1] http://indiewebcamp.com/webactions
[2] http://indiewebcamp.com/webactions-verbs-brainstorming
[3] https://github.com/schemaorg/schemaorg/commits/sdo-ganymede


-- 
Kelly Davis
Bringing a voice to Firefox OS
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-07-01 Thread Jonas Sicking
I'd definitely like to keep the implementation of whatever formats we
use in Gaia given that this is still an experimental feature and the
use cases are likely to evolve as we get user feedback.

It seems to me that given that our use case here, beyond OG, is only
our "internal" content, I.e. Gaia. So effectively we can choose
whatever format here we want as it has no effect on web content.

Given that, I'd definitely optimize for simplicity and simply extend
OG. If we want those extensions to not leak to the rest of the web we
can simply make the system app only honor those tags in Gaia content.

/ Jonas


On Mon, Jun 29, 2015 at 2:47 PM, Benjamin Francis  wrote:
> Thanks for the responses,
>
> Let me reiterate the Product requirements:
>
>1. Support for a syntax and vocabulary already in wide use on the web to
>allow the creation of cards for the largest possible volume of existing
>pinnable content
>2. Support for a syntax with a large enough and/or extensible vocabulary
>to allow cards to be created for all the types of pinnable content and
>associated actions we need in Gaia
>
> We need to deliver this by B2G 2.5 FL in September.
>
> *Existing Web Content*
> I think we're agreed that Open Graph gives us enough of a minimum viable
> product for the first requirement. However, it's not OK to just hard code
> particular og types into Gecko, we need to be able to experiment with cards
> for lots of different Open Graph types without having to modify Gecko every
> time (imagine system app addons with experimental card packs).
>
> Open Graph is just meta tags and we already have a mechanism for detecting
> specific meta tags in Gaia - the metachange event on the Browser API. As a
> minimum all we need to do to access Open Graph meta tags is to extend this
> event to include all meta tags with a "property" attribute, which is only
> used by Open Graph. We could go a step further and extend the event to all
> meta tags, which would also give us access to Twitter card markup for
> example, but that isn't essential. We do not need an RDFa parser for this,
> we can filter/clean up the data in the system app in Gaia where necessary
> (the system app is widely regarded to be part of the platform itself).
>
> *Gaia Content*
>
> Open Graph does not have a large enough vocabulary, or (as Kelly says) the
> ability to associate actions with content, needed for the second
> requirement. Schema.org has a large existing vocabulary which basically
> fulfils these use cases, though some parts are more tested than others,
> with examples given in Microdata, RDFa and JSON-LD syntaxes, eg:
>
>- Contact - http://schema.org/Person
>- Event - http://schema.org/Event
>- Photo - http://schema.org/Photograph
>- Song - http://schema.org/MusicRecording
>- Video - http://schema.org/VideoObject
>- Radio station - http://schema.org/RadioChannel
>- Email - http://schema.org/EmailMessage
>- Message - http://schema.org/Comment
>
> Schema.org also provides existing schemas for actions associated with items
> (https://schema.org/docs/actions.html), although examples are only given in
> JSON-LD syntax. Schema.org is just a vocabulary and Tantek tells me it's
> theoretically possible to express this vocabulary in Microformats syntax
> too - it's possible to create new vendor prefixed types, or suggest new
> standard types to be added to the Microformats wiki. This would be required
> because Microformats does not have a big enough existing vocabulary for
> Gaia's needs. Microdata, RDFa and JSON-LD use URL namespaces so are
> extensible by design with a non-centralised vocabulary (this is seen as a
> strength by some, as a weakness by others).
>
> The data we have [1][2][3][4] shows that Microdata, then RDFa (sometimes
> considered to include Open Graph), is used by the most pinnable content on
> the web, but the data does not include all modern Microformats. We also
> don't have any data for JSON-LD usage. However, existing usage is not the
> most important criteria for the second requirement, it's how well it fits
> the more complex use cases in Gaia (and how much work it is to implement).
>
> There is resistance to implementing a full Microdata or RDFa parser in
> Gecko due to its complexity. JSON-LD is more self-contained by design (for
> better or worse) and could be handed over to the Gaia system app directly
> via the Browser API without any parsing in Gecko. Microformats is possibly
> less Gecko work to implement than Microdata or RDFa, but more than JSON-LD.
>
> *Conclusions*
>
> My conclusion is that the least required work in Gecko for the highest
> return would be:
>
>1. *Open Graph* (bug 1178484) - Extending the existing metachange
>Browser API event to include all meta tags with a "property" attribute.
>This would allow Gaia to add support for all of the Open Graph types,
>fulfilling requirement 1.
>2. *JSON-LD* (bug 1178491) - Adding a linkeddatachange eve

Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-07-01 Thread Nicholas Nethercote
On Wed, Jul 1, 2015 at 7:37 PM, Tantek Çelik  wrote:
>
> There *is* a pretty strong engineering consensus, in both this thread,
> and other threads *against* any use of JSON-LD, or anything Linked
> Data or otherwise rebranded RDF / Semantic Web, and for good reason.

Indeed, just a few days ago bsmedberg -- the sole RDF module peer --
said (in https://bugzilla.mozilla.org/show_bug.cgi?id=1176160#c5):
"I'm hoping to just rm -rf rdf/ one of these days anyway".

See also https://bugzilla.mozilla.org/show_bug.cgi?id=833098 ("Kick
RDF out of Firefox") and
https://bugzilla.mozilla.org/show_bug.cgi?id=420506 ("Remove RDF use
from Thunderbird").

Nick
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-07-01 Thread Tantek Çelik
Great discussion and feedback in this thread - plenty to act on.

Thanks Ted Clancy for kicking this off with an impassioned reality
check. And Thanks in particular to Benjamin Francis for summarizing
product requirements and use-cases, and especially to both Ted and Ben
taking the time last week in Whistler to discuss all of this in person
- I definitely came away with a better understanding of the data,
problem space, and perspectives for Gaia's use-cases. I've also
followed up with Gregor and jst to broaden and double-check my
understanding and possible paths forward.


tl;dr: It's time. Let's land microformats parsing support in Gecko as
a Q3 Platform deliverable that Gaia can use.


Specifically:

On Mon, Jun 29, 2015 at 2:47 PM, Benjamin Francis  wrote:
> Thanks for the responses,
>
> Let me reiterate the Product requirements:
>
>1. Support for a syntax and vocabulary already in wide use on the web to
>allow the creation of cards for the largest possible volume of existing
>pinnable content

I think there's rough consensus that a subset of OG, as described by
Ted, satisfies this. Minimizing our exposure to OG (including Twitter
Cards) is ideal for a number of reasons (backcompat/proprietary
maintenance etc.).


>2. Support for a syntax with a large enough and/or extensible vocabulary
>to allow cards to be created for all the types of pinnable content and
>associated actions we need in Gaia

There appear to be multiple options for this, with the best (most
open, aligned with our mission, already open source interoperably
implemented, etc.) being microformats.

On that in particular:


> *Gaia Content*
>
> Open Graph does not have a large enough vocabulary, or (as Kelly says) the
> ability to associate actions with content, needed for the second requirement

The "associate actions with content" use-case is an interesting one
that's worthy of more specific follow-up on Kelly's response. More on
that separately.


> Schema.org has a large existing vocabulary which basically
> fulfils these use cases, though some parts are more tested than others,

"fulfils" mostly in theory. Schema is 99% overdesigned and
aspirational, most objects and properties not showing up anywhere even
in search results (except generic testing tools perhaps).

A small handful of Schema objects and subset of properties are
actually implemented by anyone in anything user-facing.

Everything else is untested, and claiming "fulfils these use cases"
puts far too much faith in a company known for abandoning their
overdesigned efforts (APIs, vocabularies, syntaxes!) every few years.
Google Base / gData / etc. likely "fulfilled" these use cases too.


> with examples given in Microdata, RDFa and JSON-LD syntaxes, eg:
>
>- Contact - http://schema.org/Person
>- Event - http://schema.org/Event
>- Photo - http://schema.org/Photograph
>- Song - http://schema.org/MusicRecording
>- Video - http://schema.org/VideoObject
>- Radio station - http://schema.org/RadioChannel
>- Email - http://schema.org/EmailMessage
>- Message - http://schema.org/Comment

This explicit list of use-cases is very helpful.

Existing interoperably implemented microformats support most of these:

- Contact - http://microformats.org/wiki/h-card
- Event - http://microformats.org/wiki/h-event
- Photo - http://microformats.org/wiki/h-entry with u-photo property
- Song - no current vocabulary - classic hAudio vocabulary could be
simplified for this
- Video - http://microformats.org/wiki/h-entry with u-video property
- Radio station - no current vocabulary - worth researching with
schema RadioChannel as input
- Email - http://microformats.org/wiki/h-entry with u-in-reply-to property
- Message - http://microformats.org/wiki/h-entry

For Song and Radio Station in particular - I will take the action of
bringing these use-cases to the microformats community and see what
the community can come up with, and how quickly. Discussion will be on
#microformats on Freenode (archived, see microformats.org/wiki/irc) if
anyone wants to contribute or just lurk.


> Schema.org also provides existing schemas for actions associated with items
> (https://schema.org/docs/actions.html),

The "actions" space has been a difficult and challenging one.

Google's (abandoned) "web intents" was one such effort.

Currently the IndieWeb community is pursuing Web Actions (and has them
working across sites)

http://indiewebcamp.com/webactions

There's likely potential there to connect webactions to be part of the
format of the post/page to be parsed, consumed, re-used.

Again, this is something I'll take to the #microformats community and
we can see what people there come up with.


> although examples are only given in
> JSON-LD syntax. Schema.org is just a vocabulary and Tantek tells me it's
> theoretically possible to express this vocabulary in Microformats syntax
> too - it's possible to create new vendor prefixed types, or suggest new
> standard types t

Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-06-29 Thread Benjamin Francis
Thanks for the responses,

Let me reiterate the Product requirements:

   1. Support for a syntax and vocabulary already in wide use on the web to
   allow the creation of cards for the largest possible volume of existing
   pinnable content
   2. Support for a syntax with a large enough and/or extensible vocabulary
   to allow cards to be created for all the types of pinnable content and
   associated actions we need in Gaia

We need to deliver this by B2G 2.5 FL in September.

*Existing Web Content*
I think we're agreed that Open Graph gives us enough of a minimum viable
product for the first requirement. However, it's not OK to just hard code
particular og types into Gecko, we need to be able to experiment with cards
for lots of different Open Graph types without having to modify Gecko every
time (imagine system app addons with experimental card packs).

Open Graph is just meta tags and we already have a mechanism for detecting
specific meta tags in Gaia - the metachange event on the Browser API. As a
minimum all we need to do to access Open Graph meta tags is to extend this
event to include all meta tags with a "property" attribute, which is only
used by Open Graph. We could go a step further and extend the event to all
meta tags, which would also give us access to Twitter card markup for
example, but that isn't essential. We do not need an RDFa parser for this,
we can filter/clean up the data in the system app in Gaia where necessary
(the system app is widely regarded to be part of the platform itself).

*Gaia Content*

Open Graph does not have a large enough vocabulary, or (as Kelly says) the
ability to associate actions with content, needed for the second
requirement. Schema.org has a large existing vocabulary which basically
fulfils these use cases, though some parts are more tested than others,
with examples given in Microdata, RDFa and JSON-LD syntaxes, eg:

   - Contact - http://schema.org/Person
   - Event - http://schema.org/Event
   - Photo - http://schema.org/Photograph
   - Song - http://schema.org/MusicRecording
   - Video - http://schema.org/VideoObject
   - Radio station - http://schema.org/RadioChannel
   - Email - http://schema.org/EmailMessage
   - Message - http://schema.org/Comment

Schema.org also provides existing schemas for actions associated with items
(https://schema.org/docs/actions.html), although examples are only given in
JSON-LD syntax. Schema.org is just a vocabulary and Tantek tells me it's
theoretically possible to express this vocabulary in Microformats syntax
too - it's possible to create new vendor prefixed types, or suggest new
standard types to be added to the Microformats wiki. This would be required
because Microformats does not have a big enough existing vocabulary for
Gaia's needs. Microdata, RDFa and JSON-LD use URL namespaces so are
extensible by design with a non-centralised vocabulary (this is seen as a
strength by some, as a weakness by others).

The data we have [1][2][3][4] shows that Microdata, then RDFa (sometimes
considered to include Open Graph), is used by the most pinnable content on
the web, but the data does not include all modern Microformats. We also
don't have any data for JSON-LD usage. However, existing usage is not the
most important criteria for the second requirement, it's how well it fits
the more complex use cases in Gaia (and how much work it is to implement).

There is resistance to implementing a full Microdata or RDFa parser in
Gecko due to its complexity. JSON-LD is more self-contained by design (for
better or worse) and could be handed over to the Gaia system app directly
via the Browser API without any parsing in Gecko. Microformats is possibly
less Gecko work to implement than Microdata or RDFa, but more than JSON-LD.

*Conclusions*

My conclusion is that the least required work in Gecko for the highest
return would be:

   1. *Open Graph* (bug 1178484) - Extending the existing metachange
   Browser API event to include all meta tags with a "property" attribute.
   This would allow Gaia to add support for all of the Open Graph types,
   fulfilling requirement 1.
   2. *JSON-LD* (bug 1178491) - Adding a linkeddatachange event to the
   Browser API which is dispatched by Gecko whenever it encounters a script
   tag with a type of "application/ld+json" (as per the W3C recommendation
   [5]), including the JSON content in the payload of the event. This would
   allow the Gaia system app to support existing schema.org schemas
   (including actions), with the least amount of work in Gecko, and already in
   a JSON format it can store directly in the Places database
   (DataStore/IndexedDB).

Kan-Ru is the owner of the Browser API module in Gecko and has said he's
happy with this approach and is happy to review the code. Let's go ahead
with that now, unblocking the work on the Gaia side. (Note that I have no
intention of building a full RDF style parser in Gaia, we'll just extract
the data we need from the JSON, for the good reasons t

Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-06-29 Thread Marcos Caceres



On June 29, 2015 at 7:07:33 AM, Michael Henretty (mhenre...@mozilla.com) wrote:
> We will definitely start with the simple open graph stuff that Ted
> mentioned ("og:title", "og:type", "og:url", "og:image", "og:description")
> since they are so widely used. And yes, even these simple ones are
> problematic. For instance, when navigating between dailymotion videos they
> keep the current meta tags, and just updates the html body content. In
> fact, single-page-apps in general are hard here. Also, on the mobile
> version of youtube they leave out og tags entirely, probably as a
> performance optimization. Turns out, many sites do this. So in 2.5 we will
> have to account for all of this and the solution might not be pretty.

ok, it's good to see you've already started to encounter the issues. 

> I think Microformats addresses the aforementioned problems.

They might, though they can also change from under you in fun ways, or be 
invalid/incorrect. 

> But if youtube,
> wikipedia, pinterest, twitter, facebook, tumblr, etc don't use them widely
> what is the point of supporting them in a moz-internal API? Let's be
> pragmatic and start with og. What's the next biggest win for us? Is the
> data clear? Ben seems to think JSON-LD [1], does anyone have data to the
> contrary?

I don't have data, just some graying hair and warnings from the distant past 
[1]. You've all seen already how controversial these formats are, and hopefully 
you understand why now (expecting validity/sanity from the web is a non-starter 
- it's the fallacy of the semantic web, and why we mockingly call it the 
"pedantic web" and recoil in horror and lash out with rage at the mere mention 
of it). 

So flip the problem a bit: what you actually want is just simple data that can 
be transformed into a card, right? basically, we scrape some text values from a 
HTML page and you just put it into a different HTML document: the card. 

As long as you don't expect validity of that data (i.e., you don't expect a 
standards conforming JSON-LD, RDFa, microdata, microformat, whatever parser*) 
then that frees us to build some kind of HTML Scraper that is actually built 
for purpose (one that is fault tolerant, and basically doesn't give a crap what 
the RDFa or JSON-LD spec says, but is designed to aggressively find the data 
you need to build nice cards). This is also why I suggest you start with og: 
data, because it basically takes the same approach: it doesn't give a crap what 
the RDFa spec says (and neither do developers that add it to their pages, as 
I'm sure you've already seen), it just defines some things by using some HTML 
elements that kinda-sorta looks like RDFa. However, it comes with a ton of 
problems which you will have a great time trying to deal with as you build the 
pinned-sites feature. The same with Twitter's card format. 

At the end of the day, what Gecko should be passing back is a simple JS object 
that contains:

{
og: {... name/value pairs...}
twitter: {... name/value pairs...}
other_because_we_can_add_new_things_as_needed_yay: {... name/value pairs...}
}

If we are not going to be doing any semantic inferencing on that data or 
actually doing the "linked data" part, then we don't need a JSON-LD 
representation of it. We just need a fairly simple structure from which FxOS 
can build different cards. That avoids talk of supporting controversial formats 
like JSON-LD and RDFa, while actually supporting web content: in the sense 
that, "we are just pulling this 'og' meta stuff from the page, we don't care 
what it is".  

My 2c,

[1] Warning from 2003, that the same things happened with RSS. They had to 
abandon XML:
http://www.xml.com/pub/a/2003/01/22/dive-into-xml.html   

"I know, I know, this is how HTML got to be "tag soup": browsers that never 
complained. Now the same thing is happening in the RSS world because the same 
social dynamics apply. End users who can't even spell "XML" certainly don't 
care about silly little formatting rules; they just want to follow their 
favorite sites in their news aggregator. When 10% of the world's RSS feeds are 
not well-formed -- including some high-profile feeds that thousands of people 
want to read -- the ability to parse ill-formed feeds becomes a competitive 
advantage. (And if you think the same thing won't happen when RDF and the 
Semantic Web go mainstream, you're deluding yourself. The same social dynamics 
apply. Boy, is that going to be messy.)"



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-06-29 Thread Marcos Caceres
On Saturday, June 27, 2015, Benjamin Francis  wrote:

> On 26 June 2015 at 19:25, Marcos Caceres  > wrote:
>
>> Could we see some examples of the cards you are generating already with
>> existing data from the Web (from your prototype)? The value is really in
>> seeing that users will get some real benefit, without expecting developers
>> to add additional metadata to their sites.
>>
>
> The prototype only supports Open Graph, you can see some example cards in
> this video Pinning the Web - Prototoype
> 
>

These look fantastic! so why not start with just those? Or are all those
card types done and thoroughly tested on a good chunk of Web content? As I
mentioned before, I'd be worried about the amount of error recovery
code that will be needed just for those types of cards. (Sorry, I don't
know any of the background and if you've already dealt with this).
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-06-29 Thread Marcos Caceres



On June 27, 2015 at 10:02:47 AM, Anne van Kesteren (ann...@annevk.nl) wrote:
> >
> The data I have does not back this up, Microdata is shown to be growing 
> fast whereas Microformats usage has remained relatively stable. 
> Also, we didn't find Microformats usage on any of the example 
> high profile sites we used during prototyping, it seems to be 
> more commonly used on Wordpress blogs and Indie Web style web 
> sites.

Could we see some examples of the cards you are generating already with 
existing data from the Web (from your prototype)? The value is really in seeing 
that users will get some real benefit, without expecting developers to add 
additional metadata to their sites.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-06-29 Thread kdavis
Let me start by saying I don't care which format we use. (Formats come, and 
formats go.) I do care, however, that my use case is supported.

My use case, speech enabling web apps and web pages for Firefox OS's voice 
assistant Vaani, requires that the chosen format support something akin to 
schema.org's actions[1] as well as the ability for anyone to add custom 
actions. This use case is also required by the Taipei team working on the 
Firefox OS TV.

Open Graph[2] does not support such actions. Thus, it is not sufficient for our 
use case. (Facebook extended Open Graph with actions[3]. However, the set of 
valid actions is completely under Facebook's control which makes their Open 
Graph extension a non-starter.)

Microdata[4], RDFa[5], and JSON-LD[6] do support actions. Hence, support for at 
least one of these is sufficient for our use case.

Microformats[7] currently does not support actions. Hence, it is not sufficient 
for our use case.

The Vaani team and the Taipei team working on the Firefox OS TV would love to 
base our work on that being done for pinning the web. (One of the 3 virtues of 
a programmer *is* laziness.) However, if neither Microdata, RDFa, nor JSON-LD 
is supported, we will, unfortunately, be forced to go our own way.

[1] http://schema.org/Action
[2] http://ogp.me/
[3] https://developers.facebook.com/docs/sharing/opengraph/using-actions
[4] http://www.w3.org/TR/microdata/
[5] http://www.w3.org/TR/xhtml-rdfa-primer/
[6] http://www.w3.org/TR/json-ld/
[7] http://microformats.org/wiki/Main_Page
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-06-28 Thread Michael Henretty
On Sat, Jun 27, 2015 at 5:51 AM, Marcos Caceres  wrote:

>
> These look fantastic! so why not start with just those? Or are all those
> card types done and thoroughly tested on a good chunk of Web content? As I
> mentioned before, I'd be worried about the amount of error recovery
> code that will be needed just for those types of cards. (Sorry, I don't
> know any of the background and if you've already dealt with this).
>


We will definitely start with the simple open graph stuff that Ted
mentioned ("og:title", "og:type", "og:url", "og:image", "og:description")
since they are so widely used. And yes, even these simple ones are
problematic. For instance, when navigating between dailymotion videos they
keep the current meta tags, and just updates the html body content. In
fact, single-page-apps in general are hard here. Also, on the mobile
version of youtube they leave out og tags entirely, probably as a
performance optimization. Turns out, many sites do this. So in 2.5 we will
have to account for all of this and the solution might not be pretty.

I think Microformats addresses the aforementioned problems. But if youtube,
wikipedia, pinterest, twitter, facebook, tumblr, etc don't use them widely
what is the point of supporting them in a moz-internal API? Let's be
pragmatic and start with og. What's the next biggest win for us? Is the
data clear? Ben seems to think JSON-LD [1], does anyone have data to the
contrary?

1.)
https://groups.google.com/d/msg/mozilla.dev.platform/5sUoRTPDnSE/24ckuPSydjQJ
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-06-27 Thread Benjamin Francis
On 26 June 2015 at 19:25, Marcos Caceres  wrote:

> Could we see some examples of the cards you are generating already with
> existing data from the Web (from your prototype)? The value is really in
> seeing that users will get some real benefit, without expecting developers
> to add additional metadata to their sites.
>

The prototype only supports Open Graph, you can see some example cards in
this video https://www.youtube.com/watch?v=FiLnRoRjD5k
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-06-26 Thread Benjamin Francis
On 26 June 2015 at 17:02, Anne van Kesteren  wrote:

> I would encourage you to go a little deeper...
> We need to judge standards on their merits


I did look deeper. I read most of all the specifications and several papers
on their adoption. My personal conclusion was that not only does
Microformats appear to be used less widely than other competing formats,
but that from a technical point of view just adding "h-" prefixes to class
names seems like a massive hack.

Many of the arguments I've heard in favour of Microformats are that it's
the "grassroots" or "non-evil" solution.

It's equally true that not being a W3C recommendation doesn't automatically
make something better either.

But I'm not the person that will have to implement this, and the people who
are think we should use Microformats.

Ben
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-06-26 Thread Anne van Kesteren
On Fri, Jun 26, 2015 at 2:18 PM, Benjamin Francis  wrote:
> When I look at RDFa, Microdata and JSON-LD I see formal W3C
> recommendations, extensive vocabularies which (at least on the surface) are
> agreed on by all the big search engines, and I see a clean engineering
> solution (albeit fairly complex).

Based on this kind of reasoning we almost ended up with XForms. I
would encourage you to go a little deeper. Let's make it clear for all
of dev.platform, a W3C Recommendation means nothing. Pretty much
anyone can get one. We need to judge standards on their merits and not
jump on the next XForms/XML/WS-*/SVG bandwagon.


-- 
https://annevankesteren.nl/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Linked Data must die. (was: Linked Data and a new Browser API event)

2015-06-26 Thread Benjamin Francis
On 26 June 2015 at 12:58, Ted Clancy  wrote:

> My apologies for the fact that this is such an essay, but I think this has
> become necessary.
>
> Firefox OS 2.5 will be unveiling a new feature called Pinning The Web, and
> there's been some discussion about whether we should leverage technologies
> like RDFa, Microdata, JSON-LD, Open Graph, and Microformats for this
> purpose.
>
> First, I'd like to give some background on these technologies.
>
> In 2001, Tim Berners-Lee said that the "Semantic Web" was the future of
> the web and was going to revolutionize our world. (
> http://www.scientificamerican.com/article/the-semantic-web/)
>
> The Semantic Web was a doomed idea, for reasons best articulated in essay
> by Cory Doctorow entitled "Metacrap", also written in 2001. (
> http://www.well.com/~doctorow/metacrap.htm) After 14 years of the
> Semantic Web not revolutionizing our world, I think history suggests that
> Cory Doctorow was right.
>
> But because the Semantic Web was "the next big thing", millions of dollars
> were poured into it (mostly in the form of research grants and crappy
> specs, from what I can gather). In 2004, RDFa became the first big standard
> to emerge from this work. RDFa is a W3C Recommendation, and work is still
> proceeding on it.
>
> JSON-LD was started in 2008 as a JSON-based alternative to RDFa. As the
> author of JSON-LD, Manu Sporny, states:
>
> "RDF is a shitty data model. It doesn’t have native support for lists.
> LISTS for fuck’s sake! [...] to work with RDF you typically needed a quad
> store, a SPARQL engine, and some hefty libraries. Your standard web
> developer has no interest in that toolchain because it adds more complexity
> to the solution than is necessary." (
> http://manu.sporny.org/2014/json-ld-origins-2/)
>
> However, though it originally wanted to distance itself from RDFa, JSON-LD
> ended up being chosen as a serialization for RDFa:
>
> "Around mid-2012, the JSON-LD stuff was going pretty well and the newly
> chartered RDF Working Group was going to start work on RDF 1.1. One of the
> work items was a serialization of RDF for JSON. [...] The biggest problem
> being that many of the participants in the RDF Working Group at the time
> didn’t understand JSON." (ibid)
>
> (I just want everyone to note that in 2012, *THE AUTHORS OF RDFa DID NOT
> KNOW JSON*. This is in a spec that casually throws around propositional
> logic terms like "entails", and "subject-predicate-object triples".)
>
> JSON-LD is now a W3C recommendation, and has undergone added complexity to
> align it with RDFa. As Manu Sporny states, "Nobody was happy with the
> result" (ibid).
>
> Microdata is similar to RDFa, but without the benefit of being a W3C
> recommendation.
>
> Open Graph is a technology developed by Facebook. It's putatively a subset
> of RDFa. There is a small subset of Open Graph tags (og:title, og:type,
> og:url, and og:image) which are widely used for sharing content on social
> media like Facebook and Twitter.
>
> RDFa, Microdata, and JSON-LD can collectively be described as "Linked
> Data" technologies, so called because their intention is that semantic
> objects across different web pages would "link" to each other to create a
> "Semantic Web".
>
> Microformats was developed circa 2005 as a lightweight way of putting
> semantic information into web pages, but does not aim to be a "Linked Data"
> or "Semantic Web" technology. It does not have an official standards body
> behind it, instead being maintained by a community of volunteers. One of
> our Mozilla employees, Tantek Çelik, was instrumental in its development.
>
>
Thanks for the history lesson :) When I started to research this area I
learnt very quickly that there are a lot of strong feelings on all sides
about which format is the "best", and many formats claim to supersede each
other. The reality is that there's still no clear winner on the web. So
what I've tried to do is to take a data driven approach to look at which
syntaxes and vocabularies are getting the most traction according to
research papers based on the Common Crawl corpus, the Bing corpus and the
Yahoo corpus (all the data I've found so far).

There are two high level requirements for the Pin the Web features:
1) Getting the most possible user value out of the data that already exists
on the web today
2) Finding the best solution for the use cases we have in Gaia apps which
can be implemented in the time frame we have for the 2.5 release (Feature
Landing on 21st September)

Based on the data available and the level of effort of implementation my
most recent conclusions for those requirements were:

1) Open Graph
2) JSON-LD

However, there's also a case for bonus points for a solution that we as
Mozilla actually want to see used in the future!


> Okay, now I'd like to discuss whether or not we should use these
> technologies for Pinning The Web.
>
> Open Graph: I think we need to use the four tags "og:title", "og:type",
> "og:url" and "og:image"