Re: [whatwg] Accessing local files with JavaScript portably and securely

2017-04-09 Thread Brett Zamir
I support such an approach and have found the usual "use a server" 
response a bit disheartening. Besides the stated cases, I believe it 
should just be easy for new programmers, children, etc., to try out 
simple projects with nothing more than a browser and text editor (and 
the console is not enough).


Another use case is for utility web apps to be shared such as doing text 
replacements offline without fears of privacy/security that the text one 
pastes can be read (if remote file access can optionally be prevented).


I also support an approach which grants privileges beyond just the 
directory where the file is hosted or its subdirectories, as this is too 
confining, e.g., if one has used a package manager like npm installing 
in root/node_modules, root/examples/index.html could not access it.


FWIW, Firefox previously had support for `enablePrivilege` which allowed 
local file access (and other privileged access upon consent of the user) 
but was removed: https://bugzilla.mozilla.org/show_bug.cgi?id=546848


I created an add-on, AsYouWish, to allow one to get this support back 
but later iterations of Firefox broke the code on which I was relying, 
and I was not able to work around the changes. It should be possible, 
however, to implement a good of its capabilities now in WebExtensions to 
allow reading local files (even optionally from a remote server) upon 
user permission though this would not work around the problem of file:// 
URLs just working as is.


For one subset of the local file usage case (and of less concern, 
security-wise), local data files, I also created an add-on WebAppFind 
(though I only got to Windows support) which allowed the user to open 
local desktop files from one's desktop into a web app without a need for 
drag-and-drop from the desktop to the app.


One only needed to double-click a desktop file (or use "Open with..."), 
having previously associated the file extension with a binary which 
would invoke Firefox with command line arguments that my add-on would 
pick up and, reading from a local "filetypes.json" file in the same 
directory (or alternatively, with custom web protocols where sites had 
previously registered to gain permission to handle certain local file 
types), determine which web site had permission to be given the content 
and to optionally be allowed to write-back any modified data to the 
user's supplied local file as well (all via `window.postMessage`). (The 
add-on didn't support arbitrary access to the file system which has some 
use cases such as a local file browser or a wiki that can link to one's 
local desktop files in a manner that allows opening them, but it at 
least allowed web apps to become first-class consumers of one's local data.)


But this add-on also broke with later iterations of Firefox (like so 
many other add-ons unlike, imv, the much better-stewarded 
backward-compatible web), and I haven't had a chance or energy to update 
for WebExtensions, but such an approach might work for you if 
implemented as a new add-on pending any adoption by browsers.


Best wishes,

Brett

On 10/04/2017 3:08 AM, whatwg-requ...@lists.whatwg.org wrote:


Send WHATWG mailing list submissions to: wha...@whatwg.org

To unsubscribe, send an e-mail to: whatwg-unsubscr...@whatwg.org

To subscribe, send an e-mail to: whatwg-subscr...@whatwg.org

You can reach the person managing the list at: i...@hixie.ch

When replying, please edit your Subject line so it is more specific
than "Re: Contents of whatwg digest..."


When replying to digest messages, please please PLEASE update the subject line 
so it isn't the digest subject line.

Today's Topics:

1. Accessing local files with JavaScript portably and securely
   (David Kendal)
2. Re: Accessing local files with JavaScript portably and
   securely (Melvin Carvalho)
3. Re: Accessing local files with JavaScript portably and
   securely (Jonathan Zuckerman)
4. Re: Accessing local files with JavaScript portably and
   securely (Philipp Serafin)


--

Message: 1
Date: Sun, 9 Apr 2017 11:51:14 +0200
From: David Kendal 
To: WHAT Working Group 
Subject: [whatwg] Accessing local files with JavaScript portably and
securely
Message-ID: <6edc81f3-95b0-4229-a3c5-76dbe548f...@dpk.io>
Content-Type: text/plain; charset=utf-8

Moin,

Over the last few years there has been a gradual downgrading of support
in browsers for running pages from the file: protocol. Most browsers now
have restrictions on the ability of JavaScript in such pages to access
other files.

Both Firefox and Chrome seem to have removed this support from XHR, and
there appears to be no support at all for Fetching local files from
other local files. This is an understandable security restriction, but
there is no viable replacement at present.

This is a shame because there are many possible uses for local static
files 

[whatwg] Array as first argument to fetch()

2015-03-27 Thread Brett Zamir
Since fetch() is making life easier as is and in the spirit of promises, 
how about taking it a step further to simplify the frequent use case of 
needing to retrieve multiple resources and waiting for all to return?


If the first argument to fetch() could be an array, then fetch() could 
be made to work like Promise.all() and return an array of the results to 
then().


Thanks,
Brett


Re: [whatwg] Array as first argument to fetch()

2015-03-27 Thread Brett Zamir

On 3/27/2015 8:39 PM, Anne van Kesteren wrote:

On Fri, Mar 27, 2015 at 1:28 PM, Brett Zamir bret...@yahoo.com wrote:

Since fetch() is making life easier as is and in the spirit of promises, how
about taking it a step further to simplify the frequent use case of needing
to retrieve multiple resources and waiting for all to return?

If the first argument to fetch() could be an array, then fetch() could be
made to work like Promise.all() and return an array of the results to
then().

It seems easy enough to just write that yourself, e.g.

   Promise.all([image, script].map(url = fetch(url)))

works fine in Firefox Nightly.


Thanks, I realize it's doable that way, but I think it makes for less 
distracting code when the implementation details of the map call and 
such are avoided for something as basic as loading resources...


Best,
Brett


Re: [whatwg] Shared storage

2014-11-10 Thread Brett Zamir

(Apologies...resending due to my inadvertent use of generic whatwg subject...)

On 10/28/2014 12:06 PM,whatwg-requ...@lists.whatwg.org  wrote:


Date: Tue, 28 Oct 2014 17:33:02 + (UTC)
From: Ian Hicksoni...@hixie.ch
Subject: Re: [whatwg] Shared storage
On Sat, 15 Feb 2014, Brett Zamir wrote:

The desktop PC thankfully evolved into allowing third-party software
which could create and edit files shareable by other third-party
software which would have the same rights to do the same. The importance
of this can hardly be overestimated.

Yet today, on the web, there appears to be no standard way to create
content in such an agnostic manner whereby users have full, built-in,
locally-controlled portability of their data.

Why can't you just do the same as used to be done? Download the resource
locally (save, using a href download), then upload it to the new site
(open, using input type=file)?


Yes, as mentioned by others, this can become a terrible user experience,
and this is not to speak of user agent reasons.

Besides the added challenges mentioned by Katelyn Gadd when live updates
occur on multiple instances of the file across different applications,
there is also the even more common use case of a particular app being
able to remember the files used previously and let the user access them
without needing to remember their exact location in a hierarchy (though
allowing a hierarchy is desirable to the user for flexibility in
organization).

Such recall of, and access to, previously used files without repeated
need for manual selection is commonly found in apps which use the likes
of a recent files drop-down or, perhaps even more commonly, by a set
of tabs which open with the last set of used files.

There is also the specific desirability for functionality to iterate
through file names (or other shared data) so that apps can provide their
own UI, perhaps filtered down by file type according to the types of
files consumable by the app, such as an IDE project viewer which still
allows the user to group files app-agnostically and where they wish
along with other file types.

The ability to store files of different types within the same
user-viewable and user-creatable folder is compelling because the user
has freedom to group like content together even if the file types
differ. A user might wish to store an email draft, a word processing
file, and a set of images all in the same folder to keep track of them,
even if a given web app might not utilize all of these types. While
there are apps which aggregate data from different sources and then let
the user tag them in such a manner as to mimic this functionality, this
is again application-specific and not necessarily portable or as
flexibly under user control as would be a shared and hierarchical file
storage area.

Best,
Brett



Re: [whatwg] whatwg Digest, Vol 127, Issue 5

2014-11-03 Thread Brett Zamir

On 10/28/2014 12:06 PM, whatwg-requ...@lists.whatwg.org wrote:

Date: Tue, 28 Oct 2014 17:33:02 + (UTC)
From: Ian Hicksoni...@hixie.ch
Subject: Re: [whatwg] Shared storage
On Sat, 15 Feb 2014, Brett Zamir wrote:


The desktop PC thankfully evolved into allowing third-party software
which could create and edit files shareable by other third-party
software which would have the same rights to do the same. The importance
of this can hardly be overestimated.

Yet today, on the web, there appears to be no standard way to create
content in such an agnostic manner whereby users have full, built-in,
locally-controlled portability of their data.

Why can't you just do the same as used to be done? Download the resource
locally (save, using a href download), then upload it to the new site
(open, using input type=file)?


Yes, as mentioned by others, this can become a terrible user experience, 
and this is not to speak of user agent reasons.


Besides the added challenges mentioned by Katelyn Gadd when live updates 
occur on multiple instances of the file across different applications, 
there is also the even more common use case of a particular app being 
able to remember the files used previously and let the user access them 
without needing to remember their exact location in a hierarchy (though 
allowing a hierarchy is desirable to the user for flexibility in 
organization).


Such recall of, and access to, previously used files without repeated 
need for manual selection is commonly found in apps which use the likes 
of a recent files drop-down or, perhaps even more commonly, by a set 
of tabs which open with the last set of used files.


There is also the specific desirability for functionality to iterate 
through file names (or other shared data) so that apps can provide their 
own UI, perhaps filtered down by file type according to the types of 
files consumable by the app, such as an IDE project viewer which still 
allows the user to group files app-agnostically and where they wish 
along with other file types.


The ability to store files of different types within the same 
user-viewable and user-creatable folder is compelling because the user 
has freedom to group like content together even if the file types 
differ. A user might wish to store an email draft, a word processing 
file, and a set of images all in the same folder to keep track of them, 
even if a given web app might not utilize all of these types. While 
there are apps which aggregate data from different sources and then let 
the user tag them in such a manner as to mimic this functionality, this 
is again application-specific and not necessarily portable or as 
flexibly under user control as would be a shared and hierarchical file 
storage area.


Best,
Brett



Re: [whatwg] Proposal: Inline pronounce element

2014-06-09 Thread Brett Zamir

On 6/10/2014 3:05 AM, whatwg-requ...@lists.whatwg.org wrote:

Message: 1
Date: Sun, 08 Jun 2014 15:41:32 -0400
From: timel...@gmail.com
To: whatwg@lists.whatwg.org
Subject: Re: [whatwg] Proposal: Inline pronounce element
Message-ID: 20140608194132.7602328.57406@gmail.com
Content-Type: text/plain; charset=iso-8859-1

Tab wrote:

This is already theoretically addressed by link rel=pronunciation,
linking to a well-defined pronunciation file format. Nobody
implements that, but nobody implements anything new either, of course.

Brett wrote:

I think it'd be a lot easier for sites, say along the lines of
Wikipedia, to support inline markup to allow users to get a word
referenced at the beginning of an article, for example, pronounced
accurately.

Wikipedia can easily use data:... if it needs to.?
And wiktionary already has a solution...

A better challenge is explaining to a screen reader if read is rEd or 
rehD in a page where you want to define and use both. I claim that this can be addressed with id= 
on the link and a ref= (or similar) on the use.?

But before User Agents should be asked to support this, I'd want to see real 
sites showing an interest.?

Screen Reader vendors seem ok with the current state - they sell the 
pronunciation tables...


My thought was that browsers could expose some interface for getting the 
word pronounced even if the user was not using a screen reader. And 
without a site needing to have supplied it's own JavaScript to apply 
styling and buttons around such tags so that when clicked, a 
`SpeechSynthesisUtterance` would be made.


Brett



Re: [whatwg] Proposal: Inline pronounce element (Tab Atkins Jr.)

2014-06-04 Thread Brett Zamir

On 6/5/2014 3:05 AM, whatwg-requ...@lists.whatwg.org wrote:
  
On Tue, Jun 3, 2014 at 3:26 AM, Daniel Morris

daniel+wha...@honestempire.com wrote:

Hello,

With existing assistive technology such as screen readers, and more
recently the pervasiveness of new technologies such as Siri and Google
Now to name two examples, I have been thinking about the
appropriateness and potential of having a way to represent the
pronunciation of words on a web page.

There is currently no other text-level semantic that I know of for
pronunciation, but we have elements for abbreviation and definition.

As an initial suggestion:

pronounce ipa=??a?p?d?iPad/pronounce

(Where the `ipa` attribute is the pronunciation using the
International Phonetic Alphabet.)

What are your thoughts on this, or does something already exist that I
am not aware of?

This is already theoretically addressed by link rel=pronunciation,
linking to a well-defined pronunciation file format.  Nobody
implements that, but nobody implements anything new either, of course.

~TJ


I think it'd be a lot easier for sites, say along the lines of 
Wikipedia, to support inline markup to allow users to get a word 
referenced at the beginning of an article, for example, pronounced 
accurately.


Brett



[whatwg] Self-imposed script or networking restrictions (toward privacy and XSS-protection, and stimulating offline tool creation and increased but safer privilege escalation)

2014-04-12 Thread Brett Zamir

*Problem:*

I believe that if Ajax or other forms of dynamic scripting had been 
absent but were proposed today, people would probably say things like:


1. What, you want to allow sites to have the ability out of the 
box to track my mouse movements?
2. You want sites to be able to know what I'm typing in a text box 
before I choose to explicitly submit?
3. You want to allow executable code, including third party code, 
to update itself without a means for a developer to review the source?
4. You want my generic offline build tool to cause users concern 
about whether the data I've added locally will be sent back to the server?


While browsers will hopefully allow user-directed control of this 
regardless of standards, ala NoScript or the like, there is no means (to 
my knowledge), besides perhaps packaging the file as an add-on with 
limited privileges or a sandboxed site-specific_browser ( 
https://en.wikipedia.org/wiki/Site-specific_browser ), by which a 
website author can cause its own particular web app loaded over the web 
to be issued in such as manner as to allay privacy or cross-site 
scripting concerns by having the browser inform the user that a given 
type of (safe) app has been loaded, one which is either not able to:

1. execute any scripts at all
2. phone home at all
3. phone home at all after the initial static HTML payload
4. phone home at all after the initial static HTML payload except 
to a list of origins indicated to the user


The above ought to also allay security concerns in the case that a 
script uses eval capabilities on a post-load network-delivered string.


*Proposed Solution:*

I would like to propose that HTTP headers, an internal HTML tag 
attribute (like the offline apps manifest attribute), and/or changes to 
the AppCache manifest structure be created which will cause desired 
security principal restrictions to be enforced, in particular either:


1. preventing all scripting
2. allowing scripting but prohibiting all Ajax or other networking 
requests (dynamic insertion of iframes, image or link tags with remote 
URLs, dynamic form submissions, etc.)
3. allowing scripting but prohibiting all Ajax or other networking 
requests except to URL origins or pages indicated within a whitelist 
(where the browser UI would display any such whitelist to users and/or 
prevent requests to the whitelist until the user approved, ideally 
providing them first a view of the proposed payload, and if approved, 
with a preview of the response payload and a choice as to whether to 
accept, and with the option to remember any such outgoing or incoming 
approvals in the future). Perhaps the AppCache manifest could be 
extended to provide this directive of prohibited or conditional updating.
4. optionally excluding loading of other resources within even the 
initial declarative HTML payload.


Data: URIs and blobs could, however, be allowed dynamically where 
scripting was allowed (e.g., whether through window.open(), a dynamic 
execution of an a download= tag or the like) as well as statically 
without the need for a whitelist.


*Benefits:*

1. The designation by authors of this status could give assurance to 
users that privacy between page loads will not be violated on sites of 
general interest. This would be especially useful in apps utilizing 
IndexedDB where the user might make a lot of local modifications but not 
wish to share these back with the origin server.


2. It could also be a boon to web developers who wish to share generic 
build tools (whether ones that are online, offlinable via AppCache, 
and/or which run from file://) in a manner which does not raise privacy 
concerns.


Currently, many web apps are distributed along with OS-dependent build 
scripts or scripts using non-client-side languages such as Node.js, 
Python, or Java, and whose interpretation requires the downloading of 
additional tools. If the developer could create a tool which would not 
raise privacy concerns but would instead inform users that the code they 
have loaded locally will not be able to update itself without their 
permission (at least without giving consent to a proposed payload back 
to the server), developers may prefer to write in client-side JavaScript 
for their own convenience (given the ubiquity of client-side 
JavaScript), and with the benefit of facilitating modifications on their 
tools whether by paid developers or open source contributors who may be 
most likely to know at least some JavaScript. Their code could provide 
build functionality by compiling locally designated files or other 
content into a file download, texarea, etc.


3. If implemented, the sandboxing capability could be leveraged to blur 
the line between add-ons, web apps, and mobile apps and be used by 
site-specific browser implementations. Web app users would gain the same 
peace of mind and more granular privilege control of mobile apps.


As a result of users knowing there 

[whatwg] Shared storage

2014-02-14 Thread Brett Zamir

*The opportunity and current obstacles*

The desktop PC thankfully evolved into allowing third-party software 
which could create and edit files shareable by other third-party 
software which would have the same rights to do the same. The importance 
of this can hardly be overestimated.


Yet today, on the web, there appears to be no standard way to create 
content in such an agnostic manner whereby users have full, built-in, 
locally-controlled portability of their data.


*Workarounds*

Sure, there is postMessage or CORS requests which can be used to allow 
one site to be the arbiter of this data.


And one could conceivably create a shared data store built upon even 
postMessage alone, even one which can work fully offline through cache 
manifests and localStorage or IndexedDB (I have begun some work on this 
concept at https://gist.github.com/brettz9/8876920 ), but this can only 
work if:


1. A site or set of domains is trusted to host the shared content.
2. Instead of being built into the browser, it requires that the shared 
storage site be visited at least one time.


*Proposal*

1. Add support for sharedStorage (similar to globalStorage but requiring 
approval), SharedIndexedDB, and SharedFileWriter/SharedFileSystem which, 
when used, would cause the browser to prompt the user to require user 
approval whenever storing or retrieving from such data stores (with an 
option to remember the choice for a particular site/domain), informing 
users of potential risks depending on how the data might be used, and 
potentially allowing them to view, on the spot, the specific data that 
was being stored.


Optional API methods could deter XSS by doing selective escaping, but 
the potential for abuse should not be used as an excuse for preventing 
arbitrary shared storage, since again, it is worked well on the desktop, 
despite risks there, and as works with postMessage despite it also 
having risks.


2. Add support for corresponding ReadonlyShared storage mechanisms, 
namespaced by the origin site of the data. A site, http://example.com 
might add such shared storage under example.com which 
http://some-other-site.example could retrieve but not alter or delete 
(unless perhaps a grave warning were given to users about the fact that 
this was not the same domain). This would have the benefit above 
postMessage in that if the origin site goes down, third party sites 
would still be able to have access to the data.


3. Encourage browsers to allow direct editing of this stored data in a 
human-readable manner (with files at least being ideally directly 
viewable from the OS desktop).


I proposed something similar earlier, and received a reply about doing 
this through shared workers, but as I understood it, I did not like that 
possibility because:


a. it would limit the neutrality of the storage, creating one site 
as an unchallengeable arbiter of the data

b. it would increase complexity for developers
c. it would presumably depend on the setting of CORS directives to 
distinguish it from same-domain shared workers.


While https://wiki.mozilla.org/WebAPI/DeviceStorageAPI appears to meet a 
subset of these needs, it does not meet all.


Thank you,
Brett



Re: [whatwg] NoDatabase databases

2013-08-18 Thread Brett Zamir

On 8/17/2013 5:16 AM, Brendan Long wrote:

On 05/01/2013 10:57 PM, Brett Zamir wrote:

I wanted to propose (if work has not already been done in this area)
creating an HTTP extension to allow querying for retrieval and
updating of portions of HTML (or XML) documents where the server is so
capable and enabled, obviating the need for a separate database (or
more accurately, bringing the database to the web server layer).

Can't you use JavaScript to do this already? Just put each part of the
page in a separate HTML or XML files, then have JavaScript request the
parts it needs and put insert them into the DOM as needed.


Yes, one can, but:

1. It won't allow users to have their browser (or privileged add-on 
code) make such universal, cross-domain partial-document-obtaining 
requests to any webpage they wish (at least to any webpage which is on a 
server where a drop-in server module or script aware of this standard 
protocol had been employed).


Imagine, for example, if all a government had to do to release their 
data online was to save a Word doc, Excel file, Access database, etc. as 
HTML and FTP it to a publicly-accessible directory on their server (and 
add a server module aware of the HTML Query API which intercepts such 
queries sent to files in their public directory to handle XPath/CSS 
Selector query processing and send back CORS headers with the modified 
response). Bam, there is now a genuine, queryable database on the Web 
which is available to the world for querying.


One could obtain subsets of such data stores without the document owner 
(in this case, the government) needing to go through hoops to ensure 
their documents/data are converted into JSON/XML/etc., have custom REST 
APIs provided, has a search interface created, etc. (though this 
protocol would let people store their data in a JSON database, etc. if 
they wished, but they could also just upload static HTML files).


Consumers of this data (whether web developers or users of the 
browser/add-on concept mentioned above) would have no need to do 
inefficient screen scraping which first had to grab entire documents to 
be able to extract useful data. There would be no need for 
server-side-only solutions (at least if one is coming from a privileged 
environment such as a browser/add-on, if the document owner enabled CORS 
on their server, or if the site is one's own).


2. Such JavaScript solutions as you mention are custom and require 
developers to learn different client-side (and server-side) libraries 
and learn different server APIs. With a standard HTML Query API, one 
would need know nothing more than the URL of the data store (and the 
structure of the contents one was seeking) to get away with bare 
XMLHttpRequest (or $.ajax) calls that do what one wants against the data 
store--no need to know what specific query strings to add to meet the 
requirements of a custom server-side API. (In some cases, that may 
admittedly be more convenient to have a succinct query syntax optimized 
for the specific document format, but it is nice to always have the 
generic query option.)


3. Custom JavaScript requires sites to include such code in every file 
and to write scripts. Of course, SOME data necessitates customized 
access control such as a website's user database (though even here, one 
could use http://en.wikipedia.org/wiki/Basic_access_authentication to 
avoid scripting).


But even with scripts determining access control, many sites could still 
benefit, by being able to say create, upload, and manage a Word document 
saved as HTML with a table whose (WYSIWYG) columns were user and 
password and then, as per #2 above, use a single reusable server-side 
library implementing the standard to query this document. The site 
could, if they wished, later switch to importing their document into a 
database while still keeping the HTML Query API library calls. And if a 
server-side script wanted to say let authenticated administrators query 
or alter the user table, client-side JavaScript could be output to them 
by the server-side code which conducted queries against the user table 
in the same familiar manner.


4. If markup would be added to HTML which coordinated intelligently with 
this query scheme, say for example to allow querying of documents with 
known paragraph numbering (there are more interesting and frequently 
needed use cases than this with tables and lists as I'm planning to 
explain in my response to Ian, but I'll use a simpler example in this 
response)...


a. The document creator could create:

article paragraphRange=
pThis is par. 1/p
pThis is par. 2/p
...
pThis is par. 500/p
/article

b. an intermediary server plugin would detect the paragraphRange 
attribute and then auto-strip out all of the inner paragraphs before 
delivering the document to the user (unless say other markup were 
present on article such as `showRange=1-20` in which case it would 
only strip out paragraphs 21-500, or if `paragraphsPerPage

[whatwg] NoDatabase databases

2013-05-01 Thread Brett Zamir

Hi,

I'm not sure where to post this idea, but as it does pertain to HTML I 
thought I would post it here.


I wanted to propose (if work has not already been done in this area) 
creating an HTTP extension to allow querying for retrieval and updating 
of portions of HTML (or XML) documents where the server is so capable 
and enabled, obviating the need for a separate database (or more 
accurately, bringing the database to the web server layer).


There are three main use cases I see for this:

1) Allowing one-off queries to be made by (privileged) user agents. This 
avoids the need for websites willing to share their data to create their 
own database APIs and overhead while allowing both the client and server 
the opportunity to avoid delivering content which is not of interest to 
the user. Possible query languages might include CSS selectors, XPath, 
XQuery, or JavaScript.


2) Allowing third-party websites the ability to make such queries of 
other sites as in #1 but requiring user permission. I seem to recall 
seeing some discussions apparently reviving the possibility for 
JavaScript APIs to make cross-domain requests with user permission 
regardless of the target site giving permission.


3) The ability for user agents to allow the user to provide intelligent 
defaults for navigating a subset of potentially large data documents, 
potentially with the assistance of website mark-up, but without the need 
for website scripting. This could reduce development time and costs, 
while ensuring that powerful capabilities were enabled for users by 
default on all websites (at least those that opted in by a simple 
server-side configuration option). It could also avoid unnecessary 
demands on the server and wait-time for the client (benefiting energy 
usage, access in developing countries, wait-times anywhere for large 
documents, etc.), while conversely facilitating exposure by sites of 
large data-sets for users wishing to download a large data set. 
Web-based IDEs, moreover, could similarly allow querying and editing of 
these documents without needing to load and display the full data set 
during editing. Some concrete examples include:


a) Allowing ordered or unordered lists or definition/dialogue lists 
or any hierarchical markup to be navigated upon user demand. The client 
and server might, for example, negotiate the number of list items from a 
list to be initially loaded and shown such that the entire list would 
not be displayed or loaded but instead would load say only the first and 
last 5 items in the list and give the user a chance to manually load the 
rest if they were interested in viewing all of that data. Hierarchical 
lists, moreover, could allow Ajax-like drill-down capabilities (or if 
the user so configured their user agent, to automatically expand to a 
certain depth), all without the author needing to provide any scripting, 
letting them focus on content. Even non-list markup, like paragraphs, 
could be drilled into, as well as providing ellipses when the child 
content was determined to be above a given memory size or if the element 
was conventionally used to provide larger amounts of data (e.g., a 
textarea). (Form submission would probably need to be disabled though 
until all child content was loaded, and again, in order to avoid usage 
against the site's intended design, such navigation might require opt-in.)


b) Tables would merit special treatment as a hierarchical type as 
one may typically wish to ensure that all cells in a given row were 
shown by default (though even here, ellipses could be added when the 
data size was determined to be large), with pagination being the 
well-used norm of table-based widgets. Having markup specified on column 
headers (if not full-blown schemas) to indicate data types would be 
useful in this regard (markup on the top level of a list might similarly 
be useful); if the user agent were, for example, made aware of the fact 
that a table column consisted exclusively of dates, it would provide a 
search option to allow the user to display records between a given date 
range (as well as better handling sorting).


Rows could, moreover be auto-numbered by the agent with an option to 
choose a range of numbers (similarly ranges could be provided for other 
elements, like paragraph or list item numbering, etc.). The shift to the 
user agent might also encourage the ability to reorder or remove columns.


c) Such a protocol would be enhanced by the existence of modular 
markup, akin to XInclude or XLink actuate=onload, whereby the user 
agent/user could determine whether or not to resolve child documents or 
allow the user to browse in a lite mode, selectively expanding only the 
items of interest, and content creators could easily and directly 
manipulate content files on their server desktop.


d) Queries made with a PUT method request could allow selective 
updating of a given table row or cell, or range of rows, a set of 

Re: [whatwg] Styling details

2013-01-02 Thread Brett Zamir

On 1/3/2013 4:35 AM, Anne van Kesteren wrote:

On Wed, Jan 2, 2013 at 8:53 PM, Ian Hickson i...@hixie.ch wrote:

Like most widgets, I think the answer is Web Components.

As far as I can tell styling form controls is an unsolved problem and
Components does not seem to be tackling it. We always play the
Components card (and before that the XBL card), but the work thus far
does not allow for altering how input is displayed, or subclassing
textarea somehow.

After a decade of waiting for this, I think it might be time to start
calling this vaporware.


In my ideal world, with HTML deprived of XML or XML-like extensibility 
(no entities or namespaces), and even with what Web Components or the 
like could do, and with JavaScript already being able to encompass this 
functionality, there appears to me to be a far greater need for making a 
standard and familiar/friendly JSON/JavaScript way to represent HTML 
than an HTML way to represent JavaScript. Such a format would also seem 
a whole lot easier to implement than Web Components.


I'd go so far as to hope for a special content-type which could avoid 
the need for HTML and even CSS syntax altogether.


I have been working on a syntax which I call JML (for JSON/JavaScript 
Markup Language, but pronounced as Jamilih, meaning Beauty in Arabic, 
and also my daughter's name--also perhaps a nice counterpart to the 
masculinish JSON), requiring, relative to JsonML, an additional array 
around all children for the sake of easier modularity (e.g., so that 
inline functions could be called to return fragments of text and/or 
children and drop in as an element's children), and also allowing an 
easy-to-learn HTML/XML-leveraging means of extensibility shown through 
examples below.



exports.main = // Use of this line would allow the template below to be 
preceded (above) by function definitions, but could perhaps also be 
dropped to allow a simpler JSON or JSON-like content-type to similarly 
be renderable as HTML without it


[ // Optional document meta-attributes could come here
['html', [
['head', [
['script', [
// A JSON format sans functions could be possible, but 
allowable for convenience in templating

{$script: function (onReady) {
require(['depend1'], function (depend1) {
onReady(function () {
document.body.appendChild(['p', ['World']]);
depend1('no additional script tags needed for 
modularity');


// Ready and easy conversion
var jml = ['b', ['Hello']], html = 
'bHello/b', dom = document.createElement(jml);

jml === html.toJML() 
jml === dom.toJML() 
html === jml.toHTML() 
html === dom.toHTML() 
dom === html.toDOM() 
dom === jml.toDOM(); // true
});
});
}}
]],
['style', [
// Non-array objects would normally represent attributes, 
but when prefixed with the
//   reserved '$', other features become possible for HTML 
(or XML)

{$css: [
['p[greeting]', ['color', 'blue']]
]
]]
]],
['body', [
'text',
['p', {'class':'greeting'}, ['Hello!']],
{'$#x': '263A'},
{$comment: 'Finished file!'}
]]
]]
]

I think the above would be very easy for existing developers and 
designers to learn.


While a declarative syntax is indeed helpful, developers often seem to 
believe that a declarative syntax is only possible through HTML or CSS. 
Declarative syntax is known primarily for being about avoiding 
unnecessary reference to control flow, but with the likes of XSL being 
able to say prove loop counts, and with JavaScript array extras now 
allowing function-based iteration without loop counts, it seems 
meaningless to make distinctions at a language level between declarative 
and imperative paradigms. Developers also seem to use declarative to 
refer to a syntax that is not only more human-readable (unlike 
non-transparent DOM objects), but also easily parsable--unlike raw 
JavaScript (or HTML), but possible with a subset of JavaScript as JML 
above. It could be optionally extended (by 3rd parties if nothing else) 
to support DOM methods, XPath/CSS Selectors, XSL, etc. Developers could 
effortlessly move between static or dynamic representations of HTML 
(without having to say first convert static HTML into 
line-break-cumbersome, syntax-highlighter-unfriendly JavaScript strings).


I would think something like the above could be fleshed out to fill this 
need and could foster code reuse with HTML and CSS and in the 
process get designers better familiar with JavaScript (perhaps creating 
document with, if the proposed content-type is 

[whatwg] isAnyProtocolHandlerRegistered

2012-09-29 Thread Brett Zamir

Hi,

A former proposal (at 
http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-January/018101.html 
) mentioned protocol handler check which did not require a URL string as 
in the second argument to isProtocolHandlerRegistered().


I don't see any other discussion of this, but I am puzzled by why a 
mechanism presumably designed to allow sites NOT to have to commit their 
users to using a specific location is set up to effectively fail in this 
goal because there is no such practical way to fallback to a specific 
site or even a custom message (e.g., pointing users to instructions on 
where and how to register a protocol available on other sites) because 
there is no programmatic way to detect whether ANY protocol has been 
registered (nor any content-creator-friendly markup solution such as a 
link attribute to be tried before the href attribute).


Yes, one can attempt to register a protocol through one's own site, but 
this requires the user to be willing to register with the linking site 
(and to do so) and requires the linking site to be willing to provide 
its own handler (and to do so). If, for example, I am not a bookstore, 
why do I need to pollute the users handlers with my own ISBN handler 
just to be able to ensure the user clicking on my recommended book links 
can see something useful? Why should users of my blog have to trust my 
site as a competent ISBN handler for all their ISBN numbers? (This also 
raises the question of why one cannot register a protocol by some regex 
or at least wildcard ability, but that will hopefully be addressed by 
the apparent merging of registerProtocolHandler with Web Intents.)


I seem to recall some security discussion with concerns that being able 
to check would leak information such as whether the user had registered 
protocols like whistleblower-ID123:, but:


1) It is already effectively possible in all modern browsers (not sure 
about IE9) to do the necessary check of whether any protocol is handled 
(at least whether it is loadable) through inserting a hidden iframe with 
an onload event (DOMContentLoaded is sadly not available for iframes in 
any browser I tested) and waiting for a timer to detect after a likely 
sufficient load time whether the protocol loaded something, though this 
obviously is clumsy, and besides requiring one to load the hopefully 
idempotent URL twice which one wishes the user to visit, it requires a 
guess at an adequate timeout to wait for a likely complete page load.


2) The spec could overcome this (seemingly obscure) security concern if 
deemed necessary by requiring a distinct prefix for either readable 
and/or especially non-readable protocol types (e.g., 
exposed+whistleblower:).


Thanks,
Brett



Re: [whatwg] runat (or server) attribute

2012-05-13 Thread Brett Zamir

On 5/13/2012 5:23 PM, Benjamin Hawkes-Lewis wrote:

On Sun, May 13, 2012 at 1:47 AM, Brett Zamirbret...@yahoo.com  wrote:

With Server-Side JavaScript taking off, could we reserve runat (or maybe
an even simpler and more concise server boolean attribute) for a standard
and (via CommonJS) potentially portable way for server-side files to be
created (and discourage use of HTML-unfriendly and
syntax-highlighter-unaware processing instructions)?

server-side files to be created - what do you mean?


So, no matter the Server-side JavaScript engine, one could write code 
like tihs:


pYou loaded this page on script server print(new Date()); /server/p

which would become:

pYou loaded this page on Sun May 13 2012 22:33:28 GMT+0800 (China 
Standard Time)/p



What would this attribute do?
It would simply be reserved for server-side consumption, whether such 
applications would replace the contents of the tag with HTML or perform 
some non-transparent actions and just remove the tag. It would not do 
anything on the client-side (though it wouldn't need to restrict the 
default behavior either since it really should be consumed by the time 
it reaches browsers). The purpose is simply to allow server-side 
applications (such as within templates) to be able to use one HTML 
syntax portably across other server-side HTML+JavaScript generating 
environments (and also not be concerned that, as with data-* attributes, 
that there might end up being some other future use of the attribute by 
client-side HTML).


Brett



[whatwg] runat (or server) attribute

2012-05-12 Thread Brett Zamir

Hi,

With Server-Side JavaScript taking off, could we reserve runat (or 
maybe an even simpler and more concise server boolean attribute) for a 
standard and (via CommonJS) potentially portable way for server-side 
files to be created (and discourage use of HTML-unfriendly and 
syntax-highlighter-unaware processing instructions)?


I know in the past Ian has showed a lack of interest in tending to HTML 
in other contexts (e.g., in rejecting a common include syntax), but 
could we at least lay this simple foundation for encouraging server-side 
portability (in a way that works well with existing syntax highlighters)?


Thanks,
Brett


Re: [whatwg] Why won't you let us make our own HTML5 browsers?

2012-01-31 Thread Brett Zamir

On 2/1/2012 8:36 AM, Ian Hickson wrote:

On Sat, 17 Dec 2011, Brett Zamir wrote:

What is the reason you won't let us make our own browsers-in-a-browser?

What is the use case for browser-in-a-browser?

If you have a browser... then you have a browser. Why would you want to
run another one inside your browser?


It would let anyone with web skills to have full creative control over 
the browser UI which they can in turn easily share with others 
regardless of the host browser those people are using, and without the 
hassle of building and packaging browser-specific add-ons or 
executables, or forcing users to manage these executables. The facility 
of allowing such a tool to be written in a standard cross-browser 
language would encourage experimentation and choice, and ultimately, I 
believe, better and more tailored user experience.


I think such a recommendation as this also goes hand-in-hand with my 
earlier recommendation for the easy creation of shared databases which 
presumably unlike Shared Workers can persist outside of and be 
independent of the original application, and would allow extensibility 
for these browser-in-browsers (which we might call bibs)--as well as 
new life for Open Data.


One could store the browsing history, for example, in such a shared 
database, which could in turn be accessed by other alternative bibs. 
Especially with experimentation on shared database formats, this could 
build confidence among users that the histories, bookmarks, cookies, 
data or documents saved for fast or aggregated local querying, or other 
personal data they are building would remain accessible to them on their 
machine even if they later chose another bib.


A more limited application of this idea would be for the earlier idea I 
shared for iframes to allow independent navigation controls. A web app 
could propose itself as the view for such iframes.



I'm not talking about some module you have to build yourself in order to
distribute a browser as an executable. I'm talking about visiting a
(secure/signed?) page on the web and being asked permission to give it
any or all powers including the ability to visit and display other
non-cross-domain-enabled sites, with the long-term possibility of
browsers becoming a mostly bare shell for installing full-featured
browsers (utilizing the possibility for APIs for these browsers to
themselves accept, integrate, and offline-cache add-on code from other
websites, emulating their own add-on system).

How do you help users who have no idea what that means and grant a hostile
Web site claiming to be a browser access to everything?
How do you help users who are told Download this zip file and click 
this exe file? Or, how do you help users who visit a site allowing 
malicious add-ons? No one has yet to explain the supposed difference to 
me in any satisfactory manner.

I am not interested in the argument that It is just too dangerous.
Browsers already allow people to download executables with a couple
clicks, not to mention install privileged browser add-ons. Enough said.

Well, in all fairness, browsers and operating systems are going out of
their way to make this harder and harder. Some (e.g. iOS, ChromeOS) make
it essentially impossible now, others (e.g. Android) require you to
explicitly opt-in to an obscure developer mode feature before allowing it,
others (e.g. MacOS, Windows) keep track of where files were obtained from
and give dire warnings before running apps from the Web.
Is giving a dire warning only conceivable and potentially intelligible 
to users of web apps, yet not for this proposal?


Brett



[whatwg] Browser-as-Desktop: Widgets in the browser

2012-01-31 Thread Brett Zamir

Hi,

This idea is more of a browser feature request, but it impinges on 
language features which could conceivably be allowed in HTML to trigger 
the feature. The idea would work best in browsers which allowed 
toolbars/full-screen-mode to be configured on a per-tab basis, but does 
not require it.


The idea is to bring the desktop view to the browser--for web pages to 
contain instructions indicating they were to be treated as widgets 
(though browsers could allow any page to be so converted as well). Note 
that this idea is NOT for widgets to be installed or displayed via an 
artificially separate mechanism from browsers, but rather for different 
tabs loaded in the browser to share a common user-determined canvas 
(though the browser might let the user drop the need for maintaining 
independent tabs, instead allowing them to be collapsed and allowing 
right-click to give options to remove or replace a widget).


Widget pages would be delivered with a truly transparent background (as 
opposed to a background:transparent merely used for convenience in CSS 
but which tends to have an assumed empty background unless specified).


The underlying background could be:
1) the user's machine desktop
2) a browser-provided background
3) a simple choice by the user of a background color, image, WebGL 
object, or, their superset, another non-transparent website (which 
might, for example, provide an interface for one's FileSystem, etc.).


The main use case is to allow widgets to be distributed as regular web 
pages, useable at their direct URL without the need for independent 
installation mechanisms, while still working with persistent tabs as 
some browser offer, HTML5-driven offline apps and a desktop-like view.


If an earlier suggestion I made for a browser to allow iframes to be 
shown with a set of navigation controls were adopted, a widget might 
itself serve as a URL navigation bar, allowing it to be dragged around 
as a widget and load its own page.


Multiple widgets could be provided on a single page, allowing for those 
items z-index to be maintained in reference to one another (perhaps with 
WebGL objects also useable in reference to such a z-index if this is not 
already possible), while the browser could allow the user to control the 
position and depth of a widget set relative to other widget sets (where 
a set could be single or multiple widgets, depending on the number of 
top-level elements in the page).


Moreover, the background itself might allow full 360 degree rotation and 
placement in any direction or depth.


Fundamental applications such as sticky notes, could be zoomed in and 
out, perhaps rotated, and even websites not designed to be used as 
widgets could be zoomed and moved around within the background.


Best wishes,
Brett



Re: [whatwg] Why won't you let us make our own HTML5 browsers?

2011-12-17 Thread Brett Zamir

On 12/17/2011 3:27 PM, Andreas Gal wrote:

We are working on an API to allow implementing a web browser as an HTML5 
application. It is going to take quite a while to get the API and security 
model right, but we are definitely interested in the topic.

https://bugzilla.mozilla.org/show_bug.cgi?id=693515
Cool, thanks for this---I really think this needs to work from within a 
real browser for now, in order to win converts who aren't yet ready to 
let go of the built-in goodness of the likes of Firefox while they 
experiment with or co-exist with alternatives they may find on the web.


While this may allow the browser to become more stripped down as far as 
built-in UI controls, I hope this may simultaneously encourage adding 
back the broader built-in functionality made available to the likes of 
Seamonkey (e.g., to allow handling client-side email from a webapp in a 
non-proprietary manner as well).


Also, please while it is early enough in the process, do not assume that 
a browser will only want to allow one privileged frame nor ignore the 
potentially powerful capability of individual otherwise non-privileged 
websites having iframes with their own independent navigation controls 
(https://bugzilla.mozilla.org/show_bug.cgi?id=618354 ).


Best wishes,
Brett



Best regards,

Andreas

On Dec 16, 2011, at 11:16 PM, Brett Zamir wrote:


What is the reason you won't let us make our own browsers-in-a-browser?

I'm not talking about some module you have to build yourself in order to distribute a 
browser as an executable. I'm talking about visiting a (secure/signed?) page on the web 
and being asked permission to give it any or all powers including the ability to visit 
and display other non-cross-domain-enabled sites, with the long-term possibility of 
browsers becoming a mostly bare shell for installing full-featured browsers (utilizing 
the possibility for APIs for these browsers to themselves accept, integrate, 
and offline-cache add-on code from other websites, emulating their own add-on system).

Of course there are security risks, but a standardized, cross-platform, 
re-envisioned and expanded equivalent of ActiveX, which can work well with 
Firewalls, does not add to the risks already inherent in the web.

I am not interested in the argument that It is just too dangerous.  Browsers 
already allow people to download executables with a couple clicks, not to mention install 
privileged browser add-ons. Enough said. There is absolutely no real difference between 
these and what I am proposing, except that executables offer the added inconvenience of 
being non-cross-platform and awkward for requiring a separate, non-readily-unifiable 
means of managing installations. Otherwise, please someone tell me what is the 
/insurmountable/ difference?

I am not really interested in a prolonged technical discussion or debate about the 
limitations of existing technologies. I am asking at a higher level why bright people 
can't help us move to a web like this. As per Ian's signature, Things that are 
impossible just take longer, I see no logical reason why such a web can't be 
envisioned and built.

 From the resistance I have seen to the idea among otherwise bright people, I 
can only reach the conclusion that there must be some ulterior motives behind 
the resistance. The main browsers would not be able to corner the market as 
easily anymore if such a thing happened. Because as long as there are these 
oligopolic fiefdoms requiring a separate set of JavaScript API standards for 
run-of-the-mill web developers to be able to develop privileged applications 
easily---or for them to be unable to interact in a privileged fashion with 
other such applications, there is less competition and sadly, the world won't 
see competitive and collective innovations leading to better privileged 
browsers.  Rather we are stuck with a centralized model whereby, the main 
browsers remain the gate-keepers of innovation.

The dream of Write once, run anywhere is thankfully becoming more realized with HTML5, 
though there is still a need for an expanded dream, something along the lines of Write once, 
run anywhere, access any functionality desired, and the current albeit highly skilled 
custodians of the web seem to sadly lack the vision at the moment to at least point us in that 
direction, let alone have plans to achieve it. I would really like to know why others seem not to 
have seen this problem or reacted to it...

Admittedly, such a concept could, if the existing browser add-on systems 
adequately expose such high privileges to their add-ons, be initially 
implemented itself as an add-on, allowing a cross-browser API initiated from 
websites to trigger the add-on to ask for the granting of website privileges, 
but in order to be well-designed, I would think that this effort should fall 
under the umbrella of a wider, representative, consultative, and capable 
effort, which is supported in principle by the browsers so

[whatwg] Why won't you let us make our own HTML5 browsers?

2011-12-16 Thread Brett Zamir

What is the reason you won't let us make our own browsers-in-a-browser?

I'm not talking about some module you have to build yourself in order to 
distribute a browser as an executable. I'm talking about visiting a 
(secure/signed?) page on the web and being asked permission to give it 
any or all powers including the ability to visit and display other 
non-cross-domain-enabled sites, with the long-term possibility of 
browsers becoming a mostly bare shell for installing full-featured 
browsers (utilizing the possibility for APIs for these browsers to 
themselves accept, integrate, and offline-cache add-on code from other 
websites, emulating their own add-on system).


Of course there are security risks, but a standardized, cross-platform, 
re-envisioned and expanded equivalent of ActiveX, which can work well 
with Firewalls, does not add to the risks already inherent in the web.


I am not interested in the argument that It is just too dangerous.  
Browsers already allow people to download executables with a couple 
clicks, not to mention install privileged browser add-ons. Enough said. 
There is absolutely no real difference between these and what I am 
proposing, except that executables offer the added inconvenience of 
being non-cross-platform and awkward for requiring a separate, 
non-readily-unifiable means of managing installations. Otherwise, please 
someone tell me what is the /insurmountable/ difference?


I am not really interested in a prolonged technical discussion or debate 
about the limitations of existing technologies. I am asking at a higher 
level why bright people can't help us move to a web like this. As per 
Ian's signature, Things that are impossible just take longer, I see no 
logical reason why such a web can't be envisioned and built.


From the resistance I have seen to the idea among otherwise bright 
people, I can only reach the conclusion that there must be some ulterior 
motives behind the resistance. The main browsers would not be able to 
corner the market as easily anymore if such a thing happened. Because as 
long as there are these oligopolic fiefdoms requiring a separate set of 
JavaScript API standards for run-of-the-mill web developers to be able 
to develop privileged applications easily---or for them to be unable to 
interact in a privileged fashion with other such applications, there is 
less competition and sadly, the world won't see competitive and 
collective innovations leading to better privileged browsers.  Rather we 
are stuck with a centralized model whereby, the main browsers remain the 
gate-keepers of innovation.


The dream of Write once, run anywhere is thankfully becoming more 
realized with HTML5, though there is still a need for an expanded dream, 
something along the lines of Write once, run anywhere, access any 
functionality desired, and the current albeit highly skilled custodians 
of the web seem to sadly lack the vision at the moment to at least point 
us in that direction, let alone have plans to achieve it. I would really 
like to know why others seem not to have seen this problem or reacted to 
it...


Admittedly, such a concept could, if the existing browser add-on systems 
adequately expose such high privileges to their add-ons, be initially 
implemented itself as an add-on, allowing a cross-browser API initiated 
from websites to trigger the add-on to ask for the granting of website 
privileges, but in order to be well-designed, I would think that this 
effort should fall under the umbrella of a wider, representative, 
consultative, and capable effort, which is supported in principle by the 
browsers so that at the very least they will not end up curtailing 
privileges to their add-ons down the line on which the effort depends.


Best wishes,
Brett



Re: [whatwg] Default encoding to UTF-8?

2011-12-01 Thread Brett Zamir

On 12/1/2011 2:00 PM, L. David Baron wrote:

On Thursday 2011-12-01 14:37 +0900, Mark Callow wrote:

On 01/12/2011 11:29, L. David Baron wrote:

The default varies by localization (and within that potentially by
platform), and unfortunately that variation does matter.

In my experience this is what causes most of the breakage. It leads
people to create pages that do not specify the charset encoding. The
page works fine in the creator's locale but shows mojibake (garbage
characters) for anyone in a different locale.

If the default was ASCII everywhere then all authors would see mojibake,
unless it really was an ASCII-only page, which would force them to set
the charset encoding correctly.

Sure, if the default were consistent everywhere we'd be fine.  If we
have a choice in what that default is, UTF-8 is probably a good
choice unless there's some advantage to another one.  But nobody's
figured out how to get from here to there.
How about a Compatibility Mode for the older non-UTF-8 character set 
approach, specific to page?  I wholeheartedly agree that something 
should be done here, preventing yet more content from piling up in 
outdated ways without any consequences. (Same with email clients too, I 
would hope as well.)


Brett



Re: [whatwg] Extensible microdata attributes

2011-06-21 Thread Brett Zamir

On 6/14/2011 2:32 AM, Tab Atkins Jr. wrote:

On Mon, Jun 13, 2011 at 2:29 AM, Brett Zamirbret...@yahoo.com  wrote:

Thanks, that's helpful. Still would be nice to have item-* though...

Well, your idea for custom item-* attributes is just a way to more
concisely embed triples of non-visible data.  You already have a
mechanism for embedding non-visible triples (meta  orlink), so the
new method needs some decent benefits to justify the duplication of
functionality.
HTML could have been created without attributes too--but if one is going 
to use it frequently enough, concision is a big selling point (as is 
non-redundant styleability).

Additionally, while we recognize that non-visible data is sometimes
necessary to embed, we'd like to discourage its use as much as
possible (in general, non-visible data rots much faster).  One way to
do that is to make the syntax slightly cumbersome or ugly - when you
really need it, you can use it, but your aesthetic sense will keep it
from being the first tool you reach for.  So, making it easier or
prettier to embed non-visible triples is actually something we'd like
to avoid if we can.


People who are going to go to the trouble of adding semantics which do 
nothing for visual rendering are probably going to have some idea of 
what they are doing. And if there is an adequately convenient method, 
they will have the chance to learn from experience about the right balance.


And is my idea really encouraging hidden meta-data?

Even in my own example of using water damage:

span itemprop=damage item-agent=water
So blurry
/span

...this is allowing some extensibility (by allowing an indefinite number 
of attributes), but conceptually it is not so different from:


span itemprop=water-damage
So blurry
/span

...which no one is calling hidden.

My suggestion is actually /helping/ avoid hidden meta tags not directly 
associated with an element encapsulating visible text.



Note, though, that Microdata or RDFa may not be quite appropriate for
this kind of thing.  You're not marking up data triples for later
extraction as independent data - you're doing in-band annotations of
the document itself.  As such, a different mechanism may be more
appropriate, such as your original design of using a custom markup
language in XML, or using custom attributes in HTML.  There's no
particular reason for these sorts of things to be readable by
arbitrary robots; it's sufficient to design for ones that know exactly
what they're reading and looking for.

With the likes of Google offering Microdata-aware searches, I think it makes
a whole lot of sense to allow rich documents such as TEI ones to enter as
regular document citizens of the web, whereby the limited resources of such
specialized semantic communities can leverage the general purpose and
better-supported services such as Google's Microdata tool, while also having
their documents editable within the likes of WYSIWYG HTML text editors, and
stored on sites such as discussion forums or wikis where only HTML may be
allowed and supported.

I think such a focus would also enable the TEI community to benefit from
reusing search-engine-recognized schemas where available, as well as helping
the web community build new schemas for the unique needs of encoding
academic texts.

I haven't yet looked into TEI's metadata scheme, but is the TEI
metadata actually something that needs to be known to search engines?
The one example you've presented in your emails, annotating that some
parts of a transcription were water-damaged (and thus presumably
possibly inaccurate?), isn't something useful for search engines, but
only for humans looking at the document as a whole.
It could be useful to a search engine. If I remembered that some text 
was water-damaged, I could specify that I only wanted to look for 
water-damaged text (with the TEI itemtype).


But I used the water damage example to show something very minute and 
concrete. I could have given examples about how one wished to search for 
more frequent use cases such as finding a particular component of a 
structured bibliography, or find all quotations attributed to a 
particular author.


Search engines could of course be employed not only for searching the 
whole web, but for searching a particular site.



If most of the other metadata is similar, then the only reason to use
Microdata is to potentially make it easier to read/embed data via
Microdata-aware WYSIWYG editors (are there any?).  Or, possibly, to
use Microdata-extraction tools.
My point about editors was that relative to TEI XML, TEI in HTML could 
be put into editors. Relative to other approaches like using data-*, it 
would not be a particular advantage, outside of the fact that data-* is 
meant only to be used by the specific site, not for republishing by 
others. For example, if a publisher of a TEI Bible encoded a ton of 
semantics, using data-* to do so would let the document be previewable 
in a text editor or shared 

Re: [whatwg] Extensible microdata attributes

2011-06-13 Thread Brett Zamir

On 6/13/2011 2:41 PM, Tab Atkins Jr. wrote:

On Sat, Jun 11, 2011 at 4:20 AM, Brett Zamirbret...@yahoo.com  wrote:

For example, to take a water-damaged text (e.g., for the TEI element
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-damage.html ) which
in TEI could be expressed as:


e  damage agent=water xmlns=http://www.tei-c.org/ns/1.0/;Some water

damaged words/damage

might be represented currently in Microdata as:

span itemprop=damage itemscope=
itemtype=http://www.tei-c.org/ns/1.0/;
meta itemprop=agent content=water/
Some water damaged words
/span

This still wouldn't quite work.  Embedded Microdata has no
relationship with the surrounding DOM - the only meaning carried is
whatever is actually being denoted as Microdata.  So, in the above
example, you're indicating that there is some water damage, but not
what is damaged.

If you wanted to address this properly, you'd need to format it like this:

span itemprop=damage itemscope itemtype=http://www.tei-c.org/ns/1.0/;
   meta itemprop=agent content=water
   span itemprop=subjectSome water damaged words/span
/span

This way, when you extract the Microdata, you get an item that looks like:

{ items: [
 { properties: {
 damage: [
   { type: ...,
 properties: {
   agent: [water],
   subject: [Some water damaged words]
 }
   }
 ]
   }
 }
   ]
}


Thanks, that's helpful. Still would be nice to have item-* though...

Note, though, that Microdata or RDFa may not be quite appropriate for
this kind of thing.  You're not marking up data triples for later
extraction as independent data - you're doing in-band annotations of
the document itself.  As such, a different mechanism may be more
appropriate, such as your original design of using a custom markup
language in XML, or using custom attributes in HTML.  There's no
particular reason for these sorts of things to be readable by
arbitrary robots; it's sufficient to design for ones that know exactly
what they're reading and looking for.


With the likes of Google offering Microdata-aware searches, I think it 
makes a whole lot of sense to allow rich documents such as TEI ones to 
enter as regular document citizens of the web, whereby the limited 
resources of such specialized semantic communities can leverage the 
general purpose and better-supported services such as Google's Microdata 
tool, while also having their documents editable within the likes of 
WYSIWYG HTML text editors, and stored on sites such as discussion forums 
or wikis where only HTML may be allowed and supported.


I think such a focus would also enable the TEI community to benefit from 
reusing search-engine-recognized schemas where available, as well as 
helping the web community build new schemas for the unique needs of 
encoding academic texts.


Brett



[whatwg] Extensible microdata attributes

2011-04-26 Thread Brett Zamir

 Hi,

I'm interested to see more rich semantics (such as made possible in Text 
Encoding Initiative documents to become available in HTML (e.g., toward 
making markup more conveniently and richly queryable on sites like 
Wikisource)), but this concern relates to data-centric markup as well.


Without stepping into a microformats/RDF-a debate (especially since it 
appears microdata is the one that is being successfully formalized), I 
would like to recommend that extensible data-*-like attributes become 
available to be associated with microdata items, but which do not 
impinge on the intended private nature of those data-* attributes. 
Perhaps item-* could be reserved for this purpose?


This would prevent the need for such ugly hacks as:

span id=UnitedNations style=display:none; itemprop=orgName 
item-placeName=New YorkUnited Nations/span

...
blockquote itemscope=itemscope itemtype=http://www.tei-c.org/ns/1.0;
span itemprop=who style=display:none;#United_Nations/span
We the Peoples of the United Nations determined to save succeeding 
generations from the scourge of war, which twice in our lifetime has 
brought untold sorrow to mankind...

/blockquote

For the latter portion, one could instead just do:

blockquote itemscope=itemscope itemtype=http://www.tei-c.org/ns/1.0; 
item-who=#United_Nations
We the Peoples of the United Nations determined to save succeeding 
generations from the scourge of war, which twice in our lifetime has 
brought untold sorrow to mankind...

/blockquote

Incidentally, as for the existing property names, I'd think changing the 
item properties to scope, itype, and prop would make for much 
cleaner, more appealing and readable code.


Thanks,
Brett



Re: [whatwg] Extensible microdata attributes

2011-04-26 Thread Brett Zamir

 On 4/26/2011 5:33 PM, Benjamin Hawkes-Lewis wrote:

On Tue, Apr 26, 2011 at 7:36 AM, Brett Zamirbret...@yahoo.com  wrote:

This would prevent the need for such ugly hacks as:

span id=UnitedNations style=display:none; itemprop=orgName
item-placeName=New YorkUnited Nations/span
...
blockquote itemscope=itemscope itemtype=http://www.tei-c.org/ns/1.0;
span itemprop=who style=display:none;#United_Nations/span
We the Peoples of the United Nations determined to save succeeding
generations from the scourge of war, which twice in our lifetime has brought
untold sorrow to mankind...
/blockquote

For the latter portion, one could instead just do:

blockquote itemscope=itemscope itemtype=http://www.tei-c.org/ns/1.0;
item-who=#United_Nations
We the Peoples of the United Nations determined to save succeeding
generations from the scourge of war, which twice in our lifetime has brought
untold sorrow to mankind...
/blockquote

I'm confused by your examples. What extractable statement are you trying to
markup with microdata here? Is it: the United Nations is in New York?


That was one part, but I was mostly focusing on the quotation indicating 
that it was by the United Nations (which is an organization in New 
York). It is using a special attribute (in this case item-who) rather 
than defining a (hidden) property-value child element (with itemprop=who).


Brett



Re: [whatwg] Extensible microdata attributes

2011-04-26 Thread Brett Zamir

 On 4/26/2011 9:22 PM, Benjamin Hawkes-Lewis wrote:

On Tue, Apr 26, 2011 at 11:48 AM, Brett Zamirbret...@yahoo.com  wrote:

  On 4/26/2011 5:33 PM, Benjamin Hawkes-Lewis wrote:

On Tue, Apr 26, 2011 at 7:36 AM, Brett Zamirbret...@yahoo.comwrote:

This would prevent the need for such ugly hacks as:

span id=UnitedNations style=display:none; itemprop=orgName
item-placeName=New YorkUnited Nations/span
...
blockquote itemscope=itemscope itemtype=http://www.tei-c.org/ns/1.0;
span itemprop=who style=display:none;#United_Nations/span
We the Peoples of the United Nations determined to save succeeding
generations from the scourge of war, which twice in our lifetime has
brought
untold sorrow to mankind...
/blockquote

For the latter portion, one could instead just do:

blockquote itemscope=itemscope itemtype=http://www.tei-c.org/ns/1.0;
item-who=#United_Nations
We the Peoples of the United Nations determined to save succeeding
generations from the scourge of war, which twice in our lifetime has
brought
untold sorrow to mankind...
/blockquote

I'm confused by your examples. What extractable statement are you trying
to
markup with microdata here? Is it: the United Nations is in New York?

That was one part, but I was mostly focusing on the quotation indicating
that it was by the United Nations (which is an organization in New York). It
is using a special attribute (in this case item-who) rather than defining
a (hidden) property-value child element (with itemprop=who).

So the extractable data is: the United Nations is the source of the
quotation 'We the Peoples of the United Nations determined to save
succeeding generations from the scourge of war, which twice in our
lifetime has brought untold sorrow to mankind...'?

Microdata (like microformats) is supposed to encourage visible data rather
than hidden metadata. Both your markup examples seem to express
authorship of the quotation through hidden metadata.
That's kind of my purpose though. Sometimes, one does not wish to embed 
the text itself, but one still wishes the data encoded so it can be 
retrieved by other means. Why should extensible semantics be restricted 
to visible information?


Brett



Re: [whatwg] Extensible microdata attributes

2011-04-26 Thread Brett Zamir

 On 4/26/2011 9:55 PM, Benjamin Hawkes-Lewis wrote:

On Tue, Apr 26, 2011 at 2:32 PM, Brett Zamirbret...@yahoo.com  wrote:

That's kind of my purpose though. Sometimes, one does not wish to embed the
text itself, but one still wishes the data encoded so it can be retrieved by
other means. Why should extensible semantics be restricted to visible
information?

http://microformats.org/wiki/principles

http://tantek.com/log/2005/06.html#d03t2359

Thanks for the references. While this may be relevant for the likes of 
blogs and other documents whose requirements for semantic density is 
limited enough to allow such reshaping for practical effect and whose 
content is reshapeable by the content creator (as opposed to 
republishing of already completed books), for more semantically dense 
content, such as the types of classical documents marked up by TEI, it 
is simply not possible to expose text for each bit of semantic 
information or to generate new text to meet that need. And of course, 
even with microformats/microdata as it is now, the semantic content 
itself is not necessarily exposed just because text is visible on the page.


The issue of discoverability is I think more related to how it will be 
consumed or may be consumed. And even if some pieces of information are 
less discoverable, it does not mean that they have no value. For such 
rich documents, a lot of attention is being paid to these texts since 
they are themselves considered important enough to be worth the time.


If the Declaration of Independence of the United States was marked up 
with hidden information about prior emendations, their likely reasons, 
etc., or about suspected authors of particular passages, or the United 
Nations Declaration of Human Rights were marked up to indicate which 
countries have expressed reservations (qualifications) about which 
rights, while a browsing application or query tool ought to be able 
(optionally) expose this hidden information, there is no automatic need 
for the markup to be polluted with extra (hidden) (and especially 
URI-based or other non-textual) tags when an attribute would suffice.


For things that are truly important, there may be a great deal of care 
put into building up many layers which are meant to be peeled away, and 
it is worth allowing some of that information (particularly the 
non-textual information, e.g., the conditions of authorship, publisher, 
etc.), especially which the original publication did not expose, to be 
still selectively revealed to queries and deeper browsing.


If a site like Wikisource (the online library sister project of 
Wikipedia's) would be able to offer such officially sanctioned semantic 
attributes, classic texts could become enhanced in this way over time, 
with the wiki exposing the hidden semantic information, which indeed may 
not be as important as the visible text, but with queries by interested 
to users, any problems in encoding could be discovered just as well.


While I know most hip web authors and developers are minimalists, can't 
we all just get along? Can't those of us interested in such richness, 
and with a view to progressively enhancing documents into the far 
future, also be welcomed into the web?


Brett



Re: [whatwg] Blacklist for regsiterProtocolHandler()

2011-04-20 Thread Brett Zamir

 On 4/20/2011 2:11 AM, Lachlan Hunt wrote:

On 2011-04-19 19:33, Ian Hickson wrote:

On Tue, 12 Apr 2011, Lachlan Hunt wrote:


We are investigating registerProtocolHandler and have been discussing
the need for a blacklist of protocols to forbid.

[...]

We'd like to know if we've missed any important schemes that must be
blocked, and we think it might be useful if the spec listed most of
those, except for the vendor specific schemes, which should probably be
left up to each vendor to worry about.


I haven't updated the spec yet, but it strikes me that maybe what we
should do instead is have a whitelist of protocols we definitely want to
allow (e.g. mailto:), and define a common prefix for protocols that are
used with this feature, in a similar way to how with XHR we've added 
Sec-*

as a list of headers _not_ to support.


Other protocols we should probably also whitelist:

irc, sms, smsto, tel.

I'm also curious how we could handle ISBN URNs, like:

  urn:isbn:0-395-36341-1

That would be useful to have a web service that could look up the ISBN 
and direct users to information about the book, or to an online store.


As currently specified, services have to register a handler for urn, 
even if they only handle ISBN URNs.  The other alternative would be to 
mint a new web+isbn scheme, which seems suboptimal.


Maybe registerProtocolHandler() could take a function as an extra 
argument to let the application determine whether it wishes to handle 
the protocol event, internally using e.preventDefault(), 
e.stopPropagation(), or something similar to indicate that it has 
successfully handled the case, and pass the buck to let other protocol 
handlers be checked in order of user preference otherwise.



So e.g. we could whitelist any protocol starting with web+ and then
register that as a common extension point for people inventing protocols
for use with this feature, so that people writing OS-native apps would
know that if they used a protocol with that prefix it's something 
that any

web site could try to take over.


That seems reasonable.



Now that it seems there is momentum on resolving the URN and custom 
(pseudo-)namespacing issue (I think x- might be nice to continue the 
tradition, though web seems fine also if real namespaces will not be 
used), can we please put back on the table the ideas of:


1) adding to a/ an attribute uris (for trying alternatives first, 
with greater precedence than href)
2) adding to a/ an attribute fallbackURIs (for lesser precedence 
than href, e.g., so a browser might expose these URIs only when the 
link was right-clicked)
3) adding an event to listen for the user refusing or the browser not 
supporting a protocol (even if this can be done with try-catches).


...so that people can actually begin experimenting with 
registerProtocolHandler() rather than expecting content authors to make 
links which may lead to nowhere for some of their users?


Brett



Re: [whatwg] Cross-domain databases; was: File package protocol and manifest support?

2011-03-09 Thread Brett Zamir
With IndexedDB apparently gaining traction, I'd like to reiterate my 
proposal for cross-domain shared databases. Though I'll admit I'm not 
sure I understand all of the concerns raised earlier, I do want to offer 
my own rebuttals to these counter-points as I understand them (and 
subsequently highlight some additional advantages):


*Compared to using cross-domain shared workers*

While cross-domain shared workers might be more ideal for sites which 
wanted more granular control of how their shared data was accessed, and 
no doubt a positive idea for sites wishing to go to the trouble to set 
it up, my strong feeling is that making one's data shared should also be 
possible in an easy a manner as setting up the database itself. 
Requiring the developer to create their own API for others to access 
their database would no doubt A) prevent some sites which might be open 
to sharing their data from doing so, and B) Add barriers for those 
wishing to mash-up different sites' data, as it would require study of 
not only the site's unique data store, but also its means of allowing 
access to it.


If not from the start, ideally the API could eventually offer more 
fine-grained access, such as to request read-only privileges or the like 
from the user, but these features would ideally work for any site 
offering a shared database, with no need to learn a site-specific API.


*Compared to being able to copy a remote database*

Besides avoiding wasted space in duplicating another database, the 
alternative of being allowed to make a simple copy of a remote database, 
is inferior in that one would not be able to operate off of the same 
data, especially if changes continue to be made by the user within the 
application and for which they do not wish independent copies of data. 
If one site offers a note-taking application, there are strong benefits 
for portability as well as concurrent use of different interfaces if the 
same data can be manipulated by multiple applications (as in different 
word processors being able to access the same file). Moreover, there is 
also the assurance for an application that it is allowing its users to 
work with the latest data from their targeted site if the targeted site 
has its own means of ensuring automatic updates.



*Transactional concerns*

I would think transactional concerns could be addressed by building in 
transactions and/or good judgment by the sharing party about whether to 
share a database in the first place (if a third-party application could 
cause corruption). Surely there are no inherently insurmountable 
barriers here.

*

Higher Privileges, Iterating Databases, and Search*

I would even hope, especially given the apparently similarly 
privilege-seeking FileSystem API,  for the ability to request permission 
from the user to access any local database, whether shared by the site 
author or not, if the idea is not shot down by those not willing to 
accept that good features may still come at the cost of security risk to 
some users who do not pay attention to dialogues (these users can also 
be duped to download unsafe executables, opt in for geolocation, etc., 
so I don't think the web should be stunted in its growth by 
over-coddling all of us just because some may abuse it). It would also 
be nice to be able to iterate through the available databases in order 
to allow generic database viewing programs to be built as web apps, as 
well as perhaps eventually see a Full Text Search API for searching 
through all of one's data (and files).


The WHATWG cannot anticipate and make its own API for every possible 
shared data use case (even if some general uses like calendar access 
indeed call for their own API) nor expect that everyone willing to share 
data will wish to go through numerous hoops to do so. I think the full 
power of Shared Data and Open Data remains to be seen if the user were 
able to download data once, while having the ability to query that data 
as they wish from any number of their own chosen applications, and if 
shared content providers could be free to lighten their server loads and 
alleviate security concerns for enabling unrestricted searches.


Thanks,
Brett

(The email below is repeated for convenience, but due to my loss of the 
original email, obtained from 
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-June/020299.html or 
as a better organized thread: 
http://www.mail-archive.com/whatwg@lists.whatwg.org/msg15393.html )

---
On Wed, 20 May 2009, Brett Zamir wrote:

 I would like to suggest an incremental though I believe significant
 enhancement to Offline applications/SQLite.

 That is, the ability to share a complete database among offline
 applications according to the URL from which it was made available.
 [...]

On Tue, 19 May 2009, Drew Wilson wrote:

 One alternate approach to providing this support would be via shared,
 cross-domain workers (yes, workers are my

Re: [whatwg] Ongoing work on an editing commands (execCommand()) specification

2011-03-03 Thread Brett Zamir

On 3/4/2011 3:53 AM, Aryeh Gregor wrote:

On Wed, Mar 2, 2011 at 8:27 PM, Brett Zamirbret...@yahoo.com  wrote:

In any case, spans with inline styles are much less likely to conflict with
other styling

I'm not sure what you mean by this.  Functionally, in what way is
span style=font-style: italic  less likely to conflict with other
styling thani, for instance?  Possibly some authors do things like
i { color: green }, but they might do the same for span as well --
either seems equally unlikely to me.

Since span is meant to be a generic container with no inherent meaning 
or formatting of its own (except being understood as being of inline 
display), formatting a span without reference to class would be a pretty 
broad stroke to paint, whereas styling an i (or em) would be 
reasonably targeted. The only problem is that the advantages for styling 
specificity which formatting tags offer are the same reasons why they 
will more likely conflict when placed in another document. I think the 
nicest solution would be span class=italic 
style=font-style:italic;, but I don't know that the need to export 
contents for external style usage is one worth complicating the markup 
even further for everyone using it.



If one wishes to allow convenient export of the internally generated mark-up
for the sake of the user (e.g., to allow them to copy the markup for use on
their own site), it is nice for there to be choice at least between
non-formatting-committal (semantic) markup and non-semantically-committal
(formatted) mark-up

The commands we have are clearly presentational.  E.g., we have an
italic command, and don't have emphasize or citation or
variable commands.  If there were demand for it, semantic commands
could be added, but currently I'm assuming the use-case is only
presentational markup.  If someone would like to argue that there's
substantial author demand or implementer interest in adding semantic
commands, we could do that, but they'd have to be separate features,
and we'd still have to deal with the commands that are presentational
only.
Personally, I'm not sure why there needs to be the redundant API with 
insertHTML in the first place, if insertHTML can be standardized (maybe 
with a removeHTML to target specific inserted tags?), so I don't see a 
need for adding the semantic ones on top of that.



On Thu, Mar 3, 2011 at 12:51 AM, Roland Steiner
rolandstei...@google.com  wrote:


Paraphrasing Julie's point
from our original exchange: you want to be consistent to make it easy to
hand-edit the resultant HTML (by the user, say), or for server-side
post-processing. For these purposes pure markup is often easier. OTOH, in
general, styles are more powerful.

I get the hand-editing argument.b  is much nicer to hand-edit than
span style=font-weight: bold, and also fewer bytes.  But why would
anyone wantspan style=font-weight: bold?  Styles are more
powerful in general, but that's not a reason to prefer them in the
specific cases where they aren't more powerful.


In relation to strong, the reason is simply to avoid committing that 
the usage of bold is for strong text. What is the practical use of 
using strong/? Besides possible clarity in code, for richly semantic 
documents, akin to the detailed level of markup present in TEI documents 
used for marking up classical texts and the like, a browser user (or 
programmer) can shape XQueries (or jQueries)--directly or by a 
user-friendly wrapping GUI--which selectively extract that information.


If one wishes to find all passages in a work of Shakespeare where there 
was indication in the original that there should be strong emphasis, one 
can make a single elegant (and semantic) query which grabs that 
information, while silently ignoring the span 
style=font-weight:bold; (or b) tags which were instead used to 
provide styling of say unmeaningful but bolded text in say the imprint 
or wherever. (The reason to use span over b, on the other hand, is 
simply to avoid the tag being at least semantically unamenable to 
alteration by the designer given its always being assumed to be tied to 
boldness by its use of b. )


Granted, a user or application might actually wish to search for text of 
a particular formatting rather than a particular meaning, but if they 
are not distinguished, only the former is possible.


Brett



Re: [whatwg] Ongoing work on an editing commands (execCommand()) specification

2011-03-02 Thread Brett Zamir

On 3/3/2011 3:18 AM, Aryeh Gregor wrote:

On Tue, Mar 1, 2011 at 5:11 PM, Ryosuke Niwarn...@webkit.org  wrote:

Styling a Range doesn't support styleWithCSS=false

I saw this feature in Mozilla's docs, but I don't really get it.  What
use-cases does it have?  Why do we need to support both ways of doing
things if they create the same visible effect?


Maybe the use of non-CSS mode was for backward-compatibility with 
earlier versions or for easier overriding of styling in the target 
document (e.g., b {color:red;}).


Using CSS might have been added by Mozilla since the predefined commands 
are formatting-specified (e.g., it is a bold command not a strong 
one), so in the absence of a semantically-based API, it is technically 
more accurate for the resulting code to be non-committal about semantic 
meaning (there is always insertHTML if you need semantically-accurate 
(WYSIWYM) mark-up internally). (Granted in Mozilla, the non-CSS version 
is also non-committal in producing b rather than strong, but perhaps 
it was seen at the time that such mark-up was out of favor and expected 
to be soon out the door?)


In any case, spans with inline styles are much less likely to conflict 
with other styling, but on the other hand, they do not provide granular 
control, unless perhaps classes were also to be added to these spans to 
indicate the formatting (span class=bold)--one case where it might 
be actually reasonable to use non-semantic class names--and allow CSS in 
the document to target these classes.


If one wishes to allow convenient export of the internally generated 
mark-up for the sake of the user (e.g., to allow them to copy the markup 
for use on their own site), it is nice for there to be choice at least 
between non-formatting-committal (semantic) markup and 
non-semantically-committal (formatted) mark-up, although I could 
understand if people wanted to force the more rare WYSIWYM editor to use 
insertHTML to handle such cases.


Brett



Re: [whatwg] Thoughts on recent WhatWG blog post

2011-02-07 Thread Brett Zamir

On 2/8/2011 1:33 AM, Adam van den Hoven wrote:

Hey guys,

I was reading the blog today and something I read (
http://blog.whatwg.org/whatwg-extensibility) prompted me to signup to the
list and get involved again. What follows is not exactly well thought out
but I'm hoping that it will spark something.

window.atob() and window.btoa() feel wrong, so does
window.crypto.getRandomUint8Array(length), not because they're not useful
but because there is no answer to 'what does converting binary data to a
base 64 string have to do with window?' beyond 'nothing but where else would
it go?'.

In reality all these belong to javascript, but modifying JS to include them
is not feasible. Also, what if I don't want to do crypto in my code, why
should I have all that lying around (not that its a hardship but why pollute
the global namespace)?

I think that we would all be better served if we were to introduce something
like CommonJS to the browser, or perhaps a modified version of the require()
method. This would allow us to move the crytpo and the atob/btoa into
specific modules which could be loaded using:

script
var my_crypto = window.require( 'html:crypto' ); //or however you wanted to
identify the 'crypto library defined by html'
var my_util = window.require( 'html:util' ); // my_util.atob();
var $ = window.require( '
https://ajax.googleapis.com/ajax/libs/jquery/2.0.0/jquery.min.js' );
/script
I really like this approach of using require() for namespacing. Besides 
being semantically more accurate, the approach looks to the future as 
well, in that it does not burden the user with needing to avoid names 
which could end up being co-opted by HTML in the future. (Also, just 
because WhatWG may verify names are fairly unique or unique on the 
public web, does not mean they are catching HTML uses off the web and 
for which there could be conflicts; abdicating responsibility for 
effects of such apps just because it is not within the scope of the spec 
would I think be rather shoddy stewardship of a language relied on for 
many purposes.)


Also, whatever the caching behavior, I like the idea of having a simple 
way of importing remote scripts dynamically, without having to define 
some wrapper for XMLHttpRequest ugliness each time one wishes to use 
such a basic feature, such as to ensure one's application stays modular 
(particularly for reusable libraries where the library does not wish to 
burden the user with needing to specify multiple script tags).


The final and even main reason I like and had wanted to suggest this 
approach is because it is neatly compatible with another idea I think 
the web (and off-web) needs: a formal way to import proprietary objects 
(e.g., such as specific browser vendor(s) or browser add-ons might make 
available), with the potential for fallback behavior, with require() 
throwing distinct exceptions such as NotSupported (the default for 
unrecognized modules) or UserRefused (for privileged features a 
browser or add-on might wish to make available from the web but which a 
user specifically declined).


Maybe a second argument to require() could be specified to allow for an 
asynchronous callback (where one did not wish to keep checking for the 
setting of a particular property or force the user to wait).


Brett



Re: [whatwg] New method for obtaining a CSS property

2011-01-28 Thread Brett Zamir

On 1/28/2011 3:15 PM, Boris Zbarsky wrote:

On 1/28/11 1:22 AM, Brett Zamir wrote:

My point is that a selector can be tied to a property through the
ruleset.


No, not really.  Something that _matches_ selectors an be tied to a 
property via seeing which selectors it matches and then considering 
the resulting declaration lists


Since I'm speaking more or less about a literal match, this would be 
basically the same as you are saying. In any case, I think you get my point.

I recognize there may be more than one declaration even with
the same property being associated with the same selector, but I'm
suggesting to define some rules for selecting the most logical match.


So rules for matching selectors to selectors, right? 

Yes.
Defining these could really get pretty complex, unless you're 
suggesting that it just be a string compare of the serializations or 
something.

Yes, I am suggesting the latter.


You can do that right now using getComputedStyle, with a bit more
code, right?


Yes, or by iterating through document.stylesheets.


Um... why would you do that?



Here's the way I've been doing it for my own code; remember all I want 
is the text of the property value associated with an exact selector 
match. With this function, I don't need to worry about context--just get 
the match I want (treating the passed selectorText argument like a 
variable name of an object and treating the passed propertyName as a 
property of that object).


function getCSSPropertyValue (selectorText, propertyName) {
function _getPropertyFromStyleSheet (ss, selectorText, propertyName) {
var rules = ss.cssRules ? ss.cssRules : ss.rules;
for (var j = 0, crl = rules.length; j  crl; j++) {
var rule = rules[j];
try {
if (rule.type === CSSRule.STYLE_RULE  
rule.selectorText === selectorText) {

return rule.style.getPropertyValue(propertyName);
}
}
catch (err) { /* IE */
if (rule.selectorText === selectorText) {
propertyName = propertyName.replace(/-([a-z])/g, 
function (str, n1) {

return n1.toUpperCase();
});
return rule.style[propertyName];
}
}
}
return false;
}
for (var i = 0, value, dsl = document.styleSheets.length; i  dsl; 
i++) {

var ss = document.styleSheets[i];
value = _getPropertyFromStyleSheet(ss, selectorText, propertyName);
if (value) {
return value;
}
}
return false;
}


But as Ashley pointed out, it is needlessly complex to create one's 
own pseudo document


Why would you need to create a pseudo document?


Since my purpose is only to get the property value for an exact selector 
match, I'm not interested in getting a different match if a particular 
element say matches E  F.class rather than just F.class. A user in 
such a use case does not care about, and probably doesn't want to face 
ambiguities raised by context.


for this purpose, and I think it should be a simple operation to be 
able to

do something as fundamental as following best practices.


Ideally, yes, but setting styles directly from script (as opposed to 
setting classes that are then styled by the stylesheet) is not exactly 
best practices, unless we're looking at different best practices lists.


Sometimes it is not possible to do this, which is the reason for this 
suggestion (even if CSS transitions could reduce the need for this 
somewhat):


var element = document.getElementById('start-transition'),
successColor = getCSSPropertyValue('.transition-success', 
'background-color'),
failureColor = getCSSPropertyValue('.transition-failure', 
'background-color');

indicateSuccessOrFail(element, successColor, failureColor);

function doFunkyTransition (element, beginColor, endColor) {
// Base on RGB values of beginColor and endColor, incrementally
// set the color style property of the element to the intermediate 
color

// in whatever manner one wishes; more advanced cases could
// be pulsating between colors, etc.
// We can't practically devise classes for each of the many
// intermediate steps of our custom transition
}

function indicateSuccessOrFail (element, successColor, failureColor) {
var beginColor = element.style.backgroundColor;
var ajaxSuccessCallback = function () {
doFunkyTransition(element, beginColor, successColor);
};
var ajaxFailCallback = function () {
doFunkyTransition(element, beginColor, failureColor);
};
someAjaxRequestFunction(ajaxSuccessCallback, ajaxFailCallback);
}



Or, for canvas specifically. You draw an animated Hello and want the
designer to be able to choose the fill color. You want to be able to
query the stylesheet easily to get the styling info.


Or just set a class on your canvas and let styles apply

Re: [whatwg] New method for obtaining a CSS property

2011-01-28 Thread Brett Zamir

On 1/28/2011 3:33 PM, Ryosuke Niwa wrote:

On Wed, Jan 26, 2011 at 10:23 PM, Brett Zamirbret...@yahoo.com  wrote:


I'll give a more concrete example, but I did state the problem: separation
of concerns, and the data I want, getting a CSS property for a given
selector.

For example, we want the designer guy to be able to freely change the
colors in the stylesheet for indicating say a successful transition (green),
an error (red), or waiting (yellow) for an Ajax request. The JavaScript girl
does not want to have to change her code every time the designer has a new
whim about making the error color a darker or lighter red, and the designer
is afraid of getting balled out for altering her code improperly. So the
JavaScript girl queries the .error class for the background-color
property to get whatever the current error color is and then indicates to an
animation script that darkred should be the final background color of the
button after the transition. The retrieval might look something like:

document.getCSSPropertyValue(.error,  background-color);  // 'darkred'

While the JavaScript would choose the intermediate RGB steps to get there
in the fashion desired by the developer.

Yes, there are CSS transitions, or maybe SVG, but this is for cases where
you want control tied to JavaScript.


It sounds like all you want to do is:

function getColor(className) {

var dummy = document.createElement('div');
dummy.className = className;
document.body.appendChild(dummy);
var color = dummy.style.backgroundColor;
document.body.removeChild(dummy);
return color;

}
Yes, mostly that would work too, yes, but again with some greater 
ambiguity if someone say styled, body  .myClass as opposed to .myClass.


Or, to piggy-back on what Boris suggested:

function getColor(className) {
  var dummy = document.createElement('div');
  dummy.className = className;
  return getComputedStyle(dummy, null).getPropertyValue('color');
}

While this is not a frequent use case, I'll admit, this need to ensure 
separation of concerns is, in my view, something fundamental enough to 
call for a built-in method, despite the fact that there are a number of 
ways this can be done.


Brett



Re: [whatwg] New method for obtaining a CSS property

2011-01-27 Thread Brett Zamir

On 1/27/2011 3:59 PM, Tab Atkins Jr. wrote:

On Wed, Jan 26, 2011 at 10:23 PM, Brett Zamirbret...@yahoo.com  wrote:

For example, we want the designer guy to be able to freely change the colors
in the stylesheet for indicating say a successful transition (green), an
error (red), or waiting (yellow) for an Ajax request. The JavaScript girl
does not want to have to change her code every time the designer has a new
whim about making the error color a darker or lighter red, and the designer
is afraid of getting balled out for altering her code improperly. So the
JavaScript girl queries the .error class for the background-color
property to get whatever the current error color is and then indicates to an
animation script that darkred should be the final background color of the
button after the transition. The retrieval might look something like:

document.getCSSPropertyValue(.error, background-color); // 'darkred'

While the JavaScript would choose the intermediate RGB steps to get there in
the fashion desired by the developer.

Yes, there are CSS transitions, or maybe SVG, but this is for cases where
you want control tied to JavaScript.

Or, for canvas specifically. You draw an animated Hello and want the
designer to be able to choose the fill color. You want to be able to query
the stylesheet easily to get the styling info.

Given a selector, multiple declaration blocks could match, and within
a single declaration block, multiple copies of the property can match.
  You could get any number of values back from a call like that, and
there's no guarantee that the value you get is worth anything, because
a property from some other declaration block with a different selector
could be winning instead.
I was thinking of it grabbing the winning property for the whole 
document, i.e., the one which would be applicable without knowing more 
contextual information. So, if the selector specified were .error, it 
wouldn't get div.error, but it could be defined for example to get the 
last .error and property (or perhaps giving precedence to !important). 
Whatever the rules, it would be the team's responsibility to ensure it 
was unique enough to provide the right value (as it is within CSS itself 
via cascading).

It seems that your problem is that you want a way to declare a certain
value to be reused in both CSS and javascript, so that you can change
it in either realm without having to care about it in the other.

This can be solved cleanly and easily with CSS Variables.  The CSS
Working Group has tried several times to agree on how such a thing
should work, so far without success, but Chrome is trying again now.
Check out my blog post on the subject:
http://www.xanthir.com/blog/b49w0
I was only proposing a read-only system, but CSS Variables as you 
describe is indeed an excellent and attractively comprehensive and clean 
solution (and also address canvas requirements more smoothly, since 
variables could be defined with values which did not even map to a real 
CSS property, such as shadow color, as long as variables were made 
available to JavaScript whether or not they were actually used in the CSS).


Still, also having the ability to easily get the value which wins out 
would have the one advantage of working without needing to change 
existing code to use variables.


thanks,
Brett



Re: [whatwg] New method for obtaining a CSS property

2011-01-27 Thread Brett Zamir

On 1/28/2011 2:19 AM, Boris Zbarsky wrote:

On 1/27/11 1:23 AM, Brett Zamir wrote:

I'll give a more concrete example, but I did state the problem:
separation of concerns, and the data I want, getting a CSS property for
a given selector.


selectors don't have properties.   Elements have properties, and 
declarations have properties.  selectors are a mechanism for tying 
rulesets to declarations.


My point is that a selector can be tied to a property through the 
ruleset. I recognize there may be more than one declaration even with 
the same property being associated with the same selector, but I'm 
suggesting to define some rules for selecting the most logical match. 
Even if it cannot match the same behavior for contextual style 
determination (which as Ashley pointed out follows its own cascading 
rules which do not apply here), interpreting it predictably (e.g., 
either the very first or very last exact match if there are other exact 
duplicate selector references) and as literally as possible (e.g., 
.error would not match .error.error nor div.error) should address 
the use cases I am mentioning.



For example, we want the designer guy to be able to freely change the
colors in the stylesheet for indicating say a successful transition
(green), an error (red), or waiting (yellow) for an Ajax request. The
JavaScript girl does not want to have to change her code every time the
designer has a new whim about making the error color a darker or lighter
red, and the designer is afraid of getting balled out for altering her
code improperly. So the JavaScript girl queries the .error class for
the background-color property to get whatever the current error color
is and then indicates to an animation script that darkred should be
the final background color of the button after the transition. The
retrieval might look something like:

document.getCSSPropertyValue(.error, background-color); // 'darkred'


You can do that right now using getComputedStyle, with a bit more 
code, right?


Yes, or by iterating through document.stylesheets. But as Ashley pointed 
out, it is needlessly complex to create one's own pseudo document for 
this purpose, and I think it should be a simple operation to be able to 
do something as fundamental as following best practices. But the CSS 
Variables Tab suggested would work pretty well to meet this need as 
well, though it would require altering existing CSS to be parameterized.



Or, for canvas specifically. You draw an animated Hello and want the
designer to be able to choose the fill color. You want to be able to
query the stylesheet easily to get the styling info.


Or just set a class on your canvas and let styles apply to it as normal?


Maybe you are thinking of SVG here?

One can't do something like this with canvas:
canvas style=fillStyle:rgb(200,0,0)/canvas

..and even if one could, it would not be targeted to the specific shapes 
needing styling.


thanks,
Brett



Re: [whatwg] New method for obtaining a CSS property

2011-01-27 Thread Brett Zamir

On 1/28/2011 2:22 AM, Boris Zbarsky wrote:

On 1/27/11 4:48 AM, Brett Zamir wrote:

I was thinking of it grabbing the winning property for the whole
document, i.e., the one which would be applicable without knowing more
contextual information. So, if the selector specified were .error, it
wouldn't get div.error


That's pretty difficult to define, actually.  Should it get 
.error.error?


As mentioned in my other response just now, no, I don't think it should. 
The idea is to be as literal as possible, following a predictable path 
(e.g., the very first or maybe very last reference). It couldn't and 
wouldn't need to follow CSS cascading behavior, since the purpose here 
is different. It would generally be expected that there would only be 
one match since the designer and developer would coordinate to find a 
unique match.


As with regular contextual CSS, one needs to be careful not to overwrite 
one's own rules, but this use case is treating selector-property 
associations as (read-only) variables rather than contextually resolved 
style rules.


Still, it could also be made to follow cascading behavior (e.g., 
!important getting higher precedence). As long as it was predictable, 
the point is for there to be a clear way to figure it out.



Whatever the rules, it would be the team's responsibility to ensure it
was unique enough to provide the right value (as it is within CSS itself
via cascading).


Why is just asking for computed style, and getting correct answers 
that include the results of the cascade, not the right solution here?
Sure, one could pick an arbitrary element, and hope it didn't have its 
own style attribute, or again, make a pseudo document, but this does not 
seem as satisfying as just querying the stylesheets, as one can do--but 
only by an extra utility function--in iterating through 
document.stylesheets. Best practice should not require hacks, in my 
opinion. One should be able to just tell people, if you want to grab a 
style property value defined in a stylesheet dynamically in JavaScript, 
use this simple method to do so.


thanks,
Brett



[whatwg] New method for obtaining a CSS property

2011-01-26 Thread Brett Zamir
While it can already be done dynamically by iterating through 
document.stylesheets, in order to allow full separation of presentation 
from content in a convenient way and foster best practices, I think it 
would be helpful to have a built-in method to obtain a CSS property for 
a given selector.


This is particularly needed by canvas which has no inherent way of 
obtaining styles set in a stylesheet, but would also be useful for HTML 
and SVG where setting a class dynamically may not always be sufficient, 
such as in a dynamically powered transition which wants to obtain 
starting and ending style properties.


(This issue might also highlight some gaps in CSS, with, for example, 
there being no exact equivalent for say shadow color as used in canvas, 
but even without such changes to CSS, the proposed method would still be 
useful.)


Brett



Re: [whatwg] New method for obtaining a CSS property

2011-01-26 Thread Brett Zamir

On 1/27/2011 11:21 AM, Tab Atkins Jr. wrote:

On Wed, Jan 26, 2011 at 6:47 PM, Brett Zamirbret...@yahoo.com  wrote:

While it can already be done dynamically by iterating through
document.stylesheets, in order to allow full separation of presentation from
content in a convenient way and foster best practices, I think it would be
helpful to have a built-in method to obtain a CSS property for a given
selector.

This is particularly needed by canvas which has no inherent way of obtaining
styles set in a stylesheet, but would also be useful for HTML and SVG where
setting a class dynamically may not always be sufficient, such as in a
dynamically powered transition which wants to obtain starting and ending
style properties.

(This issue might also highlight some gaps in CSS, with, for example, there
being no exact equivalent for say shadow color as used in canvas, but even
without such changes to CSS, the proposed method would still be useful.)

What are you actually trying to accomplish?  I can't tell from this
email what data you're trying to get, precisely, and nowhere do you
state what problem you're trying to solve.


I'll give a more concrete example, but I did state the problem: 
separation of concerns, and the data I want, getting a CSS property for 
a given selector.


For example, we want the designer guy to be able to freely change the 
colors in the stylesheet for indicating say a successful transition 
(green), an error (red), or waiting (yellow) for an Ajax request. The 
JavaScript girl does not want to have to change her code every time the 
designer has a new whim about making the error color a darker or lighter 
red, and the designer is afraid of getting balled out for altering her 
code improperly. So the JavaScript girl queries the .error class for 
the background-color property to get whatever the current error color 
is and then indicates to an animation script that darkred should be 
the final background color of the button after the transition. The 
retrieval might look something like:


document.getCSSPropertyValue(.error,  background-color);  // 'darkred'

While the JavaScript would choose the intermediate RGB steps to get 
there in the fashion desired by the developer.


Yes, there are CSS transitions, or maybe SVG, but this is for cases 
where you want control tied to JavaScript.


Or, for canvas specifically. You draw an animated Hello and want the 
designer to be able to choose the fill color. You want to be able to 
query the stylesheet easily to get the styling info.


Brett



Re: [whatwg] Syntax highlighting language attribute for editable elements

2010-12-27 Thread Brett Zamir

On 6/13/2010 10:45 PM, Bjartur Thorlacius wrote:

While using @lang for this purpose sound good in theory it will simply
overload it with information it wasn't really designed for. Something
like @type=application/perl or somesuch might work better. That also
has the benefit that we don't need to build a new list of names of
programming languages (and take care of languages with similiar/same
names, such as Go vs Go!).

(Sorry for the very delayed reply)

I like the @type idea, and it can be extensible too via application/x-*.

I think code/ could benefit for this approach too, in order to keep 
them in harmony.


Brett



On 6/13/10, Ashley Sheridana...@ashleysheridan.co.uk  wrote:

On Sun, 2010-06-13 at 13:57 +0800, Brett Zamir wrote:


Has thought been given to allow textarea, input and/or contenteditable
elements to use an attribute (maybe likecode/  does with
class=language-XX) so that user agents might be able to display the
editable text with syntax highlighting code automatically?

This should not adversely affect users who do not have such browser
support, nor does it put pressure on browsers to implement immediately
(add-ons might take care of such a role). But having a convention in
place (even if languages are not predefined) would ensure that the
burden of implementing such support could be shifted away from the
developer if they are not so inclined.

I'd prefer to see a dedicated attribute (also oncode/) since the
language type does convey general interest semantic information, but I
think it would also ideally be consistent (i.e., the same attribute to
be used incode/  as intextarea/, etc.).

Maybe @lang/@xml:lang could be used for this purpose if its definition
could someone be widened to recognize computer languages.

It would be nice, however, to also have some means of indicating that
the web author is providing their own styling of the element in the
event they wish to use their own editor.

thank you,
Brett Zamir


I think maybe not a class, as the class attribute already has a purpose
and is probably already used in acode class=php  type of capacity
already by some sites showing code excerpts. I'd suggest maybe extending
the lang attribute, but it's also conceivable that a code snippet might
be in Perl and written with French comments, and the lang attribute
wasn't meant for multiple values like the class attribute is. Perhaps
the best solution is to use another new attribute altogether?

It is a good idea though, I think, as it does add a lot of semantic
meaning to the content.

Thanks,
Ash
http://www.ashleysheridan.co.uk









Re: [whatwg] Reserving XRI and URN in registerProtocolHandler

2010-11-26 Thread Brett Zamir

On 11/26/2010 6:18 PM, Julian Reschke wrote:

On 26.11.2010 05:20, Brett Zamir wrote:

I'd like to propose reserving two protocols for use with
navigator.registerProtocolHandler: urn and xri (or possibly xriNN
where NN is a version number).

See http://en.wikipedia.org/wiki/Extensible_Resource_Identifier for info
on XRI (basically allows the equivalents of URN but with a user-defined
namespace and without needing ICANN/IANA approval). Although it was


You don't need ICANN/IANA approval.

You can use informal URN namespaces, use a URN scheme that allows just 
grabbing a name (such as URN:UUID) *or* write a small spec; for the 
latter, the approval is *IETF* consensus (write an Internet Draft, 
then ask the IESG for publication as RFC).


My apologies for the lack of clarity on the approval process. I see all 
the protocols listed with them, so I wasn't clear.


In any case, I still see the need for both types being reserved (and for 
their subnamespaces targeted by the protocol handler), in that 
namespacing is built into the XRI unlike for informal URNs which could 
potentially conflict.


thanks,
Brett



Re: [whatwg] Reserving XRI and URN in registerProtocolHandler

2010-11-26 Thread Brett Zamir

On 11/26/2010 7:13 PM, Julian Reschke wrote:

On 26.11.2010 11:54, Brett Zamir wrote:

...
My apologies for the lack of clarity on the approval process. I see all
the protocols listed with them, so I wasn't clear.

In any case, I still see the need for both types being reserved (and for
their subnamespaces targeted by the protocol handler), in that
namespacing is built into the XRI unlike for informal URNs which could
potentially conflict.
...


I'm still not sure what you mean by reserve and what that would mean 
for the spec and for implementations.


I just mean that authors should not use already registered protocols 
except as intended, thinking that they can use any which protocol name 
they like (e.g., the Urn Manufacturers Company using urn for its 
categorization scheme).
I do agree that the current definition doesn't work well for the urn 
URI scheme, as, as you observed, semantics depend on the first 
component (the URN namespace). Do you have an example for an URN 
namespace you actually want a protocol handler for?



ISBNs.
Finally, I'd recommend not to open the XRI can-of-worms (see 
http://en.wikipedia.org/wiki/Talk:Extensible_Resource_Identifier).


Ok, looks like I misappropriated this based on an incomplete 
understanding. Still, at least the part about having a namespaced naming 
protocol makes sense to me. For example, if Wikipedia offered its own 
article names up for referencing using its own namespace to define which 
scheme was being used, but in a generic way so they could be 
dereferenced to articles on Britannica, Citizendium, etc., sites 
wouldn't need to be showing preference to only one encyclopedia when 
they added links (or at least give the choice to their users by using 
the attributes I proposed be added to a/ such as alternateURIs for 
the fallbacks after href or defaultURIs for the priority ones before 
href).


Brett



Re: [whatwg] Reserving XRI and URN in registerProtocolHandler

2010-11-26 Thread Brett Zamir

On 11/26/2010 11:59 PM, Julian Reschke wrote:

On 26.11.2010 16:55, Brett Zamir wrote:

On 11/26/2010 7:13 PM, Julian Reschke wrote:

On 26.11.2010 11:54, Brett Zamir wrote:

...
My apologies for the lack of clarity on the approval process. I see 
all

the protocols listed with them, so I wasn't clear.

In any case, I still see the need for both types being reserved 
(and for

their subnamespaces targeted by the protocol handler), in that
namespacing is built into the XRI unlike for informal URNs which could
potentially conflict.
...


I'm still not sure what you mean by reserve and what that would mean
for the spec and for implementations.


I just mean that authors should not use already registered protocols
except as intended, thinking that they can use any which protocol name
they like (e.g., the Urn Manufacturers Company using urn for its
categorization scheme).

I do agree that the current definition doesn't work well for the urn
URI scheme, as, as you observed, semantics depend on the first
component (the URN namespace). Do you have an example for an URN
namespace you actually want a protocol handler for?


ISBNs.


Oh, that's a good point. In particular, if the URN WG at some day 
makes progress with respect to retrieval.


So, would it be possible to write a generic protocolHandler for URN 
which itself delegates to more specific ones?


If a site were interested in handling all of the cases, I would think 
so, but I doubt that would happen. I doubt the neighborhood bookstore 
site is going to try to deal with XMPP URNs or whatever else, even if 
the spec called for some (bandwidth-wasting) response by the server to 
indicate it was abdicating responsibility.


The only optimal way I can really see this is if there were say a fourth 
argument added to registerProtocolHandler which allowed (or in the case 
of URNs or what I'll call XRN for now, required) specifying a 
namespace (perhaps also allowing URN subnamespace specificity via 
colons, e.g., ietf:rfc).


Brett



[whatwg] Reserving XRI and URN in registerProtocolHandler

2010-11-25 Thread Brett Zamir
I'd like to propose reserving two protocols for use with 
navigator.registerProtocolHandler: urn and xri (or possibly xriNN 
where NN is a version number).


See http://en.wikipedia.org/wiki/Extensible_Resource_Identifier for info 
on XRI (basically allows the equivalents of URN but with a user-defined 
namespace and without needing ICANN/IANA approval). Although it was 
rejected earlier, I don't see that there is any other way for sites to 
create their own categorization or other behavior mechanisms in a way 
which is well-namespaced, does not rely on waiting for official 
approval, and has the benefits of working with the HTML5 specification 
as already designed.


URN is something which I think also deserves to be reserved, if not all 
IANA protocols.


As I see it, the only way for a site to innovate safely in avoiding 
conflicts for non-IANA protocols is to use XRI (assuming especially if 
it can be officially reserved).


And all of this would be enhanced, in my view, if my earlier proposal 
for defaultURIs and alternateURIs attributes on a/ could be accepted 
as well: 
http://www.mail-archive.com/whatwg@lists.whatwg.org/msg20066.html in 
that it makes it much more likely that people would actually use any of 
these protocols.


thank you,
Brett



Re: [whatwg] Reserving XRI and URN in registerProtocolHandler

2010-11-25 Thread Brett Zamir
My apologies, I realized that there might be a modification needed to 
the HTML5 design of registerProtocolHandler, in that URN and XRI are 
better designed to work, in many cases, with registration to a specific 
namespace. For example, one application might only wish to handle 
urn:earthquakes or xri:http://example.com/myProtocols/someProtocol which 
hopefully registerProtocolHandler could be expanded to allow such 
specification without interfering with other URN/XRI protocol handlers 
which attempted to handle a different namespace.


thanks,
Brett

On 11/26/2010 12:20 PM, Brett Zamir wrote:
I'd like to propose reserving two protocols for use with 
navigator.registerProtocolHandler: urn and xri (or possibly xriNN 
where NN is a version number).


See http://en.wikipedia.org/wiki/Extensible_Resource_Identifier for 
info on XRI (basically allows the equivalents of URN but with a 
user-defined namespace and without needing ICANN/IANA approval). 
Although it was rejected earlier, I don't see that there is any other 
way for sites to create their own categorization or other behavior 
mechanisms in a way which is well-namespaced, does not rely on waiting 
for official approval, and has the benefits of working with the HTML5 
specification as already designed.


URN is something which I think also deserves to be reserved, if not 
all IANA protocols.


As I see it, the only way for a site to innovate safely in avoiding 
conflicts for non-IANA protocols is to use XRI (assuming especially if 
it can be officially reserved).


And all of this would be enhanced, in my view, if my earlier proposal 
for defaultURIs and alternateURIs attributes on a/ could be accepted 
as well: 
http://www.mail-archive.com/whatwg@lists.whatwg.org/msg20066.html in 
that it makes it much more likely that people would actually use any 
of these protocols.


thank you,
Brett






Re: [whatwg] (deferred) script tags with document.write built in

2010-08-17 Thread Brett Zamir

 On 8/13/2010 2:00 AM, Adam Barth wrote:

On Thu, Aug 12, 2010 at 3:02 AM, Brett Zamirbret...@yahoo.com  wrote:

  On 8/12/2010 4:19 PM, Ian Hickson wrote:

On Sat, 24 Jul 2010, Brett Zamir wrote:

Might there be a way thatscript/tags could add an attribute which
combined the meaning of both defer and document.write, whereby the
last statement was evaluated to a string, but ideally treated, as far as
the DOM, with the string being parsed and replacing the containing
script node.

For example:

script  write
 'span  onmouseover=alert(\''+(new Date())+'\')I\'ve got the
date/span'
/script

If E4X were supported (since we otherwise lamentably have no PHP-style
HEREDOC syntax in JavaScript to minimize the few warts above), allowing
this to be used could be especially convenient:

script  write
 span  onmouseover=alert(new Date())I've got the date/span
/script

(Maybe even a newwrite/tag could be made to do this exclusively and
more succinctly.)

I chose defer as the default behavior so as to be XHTML-friendly, to
allow convenient reference by default to other DOM elements without the
need for adding a listener, and the more appealing default behavior of
not blocking other content from appearing.

Since it doesn't seem that XQuery support will be making it into
browsers anytime soon, it would be nice to be able to emulate its clean
template-friendly declarative style, dropping the need to find and
append to elements, etc..

I don't really see what this proposal would solve. Can you elaborate on
what this would allow that can't be done today?

It would simplify and beautify the addition of dynamic content and encourage
separation of business logic from design logic (at least for content
displayed on initial load).

For example, using proposed shorter formwrite/, one might do this:

script
   // business logic here
   var localizedMsg = I've got the date: ;
   var businessLogicDate = new Date();
/script
write
span+localizedMsg.toUpperCase() + businessLogicDate +/span
/write

It would simplify for those with a frequent need for template pages. The
template(s) expressed bywrite/  could succinctly express the design logic
without need for document.write() used everywhere. The semantically named
tag would also distinguish such templates from other scripts.

For XHTML, it would be especially useful in being able to offer
document.write functionality (since such a tag would be defined as deferring
execution until the rest of the page had loaded). No need for onload
handlers, no need for adding and referencing IDs in order to find the
element, and no need for DOM appending methods in order to provide dynamic
content.

I agree that a client-side templating system would be very useful.
However, we should design it with security in mind.  The design you
propose above is very XSS-prone because you're concatenating strings.
What you want is a templating system that operates after parsing (and
possibly after tree construction) but before rendering.


If the concern is to accommodate people who use blacklists of tags 
(which they shouldn't), then instead of write/, I also mentioned 
script write/. The latter, as a derivative of script, would be prone 
to XSS, but it would most likely be caught by existing blacklists.


In order for a templating system to have enough robustness while not 
unnecessarily adding yet another syntax or making undue restrictions, I 
think regular JavaScript would work fine, just cutting out the 
unnecessary cruft of load events and element finding needed by XHTML and 
cut out the need for document.write calls for both XHTML and HTML.


I think E4X would be far more elegant than strings and ideal (e.g., see 
https://developer.mozilla.org/en/E4X_for_templating ) and a logical 
choice, but I proposed the string concatenation to hopefully minimize 
the changes that would be necessary to add such support in browsers that 
don't support E4X.


Brett



Re: [whatwg] (deferred) script tags with document.write built in

2010-08-12 Thread Brett Zamir

 On 8/12/2010 4:19 PM, Ian Hickson wrote:

On Sat, 24 Jul 2010, Brett Zamir wrote:

Might there be a way thatscript/  tags could add an attribute which
combined the meaning of both defer and document.write, whereby the
last statement was evaluated to a string, but ideally treated, as far as
the DOM, with the string being parsed and replacing the containing
script node.

For example:

script  write
 'span  onmouseover=alert(\''+(new Date())+'\')I\'ve got the
date/span'
/script

If E4X were supported (since we otherwise lamentably have no PHP-style
HEREDOC syntax in JavaScript to minimize the few warts above), allowing
this to be used could be especially convenient:

script  write
 span  onmouseover=alert(new Date())I've got the date/span
/script

(Maybe even a newwrite/  tag could be made to do this exclusively and
more succinctly.)

I chose defer as the default behavior so as to be XHTML-friendly, to
allow convenient reference by default to other DOM elements without the
need for adding a listener, and the more appealing default behavior of
not blocking other content from appearing.

Since it doesn't seem that XQuery support will be making it into
browsers anytime soon, it would be nice to be able to emulate its clean
template-friendly declarative style, dropping the need to find and
append to elements, etc..

I don't really see what this proposal would solve. Can you elaborate on
what this would allow that can't be done today?


It would simplify and beautify the addition of dynamic content and 
encourage separation of business logic from design logic (at least for 
content displayed on initial load).


For example, using proposed shorter form write/, one might do this:

script
   // business logic here
   var localizedMsg = I've got the date: ;
   var businessLogicDate = new Date();
/script
write
span+localizedMsg.toUpperCase() + businessLogicDate + /span
/write

It would simplify for those with a frequent need for template pages. The 
template(s) expressed by write/ could succinctly express the design 
logic without need for document.write() used everywhere. The 
semantically named tag would also distinguish such templates from other 
scripts.


For XHTML, it would be especially useful in being able to offer 
document.write functionality (since such a tag would be defined as 
deferring execution until the rest of the page had loaded). No need for 
onload handlers, no need for adding and referencing IDs in order to find 
the element, and no need for DOM appending methods in order to provide 
dynamic content.


Brett



Re: [whatwg] Please consider adding a couple more datetime input types - type=year and type=month-day

2010-08-08 Thread Brett Zamir
 How about a pull-down for Wikipedia which lets you choose the year or 
period? Or a charting application which looks at trends in history.


While some uses may be more common than others, I personally favor going 
the extra kilometre to allow full expressiveness for whatever ranges are 
allowed.


Brett

On 8/9/2010 9:19 AM, Ben Schwarz wrote:
While creating an input that works for every use case you can think of 
sounds like a good idea, I'd like to question weather a user would 
ever /enter a date/ that would require the inclusion of BC/AD.


I'm certain that there is a requirement to markup such text, but as 
for /entry/ I'm strongly of the opinion that you're over cooking this.


On Mon, Aug 9, 2010 at 11:11 AM, Kit Grose k...@iqmultimedia.com.au 
mailto:k...@iqmultimedia.com.au wrote:


The field being four digits long doesn't restrict its contents to
four digits only. I suppose you do raise an interesting concern;
should the year field also permit the entry of BC/AD? If so,
that might invalidate the ability to use a number field; you'd
need to use a validation pattern on a standard text field.

—Kit

On 09/08/2010, at 10:46 AM, Andy Mabbett wrote:


 On Mon, August 9, 2010 00:44, Kit Grose wrote:
 How is a year input any different from a four-digit input
type=number
 field?

 Years can be more of fewer than four digits. Julius Caesar was
born in 100
 BC, for instance, while Manius Acilius Glabrio was consul in 91 AD.

 --
 Andy Mabbett
 @pigsonthewing
 http://pigsonthewing.org.uk







Re: [whatwg] HTML resource packages

2010-08-03 Thread Brett Zamir

 This is and was a great idea. A few points/questions:

1) I think it would be nice to see explicit confirmation in the spec 
that this works with offline caching.


2) Could data files such as .txt, .json, or .xml files be used as part 
of such a package as well?


3) Can XMLHttpRequest be made to reference such files and get them from 
the cache, and if so, when referencing only a zip in the packages 
attribute, can XMLHttpRequest access files in the zip not spelled out by 
a tag like link/? I think this would be quite powerful/avoid 
duplication, even if it adds functionality (like other HTML5 features) 
which would not be available to older browsers.


4) Could such a protocol also be made to accommodate profiles of 
packages, e.g., by a namespace being allowable somewhere for each package?


Thus, if a package is specified as say being under the XProc (XML 
Pipelining) namespace profile, the browser would know it could 
confidently look for a manifest file with a given name and act 
accordingly if the profile were eventually formalized through future 
specifications or implemented by general purpose scripting libraries or 
browser extensions, etc.


Another example would be if a file packaging format were referenced by a 
page, allowing, along with a set of files, a manifest format like METS 
to be specified and downloaded, describing a sitemap for a package of 
files (perhaps to be added immediately to the user's IndexedDB database, 
navigated Gopher-like, etc.) and then made navigable online or offline 
if the files were included in the zip, thus allowing a single HTTP 
request to download a whole site (e.g., if a site offered a collection 
of books).


And manifest files might be made to specify which files should be 
updated at a specific time independently of the package (e.g., checking 
periodically for an updated manifest file outside of a zip which could 
point to newer versions).


Note: the above is not asking browsers to implement any such additional 
complex functionality here and now; rather, it is just to allow for the 
possibility of automated discovery of package files having a particular 
structure (e.g., with specifically named manifest files to indicate how 
to interpret the package contents) by providing a programmatically 
accessible namespace for each package which could be unique per 
application and interpreted in particular ways, including by general 
purpose JavaScript libraries. This is not talking about adding 
namespaces to HTML itself, but rather for specifying package profiles.


Such extensibility would, as far as I can see it, allow for some very 
powerful declarative styles of programming in relation to handling of 
multiple files (whether resource files, data files, or complete pages), 
while piggybacking on the proposal's ability to minimize the HTTP 
requests needed to get them.


best wishes,
Brett


On 8/4/2010 8:31 AM, Justin Lebar wrote:

We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
and we wanted to get the WhatWG's feedback on the feature.

For the impatient, the spec is here:

 http://people.mozilla.org/~jlebar/respkg/

and the bug (complete with builds you can try and some preliminary
performance numbers) is here:

 https://bugzilla.mozilla.org/show_bug.cgi?id=529208


You can think of resource packages as image spriting 2.0.  A page
indicates in itshtml  element that it uses one or more resource
packages (which are just zip files).  Then when that page requests a
resource (be it an image, a css file, a script, or whatever), the
browser first checks whether one of the packages contains the
requested resource.  If so, the browser uses the resource out of the
package instead of making a separate HTTP request for the resource.

There's of course more detail than that, of course.  Hopefully it's
(mostly) clear in the spec.

I envision two classes of users of resource packages.  I'll call the
first resource-constrained developers.  These developers care about
how fast their page is (who doesn't?), but can't spend weeks speeding
up their page.  For these developers, resource packages are an easy
way to make their pages faster without going through the pain of
spriting their images and packaging their js/css.

The other class of users are the resource-unconstrained developers;
think Google or Facebook.  These developers have already put a huge
amount of effort into making their pages fast, and a naive application
of resource packages is unlikely to make them any faster.  But these
developers may be able to use resource packages cleverly to gain
speedups.  In particular, nobody (to my knowledge) currently sprites
content images, such as the results of an image search.  A determined
set of developers should be able to construct resource packages for
image search results on the fly and save some HTTP requests.


So we can avoid rehashing here the common objections to resource
packages, here's a brief overview of the arguments I've 

Re: [whatwg] Simple Links

2010-07-27 Thread Brett Zamir

 On 7/28/2010 6:22 AM, Eduard Pascual wrote:

On Tue, Mar 30, 2010 at 11:44 PM, Christoph Päper
christoph.pae...@crissov.de  wrote:

If you think about various syntax variants of wiki systems they’ve got one thing in common that makes 
them preferable to direct HTML input: easy links! (Local ones at least, whatever that means.) The 
best known example is probably double square brackets as in Mediawiki, the engine that powers the 
Wikimediaverse. A link to another article on the same wiki is as simple as “[[Foo]]”, where HTML 
would have needed “a href=FooFoo/a”.

I wonder whether HTML could and should provide some sort of similar shortening, i.e. “a hrefFoo/a” or even, 
just maybe, “aFoo/a”. The UA would append the string content, properly encoded, to the base Web address as 
the hyperlink’s target, thus behave as had it encounters “a href=FooFoo/a”.

I prefer the binary toggle role of the ‘href’ attribute, although it doesn’t work well in the XML serialisation, because it provides 
better compatibility with existing content and when I see or write “aBar/a” I rather think of the origin of that element 
name, ‘anchor’. So I expect it to be equivalent to “a idBar/a” and “a nameBar/a” which would be shortcuts 
for “a id=BarBar/a”.

PS: Square brackets aren’t that simple actually, because on many keyboard 
layouts they’re not easy to input and might not be found on keytops at all.
PPS: The serialisation difference is not that important, because XML, unlike 
HTML, isn’t intended to be written by hand anyway.

Can't this be handled with CSS' generated content? I'm not sure if
I'll be getting the syntax right, but I think something like this:

a[href]:empty { content: attr(href); }
would pull the href from every emptya  that has such attribute (so
it doesn't mess with anchor-only elements) and render it as the
content of the element. Note that href attributes are resolved
relative to what yourbases define (this is slightly better than
just appending, since it makes '../whatever'-style URLs work the
right way), so you don't need to (rather, should not) use absolute
URLs for such links.

It seems that you are only concerned about avoiding duplication of
content for the href and the content of the element. Your proposal
puts the stuff on the content, while the CSS-based solution would put
it on the href; but both put it only once.
While it is a creative solution, something as basic as content of an 
href should not depend on CSS... CSS content is supposed to be reserved 
for decorative content.


I for one like the abbreviated syntax; a lot of times one does wish to 
make the link visible. I imagine the web would be full of such links.


Abbreviating to a.../a wouldn't work as an abbrev for a href as 
the former is still used for anchors.


Brett



Re: [whatwg] [URL] Starting work on a URL spec

2010-07-23 Thread Brett Zamir

 On 7/24/2010 12:02 PM, Boris Zbarsky wrote:

On 7/23/10 11:59 PM, Silvia Pfeiffer wrote:

Is that URLs as values of attributes in HTML or is that URLs as pasted
into the address bar? I believe their processing differs...


It certainly does in Firefox (the latter have a lot more fixup done to 
them, and there are also differences in terms of how character 
encodings are handled).


I would be particularly interested in data on this last, across 
different browsers, operating systems, and locales...  There seem to 
be servers out there expecting their URIs in UTF-8 and others 
expecting them in ISO-8859-1, and it's not clear to me how to make 
things work with them all.


Seems to me that if they are not in UTF-8, they should be treated as 
bugs, even if that is not a de jure standard.


Brett



Re: [whatwg] Please disallow javascript: URLs in browser address bars

2010-07-22 Thread Brett Zamir

 On 7/23/2010 6:35 AM, Luke Hutchison wrote:

On Thu, Jul 22, 2010 at 5:39 PM, Boris Zbarskybzbar...@mit.edu  wrote:


  I can see the security benefits of disallowing all cross-origin application
of javascript: (if you don't know where it came from, don't apply it).

Yes, that is actually a really good way to put things -- javascript
typed into the URL bar is cross-origin.  (And dragging bookmarklets to
the address bar or bookmarks bar is also cross-origin, that's the
reason that a security check should be applied and/or user warning
given.)

Facebook already disallows the execution of arbitrary js code on a fan
page, of course, which is why these viruses require you to manually
copy/paste into the addressbar.


In whatever security mechanism is worked out, besides preserving the 
ability for people to be able to use the URL bar for potentially 
privileged bookmarklets if they wish (even if they must give permission 
after receiving a specific warning), I would actually like to see the 
privileges available to bookmarklets expanded, upon explicit warnings 
and user permission. For example, it would be of enormous use to be able 
to link someone to a specific site, while manipulating the view of that 
page such as to mash over the data with tooltips mash down some data 
from it to a smaller set, mash up the data with additional notes/sources 
(whether from other sites or text found on the source page), or mash 
under the data with semantic markup changes or highlighting of specific 
text.


I know this is absolutely dangerous, but if people can install 
extensions which can wipe out hard-drives with a two clicks and a 
restart (and thank God that such power exists in browsers like Firefox 
so people can make extensions which do access the file system for 
positive uses), there should be a way, such as with dead-serious 
warnings (and I'll concede disallowing https), that people can mash an 
existing source and still work in its scope (just as I think there 
should be the ability to run cross-domain Ajax after getting user 
permission). Greasemonkey is great, but it would be nice for there to be 
a standard, especially for uses as referring people immediately to a 
specific subset of content on another page.


Brett



[whatwg] iframes with potential for independent navigation controls

2010-07-21 Thread Brett Zamir
 I would like to see attributes be added to allow iframes to have 
independent navigation controls, or rather, to allow a parent document 
to have ongoing access to the navigation history of its iframes (say to 
be informed of changes to their histories via an event) so that it could 
create such controls?  I would think that the average user might suspect 
that their clicks would not necessarily be private if they were already 
in the context of another site, but if privacy would be of concern here 
(or security, though GET requests alone shouldn't alone be able to give 
access to sensitive data), maybe the user could be asked for permission, 
as with Geolocation.


This really has I think wonderful potential.

Especially but not exclusively for larger screens, one can envision, for 
example, a site which displays content in a table, with paragraphs being 
in one column, and commentary in another.


If the commentary column uses say a wiki (or a comment feed/discussion 
thread), to keep track of resources and cross-references, insights, 
errata, etc. pertaining to a given paragraph or verse (e.g., for books, 
but also potentially for blog articles, etc.--anywhere people may wish 
to dissect in context), it would be desirable for one to be able to say 
edit content in one fixed iframe pane, follow links in another one 
(including to other domains), etc., all while keeping the original 
context of the table of content and iframes on screen, and allowing the 
user to go backward and forward in any pane independently (or possibly 
type in a URL bar for each iframe). One can even imagine ongoing chat 
discussions taking place within such a mosaic.


best wishes,
Brett



Re: [whatwg] Allowing in attribute values

2010-06-24 Thread Brett Zamir

On 6/24/2010 10:13 AM, Benjamin M. Schwartz wrote:

The HTML5 spec appears to allow  inside an attribute value.  For
example, the following page (note the body tag) passes the experimental
HTML5 validator at w3c.org:

!DOCTYPE HTMLhtmlheadtitle/title/head
body class=32
/body/html

I think  should be disallowed inside attribute values.   It is
disallowed in XHTML [1].
I do not see any reference to this in the XHTML 1.0 specification (nor 
XHTML 1.1), and in XML, section 2.4, it states only that it must be 
escaped if part of the sequence ]] in content, which I guess means 
only element content. E4X also does not escape  in attribute values 
(only in element content).


Brett


[whatwg] Syntax highlighting language attribute for editable elements

2010-06-12 Thread Brett Zamir
Has thought been given to allow textarea, input and/or contenteditable 
elements to use an attribute (maybe like code/ does with 
class=language-XX) so that user agents might be able to display the 
editable text with syntax highlighting code automatically?


This should not adversely affect users who do not have such browser 
support, nor does it put pressure on browsers to implement immediately 
(add-ons might take care of such a role). But having a convention in 
place (even if languages are not predefined) would ensure that the 
burden of implementing such support could be shifted away from the 
developer if they are not so inclined.


I'd prefer to see a dedicated attribute (also on code/) since the 
language type does convey general interest semantic information, but I 
think it would also ideally be consistent (i.e., the same attribute to 
be used in code/ as in textarea/, etc.).


Maybe @lang/@xml:lang could be used for this purpose if its definition 
could someone be widened to recognize computer languages.


It would be nice, however, to also have some means of indicating that 
the web author is providing their own styling of the element in the 
event they wish to use their own editor.


thank you,
Brett Zamir


Re: [whatwg] URN or protocol attribute

2010-06-03 Thread Brett Zamir

On 3/11/2010 10:44 AM, Brett Zamir wrote:

On 3/11/2010 10:31 AM, Ian Hickson wrote:

I would recommend following a pattern somewhat like the Web's initial
development: create a proof of concept, and convince people that it's 
what
they want. That's the best way to get a feature adopted. This is 
described

in more detail in the FAQ:


http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_a_specification.3F 



Ok, fair enough. I think I'll try that as a browser extension snip


Just as a follow-up, I have now made a Firefox extension which supports 
two attributes on a/: uris and alternateURIs, whereby the former 
takes precedence over href, and the latter are accessible only by 
right-clicking links (though potentially discoverable by custom styling 
of such links (automatable by the extension)).


My thought is that sites which have the following goals may be 
particularly interested:


1) Those wishing to maintain objectivity and refrain from endorsing 
specific sites, e.g., governments, news institutions, scholars, or sites 
like Wikipedia. Even for a site's internal links, use of alternateURIs 
could offer convenience (e.g., Wikipedia would no doubt wish to continue 
to use href to refer to its own ISBN page by default, but could use the 
alternateURIs attribute to allow right-clicks on the link to activate 
the URN link which in turn activates their chosen default handler, e.g., 
Amazon, Google Books, etc.). The same could be done for music, etc.
2) giving full user choice as to how to view the data (especially useful 
for information of common and particular interest to the site viewers, 
e.g., links to the Bible in a religious forum)
3) those wishing to try out new protocols of whatever type (not only 
URNs), such as chatting protocols, whether installed via web or browser 
extension, as the proposed markup gives them a convenient fall-back to 
href, so they don't have to have dead links for those whose browsers 
do not support the protocol.


https://addons.mozilla.org/en-US/firefox/addon/162154/

Brett



Re: [whatwg] URN or protocol attribute

2010-06-03 Thread Brett Zamir

On 6/4/2010 12:59 PM, Brett Zamir wrote:

On 3/11/2010 10:44 AM, Brett Zamir wrote:

On 3/11/2010 10:31 AM, Ian Hickson wrote:

I would recommend following a pattern somewhat like the Web's initial
development: create a proof of concept, and convince people that 
it's what
they want. That's the best way to get a feature adopted. This is 
described

in more detail in the FAQ:


http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_a_specification.3F 



Ok, fair enough. I think I'll try that as a browser extension snip


Just as a follow-up, I have now made a Firefox extension which 
supports two attributes on a/: uris and alternateURIs, whereby 
the former takes precedence over href, and the latter are accessible 
only by right-clicking links (though potentially discoverable by 
custom styling of such links (automatable by the extension)).


My thought is that sites which have the following goals may be 
particularly interested:


1) Those wishing to maintain objectivity and refrain from endorsing 
specific sites, e.g., governments, news institutions, scholars, or 
sites like Wikipedia. Even for a site's internal links, use of 
alternateURIs could offer convenience (e.g., Wikipedia would no 
doubt wish to continue to use href to refer to its own ISBN page by 
default, but could use the alternateURIs attribute to allow 
right-clicks on the link to activate the URN link which in turn 
activates their chosen default handler, e.g., Amazon, Google Books, 
etc.). The same could be done for music, etc.
2) giving full user choice as to how to view the data (especially 
useful for information of common and particular interest to the site 
viewers, e.g., links to the Bible in a religious forum)
3) those wishing to try out new protocols of whatever type (not only 
URNs), such as chatting protocols, whether installed via web or 
browser extension, as the proposed markup gives them a convenient 
fall-back to href, so they don't have to have dead links for those 
whose browsers do not support the protocol.


https://addons.mozilla.org/en-US/firefox/addon/162154/


Just to elaborate a little bit further, one possible future addition 
which could further enhance this experience would be to design a 
protocol (and corresponding markup-detection mechanism), say created: 
or wiki: which would first check via HEAD request whether the page 
were created or not, and then style the link accordingly and possibly 
alter the URL to lead directly to the editing or alternatively, it could 
make HEAD requests to try out a sequence of URLs, e.g., checking whether 
Citizendium had an article, and if not, creating a link to the Wikipedia 
article if present. While this could be done potentially via the server 
(e.g., this extension for Mediawiki: 
http://www.mediawiki.org/wiki/Extension:BADI_Pages_Created_Links ), I 
believe allowing client-side markup to do it would facilitate use of 
this potential more widely, allowing wikis or open-forums to link with 
one another in a way which prevents wasted visits by their users.


Although URNs could also be used (as supported already potentially in 
the above extension) to try to find encyclopedic articles (e.g., 
urn:name:pizza), or better yet, through a new protocol which could 
suggest intended uses of the information (e.g., 
find:urn:name:pizza?category=order, find:urn:name:pizza?category=define, 
etc.) and thereby avoiding hard-coding information, the created: 
suggestion above could give authors more control than they have now if 
they did want to suggest a particular path-way.


Brett



[whatwg] Avoiding new globals in HTML5 ECMAScript

2010-05-09 Thread Brett Zamir

Hello,

Although it seems a lot of attention has been given to ensuring 
backward-compatibility in HTML5, and while a kind of namespacing has 
been considered in use of data- attributes (over expando properties), it 
appears to my limited observations that global (window) properties are 
being added without sufficient regard for backward compatibility (and in 
any case limiting future variable naming by authors).


While I can understand the convenience of properties like 
window.localStorage or window.JSON, it seems to me that new global 
properties and methods (at least future ones!) should be added within 
some other reserved object container besides window.


While I can appreciate that some would argue that the convenience of 
globals to authors outweighs potential conflict concerns (and I know I'm 
not offering this suggestion very early in the process), it seems to me 
that HTML5's client-side ECMAScript should model good practices in 
limiting itself as far as new globals perhaps similar to how XML 
reserved identifiers beginning with xml, doing the same with allowing 
one W3C global or maybe HTML{N} globals or the like (HTML alone 
would no doubt be way too likely to conflict), allowing authors the 
assurance that they can name their properties freely within a given set 
of constraints without fear of being over-ridden later.


thank you,
Brett



Re: [whatwg] Avoiding new globals in HTML5 ECMAScript

2010-05-09 Thread Brett Zamir
My apologies, it was brought to my attention that JSON was specified in 
ECMAScript 5, but the principle still applies (for ECMAScript as well I 
would say).


thank you,
Brett

On 5/10/2010 1:08 PM, Brett Zamir wrote:

Hello,

Although it seems a lot of attention has been given to ensuring 
backward-compatibility in HTML5, and while a kind of namespacing has 
been considered in use of data- attributes (over expando properties), 
it appears to my limited observations that global (window) properties 
are being added without sufficient regard for backward compatibility 
(and in any case limiting future variable naming by authors).


While I can understand the convenience of properties like 
window.localStorage or window.JSON, it seems to me that new global 
properties and methods (at least future ones!) should be added within 
some other reserved object container besides window.


While I can appreciate that some would argue that the convenience of 
globals to authors outweighs potential conflict concerns (and I know 
I'm not offering this suggestion very early in the process), it seems 
to me that HTML5's client-side ECMAScript should model good practices 
in limiting itself as far as new globals perhaps similar to how XML 
reserved identifiers beginning with xml, doing the same with 
allowing one W3C global or maybe HTML{N} globals or the like 
(HTML alone would no doubt be way too likely to conflict), allowing 
authors the assurance that they can name their properties freely 
within a given set of constraints without fear of being over-ridden 
later.


thank you,
Brett






Re: [whatwg] Parsing processing instructions in HTML syntax: 10.2.4.44 Bogus comment state

2010-03-17 Thread Brett Zamir

On 3/2/2010 6:54 PM, Ian Hickson wrote:

On Tue, 2 Mar 2010, Elliotte Rusty Harold wrote:
   

The handling of processing instructions in the XHTML syntax seems
reasonably well-defined; but it feels a little off in the HTML syntax.
 

There's no such thing as processing instructions in text/html.

There was such a thing in HTML4, because of its SGML heritage, though it
was explicitly deprecated.

   


Doesn't seem deprecated per 
http://www.w3.org/TR/html401/appendix/notes.html#h-B.3.6



Briefly it seems that? causes the parser to go into Bogus comment
state, which is fair enough. (I wouldn't really recommend that anyone
use processing instructions in HTML syntax anyway.) However the parser
comes out of that state at the first. Because processing instructions
can contain  and terminate only at the two character sequence ?  this
could cause PI processing to terminate early and leave a lot more error
handling and a confused parser state in the text yet to come.
 

In HTML4, PIs ended at the first, not at ?. ?target data is the
syntax of PIs when the SGML options used by HTML4 are applied.

In any case, the parser in HTML5 is based on what browsers do, which is
also to terminate at the first. It's unlikely that we can change that,
given backwards-compatibility needs.

There's a simple workaround: don't use PIs in text/html, since they don't
exist in HTML5 at all, and don't send XML as text/html, since XML and HTML
have different syntaxes and aren't compatible.

   


In http://dev.w3.org/html5/html4-differences/ , it says:

HTML5 defines an HTML syntax that is compatible with HTML4 and XHTML1 
documents published on the Web, but is not compatible with the more 
esoteric SGML features of HTML4, such as processing instructions 
http://www.w3.org/TR/1999/REC-html401-19991224/appendix/notes.html#h-B.3.6 
and shorthand markup 
http://www.w3.org/TR/1999/REC-html401-19991224/appendix/notes.html#h-B.3.7.


This seems to me to suggest that backward compatibility can be broken as 
far as processing instructions (i.e., requiring ? and not merely  to 
close a processing instruction). If not, then it doesn't seem clear from 
the specification that processing instructions are indeed not allowed 
because the parsing model does allow them, and with processing 
instructions being platform-specific by definition and not apparently 
explicitly prohibited by HTML5 (unless that is what you are trying to 
say here), HTML5 syntax does seem to be compatible with them. But if you 
are trying to prohibit them for any use whatsoever yet still technically 
allow them to be ignored for compatibility, it seems this would 
contradict the statement at http://dev.w3.org/html5/html4-differences/ 
that there is no longer a need for marking features deprecated. Or 
is the difference that these are forbidden from doing anything but will 
be allowed (and ignored) indefinitely into the future in future versions 
of HTML?


Btw, I've added a talk section at the wiki page 
http://wiki.whatwg.org/wiki/Talk:HTML_vs._XHTML#Harmony to suggest 
covering XHTML-HTML compatibility guidelines specifically, so would 
appreciate a reply there, so I know whether we can begin edits in this 
vein on the page.


thanks,
Brett



Re: [whatwg] Lifting cross-origin XMLHttpRequest restrictions?

2010-03-13 Thread Brett Zamir

On 3/12/2010 3:41 PM, Anne van Kesteren wrote:
On Fri, 12 Mar 2010 08:35:48 +0100, Brett Zamir bret...@yahoo.com 
wrote:
My apologies if this has been covered before, or if my asking this is 
a bit dense, but I don't understand why there are restrictions on 
obtaining data via XMLHttpRequest from other domains, if the request 
could be sandboxed to avoid passing along sensitive user data like 
cookies (or if the user could be asked for permission, as when 
installing browser extensions that offer similar privileges).


Did you see

  http://dev.w3.org/2006/webapi/XMLHttpRequest-2/
  http://dev.w3.org/2006/waf/access-control/

?


I have now, thanks. :)  Though I regrettably don't have a lot of time 
now to study it as deeply as I'd like (nor Michal Zalewski's reference 
to UMP), and I can't speak to the technical challenges of browsers (and 
their plug-ins) implementing the type of sandboxing that would be 
necessary for this if they don't already, I was just hoping I could 
articulate interest in finding a way to overcome if possible, and 
question whether the security challenges could be worked around at least 
in a subset of cases.


While I can appreciate such goals as trying to prevent 
dictionary-based, distributed, brute-force attacks that try to get login 
accounts to 3^rd party servers mentioned in the CORS spec and 
preventing spam or opening accounts on behalf of users and the like, I 
would think that at least GET/HEAD/OPTIONS requests should not be quite 
as important an issue.


As far as the issue Michal brought up about the client's IP being sent, 
I might think this problem could be mitigated by a client header being 
added to indicate the domain of origin behind the request. It's hard to 
lay the blame on the client for a DoS if it is known which server was 
initiating. (Maybe this raises some privacy issues, as the system would 
make known who was visiting the initiating site, but I'd think A) this 
info could be forged anyways, and B) any site could publish its visitors 
anyways.) I'll admit this might make things more interesting legally 
though, e.g., whether the client shared some or all responsibility, for 
DoS or copyright violations, especially if interface interaction 
controlled the number of requests. But as far the burden on the user, if 
the user is annoyed that their browser is being slowed as a result of 
requests made on their behalf (though I'm not sure how much work it 
would save given that the server still has to maintain a connection), 
they can close the tab/window, or maybe the browser could offer to 
selectively disable such requests or request permission.


I would think that the ability for clients to help a server crawl the 
internet might even potentially be a feature rather than a bug, allowing 
a different kind of proxy opportunity for server hosts which are in 
countries with blocked access. Besides this kind of reverse proxy (to 
alter the phrase), I wouldn't think it would be that compelling for 
sites to outsource their crawling (except maybe as a very insecure and 
unpredictably accessible backup or caching service!), since they'd have 
to retrieve the information anyways, but again I can't see what harm 
there would really be in it, except that addressing DoS plans would need 
to address an additional header.


I apologize for not being able to research this more carefully, but I 
was just hoping to see if there might be some way to allow at least a 
safer subset of requests like GET and HEAD by default. Akin to the 
rationales behind my proposal for browser support of client-side XQuery, 
including as a content type (at 
http://brett-zamir.me/webmets/index.php?title=DrumbeatDescription ), it 
seems to me that users could really benefit from such capacity in 
client-side JavaScript, not only for the sake of greater developer 
options, but also for encouraging greater experimentation of mash-ups, 
as the mash-up server is not taxed with having to obtain the data 
sources (nor tempted to store stale copies of the source data nor 
perhaps be as concerned with the need to obtain republishing permissions).


Servers are already free to obtain and mix in content from other 
sites, so why can't client-side HTML JavaScript be similarly empowered?


Because you would also have access to e.g. IP-authenticated servers.



As suggested above, could a header be required on compliant browsers to 
send a header along with their request indicating the originating 
server's domain?


best wishes,
Brett



[whatwg] Lifting cross-origin XMLHttpRequest restrictions?

2010-03-11 Thread Brett Zamir

Hi,

My apologies if this has been covered before, or if my asking this is a 
bit dense, but I don't understand why there are restrictions on 
obtaining data via XMLHttpRequest from other domains, if the request 
could be sandboxed to avoid passing along sensitive user data like 
cookies (or if the user could be asked for permission, as when 
installing browser extensions that offer similar privileges).


Servers are already free to obtain and mix in content from other sites, 
so why can't client-side HTML JavaScript be similarly empowered?


If the concern is simply to give servers more control and avoid Denial 
of Service effects, why not at least make the blocking opt in (like 
robots.txt)? There are a great many uses for being able to mash up data 
from other sites, including from the client, and it seems to me to be 
unnecessarily restrictive to require explicit permissions. Despite my 
suggesting opt-in blocking as an alternative, I wouldn't even think 
there should be this option at all, since servers are technically free 
to grab such content unhindered, and everyone I believe should have the 
freedom and convenience to be able to design and enjoy applications 
which just work--mixing from other pages without extra effort, unless 
they are legally prohibited from doing so.


If the concern is copyright infringement, the same concern holds true 
for servers which can already obtain such content unrestricted, and I do 
not believe overly cautious preemptive policing is a valid pretext for 
constraining technology and its opportunities for sites and users.


thanks,
Brett


Re: [whatwg] URN or protocol attribute

2010-03-10 Thread Brett Zamir

On 3/11/2010 9:19 AM, Ian Hickson wrote:

On Mon, 8 Feb 2010, Brett Zamir wrote:
   

Internet Explorer has an attribute on anchor elements for URNs:
http://msdn.microsoft.com/en-us/library/ms534710%28VS.85%29.aspx

This has not caught on in other browsers, though I believe it could be a very
powerful feature once the feature was supported with a UI that handled URNs
(as with adding support for custom protocols).

Imagine, for example, to take a link like:

a href=http://www.amazon.com/...(shortened)
urn=isbn:9210020251United Nations charter/a

The default behavior would simply follow the link, but if a user agent
supported the @urn attribute, and if the browser (or browser add-ons)
had registered support for that URN namespace identifier (here isbn),
it could, for example, open a dialog to ask which handler to use (or
whether to always use it), it could ask or otherwise allow in
preferences an HTML page (with wildcards) where the attribute's content
could be passed, or it could give the option whenever the user
right-clicked to choose which handler they wanted to use for a given
link.
 

Does this match IE's behaviour with the urn= attribute?

   
No, to my knowledge, there is no special behavior in IE with regard to 
how it is used. I was really more listing it as a precedent, since I was 
more in favor of a more general purpose defaultProtocol attribute 
which could not only allow URNs but other protocols to be tried before 
defaulting to a regular href link.



Historically, browsers that have wanted to offer dedicated services for
specific features, e.g. the iPhone handling map views using a dedicated
Maps application, have done so by simply overriding parts of the URL
space, e.g. in that case detecting when a page is on the Google Maps site
and parsing the URL locally instead of sending it to the remote site.

   
The problem with this is that it is not an approach which can likely be 
taken by browser extensions nor be offered to websites which wish to 
register themselves as handlers. And URNs by definition are not specific 
to any site.

Is there really a need for a more dedicated mechanism? It's not clear that
there is much pent-up demand for this.
   


There wasn't a lot of pent up demand for the web itself either (why 
would people or companies want to link to other people's sites?); if 
people aren't able to use a feature or know of the concept, they might 
not think of asking for it. I think that as with my earlier suggestion 
on shared databases or storage, I think people are just not accustomed 
to thinking that the web can be used in a way which collaborates between 
sites (more than mere links), since the first idea that pops into 
people's minds is how they can put their own site up. That doesn't mean 
they wouldn't like to work with other sites or offer a feature that 
would have normal fallback behavior in browsers that didn't support it.


If people can see a need for registering protocol handlers, the 
defaultProtocol attribute is I think the best way to make it work.  
Why would someone want to experiment in using a protocol (including 
urn:), say ISBN:, if the interface will only say to their users, This 
browser does not recognize that protocol/namespace ID. The 
defaultProtocol attribute would give a chance for the protocol/URN NID 
to be checked for support, but if not working could default to visiting 
the href target. Wouldn't that be a useful feature? Few will 
experiment with a href=urn:isbn:../a as it is just a dead link 
for browsers that don't support the protocol, but I'm sure many sites 
would be willing to allow a defaultProtocol=urn:isbn:... 
href=http://books...;.../a as it doesn't hurt to add one extra 
attribute, even if say browsers are slow at supporting the attribute.


The web is not only about companies that want to make money and shuffle 
people in the direction they want. There are also sites (including 
companies without a stake in certain content) who want to offer more 
choice to their users (e.g., Wikipedia, governments, individuals, etc.). 
And no doubt any company wouldn't mind being able to register themselves 
in a way where they could offer themselves to users visiting those more 
neutral sites (e.g., Amazon registering itself for links leading to ISBN 
links at other sites). It simply offers more choice to users...


Brett



Re: [whatwg] URN or protocol attribute

2010-03-10 Thread Brett Zamir

On 3/11/2010 10:31 AM, Ian Hickson wrote:

On Thu, 11 Mar 2010, Brett Zamir wrote:
   

Is there really a need for a more dedicated mechanism? It's not clear
that there is much pent-up demand for this.
   

There wasn't a lot of pent up demand for the web itself either (why
would people or companies want to link to other people's sites?); if
people aren't able to use a feature or know of the concept, they might
not think of asking for it.
 

That's true, but we only have so many resources, so we have to prioritise.
Things that have pent-up demand are typically more important. :-)

I would recommend following a pattern somewhat like the Web's initial
development: create a proof of concept, and convince people that it's what
they want. That's the best way to get a feature adopted. This is described
in more detail in the FAQ:


http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_a_specification.3F
   


Ok, fair enough. I think I'll try that as a browser extension, as well 
as possibly the other idea of a shared database API which I think has 
the potential to become an even more powerful feature (e.g., a local 
calendar to which any site could offer to add events and be viewed on 
the web or in a browser extension)... But there still may be some things 
(like an official auxiliary world language!) which can only show their 
real benefits (and potential demand) when implemented across the board...


Brett



Re: [whatwg] Parsing processing instructions in HTML syntax: 10.2.4.44 Bogus comment state

2010-03-04 Thread Brett Zamir

On 3/3/2010 7:06 PM, Philip Taylor wrote:

On Wed, Mar 3, 2010 at 10:55 AM, Brett Zamirbret...@yahoo.com  wrote:
   

On 3/2/2010 6:54 PM, Ian Hickson wrote:
 

On Tue, 2 Mar 2010, Elliotte Rusty Harold wrote:

   

Briefly it seems that? causes the parser to go into Bogus comment
state, which is fair enough. (I wouldn't really recommend that anyone
use processing instructions in HTML syntax anyway.) However the parser
comes out of that state at the first. Because processing instructions
can containand terminate only at the two character sequence ?this
could cause PI processing to terminate early and leave a lot more error
handling and a confused parser state in the text yet to come.

 

In HTML4, PIs ended at the first, not at ?. ?target data is the
syntax of PIs when the SGML options used by HTML4 are applied.

In any case, the parser in HTML5 is based on what browsers do, which is
also to terminate at the first. It's unlikely that we can change that,
given backwards-compatibility needs.

   

Are there really a lot of folks out there depending on old HTML4-style
processing instructions not being broken?
 

Yes, e.g. a load of pages like
http://www.forex.com.cn/html/2008-01/821561.htm (to pick one example
at random) say:

   ?xml:namespace prefix = o ns = urn:schemas-microsoft-com:office:office /

and don't have the string ? anywhere.
   


Ok, fair enough.  But while it is great that HTML5 seeks to be 
transitional and backwards compatible, HTML5 (thankfully) already breaks 
compatibility for the sake of XML compatibility (e.g., localName or 
getElementsByTagNameNS). It seems to me that there should still be a 
role of eventually transitioning into something more full-featured in a 
fundamental, language-neutral way (e.g., supporting a fuller subset of 
XML's features such as external entities and yes, XML-style processing 
instructions); extensible, including the ability to include XML from 
other namespaces which may also encourage or rely on using their own XML 
processing instructions, for those who wish to experiment or supplement 
the HTML standard behavior; and more harmonious and compatible with a 
simpler syntax (i.e., XML's)--even if the more complex syntax is more 
prominent and continues to be supported indefinitely.


Brett



Re: [whatwg] URN or protocol attribute

2010-02-10 Thread Brett Zamir

On 2/10/2010 3:55 PM, Martin Atkins wrote:


Brett Zamir wrote:

Hi,

Internet Explorer has an attribute on anchor elements for URNs: 
http://msdn.microsoft.com/en-us/library/ms534710%28VS.85%29.aspx


This has not caught on in other browsers, though I believe it could 
be a very powerful feature once the feature was supported with a UI 
that handled URNs (as with adding support for custom protocols).


Imagine, for example, to take a link like:

a href=http://www.amazon.com/...(shortened) 
urn=isbn:9210020251United Nations charter/a



[snip details]

I like what this proposal achieves, but I'm not sure it's the right 
solution.


Here's an attempt at stating what problem you're trying to solve 
without any specific implementation: (please correct me if I 
misunderstood)


 * Provide a means to link to or operate on a particular artifact 
without necessarily requiring that the artifact be retrieved from a 
specific location.


 * Provide graceful fallback to user-agents which do not have any 
specialized handling for a particular artifact.



Yes, exactly.
This is different to simply linking to a different URL scheme (for 
example, linking a mailto: URL in order to begin an email message 
without knowing the user's preferred email provider) because it 
provides a fallback behavior for situations where there is no handler 
available for a particular artifact.


Yes. But note also that it would be possible to have the @protocol 
attribute be, for example, be used for already frequent protocols like 
mailto:, and the @href be http: . Or, to use a URN, the protocol could 
be urn:isbn:... and the @href could be http, etc.


Since 'href' also links to a protocol, it might be more clear for the 
proposed attribute to be called something like @defaultProtocol.



== Example Use-cases ==

 * View a particular book, movie or other such product without 
favoring a particular vendor.


 * View a map of the location for particular address or directions to 
that address without favoring a particular maps provider.


 * View a Spanish translation of some web document without favoring a 
particular translation provider.


 * Share a link/photo/etc with friends without favoring a particular 
publishing platform. (i.e. generalizing the Tweet This, Share on 
Facebook, class of links)




Yes. This would of course depend on the protocols existing. For example, 
XMPP as an open protocol, might work for your last examples if those 
services were actually using XMPP. And your other examples would also be 
excellent use cases.



== Prior Art ==

=== Android OS Intents ===

The Android OS has a mechanism called Intents[1] which allow one 
application to describe an operation it needs have performed without 
nominating a particular other application to perform it.


Intents are described in detail here:
http://developer.android.com/guide/topics/intents/intents-filters.html

An intent that does not identify a particular application consists of 
the following properties:


 * Action: a verb describing what needs to be done. For example, 
view, edit, choose, share, call.
 * Object: the URI of a particular thing that the action is to be done 
to. This is not specified for actions that apply only to a class of 
object, such as choose.
 * Object Type: the MIME type of the Object, or if no particular 
Object is selected a concrete MIME type or wildcard MIME type (e.g. 
image/*) describing the class of object that the action relates to.


A process called Intent Resolution is used to translate an abstract 
intent like the above into an explicit intent which nominates a 
particular handler.


Often when applications use intents a UI is displayed which allows a 
user to choose one of several available applications that can perform 
the action. For example, the built-in photo gallery application 
provides a Share command on a photo. By default, this can be handled 
by application such as the email client and the MMS application, but 
other applications can declare their support for intents of this type 
thus allowing plug-in functionality such as sharing a photo on Facebook.




That's an interesting consideration.

I think some behaviors should be necessarily limited with links (as they 
are in HTTP disallowing a link to make a POST or PUT request or upload a 
form (without JavaScript at least))--so that, e.g., spam links don't 
cause users to accidentally do things they didn't want to do. So 
side-effects should probably not occur (like share at least), unless 
it was merely, as in your use cases with Twitter/Facebook to lead to a 
UI control confirming that you wanted to share.


Unlike URNs, a regular protocol could already handle the use cases you 
mention, and perhaps the Intents mechanism could itself be made into a 
protocol: e.g.,:


android:intents;action=CALL;data=tel:123-555-1212

Being that experimentation here is fairly early on, and being that there 
may be too many types of fundamental actions/data/etc. to agree

[whatwg] URN or protocol attribute

2010-02-08 Thread Brett Zamir

Hi,

Internet Explorer has an attribute on anchor elements for URNs: 
http://msdn.microsoft.com/en-us/library/ms534710%28VS.85%29.aspx


This has not caught on in other browsers, though I believe it could be a 
very powerful feature once the feature was supported with a UI that 
handled URNs (as with adding support for custom protocols).


Imagine, for example, to take a link like:

a href=http://www.amazon.com/...(shortened) 
urn=isbn:9210020251United Nations charter/a


The default behavior would simply follow the link, but if a user agent 
supported the @urn attribute, and if the browser (or browser add-ons) 
had registered support for that URN namespace identifier (here isbn), 
it could, for example, open a dialog to ask which handler to use (or 
whether to always use it), it could ask or otherwise allow in 
preferences an HTML page (with wildcards) where the attribute's content 
could be passed, or it could give the option whenever the user 
right-clicked to choose which handler they wanted to use for a given link.


Likewise, a link like:

a href=http://en.wikipedia.org/wiki/United_Nations_Charter; 
urn=wikipedia:United_Nations_CharterUnited Nations charter/a


...could alternatively be opened in the corresponding Encyclopedia 
Britannica (or Amazon, etc.) page (assuming wikipedia could be 
accepted as a URN namespace, thus unburdening the IANA from coming to 
consensus on the web community's many and ever-expanding names). Users 
would be free to follow the content in the data viewer they wished to 
use, while sites would be free to avoid committing too strongly to any 
particular handler implementation (i.e., the website they specify is 
only a default).


Although software has been made to handle URNs within @href (see, for 
example, https://addons.mozilla.org/en-US/firefox/addon/1940 ), it is 
really a lot to ask for website creators to support a protocol without 
being certain of support (or at least a JavaScript fall-back which could 
test for support). With this proposal, there is no real down-side to 
content creators, users, etc. (nor, given the simplicity of this 
proposal, even really, it would seem to me, to specification editors!), 
to adding an extra attribute (which need have no precise behavior 
associated with it).


Actually, maybe even a protocol attribute would be in order to give a 
similar alternative between a default website and a generic protocol 
handler (and maybe the urn: protocol could be subsumed into this even 
more general attribute).


So, my suggestion is to add to a/ a @urn or, even better, a @protocol 
attribute (since the latter is more comprehensive and also already has a 
formal API for JavaScript registration of handlers). If some browsers 
are not keen on supporting it, maybe there could be a simple test to 
check for support, since it is not strictly necessary, but could 
unobtrusively offer more choice to users.


best wishes,
Brett



Re: [whatwg] External document subset support

2009-06-18 Thread Brett Zamir

Ian Hickson wrote:

On Mon, 18 May 2009, Brett Zamir wrote:
   

Section 10.1, Writing XHTML documents observes: According to the XML
specification, XML processors are not guaranteed to process the external
DTD subset referenced in the DOCTYPE.

While this is true, since no doubt the majority of web browsers are
already able to process external stylesheets or scripts, might the very
useful feature of external entity files, be employed by XHTML 5 as a
stricter subset of XML (similar to how XML Namespaces re-annexed the
colon character) in order to allow this useful feature to work for XHTML
(to have access to HTML entities or other useful entities for one, as
well as enable a poor man's localization, etc.)?
 


While there are arguments on both sides of whether this is a good idea or
not, I think the more important concern in this case is whether we can
extend XML in this way. I think in practice we should leave this up to the
XML specs and their successors. I don't think it would be appropriate for
us to profile the XML spec in this way.

   


While it is not my purpose to extend the debate on external DTD's, I 
wanted to bring up the following points (brought to light after a recent 
re-review of the spec) because it raises a few serious issues which I 
believe current browsers are failing at, and if the browsers do not 
address these issues, they would make claims for real XHTML 5 support 
(as with XHTML 1.* and plain XML support) unworkable. While I agree that 
any changes to XML itself should be up to the XML specs, from what I can 
now tell, it looks like a closer adherence to the existing spec would 
solve most of the existing problems. I wanted to share the following 
points which I think could resolve most of the issues, if the browsers 
would make the required changes.


I was pleasantly surprised to find that the spec seems to recommend 
solutions which I believe avoid the more serious issue of single point 
of failure problems.


(The other complaints with DTD's, such as avoiding cross-domain DTDs for 
the sake of security or avoidance of DOS attacks might be an optional 
issue if that may, in combination with adhering to existing 
recommendations, satisfy concerns, though I personally do not think such 
a risk is similar to inclusion of cross-domain scripts.)


So what follows is what I have gleaned from these various statements as 
applied to current browsers. I can provide specific citations, but I did 
not wish to expand this post unnecessarily (though I list references at 
the end).


The major issues which I think ought to be resolved by certain browsers, 
as they do not seem to be in accord with the XML spec and as a result, 
create interoperability problems:


1) Firefox and Webkit, should not give a single point of failure for a 
missing entity as they do now, (unless they switch to a validating 
parser which finds no declaration in the external file and the user is 
in validation mode), since such failures in a document with an external 
DTD are NOT well-formedness errors unless the document deliberately 
declares standalone=yes.
2) Explorer, which no longer seems to require in IE8 that the document 
be completely described by the DTD as I believe it had earlier (though 
it will report errors if the document violates rules which are 
specified), should, per the spec, really only report validation errors 
upon user option (ideally, I would say, off by default, and activatable 
on a case-by-case as well as preference-based basis). This will possibly 
speed things up if the option could be disabled as well as let their 
browser work with documents which violate validation. But this issue is 
not as serious as #1, since #1 prevents even valid documents from being 
interoperably viewed on the web.


If these issues are addressed by those aiming for compliance, the only 
disadvantages which will remain (and which are inherent in XML by 
allowing the co-existence of validating and non-validating parsers) are 
those issues described in http://www.w3.org/TR/REC-xml/#safe-behavior 
and http://www.w3.org/TR/REC-xml/#proc-types , namely that:


1) some (entity-related) /well-formedness/ errors (e.g., if an entity is 
not defined but is used) will go hidden to a non-validating parser as 
these will not need to load an entity replacement (which is not a big 
problem, since a document author should presumably have checked (with an 
application which does external entity substitution) that their entities 
integrate properly with the text--it is not as important, however, that 
they check for /validation/ errors, since as mentioned above, these need 
only be reported optionally).
2) The application may possibly not be notified by its processor of, 
e.g., entity replacement values, if it is a non-validating processor 
(though non-validating processors can also make such replacements). But 
since these are, as mentioned above, not to produce well-formedness 
errors, there is no single point

Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-06-10 Thread Brett Zamir

- Original Message 

From: Ian Hickson i...@hixie.ch
To: Brett Zamir bret...@yahoo.com
Cc: wha...@whatwg.org
Sent: Wednesday, June 10, 2009 11:48:09 AM
Subject: Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?
 
On Mon, 18 May 2009, Brett Zamir wrote:
 
 Has any thought been given to standardizing on at least a part of DOM 
 Level 3 Load and Save in HTML5?
 
 DOM3 Load and Save is already standardised as far as I can tell. I don't 
 see why HTML5 would have to say anything about it.

The hope was that there would be some added impetus to have browsers settle on 
a standard way of doing this, since to my knowledge, it looks to me like only 
Opera has implemented DOM Level 3 LS (Mozilla for one hasn't seemed keen on 
implementing it), and I'm afraid this otherwise very important functionality 
will remain unimplemented or unstandardized across browsers. DOMParser() and 
XMLSerializer() may be available in more than just Mozilla, but are not 
standardized, and innerHTML, along with the other issues Boris mentioned in the 
DOMParser / XMLSerializer thread (e.g., being able to parse by content type 
like XML), just doesn't sound appropriate to handle plain XML document when its 
called innerHTML.

thanks,
Brett



Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-06-10 Thread Brett Zamir

From: Anne van Kesteren ann...@opera.com
To: Michael A. Puls II shadow2...@gmail.com; Brett Zamir 
bret...@yahoo.com; Ian Hickson i...@hixie.ch
Cc: wha...@whatwg.org
Sent: Thursday, June 11, 2009 12:31:10 AM
Subject: Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

On Wed, 10 Jun 2009 17:13:28 +0200, Michael A. Puls II shadow2...@gmail.com 
wrote:
 It seems that everyone wants DOM3 LS to die and to have everyone use JS 
 to make their own wrapper around XHR + DOMParser + XMLSerializer etc. to 
 do what DOM3 LS does.

Yeah, no need for two high-level network APIs.

That'd be fine by me if at least DOMParser + XMLSerializer was being officially 
standardized on...

Brett


Re: [whatwg] Nested optgroups

2009-06-03 Thread Brett Zamir

Ian Hickson wrote:

On Mon, 13 Apr 2009, Markus Ernst wrote:
   

I found a message in the list archives from July 2004, where Ian
announced to put nested optgroups back into the spec:
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2004-July/001200.html

Anyway in the current spec, the optgroup element is not allowed inside
another optgroup element:
http://www.whatwg.org/specs/web-apps/current-work/#the-optgroup-element

Has this been removed again since 2004? I did not find more on this in
the list archives.
 


Yeah, this was removed because we couldn't find a good way to get browsers
to support it without breaking backwards compatibility with legacy content
(which relies on the non-nesting parser behaviour).
   
Would there be a way to allow a new element to trigger this behavior 
(maybe deprecating optgroup as well if an attribute on the new element 
could indicate compactness)? Along the lines of expanding HTML more 
toward regular applications, I would think this could help quite nicely 
for building menu bars or the frequently used navigation bars 
recommended by accessibility guidelines without JavaScript (CSS-only 
ones are rare)...


Brett


[whatwg] document.contentType

2009-06-02 Thread Brett Zamir

Hello,

Regardless of any decision on whether my recommendation for 
document.contentType to be standardized and made settable on a document 
created by createDocument() (rather than needing to call the 
less-than-intuitive doc.open() fix for HTML), I'd still like to 
recommend standardizing on Mozilla's document.contentType ( 
https://developer.mozilla.org/en/DOM/document.contentType ) for at least 
being able to get the property.


It can be very useful for JavaScript code or a generic library to offer 
support for XHTML or HTML, such as conditionally calling 
getElementsByTagNameNS() (feature detection will not help since a 
document may still be in HTML even if the browser supports XHTML).


thanks,
Brett


Re: [whatwg] External document subset support

2009-05-24 Thread Brett Zamir

Henri Sivonen wrote:

On May 18, 2009, at 11:50, Brett Zamir wrote:


Henri Sivonen wrote:

On May 18, 2009, at 09:36, Brett Zamir wrote:
Also, as far as heavy server loads for frequent DTDs, entities could 
be deliberately not defined at a resolvable URL.


There are existing XML doctypes out there with resolvable URIs, so 
you'd need a blacklist to bootstrap such a solution.


As you suggest on your site, 'If, for legacy reasons, you must process 
some well-known DTDs, please make your entity resolver retrieve those 
DTDs from a local catalog. I would think the big browsers would be 
fully capable of doing this (as XML allows for by distinguishing public 
and system identifiers), and for any which exploded in popularity before 
obtaining a public identifier, I would imagine a blacklist could work.
The same problems of denial-of-service could exist with stylesheet 
requests, script requests, etc.


No, styles and scripts are commonly site-specific, so there isn't a 
Web-wide single point of failure whose URI gets copied around as 
boilerplate.


Well, again, as mentioned below, they can be of wider use, but I see 
your point that the effects on other sites would indeed most likely be 
stronger if the source site went down. While I think that's a risk they 
should be free to take (just as if people want to share or rely on 
external scripts), but if there's enough feeling against that, the issue 
could be addressed by requiring browsers to only access same domain.
Even some sites, like Yahoo, have encouraged referring to their 
frequently accessed external files to take advantage of caching.


At least the serving infrastructure for those URIs has been designed 
for high load unlike the server for many existing DTD URIs out there. 
Again, I say either let them take the risk if they actually make a 
likely popular DTD to be available, allow a blacklist, or if need really 
be, limit to the same domain.
Furthermore, JS libraries have obvious functionality in existing 
browsers, so it's unlikely that authors would reference JS libraries 
as part of boilerplate without actually intending to take the perf hit 
of loading the library.


Presumably most XML users will be including doctypes which include a 
public identifier. Use of lesser known XML dialects will probably 
presume some knowledge of what is happening, and even then, the official 
provider of the dialect, will probably know not to provide their DTD 
directly as a referenceable DTD.
The spec could even insist on same-domain, though I don't see any 
need for that.


Without same-origin (as in not even performing a CORS GET), you'd need 
to blacklist at least w3.org due to existing references out there. 
Sounds fine, though I am assuming w3.org references already have a 
PUBLIC identifier for their DTDs.

(Note that for security, same-origin/CORS is must-have anyway.)

A must-have if you don't trust the origin, yes. But plenty of sites 
include scripts from other sites for ads or analysis. It would not be 
such a big loss in the case of DTDs to restrict to same domain, however.
I also disagree with throwing our hands up in the air about character 
entities (or thinking that the (English-based) HTML ones are 
sufficient).


That's a text input method issue that needs to be solved on the 
authoring side for text input of all kind--not just text input for 
writing XML in a text editor.


So, what's wrong with doing it in XML? If you're saying that text 
editors need to better support Unicode, then sure, but that's not a 
complete solution, given the cumbersomeness of finding obscure 
characters, etc. which can more simply be defined once in a DTD and 
forgotten. It's a nice feature for a text format which can be created 
across a variety of editors.
Moreover, the browser with the largest market share offers such 
support already, and those who depend on it may already view other 
browsers not supporting the standard as broken.


IE doesn't support XHTML or SVG which are the popular XML formats one 
might want to load into a browsing context.


Again, if there is an offline use, there is a browsing use. Just because 
not everyone is rushing to use XML in this way, does not mean that a lot 
of people would not like to share especially their document-centric XML 
in such a fashion (and even data-centric XML).


Yes, a Firefox/Opera/Safari user who tries XHTML in IE will find it 
broken, while a user of Firefox, etc. visiting an XML file dependent 
on an external DTD will find it broken. Firefox/Opera/Safari should be 
free to offer this positive feature to their users, even if IE doesn't 
come on board (to their eventual detriment I would think), while I would 
hope Firefox et al would implement this one feature on top of their 
already existing support for showing XML as a tree. As I said, IE is 
offering functionality which other browser users will think is broken in 
their browser--I think that is due to these browsers not having gone far 
enough, rather

Re: [whatwg] Removing the need for separate feeds

2009-05-22 Thread Brett Zamir
I also wonder if feeds being accessible in HTML might give rise, as with 
stylesheets and scripts contained in the head (convenient as those can 
be too), to excessive bandwidth, as agents repeatedly request updates to 
a whole HTML page containing a lot of other data.


(If we had external entities working though, that might be different for 
XHTML at least, as the file could be included easily as well as reside 
in its own independently discoverable location (via link/)...)


Brett


Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-05-20 Thread Brett Zamir

Jonas Sicking wrote:

On Wed, May 20, 2009 at 12:49 AM, Maciej Stachowiakm...@apple.com  wrote:
   

On May 19, 2009, at 11:49 PM, Jonas Sicking wrote:

 

On Mon, May 18, 2009 at 5:45 AM, Brett Zamirbret...@yahoo.com  wrote:
   

Has any thought been given to standardizing on at least a part of DOM
Level
3 Load and Save in HTML5?
 

The Load and Save APIs in DOM 3 are much too complicated IMHO so I'd
like to see something simpler standardized.

We've had parsing and serializing APIs in Firefox for ages. Would be
very exited to see someone put in effort to get their API cleaned up
and standardized.

https://developer.mozilla.org/En/DOMParser
https://developer.mozilla.org/En/XMLSerializer
   

WebKit actually implements most of these. I think it would make sense to
publish these as a WebApps spec. But these classes don't do any loading or
saving, just parsing and serializing.
 


Doesn't XMLHttpRequest do all the load/save that is needed? I don't
know how would could standardize save beyond that without relying on
something like WebDAV.

   

Document.load would be the simplified
load/save method that it would make sense to standardize IMO, since Firefox
has it and it is needed for Web compatibility. I am concerned though that
Document.load() allows for synchronous network loads.
 


I'm certainly no fan of Document.load() and wish it would go away. In
fact we have decided not to add any additional features such as CORS
or progress event support in order to discourage its use and move
people to XHR instead.

/ Jonas


   


Does Ajax at least have theoretical support for loading entities from an 
external DTD, as I presume a DOM method must? (Would be nice to hear 
from some others on that separate External document subset support 
thread too, especially to know whether all the browsers are just not 
inclined to ever implement this (despite all of the off-web documents 
that use them...))


thanks,
Brett


Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-05-20 Thread Brett Zamir

Brett Zamir wrote:

Jonas Sicking wrote:

On Wed, May 20, 2009 at 12:49 AM, Maciej Stachowiakm...@apple.com  wrote:
   

On May 19, 2009, at 11:49 PM, Jonas Sicking wrote:

 

On Mon, May 18, 2009 at 5:45 AM, Brett Zamirbret...@yahoo.com  wrote:
   

Has any thought been given to standardizing on at least a part of DOM
Level
3 Load and Save in HTML5?
 

The Load and Save APIs in DOM 3 are much too complicated IMHO so I'd
like to see something simpler standardized.

We've had parsing and serializing APIs in Firefox for ages. Would be
very exited to see someone put in effort to get their API cleaned up
and standardized.

https://developer.mozilla.org/En/DOMParser
https://developer.mozilla.org/En/XMLSerializer
   

WebKit actually implements most of these. I think it would make sense to
publish these as a WebApps spec. But these classes don't do any loading or
saving, just parsing and serializing.
 


Doesn't XMLHttpRequest do all the load/save that is needed? I don't
know how would could standardize save beyond that without relying on
something like WebDAV.

   

Document.load would be the simplified
load/save method that it would make sense to standardize IMO, since Firefox
has it and it is needed for Web compatibility. I am concerned though that
Document.load() allows for synchronous network loads.
 


I'm certainly no fan of Document.load() and wish it would go away. In
fact we have decided not to add any additional features such as CORS
or progress event support in order to discourage its use and move
people to XHR instead.

/ Jonas


   


Does Ajax at least have theoretical support for loading entities from 
an external DTD, as I presume a DOM method must? (Would be nice to 
hear from some others on that separate External document subset 
support thread too, especially to know whether all the browsers are 
just not inclined to ever implement this (despite all of the off-web 
documents that use them...))


Sorry, guess I can answer the first part of my question, yes it is 
theoretically possible... Per 
http://www.w3.org/TR/XMLHttpRequest/#responsexml , Return the XML 
response entity body. - Parse the response entity body into a 
document tree following the rules from the XML specifications.


But I'm still holding out hope that the ability to at least parse an 
external DTD for entities alone would be considered...


Brett


Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-05-20 Thread Brett Zamir

Brett Zamir wrote:

Brett Zamir wrote:

Jonas Sicking wrote:

On Wed, May 20, 2009 at 12:49 AM, Maciej Stachowiakm...@apple.com  wrote:
   

On May 19, 2009, at 11:49 PM, Jonas Sicking wrote:

 

On Mon, May 18, 2009 at 5:45 AM, Brett Zamirbret...@yahoo.com  wrote:
   

Has any thought been given to standardizing on at least a part of DOM
Level
3 Load and Save in HTML5?
 

The Load and Save APIs in DOM 3 are much too complicated IMHO so I'd
like to see something simpler standardized.

We've had parsing and serializing APIs in Firefox for ages. Would be
very exited to see someone put in effort to get their API cleaned up
and standardized.

https://developer.mozilla.org/En/DOMParser
https://developer.mozilla.org/En/XMLSerializer
   

WebKit actually implements most of these. I think it would make sense to
publish these as a WebApps spec. But these classes don't do any loading or
saving, just parsing and serializing.
 


Doesn't XMLHttpRequest do all the load/save that is needed? I don't
know how would could standardize save beyond that without relying on
something like WebDAV.

   

Document.load would be the simplified
load/save method that it would make sense to standardize IMO, since Firefox
has it and it is needed for Web compatibility. I am concerned though that
Document.load() allows for synchronous network loads.
 


I'm certainly no fan of Document.load() and wish it would go away. In
fact we have decided not to add any additional features such as CORS
or progress event support in order to discourage its use and move
people to XHR instead.

/ Jonas


   


Does Ajax at least have theoretical support for loading entities from 
an external DTD, as I presume a DOM method must? (Would be nice to 
hear from some others on that separate External document subset 
support thread too, especially to know whether all the browsers are 
just not inclined to ever implement this (despite all of the off-web 
documents that use them...))


Sorry, guess I can answer the first part of my question, yes it is 
theoretically possible... Per 
http://www.w3.org/TR/XMLHttpRequest/#responsexml , Return the XML 
response entity body. - Parse the response entity body into a 
document tree following the rules from the XML specifications.


But I'm still holding out hope that the ability to at least parse an 
external DTD for entities alone would be considered...


Argh... Didn't read far enough... resources referenced will not be 
loaded... So now I'm having to hold out hope on both counts...


Brett


Re: [whatwg] Reserving id attribute values?

2009-05-19 Thread Brett Zamir

Anne van Kesteren wrote:
On Tue, 19 May 2009 03:20:49 +0200, Brett Zamir bret...@yahoo.com 
wrote:
In order to comply with XML ID requirements in XML, and facilitate 
future transitions to XML, can HTML 5 explicitly encourage id 
attribute values to follow this pattern (e.g., disallowing numbers 
for the starting character)?


Those are only validity requirements and only when DTDs are involved 
(and possibly XSD) so it doesn't really matter.


While it may not be that common, people may want at a later date to 
apply some files for validation...


And don't forget the vanity/business factor in having a fully validating 
site :)


Brett


Re: [whatwg] Reserving id attribute values?

2009-05-19 Thread Brett Zamir

Anne van Kesteren wrote:
On Tue, 19 May 2009 13:46:43 +0200, Brett Zamir bret...@yahoo.com 
wrote:
While it may not be that common, people may want at a later date to 
apply some files for validation...


And don't forget the vanity/business factor in having a fully 
validating site :)


You can have validation outside the DTD/XSD realm.

Sorry I guess I was forgetting that HTML can be validated just as well. 
However, whichever way one validates, RelaxNG or whatever, XHTML 5 (and 
HTML 5?) will still define it as an ID type, no?


This concern was brought to mind when looking at 
http://dev.w3.org/html5/html4-differences/#absent-attributes which 
simply suggests using id instead of name (some might just think they 
can simply switch the attribute and not the value).


Brett


[whatwg] Cross-domain databases; was: File package protocol and manifest support?

2009-05-19 Thread Brett Zamir
I would like to suggest an incremental though I believe significant 
enhancement to Offline applications/SQLite.


That is, the ability to share a complete database among offline 
applications according to the URL from which it was made available. It 
could be designated by the origin site as a read-only database, or also 
potentially with shared write access, shareable with specific domains or 
all domains, and perhaps also with a mechanism to indicate the license 
of its contents.  Perhaps the manifest file could include such 
information. Actually, there might even be a shared space for databases 
directly downloaded by the user (or by an application) which would allow 
all applications access, no doubt requiring UA permission.


Ideally the origin site could also have control over providing an update 
to the database (not necessarily through manually performing UPDATE 
commands, but potentially by simply providing a new database at the 
previous location which was checked periodically for a new modification 
date). I don't know whether it would ideal to tie this in to the caching 
API (perhaps deleting the database reference could also cause it to be 
re-downloaded and also force the database to be re-created). Perhaps the 
cache API could also be optionally shared with other domains as well, 
allowing them to ensure their application was working with the latest data.


I believe custom protocols will also play into this well, as there could 
be a number of uses for operating on the same data set while linking to 
it in a way which is application-independent.


(Thanks to Kristof Zelechovski for helping me distill the essence of the 
idea a bit more succinctly and more in line with HTML 5's progress to date.)


Brett


Re: [whatwg] External document subset support

2009-05-18 Thread Brett Zamir

Henri Sivonen wrote:

On May 18, 2009, at 09:36, Brett Zamir wrote:

Section 10.1, Writing XHTML documents observes: According to the 
XML specification, XML processors are not guaranteed to process the 
external DTD subset referenced in the DOCTYPE.


While this is true, since no doubt the majority of web browsers are 
already able to process external stylesheets or scripts, might the 
very useful feature of external entity files, be employed by XHTML 5 
as a stricter subset of XML (similar to how XML Namespaces re-annexed 
the colon character) in order to allow this useful feature to work 
for XHTML (to have access to HTML entities or other useful entities 
for one, as well as enable a poor man's localization, etc.)?


See http://hsivonen.iki.fi/no-dtd/ explains why DTDs don't work for 
the Web in the general case.


While that is a thoughtful and helpful article, your arguments there 
mostly relate to validation from a central spec. Also, as far as heavy 
server loads for frequent DTDs, entities could be deliberately not 
defined at a resolvable URL. The same problems of denial-of-service 
could exist with stylesheet requests, script requests, etc. Even some 
sites, like Yahoo, have encouraged referring to their frequently 
accessed external files to take advantage of caching. The spec could 
even insist on same-domain, though I don't see any need for that. If I 
give my website out to Slashdot, I shouldn't be surprised when I get 
slashdotted, and if I do, that's my fault, not the web's fault. A DTD 
doesn't need to reference a central location, nor would it be likely 
that major browsers would fail to use the PUBLIC identifier to avoid 
checking for the SYSTEM file.


I also disagree with throwing our hands up in the air about character 
entities (or thinking that the (English-based) HTML ones are 
sufficient). As I said, just because the original spec defined it as 
optional, does not mean we must perpetually remain stuck in the past, 
especially in the case of XML-on-the-web which is not going to break a 
whole lot of browsing uses at all if external DTDs are suddently made 
possible. Moreover, the browser with the largest market share offers 
such support already, and those who depend on it may already view other 
browsers not supporting the standard as broken.
Loading same-origin DTDs for the purpose of localization is a 
semi-defensible case, but it's a lot of complexity for a use case that 
is way on the wrong side of 80/20 on the Web scale. 
How so? And besides localization, there are many other uses such as 
providing a convenient tool for editors to avoid finding a copyright 
symbol, etc. Not everyone uses an IDE which makes these available or 
knows how to use it. I'm assisting such a project which has this issue. 
And I really don't buy the web/non-web dichotomy which some people make. 
If there's an offline use, there's an online use, pure and simple. And a 
client-side-only use as well--to be able to read my own documents, I'd 
like to do so in a browser--many others besides me like to live in 
their browsers.


Even if it is a niche group which uses TEI, Docbook, etc. or who wants 
to be able to build say a browser extension which can take advantage of 
their rich semantics, this is still a use for citizens of the web. If 
people can push forward with backwards-incompatible technologies like 
the video element, 3d-animation, or whatever, it seems not much to ask 
to support the humble external entity file... :)
Besides, if the use case for DTDs is localization within an origin, 
the server can perform the XML parse and reserialize into DTDless XML. 
(That's how I've implemented this pattern in the past without 
client-side support.)


That is assuming people are aware of scripting and have access to such 
resources. Wasn't it one of the aims of the likes of XSL, XQuery, and 
XForms to use a syntax which doesn't require knowledge of an unrelated 
scripting language (and those are pretty complex examples unlike entities)?


(Btw, you and I discussed this before, though I didn't get a response 
from you to my last post: 
https://bugzilla.mozilla.org/show_bug.cgi?id=22942#c109 ; I don't mean 
to go off-topic but you might wish to consider or respond to some of its 
points as well...)


best wishes,
Brett


Re: [whatwg] External document subset support

2009-05-18 Thread Brett Zamir

Kristof Zelechovski wrote:

AFAIK, WebKit is not going to validate XML, they say it makes page load too
slow.
Yes, I can see validation would be a problem, and see little use for 
that except local file testing. But I'm just talking about using the DTD 
to access entities, not to do validation. While this does involve 
another HTTP request (as do external stylesheeets, scripts, etc.), 
browsers could, as they do with such files, cache the files.

Besides, entities introduce a security risk because it can contain
incomplete syntax fragments and they can open a path to XML injection into,
say,![DANGER[span title=malicious-entity;sweet kittens/span]].
So XML processors often refuse to load cross-domain DTD or ENTITIES.

   
Then, cross-domain entities could be restricted... I'm just thinking one 
should be able to at least have /some/ way to use them, even if you have 
to save the file in the same domain.

There are several XHTML entities that are indispensable for authors, namely
those that disambiguate characters are invisible or are indistinguishable
from others in a monospaced typeface.  These include spacing, dashes, quotes
and maybe text direction (deprecated).  Converting them to their
corresponding characters deteriorates the editing experience in an ordinary
text editor.  As far as codes for letters are concerned, text in a non-Latin
script would consist mainly of entities, which would make it extremely hard
to read, so this approach is not practical.  An editor limited to the ASCII
character set would be better off using a transliteration scheme and a
converter.
   
Yes, if the whole document were written in that fashion, and it was not 
a localization. But for those who are already using programs which 
support non-Latin scripts, such documents may still take advantage of 
entities (or even have the entities themselves be in a non-Latin 
script). For example, my Chinese XHTML ( 
http://bahai-library.com/zamir/chineseXHTML.xml and 
http://bahai-library.com/zamir/chineseXHTML.xsl ) (can view in Safari, 
Firefox, or Opera), which allows tags, attributes, and CSS in Chinese 
characters roughly equivalent to the XHTML tag names, as long as the 
stylesheet is attached, could also use Chinese entities in the XML 
document if such external doctypes were supported (the example uses one 
in the internal subset). This raises another use for entities--a simple 
introduction to preparing XHTML documents in all regards, regardless of 
one's native language. (And I used entities in the XSL file too, thereby 
highlighting another example of entities--the ability to automatically 
and transparently share /all/ of my code, including localization code 
for anyone who wished to make their own localization or borrow mine for 
other uses.)

However, as some of the entities are indispensable, a DOCTYPE is required.
The browsers may support built-in entities but XML processors used to
process XHTML documents need not.  Providing a set of the entities needed
in-line is easy;
If you mean providing them in each document, then that, while easy, is 
already supported, but is a large use of bandwidth, and not to mention 
quite a pain to have to copy into each document and maintain...

however, the problem is that some validating processors
like MSXML require that the DTD, if provided, should fully describe the
document; providing entities only is not supported by default and the
processor refuses to load the document.  That means a DOCTYPE for XHTML is
necessary and should be provided by WHATWG (or by an independent party).
This DTD should be external in order to use parameter entities and, of
course, to make the document smaller.  It cannot, of course, define all
nuances of XHTML, but an upper approximation would be sufficient.

The problem, of course, is maintenance, since XHTML is in flux.  XHTML is
currently described formally by a RELAX NG grammar and maintaining a
separate DTD would double the work to do so it would be best to be able to
generate the DTD automatically.  However, the converter I was advised to use
was unable to produce a DTD from the grammar because it is too complex for
the DTD formalism (of course).
   

Hmm... Good point. Still, it is surmountable...


Best regards,
Chris

Aside: Note that you cannot use DocBook with MSIE directly; a bug in the
default XSLT processor causes an error in initialization code.  This kills
all transformations, whatever your document is.  (I do not know about TEI.)
   
I've used XSL successfully before in IE, but haven't used it for some 
time... Right now my Chinese XHTML which really did work for me in 
Explorer before when I outputted in GB2312 is not working now, though 
that may be due to the fonts on my system now.


Brett


[whatwg] File package protocol and manifest support?

2009-05-18 Thread Brett Zamir
While this may be too far in the game to bring up, I'd very much be 
interested (and think others would be too) to have a standard means of 
representing not only individual files, but also groups of files on the web.


One application of this would be for a web user to be able to do the 
following (taking advantage of both offline applications and related 
somewhat to custom protocols):


1) Click a link in a particular protocol containing a list of files or 
leading to a manifest file which contains a list of files. Very 
importantly, the files would NOT need to be from the same site.
2) If the files have not been downloaded already, the browser accesses 
the files (possibly first decompressing them) to store for offline use.
3) If the files were XML/XHTML, take advantage of any attached XSL, 
XQuery, or CSS in reassembling them.
4) If the files were SQL, reassemble them in a table-agnostic 
manner--e.g., allow the user to choose which columns to view and in 
which order and how many records at a time (including allowing a 
single-record flashcard-like view), also allowing for automated 
generation of certain columns using JavaScript.
5) If the files included templates, use these for the display and 
populate for the user to view.
6) Bring the user to a particular view of the pages, starting for 
example, at a particular paragraph indicated by the link or manifest 
file, highlight the document or a portion of the targeted page with a 
certain font and color, etc.


It seems limiting that while we can reference individual sites' data at 
best targeting an existing anchor or predefined customizability, we do 
not have any built-in way to bookmark and share views of that data over 
the web.


In considering building a Firefox extension to try this as a proof of 
concept, METS (http://www.loc.gov/standards/mets/ ) seems to have many 
aspects which could be useful as a base in such a standard, including 
the useful potential of enabling links to be described for files which 
may not exist as hyperlinks within the files--i.e., XLink linkbases).


Besides this offline packages use, such a language might work just as 
well to build a standard for hierarchical sitemaps, linkbases, or Gopher 
2.0 (and not being limited to its usual web view, equivalent of icon 
view on the desktop, but conceivably allowing column browser or tree 
views for hierarchical data ranging from interlinked genealogies to 
directories along the lines of http://www.dmoz.org/ or 
http://dir.yahoo.com ), including for representing files on one's own 
local system yet leading to other sites. The same manifest files might 
be browseable directly (e.g., Gopher-mode), being targeted to 
continguously lead to other such manifest file views until reaching a 
document (the Gopher-view could optionally remain in sight as the end 
document loaded), or, as mentioned above, as a cached and integrated 
offline application (especially where compressed files and SQL were 
involved).


Brett


[whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-05-18 Thread Brett Zamir

One more thought...

While it is great that innerHTML is being officially standardized, I'm 
afraid it would be rather hackish to have to use it for parsing and 
serializing dynamically created content which wasn't destined to make it 
immediately into the document, if at all.


Has any thought been given to standardizing on at least a part of DOM 
Level 3 Load and Save in HTML5?


The API, if simply applied to serialization, would look like this :

var ser = DOMImplementationLS.createLSSerializer();
var str = ser.writeToString(document);

and like this for parsing to the DOM:

var lsParser = DOMImplementationLS.createLSParser(1, null); // 1 
for synchronous; null for no schema type

var lsInput = DOMImplementationLS.createLSInput();
lsInput.stringData = 'myXml/';
var doc = lsParser.parse(lsInput);

If a revision to the DOM3 module is not in order (which, e.g., 
simplifies the parsing from a string for simple cases) and the above is 
considered too cumbersome, maybe some other cross-browser standard could 
be agreed upon?


I think using DOM3 would facilitate readily adding additional aspects of 
the module in the future (as ECMAScript seems to be positively albeit 
slowly expanding to ever new uses) and offer familiarity for those 
working in other contexts with DOM Level 3, while ECMAScript users can 
still wrap these in their own simpler functions. However, I can also see 
the desire for something simpler (as I say, maybe an addendum to the LS 
module). But I do hope something might be considered, since I find this 
to be a quite frequent need and do not like relying on feature-checking 
for non-standard methods in the various browsers as well as being 
unclear on how to future-proof my code to work with standards-compliant 
browsers...


thanks,
Brett


[whatwg] Reserving id attribute values?

2009-05-18 Thread Brett Zamir
In order to comply with XML ID requirements in XML, and facilitate 
future transitions to XML, can HTML 5 explicitly encourage id attribute 
values to follow this pattern (e.g., disallowing numbers for the 
starting character)?


Also, there is this minor errata: 
http://www.whatwg.org/specs/web-apps/current-work/#refsCSS21 is broken 
(in section 3.2)


Brett


Re: [whatwg] External document subset support

2009-05-18 Thread Brett Zamir

Hello,

I don't want to go too far off topic here, but I'll respond to the 
points as I do think it illustrates one of the uses of entities 
(localization)--which would apply to some degree in XHTML (at least for 
entities) as well as in XML.


Kristof Zelechovski wrote:


Using entities in XSL to share code was my mistake once too; it is 
similar to using data members not wrapped in properties in data types. 
 XSL itself provides a better structured approach for code reuse.


Unless you're talking about variables, I guess I'd need elaboration, but 
I don't want to go too far off track on list here...


Being able to use localized programming language constructs is at the 
same time trivial (replace this with that),


I think that depends on how familiar the script and language is to you 
(cognates help many non-English Europeans, whereas the same does not 
apply elsewhere). To take some of my wife's family younger cousins, for 
example, who are not particularly educated yet who use computers as many 
Chinese do, they found it much easier to get a grasp of this Chinese 
XHTML than the English one, even though they had had some previous 
English instruction. I think actual research would need to be done on 
this, since it is well possible that only programmer types make it past 
the barrier to entry, and then, they may be even more inclined to 
dismiss the benefits for others less skilled; i.e., I did it, so others 
should, or they want to get away from their linguistic background 
distinctiveness, or have perhaps irrational fears that this would lead 
to their people being satisfied with lower standards, etc. (just as many 
oppose bilingual education even while it may even help transition 
students to the mainstream language).


expensive (you have to translate the documentation)

Not sure what you mean by cost of translating the documentation. Cost 
for whom? If your audience is intended for that audience--e.g., Chinese 
code at a Chinese website--who needs to translate anything? On the 
contrary, they avoid the need to translate...


and not that useful (you freeze the language and cut the programmers 
off from the recent developments in the language).


I don't think it would be that hard to update the translating 
template--it's not that difficult. But I'm definitely not talking about 
relying on this anyways. There are big advantages to having a common 
language as far as the ability to learn from others' code from people 
around the world, etc. But just as I replied to someone on another list 
who said this was not semantic, this is very much semantic to those 
for whom it is their native language--perhaps even more in the spirit of 
pure XML (though Babelizing semantics even further, no doubt, if people 
actually starting using this on a large scale, as search engines would 
have to be aware of either the post-transformation result or the 
localized XML, etc.).


Languages tend to use English keywords regardless of the culture of 
their designer because:


1. no matter how deep you go, there is always a place where you have 
to switch to English in order to refer to some precedent technology,


Yes, like in my use of ?xml-stylesheet? (though no doubt browsers 
could be fairly trivially programmed to recognize localized processing 
instructions, as well). Anyways, again, I'm in favor of a common 
language, and would even hope very much that countries around the world 
could democratically agree on an official standard (including possibly 
English, which if its use is as widespread and popular as its proponents 
believe, should have little problem obtaining a democratic majority) so 
that children will everywhere begin earlier to have access to such a 
common language. Nevertheless, if you're a beginner, having to deal with 
one line of English is a lot easier than having to deal with a whole 
syntax in English, if that's not your native language. I think the fact 
that a number of open source projects I've encountered still have not 
only comments but also even variables in the programmer's original 
language is evidence that there is some desire for convenient 
localization. If you have tools that translate it before serving the 
code, it is still available anyways.


2. the English words/roots used in the language design often have a 
slightly different meaning from the English source,


Maybe, but it is much easier to learn a few exceptions which are 
probably at least related in meaning, than to have to learn something 
completely foreign. Would you like to learn an Arabic-script XHTML, even 
if there was a one-to-one mapping from your keyboard already? Of course 
you could, but you have to admit it would take a little time out for 
you, especially if you were not already inclined to do coding/markup. 
It's not only a vocabulary issue here, but a script issue too--moreover, 
using that script may force you to switch between your keyboard layouts 
each time you want to make a document.


3. they