[IndexedDB] Full text indexing
Hi all. Disclaimer: last time I posted to this mailing list someone correctly pointed out that I'd not read the spec properly. Apologies if I've done the same again. I'm very enthusiastic about the whole offline web app thing, and as this is a public forum I thought I may as well fire off a couple of questions. They might be daft, so I apologise in advance if so. Onwards. At present I can't see any reference to full text indexing via the IndexedDB API. Is this something that is specifically out-of-scope, or is it included in the way indexes work but not explicitly stated? I can think of a couple of scenarios in which this would be useful, all of which fall under the heading of offline search. Few examples below: -- Reference material: If users can take parts of a website offline, at some point someone will want to search that data. If I build an offline application which takes a stack of reference material offline, I'd also like to build a database containing text from those pages. I can then use full-text search to retrieve URL's of the offline pages and direct the user to them. -- Emails Let's say a user takes their mailbox offline. Now they want to search it for a particular phrase or subject. What feature of IndexedDB would we expect developers to leverage to implement this? While full text search would be possible with a regular index of single keywords, this approach isn't as elegant as full-text indexing: * Searching for multiple keywords would probably be a second/third query + join, which would be slow * Initially populating the database with individual keywords would require the user to download a lot of data, whereas populating a full-text index with a sentence would be more efficient (in some (most?) scenarios). * A full-text index could expose more advanced functionality such as searching for quoted terms, and other conditional operators (see Gears implementation of full text search [http://code.google.com/apis/gears/api_database.html - bottom of the page]). Unless this has been considered already, might I suggest either extending KeyRange to include a Match property? Or perhaps introduce a level of abstraction to KeyRanges along the lines of: IRange (internal) - bool IncludeInResult( itemInIndex ); KeyRange : inherits IRange // properties as per spec TextRange : inherits IRange - DOMString Match Sure you can think of something more appropriate, but that explains what I'd like to accomplish. Is this something that can already be achieved via the IndexedDB spec? If not, could it be included without too much effort? Appreciate all your hard work. Nathan
ISSUE-118 (dispatchEvent links): Consider allowing dispatchEvent for generic event duplication for links [DOM3 Events]
ISSUE-118 (dispatchEvent links): Consider allowing dispatchEvent for generic event duplication for links [DOM3 Events] http://www.w3.org/2008/webapps/track/issues/118 Raised by: Doug Schepers On product: DOM3 Events Simon Pieters wrote in http://lists.w3.org/Archives/Public/www-dom/2010AprJun/0041.html : [[ Is it defined what should happen in the following case? div onclick=document.links[0].dispatchEvent(event)click me/div a href=http://example.org/;test/a It seems Firefox and Opera throw an exception, while WebKit allows the event to be dispatched. I think it seems like a neat thing to be able to do, for making table rows or canvas clickable. (However the event shouldn't be a 'trusted' event in that case, of course.) To make it work today you'd have to create a new event and copy over all properties, which is annoying. ]]
Re: [CORS] What constitutes a network error?
20.07.2010, в 14:37, Jonas Sicking написал(а): However I haven't been able to find a clear definition of what counts as a network error. Does this include successful HTTP requests that return 4xx or 5xx status codes? Or just errors in the lower level of the stack, such as aborted TCP connections? FWIW, I've been always assuming the latter. Blocking 4xx and 5xx responses would mean having a rather unexpected difference between same origin and cross origin XMLHttpRequest (the former lets JS code see such responses). - WBR, Alexey Proskuryakov
Re: [CORS] What constitutes a network error?
On Wed, Jul 21, 2010 at 1:14 PM, Alexey Proskuryakov a...@webkit.org wrote: 20.07.2010, в 14:37, Jonas Sicking написал(а): However I haven't been able to find a clear definition of what counts as a network error. Does this include successful HTTP requests that return 4xx or 5xx status codes? Or just errors in the lower level of the stack, such as aborted TCP connections? FWIW, I've been always assuming the latter. Blocking 4xx and 5xx responses would mean having a rather unexpected difference between same origin and cross origin XMLHttpRequest (the former lets JS code see such responses). I'm fairly certain that when we discussed this at the F2F in Redmond, we talked about 4xxs aways resulting in failed requests. And that this solved some security issues. However I could be misremembering, or we could have changed our minds later. Definitely would like to hear others speak up. / Jonas
Re: ISSUE-118 (dispatchEvent links): Consider allowing dispatchEvent for generic event duplication for links [DOM3 Events]
On Wed, Jul 21, 2010 at 10:11 AM, Web Applications Working Group Issue Tracker sysbot+trac...@w3.org wrote: ISSUE-118 (dispatchEvent links): Consider allowing dispatchEvent for generic event duplication for links [DOM3 Events] http://www.w3.org/2008/webapps/track/issues/118 Raised by: Doug Schepers On product: DOM3 Events Simon Pieters wrote in http://lists.w3.org/Archives/Public/www-dom/2010AprJun/0041.html : [[ Is it defined what should happen in the following case? div onclick=document.links[0].dispatchEvent(event)click me/div a href=http://example.org/;test/a It seems Firefox and Opera throw an exception, while WebKit allows the event to be dispatched. I think it seems like a neat thing to be able to do, for making table rows or canvas clickable. (However the event shouldn't be a 'trusted' event in that case, of course.) To make it work today you'd have to create a new event and copy over all properties, which is annoying. ]] Even if we make this dispatch the event, it wouldn't make the link be followed — since the event isn't dispatched by the UA, there's no default action. There is, in any case, a simpler solution to the above: div onclick=document.links[0].click()click me/div a href=http://example.org/;test/a -- Ian Hickson
Re: Lifetime of Blob URL
Tying a 'lifetime' of a string url to a blob which is not even needed at the point of use seems to be creating a mechanism that doesn't generally work: function getImageUrl() { var a_blob = ... load a blob in some way, perhaps via XHR return a_blob.url; } ... // sometime during initialization var imageUrl = getImageUrl(); ... // sometime later anImage.src = imageUrl; This may work all the time on the developer's computer and fail all the time (or sometimes) in the field. It may be very frustrating. Tying lifetime explicitly to the Window (url dies when window closes or revoke() is called) does not fix all the issues but makes the mechanism less likely to shoot the user in the foot by making it more explicit. Dmitry On Tue, Jul 13, 2010 at 7:37 AM, David Levin le...@google.com wrote: On Tue, Jul 13, 2010 at 6:50 AM, Adrian Bateman adria...@microsoft.comwrote: On Monday, July 12, 2010 2:31 PM, Darin Fisher wrote: On Mon, Jul 12, 2010 at 9:59 AM, David Levin le...@google.com wrote: On Mon, Jul 12, 2010 at 9:54 AM, Adrian Bateman adria...@microsoft.com wrote: I read point #5 to be only about surviving the start of a navigation. As a web developer, how can I tell when a load has started for an img? Isn't this similarly indeterminate. As soon as img.src is set. the spec could mention that the resource pointed by blob URL should be loaded successfully as long as the blob URL is valid at the time when the resource is starting to load. Should apply to xhr (after send is called), img, and navigation. Right, it seems reasonable to say that ownership of the resource referenced by a Blob can be shared by a XHR, Image, or navigation once it is told to start loading the resource. -Darin It sounds like you are saying the following is guaranteed to work: img.src = blob.url; window.revokeBlobUrl(blob); return; If that is the case then the user agent is already making the guarantees I was talking about and so I still think having the lifetime mapped to the blob not the document is better. This means that in the general case I don't have to worry about lifetime management. Mapping lifetime to the blob exposes when the blob gets garbage collected which is a very indeterminate point in time (and is very browser version dependent -- it will set you up for compatibility issues when you update your javascript engine -- and there are also the cross browser issues of course). Specifically, a blob could go out of scope (to use your earlier phrase) and then one could do img.src = blobUrl (the url that was exposed from the blob but not using the blob object). This will work sometimes but not others (depending on whether garbage collection collected the blob). This is much more indeterminate than the current spec which maps the blob.url lifetime to the lifetime of the document where the blob was created. When thinking about blob.url lifetime, there are several problems to solve: 1. An AJAX style web application may never navigate the document and this means that every blob for which a URL is created must be kept around in some form for the lifetime of the application. 2. A blob passed to between documents would have its blob.url stop working as soon as the original document got closed. 3. Having a model that makes the url have a determinate lifetime which doesn't expose the web developer to indeterminate behaviors issues like we have discussed above. The current spec has issues #1 and #2. Binding the lifetime of blob.url to blob has issue #3. dave
[Bug 9766] We should expose the subprotocol for the case of the client not specifying one but the server specifying one
http://www.w3.org/Bugs/Public/show_bug.cgi?id=9766 Ian 'Hixie' Hickson i...@hixie.ch changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.
[Bug 9973] If the entry's name is sec-websocket-protocol 0 please don't put normative requirements in parenthesis
http://www.w3.org/Bugs/Public/show_bug.cgi?id=9973 Ian 'Hixie' Hickson i...@hixie.ch changed: What|Removed |Added Status|NEW |RESOLVED CC||i...@hixie.ch Resolution||NEEDSINFO --- Comment #1 from Ian 'Hixie' Hickson i...@hixie.ch 2010-07-22 05:25:27 --- Not sure what this is referring to. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.
[Bug 9989] Is the number of replacement characters supposed to be well-defined? If not this should be explicitly noted. If it is then more detail is required.
http://www.w3.org/Bugs/Public/show_bug.cgi?id=9989 Ian 'Hixie' Hickson i...@hixie.ch changed: What|Removed |Added Status|NEW |RESOLVED CC||i...@hixie.ch Resolution||NEEDSINFO --- Comment #1 from Ian 'Hixie' Hickson i...@hixie.ch 2010-07-22 05:27:48 --- I don't understand what isn't well-defined. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.
[Bug 10129] The end of this WebSocket section links to EventSource with fail the connection. It should link to the WebSocket fail the connection instead.
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10129 Ian 'Hixie' Hickson i...@hixie.ch changed: What|Removed |Added Status|NEW |RESOLVED CC||i...@hixie.ch Resolution||FIXED -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.
[Bug 10213] The definition of absolute url makes https:foo not an absolute url, since its behavior depends on whether the base is https: or not. Is that desired? In particular, using this defini
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10213 Ian 'Hixie' Hickson i...@hixie.ch changed: What|Removed |Added Status|NEW |RESOLVED CC||i...@hixie.ch Resolution||NEEDSINFO --- Comment #2 from Ian 'Hixie' Hickson i...@hixie.ch 2010-07-22 05:39:43 --- ws:foo isn't absolute, therefore per spec it's treated as non-absolute. Am I missing something? Are browsers not implementing the spec here? -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.
[Bug 10213] The definition of absolute url makes https:foo not an absolute url, since its behavior depends on whether the base is https: or not. Is that desired? In particular, using this defini
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10213 Boris Zbarsky bzbar...@mit.edu changed: What|Removed |Added Status|RESOLVED|REOPENED CC||bzbar...@mit.edu Resolution|NEEDSINFO | --- Comment #3 from Boris Zbarsky bzbar...@mit.edu 2010-07-22 05:50:07 --- ws:foo isn't absolute, How is a browser supposed to know that? Trying to create a URI from that string without a base URI successfully creates one, for example... Are browsers not implementing the spec here? Nope. Neither Gecko nor webkit throw on such a url, for example. In Gecko's case, because the concept of absolute url the spec uses (one which resolves to different things depending on the base) matches nothing that Necko exposes, and because by the definition normally used in Gecko (it's an absolute URL if you can parse it as a url even if there is no base) this url is absolute. See also https://bugzilla.mozilla.org/show_bug.cgi?id=580234 which is what prompted me to read this section to start with. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.