On Sun, 15 Aug 2010 11:17:43 -0400, Mike Wilcox <m...@mikewilcox.net>
wrote:
Michael, good try, but I've been down that road; it's pretty hard to do.
You left in the script text,
Yeh, forgot about that. I'm grabbing text nodes from anything.
spaces were missing, and there were no line breaks.
Yes, I did that on purpose because I thought that's what you wanted
judging by "textContent returns everything, including tabs, white space.."
But, either way, it is indeed more complicated than my example.
On Aug 15, 2010, at 7:41 AM, Michael A. Puls II wrote:
On Sat, 14 Aug 2010 20:03:30 -0400, Mike Wilcox <m...@mikewilcox.net>
wrote:
Wow, I was just thinking of proposing this myself a few days ago.
In addition to Adam's comments, there is no standard, stable way of
*getting* the text from a series of nodes. textContent returns
everything, including tabs, white space, and even script content.
Well, you can do stuff like this:
------
(function() {
function trim(s) {
return s.replace(/^\s\s*/, '').replace(/\s\s*$/, '');
}
function setInnerText(v) {
this.textContent = v;
}
function getInnerText() {
var iter = this.ownerDocument.createNodeIterator(this,
NodeFilter.SHOW_TEXT, null, null);
var ret = "";
var first = true;
for (var node; (node = iter.nextNode()); ) {
var fixed = trim(node.nodeValue.replace(/\r|\n|\t/g, ""));
if (fixed.length > 0) {
if (!first) {
ret += " ";
}
ret += fixed;
first = false;
}
}
return ret;
}
HTMLElement.prototype.__defineGetter__('myInnerText', getInnerText);
HTMLElement.prototype.__defineSetter__('myInnerText', setInnerText);
})();
------
and adjust how you handle spaces and build the string etc. as you see
fit. Then, it's just alert(el.myInnerText).
NodeIterator's standard. __defineGetter/Setter__ is de-facto standard
(and you have Object.defineProperty as standard for those that support
it). How newlines and tabs and spaces are stripped/normalized just
isn't standardized in this case. But that might different depending on
the application.
Or, just run a regex on textContent.
--
Michael
--
Michael