Re: [Tutor] XML: Expletive Deleted (OT)
> [ The only SOA/XML book that addresses this side of XML usage > is the excellent "SOA - A Field Guide" by Peter Erls. Erls also > suggests some mitigating strategies to get round it.] Oops, don't rely on memory... That is Thomas Erl not Peter Erls. And of course there may be other SOAP/XML books deal with these issues, but Erl's book is the only one I've read that does so! Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] XML: Expletive Deleted (OT)
Just picked this up after being out for most of the week... "Carroll, Barry" <[EMAIL PROTECTED]> wrote in message > One reason to for choosing a human-readable format is the desire to > visually confirm the correctness of the stored data and format. Thats a very dangerous asumption, how do you detect unprintable characters, tabs instead of spaces, trailing spaces on a line etc etc. Whole text representations are helpful you should never rely on the human eye to validate a data file. > can be invaluable when troubleshooting a bug involving stored data. > If > there is a tool between the user and the data, one must then rely > upon > the correctness of the tool to determine the correctness of the > data. Or the correctness of the eye. I know which one i prefer - a tested tool. The human eye is not a dta parser, but it flatters to deceive by being nearly good enough. > In a case like this, nothing beats the evidence of one's eyes, IMHO. Almost anything beats the human eye IME :-) Actually if you must use eyes do so on a hex dump of the file, that is usually reliable enough if you can read hex... > In their book, "The Pragmatic Programmer: From Journeyman to Master" > (Addison Wesley Professional), Andrew Hunt and David Thomas give > another > reason for storing data in human readable form: > >The problem with most binary formats is that the context > necessary >to understand the data is separate from the data itself. You are >artificially divorcing the data from its meaning. The data may >as well be encrypted; it is absolutely meaningless without the >application logic to parse it. With plain text, however, you can >achieve a self-describing data stream that is independent of the >application that created it. But at a very high risk. I do not dislike text files BTW and am not suggesting that text should not be used but its parsing is best left to machines, the eye is only a rough and unreliable guide. And if your data volumes are high go with binary, you'll need tools to parse a lot of data anyway, you might as well save the space! The Hunt/Thomas book is excellent BTW - I recommend it highly. Even though I disagree witrh several of their suggestions(*) I agree with far more. (*)They recommend sticking with one text editor whereas I use about 5 or 6 on a regular basis depending on the job I'm doing and the platform I'm working on. Emacs on X Windows for new files but vim for quick fix ups, vim on Windows for most things, ed or ex for text based email or over a phone line. > This is an example of the resource balancing act that computer > people > have been faced with since the beginning. The most scarce/expensive > resource dictates the program's/system's design. In Alan's example > high > speed bandwidth is the limiting resource. A data transmission > method > that fails to minimize use of that resource is therefore a bad > solution. Unfortunately the software industry is full of people who by and large don't understand networks so they just ignoire them. At least thats my experience! SOA using SOAP/XML is probably the most inefficient and unreliable set of data networking technologies you could possible come up with. But the focus is on cutting developer cost because the people inventing it are developers! In almost every sizewable project the cost of development will be significantly less than the cost of deployment - in most of my projects it usually works out something like: development - 15% deployment - 30% support - 15% training - 25% documentation - 5% management overhead - 10% Saving 25% of development costs rediuced total cost by around 4% but if that puts deployment costs up by 10% the net gain is only 1%! And in XML case it often puts deployment costs up by 100% - a net loss of 24%!! Now those figures come from a typical project that I work on which probably has a total budget of betwen $10-100 million. If your budget is smaller, say less than $1 million then the balance may well change. But over 50% of the IT industry works on projects with >$1m budgets according to both Datamation and Infoweek. [ The only SOA/XML book that addresses this side of XML usage is the excellent "SOA - A Field Guide" by Peter Erls. Erls also suggests some mitigating strategies to get round it.] > So here's my off-topic question: Ajax is being touted as the > 'best-known > method' (BKM) for making dynamic browser-based applications, and XML > is > the BKM for transferring data in Ajax land. If XML is a bad idea > for > network data-transfer, what medium should be used instead? The example I gave of having to upgrade the sites network was actually an early adopter of XML/Ajax architecture! There are lots of other data formats around - some are even self describing (CSV and TLV are cases) Others simply hold the definition in an accessible library so you only have to transport it once - eg IDL and ASN.1 - or optionally compile it into your code for maxim
Re: [Tutor] XML: Expletive Deleted (OT)
Carroll, Barry wrote: > So here's my off-topic question: Ajax is being touted as the 'best-known > method' (BKM) for making dynamic browser-based applications, and XML is > the BKM for transferring data in Ajax land. If XML is a bad idea for > network data-transfer, what medium should be used instead? JSON is a popular alternative to XML for Ajax applications. It is much lighter-weight than XML and easier to parse in JavaScript. http://json.org/ Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] XML: Expletive Deleted (OT)
Alan, Ralph, et al: This is a little off-topic, I guess, being not directly related to Python. Oh, well. Here are a couple of personal opinions and a question about XML. > -Original Message- > Date: Sun, 11 Jun 2006 08:55:17 +0100 > From: "Alan Gauld" <[EMAIL PROTECTED]> > Subject: Re: [Tutor] Expletive Deleted > To: "Ralph H. Stoos Jr." <[EMAIL PROTECTED]>, > Message-ID: <[EMAIL PROTECTED]> > Content-Type: text/plain; format=flowed; charset="iso-8859-1"; > reply-type=original > > > I think XML is a tool that allows non-programmers to look at > > structured > > data and have it a in human readable form that gives us a chance of > > understanding that structure. > > Thats not a great reason to choose a file format IMHO. > Tools can be written to display data in a readable format. > For example SQL can be used to view the data in a database. > File formats should be designed to store data, compactly > and with easy access. One reason to for choosing a human-readable format is the desire to visually confirm the correctness of the stored data and format. This can be invaluable when troubleshooting a bug involving stored data. If there is a tool between the user and the data, one must then rely upon the correctness of the tool to determine the correctness of the data. In a case like this, nothing beats the evidence of one's eyes, IMHO. In their book, "The Pragmatic Programmer: From Journeyman to Master" (Addison Wesley Professional), Andrew Hunt and David Thomas give another reason for storing data in human readable form: The problem with most binary formats is that the context necessary to understand the data is separate from the data itself. You are artificially divorcing the data from its meaning. The data may as well be encrypted; it is absolutely meaningless without the application logic to parse it. With plain text, however, you can achieve a self-describing data stream that is independent of the application that created it. Tip 20 Keep Knowledge in Plain Text > > The other strength that I can see is this: Once data is in this > > format, > > and a tool has been written to parse it, data can be added to the > > structure (more elements) and the original tool will not be broken > > by > > this. Whatever it is parsed for is found and the extra is ignored. > > But this is a very real plus point for XML. > And this IMHO is the biggest single reason for using it, if you have > data where the very structure itself is changing yet the same file > has to be readable by old and new clients then XML is a good choice. No argument there. > > Without a doubt, the overhead XML adds over say, something as simple > > as > > CSV is considerable, and XML would appear to be rather more hard to > > work > > with in things like Python and PERL. > > Considerable is an understatement, its literally up to 10 or 20 times > more space and that means bandwidth and CPU resource to > process it. > > Using XML as a storage medium - a file - is not too bad, you suck > it up, process it and foirget the file. MY big gripe is that people > are > inceasingly trying to use XML as the payload in comms systems, > sending XML messages around. This is crazy! The extra cost of the > network and hardware needed to process that kind of architecture > is usually far higher than the minimal savings it gives in developer > time. > [As an example I recently had to uplift the bandwidth of the > intranet pipe in one of our buildings from 4Mb to a full ATM pipe > of 34Mb just to accomodate a system 'upgrade' that now used XML. > That raised the network operations cost of that one building > from $10k per year to over $100k! - The software upgrade by > contrast was only a one-off cost of $10K] This is an example of the resource balancing act that computer people have been faced with since the beginning. The most scarce/expensive resource dictates the program's/system's design. In Alan's example high speed bandwidth is the limiting resource. A data transmission method that fails to minimize use of that resource is therefore a bad solution. Python itself is a result of this balancing act. Interpreted languages like Basic were invented to overcome the disadvantages of writing of programs in machine-readable, human-unfriendly formats. Compiled languages like C were invented to overcome the slow execution speed of interpreted programs. As processor speeds increased and execution times dropped , interpreted languages like Python once again became viable for large scale programs. > > So, I think XML has it's place but I will not fault anyone for > > trying to > > make it easier to get code to work. > > Absolutely agree with that. Just be careful how you use it and > think of the real cost impact you may be having if its your choice. > Your customers will thank you. So here's my off-topic question: Ajax is being touted as the 'best-known method' (BKM) for
Re: [Tutor] XML: Expletive Deleted
Kent, Danny, Lawrence, et. al. Thanks! I was kind of cringing as I sent this plaint/rant, but it seems I'm not the only one who has had trouble grokking DOM. I spanked the problem temporarily with regex, but can now actually fix it properly. Appreciate all the help!On 6/10/06, Kent Johnson <[EMAIL PROTECTED]> wrote: In my opinion the standard DOM models are the most awkward way to dealwith XML. If you are trying to get data from HTML on a web page, look atBeautifulSoup. For general XML processing, look at ElementTree. They are both simpler than DOM.http://www.crummy.com/software/BeautifulSoup/http://effbot.org/zone/element.htm Kent___Tutor maillist - Tutor@python.orghttp://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] XML: Expletive Deleted
In my opinion the standard DOM models are the most awkward way to deal with XML. If you are trying to get data from HTML on a web page, look at BeautifulSoup. For general XML processing, look at ElementTree. They are both simpler than DOM. http://www.crummy.com/software/BeautifulSoup/ http://effbot.org/zone/element.htm Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] XML: Expletive Deleted
> >> for item in itemIDs: > >> print item > > yeilds > > > > > > > > Okay, no problem. Now all I have to do is figure out which > particlular.string.of.words.interconnected.by.periods to > pass to extract the values. > > >> for item in itemIDs: > >> print item.nodeValue > > Seems logical: > > None > None > None > None > None try dir(item) to see what attributes the item has, and try the ones that sound right. e.g.: >>> from xml.dom.minidom import parse, parseString >>> resp = parseString("foo") >>> bottom = resp.getElementsByTagName("bottom") >>> bottom [] >>> dir(bottom[0]) ['ATTRIBUTE_NODE', ...long list snipped..., 'writexml'] >>> bottom[0].hasChildNodes() True >>> bottom[0].childNodes [] >>> dir(bottom[0].childNodes[0]) ['ATTRIBUTE_NODE', ...long list snipped..., 'writexml'] >>> bottom[0].childNodes[0].data u'foo' so you see, with "value", there's an invisible text node. it's one of the quirks of xml, i guess. then the attribute you're looking for is "data", not "nodeValue". in summary: instead of item.nodeValue, item.childNodes[0].data. --lawrence ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] XML: Expletive Deleted
>>> from xml.dom.minidom import parse, parseString > >>> data = response.read() >>> connection.close() >>> response = parseString(data) >>> itemIDs = response.getElementsByTagName("ItemID") >>> response.unlink() ^ Hi Doug, What's going on here? Why unlink()? > Okay, no problem. Now all I have to do is figure out which > particlular.string.of.words.interconnected.by.periods to pass to extract > the values. > >>> for item in itemIDs: >>> print item.nodeValue You may want to look at the minidom example here: http://www.python.org/doc/lib/dom-example.html Does this help? ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor