Re: [Tutor] XML: Expletive Deleted (OT)

2006-06-16 Thread Alan Gauld
> [ The only SOA/XML book that addresses this side of XML usage
> is the excellent "SOA - A Field Guide" by Peter Erls. Erls also
> suggests some mitigating strategies to get round it.]

Oops, don't rely on memory...

That is Thomas Erl not Peter Erls.

And of course there may be other SOAP/XML books deal with these 
issues, but Erl's book is the only one I've read that does so!

Alan G.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] XML: Expletive Deleted (OT)

2006-06-15 Thread Alan Gauld
Just picked this up after being out for most of the week...

"Carroll, Barry" <[EMAIL PROTECTED]> wrote in message

> One reason to for choosing a human-readable format is the desire to
> visually confirm the correctness of the stored data and format.

Thats a very dangerous asumption, how do you detect unprintable
characters, tabs instead of spaces, trailing spaces on a line etc etc.
Whole text representations are helpful you should never rely on the
human eye to validate a data file.

> can be invaluable when troubleshooting a bug involving stored data. 
> If
> there is a tool between the user and the data, one must then rely 
> upon
> the correctness of the tool to determine the correctness of the 
> data.

Or the correctness of the eye. I know which one i prefer - a tested 
tool.
The human eye is not a dta parser, but it flatters to deceive by being
nearly good enough.

> In a case like this, nothing beats the evidence of one's eyes, IMHO.

Almost anything beats the human eye IME :-)
Actually if you must use eyes do so on a hex dump of the file, that
is usually reliable enough if you can read hex...

> In their book, "The Pragmatic Programmer: From Journeyman to Master"
> (Addison Wesley Professional), Andrew Hunt and David Thomas give 
> another
> reason for storing data in human readable form:
>
>The problem with most binary formats is that the context 
> necessary
>to understand the data is separate from the data itself. You are
>artificially divorcing the data from its meaning. The data may
>as well be encrypted; it is absolutely meaningless without the
>application logic to parse it. With plain text, however, you can
>achieve a self-describing data stream that is independent of the
>application that created it.

But at a very high risk. I do not dislike text files BTW and am not
suggesting that text should not be used but its parsing is best left
to machines, the eye is only a rough and unreliable guide.

And if your data volumes are high go with binary, you'll need tools
to parse a lot of data anyway, you might as well save the space!

The Hunt/Thomas book is excellent BTW - I recommend it highly.
Even though I disagree witrh several of their suggestions(*) I agree
with far more.

(*)They recommend sticking with one text editor whereas I use
about 5 or 6 on a regular basis depending on the job I'm doing and
the platform I'm working on. Emacs on X Windows for new files
but vim for quick fix ups, vim on Windows for most things,
ed or ex for text based email or over a phone line.

> This is an example of the resource balancing act that computer 
> people
> have been faced with since the beginning.  The most scarce/expensive
> resource dictates the program's/system's design.  In Alan's example 
> high
> speed bandwidth is the limiting resource.  A data transmission 
> method
> that fails to minimize use of that resource is therefore a bad 
> solution.

Unfortunately the software industry is full of people who by and large
don't understand networks so they just ignoire them. At least thats
my experience! SOA using SOAP/XML is probably the most inefficient
and unreliable set of data networking technologies you could possible
come up with. But the focus is on cutting developer cost because the
people inventing it are developers! In almost every sizewable project
the cost of development will be significantly less than the cost of
deployment - in most of my projects it usually works out something 
like:

development - 15%
deployment - 30%
support - 15%
training - 25%
documentation - 5%
management overhead - 10%

Saving 25% of development costs rediuced total cost by around 4% but
if that puts deployment costs up by 10% the net gain is only 1%!
And in XML case it often puts deployment costs up by 100%
- a net loss of  24%!!
Now those figures come from a typical project that I work on which
probably has a total budget of betwen $10-100 million. If your
budget is smaller, say less than $1 million then the balance may
well change. But over 50% of the IT industry works on projects
with  >$1m budgets according to both Datamation and Infoweek.

[ The only SOA/XML book that addresses this side of XML usage
is the excellent "SOA - A Field Guide" by Peter Erls. Erls also
suggests some mitigating strategies to get round it.]

> So here's my off-topic question: Ajax is being touted as the 
> 'best-known
> method' (BKM) for making dynamic browser-based applications, and XML 
> is
> the BKM for transferring data in Ajax land.  If XML is a bad idea 
> for
> network data-transfer, what medium should be used instead?

The example I gave of having to upgrade the sites network was actually
an early adopter of XML/Ajax architecture! There are lots of other 
data
formats around - some are even self describing (CSV and TLV are cases)
Others simply hold the definition in an accessible library so you only
have to transport it once - eg IDL and ASN.1 - or optionally compile 
it
into your code for maxim

Re: [Tutor] XML: Expletive Deleted (OT)

2006-06-12 Thread Kent Johnson
Carroll, Barry wrote:
> So here's my off-topic question: Ajax is being touted as the 'best-known
> method' (BKM) for making dynamic browser-based applications, and XML is
> the BKM for transferring data in Ajax land.  If XML is a bad idea for
> network data-transfer, what medium should be used instead?

JSON is a popular alternative to XML for Ajax applications. It is much 
lighter-weight than XML and easier to parse in JavaScript.
http://json.org/

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] XML: Expletive Deleted (OT)

2006-06-12 Thread Carroll, Barry
Alan, Ralph, et al:

This is a little off-topic, I guess, being not directly related to
Python.  Oh, well.  Here are a couple of personal opinions and a
question about XML.

> -Original Message-
> Date: Sun, 11 Jun 2006 08:55:17 +0100
> From: "Alan Gauld" <[EMAIL PROTECTED]>
> Subject: Re: [Tutor] Expletive Deleted
> To: "Ralph H. Stoos Jr." <[EMAIL PROTECTED]>,

> Message-ID: <[EMAIL PROTECTED]>
> Content-Type: text/plain; format=flowed; charset="iso-8859-1";
>   reply-type=original
> 
> > I think XML is a tool that allows non-programmers to look at
> > structured
> > data and have it a in human readable form that gives us a chance of
> > understanding that structure.
> 
> Thats not a great reason to choose a file format IMHO.
> Tools can be written to display data in a readable format.
> For example SQL can be used to view the data in a database.
> File formats should be designed to store data, compactly
> and with easy access.

One reason to for choosing a human-readable format is the desire to
visually confirm the correctness of the stored data and format.  This
can be invaluable when troubleshooting a bug involving stored data.  If
there is a tool between the user and the data, one must then rely upon
the correctness of the tool to determine the correctness of the data.
In a case like this, nothing beats the evidence of one's eyes, IMHO.  

In their book, "The Pragmatic Programmer: From Journeyman to Master"
(Addison Wesley Professional), Andrew Hunt and David Thomas give another
reason for storing data in human readable form:

The problem with most binary formats is that the context necessary 
to understand the data is separate from the data itself. You are 
artificially divorcing the data from its meaning. The data may 
as well be encrypted; it is absolutely meaningless without the 
application logic to parse it. With plain text, however, you can 
achieve a self-describing data stream that is independent of the 
application that created it.

Tip 20

Keep Knowledge in Plain Text

> > The other strength that I can see is this:  Once data is in this
> > format,
> > and a tool has been written to parse it,  data can be added to the
> > structure (more elements) and the original tool will not be broken
> > by
> > this.  Whatever it is parsed for is found and the extra is ignored.
> 
> But this is a very real plus point for XML.
> And this IMHO is the biggest single reason for using it, if you have
> data where the very structure itself is changing yet the same file
> has to be readable by old and new clients then XML is a good choice.

No argument there.  

> > Without a doubt, the overhead XML adds over say, something as simple
> > as
> > CSV is considerable, and XML would appear to be rather more hard to
> > work
> > with in things like Python and PERL.
> 
> Considerable is an understatement, its literally up to 10 or 20 times
> more space and that means bandwidth and CPU resource to
> process it.
> 
> Using XML as a storage medium - a file - is not too bad, you suck
> it up, process it and foirget the file. MY big gripe is that people
> are
> inceasingly trying to use XML as the payload in comms systems,
> sending XML messages around. This is crazy! The extra cost of the
> network and hardware needed to process that kind of architecture
> is usually far higher than the minimal savings it gives in developer
> time.
> [As an example I recently had to uplift the bandwidth of the
> intranet pipe in one of our buildings from 4Mb to a full ATM pipe
> of 34Mb just to accomodate a system 'upgrade' that now used XML.
> That raised the network operations cost of that one building
> from $10k per year to over $100k! - The software upgrade by
> contrast was only a one-off cost of $10K]

This is an example of the resource balancing act that computer people
have been faced with since the beginning.  The most scarce/expensive
resource dictates the program's/system's design.  In Alan's example high
speed bandwidth is the limiting resource.  A data transmission method
that fails to minimize use of that resource is therefore a bad solution.


Python itself is a result of this balancing act.  Interpreted languages
like Basic were invented to overcome the disadvantages of writing of
programs in machine-readable, human-unfriendly formats.  Compiled
languages like C were invented to overcome the slow execution speed of
interpreted programs.  As processor speeds increased and execution times
dropped , interpreted languages like Python once again became viable for
large scale programs.  

> > So, I think XML has it's place but I will not fault anyone for
> > trying to
> > make it easier to get code to work.
> 
> Absolutely agree with that. Just be careful how you use it and
> think of the real cost impact you may be having if its your choice.
> Your customers will thank you.

So here's my off-topic question: Ajax is being touted as the 'best-known
method' (BKM) for 

Re: [Tutor] XML: Expletive Deleted

2006-06-12 Thread doug shawhan
Kent, Danny, Lawrence, et. al. 

Thanks! 

I was kind of cringing as I sent this plaint/rant, but it seems I'm not
the only one who has had trouble grokking DOM. I spanked the problem
temporarily with regex, but can now actually fix it properly. 

Appreciate all the help!On 6/10/06, Kent Johnson <[EMAIL PROTECTED]> wrote:
In my opinion the standard DOM models are the most awkward way to dealwith XML. If you are trying to get data from HTML on a web page, look atBeautifulSoup. For general XML processing, look at ElementTree. They are
both simpler than DOM.http://www.crummy.com/software/BeautifulSoup/http://effbot.org/zone/element.htm
Kent___Tutor maillist  -  Tutor@python.orghttp://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] XML: Expletive Deleted

2006-06-10 Thread Kent Johnson
In my opinion the standard DOM models are the most awkward way to deal 
with XML. If you are trying to get data from HTML on a web page, look at 
BeautifulSoup. For general XML processing, look at ElementTree. They are 
both simpler than DOM.
http://www.crummy.com/software/BeautifulSoup/
http://effbot.org/zone/element.htm

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] XML: Expletive Deleted

2006-06-09 Thread lawrence wang
>  >> for item in itemIDs:
>  >> print item
>
>  yeilds
>
>  
>  
>  
>  
>  
>
>  Okay, no problem. Now all I have to do is figure out which
> particlular.string.of.words.interconnected.by.periods to
> pass to extract the values.
>
> >> for item in itemIDs:
> >> print item.nodeValue
>
> Seems logical:
>
> None
> None
> None
> None
> None

try dir(item) to see what attributes the item has, and try the ones
that sound right. e.g.:

>>> from xml.dom.minidom import parse, parseString
>>> resp = parseString("foo")
>>> bottom = resp.getElementsByTagName("bottom")
>>> bottom
[]
>>> dir(bottom[0])
['ATTRIBUTE_NODE', ...long list snipped..., 'writexml']
>>> bottom[0].hasChildNodes()
True
>>> bottom[0].childNodes
[]
>>> dir(bottom[0].childNodes[0])
['ATTRIBUTE_NODE', ...long list snipped..., 'writexml']
>>> bottom[0].childNodes[0].data
u'foo'

so you see, with "value", there's an invisible text node.
it's one of the quirks of xml, i guess. then the attribute you're
looking for is "data", not "nodeValue".

in summary: instead of item.nodeValue, item.childNodes[0].data.

--lawrence
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] XML: Expletive Deleted

2006-06-09 Thread Danny Yoo
>>> from xml.dom.minidom import parse, parseString
>
>>> data = response.read()
>>> connection.close()
>>> response = parseString(data)
>>> itemIDs = response.getElementsByTagName("ItemID")
>>> response.unlink()
 ^


Hi Doug,

What's going on here?  Why unlink()?



> Okay, no problem. Now all I have to do is figure out which 
> particlular.string.of.words.interconnected.by.periods to pass to extract 
> the values.
>
>>> for item in itemIDs:
>>> print item.nodeValue


You may want to look at the minidom example here:

 http://www.python.org/doc/lib/dom-example.html

Does this help?
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor