Re: [Tutor] XML: Expletive Deleted (OT)

2006-06-16 Thread Alan Gauld
 [ The only SOA/XML book that addresses this side of XML usage
 is the excellent SOA - A Field Guide by Peter Erls. Erls also
 suggests some mitigating strategies to get round it.]

Oops, don't rely on memory...

That is Thomas Erl not Peter Erls.

And of course there may be other SOAP/XML books deal with these 
issues, but Erl's book is the only one I've read that does so!

Alan G.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] XML: Expletive Deleted (OT)

2006-06-15 Thread Alan Gauld
Just picked this up after being out for most of the week...

Carroll, Barry [EMAIL PROTECTED] wrote in message

 One reason to for choosing a human-readable format is the desire to
 visually confirm the correctness of the stored data and format.

Thats a very dangerous asumption, how do you detect unprintable
characters, tabs instead of spaces, trailing spaces on a line etc etc.
Whole text representations are helpful you should never rely on the
human eye to validate a data file.

 can be invaluable when troubleshooting a bug involving stored data. 
 If
 there is a tool between the user and the data, one must then rely 
 upon
 the correctness of the tool to determine the correctness of the 
 data.

Or the correctness of the eye. I know which one i prefer - a tested 
tool.
The human eye is not a dta parser, but it flatters to deceive by being
nearly good enough.

 In a case like this, nothing beats the evidence of one's eyes, IMHO.

Almost anything beats the human eye IME :-)
Actually if you must use eyes do so on a hex dump of the file, that
is usually reliable enough if you can read hex...

 In their book, The Pragmatic Programmer: From Journeyman to Master
 (Addison Wesley Professional), Andrew Hunt and David Thomas give 
 another
 reason for storing data in human readable form:

The problem with most binary formats is that the context 
 necessary
to understand the data is separate from the data itself. You are
artificially divorcing the data from its meaning. The data may
as well be encrypted; it is absolutely meaningless without the
application logic to parse it. With plain text, however, you can
achieve a self-describing data stream that is independent of the
application that created it.

But at a very high risk. I do not dislike text files BTW and am not
suggesting that text should not be used but its parsing is best left
to machines, the eye is only a rough and unreliable guide.

And if your data volumes are high go with binary, you'll need tools
to parse a lot of data anyway, you might as well save the space!

The Hunt/Thomas book is excellent BTW - I recommend it highly.
Even though I disagree witrh several of their suggestions(*) I agree
with far more.

(*)They recommend sticking with one text editor whereas I use
about 5 or 6 on a regular basis depending on the job I'm doing and
the platform I'm working on. Emacs on X Windows for new files
but vim for quick fix ups, vim on Windows for most things,
ed or ex for text based email or over a phone line.

 This is an example of the resource balancing act that computer 
 people
 have been faced with since the beginning.  The most scarce/expensive
 resource dictates the program's/system's design.  In Alan's example 
 high
 speed bandwidth is the limiting resource.  A data transmission 
 method
 that fails to minimize use of that resource is therefore a bad 
 solution.

Unfortunately the software industry is full of people who by and large
don't understand networks so they just ignoire them. At least thats
my experience! SOA using SOAP/XML is probably the most inefficient
and unreliable set of data networking technologies you could possible
come up with. But the focus is on cutting developer cost because the
people inventing it are developers! In almost every sizewable project
the cost of development will be significantly less than the cost of
deployment - in most of my projects it usually works out something 
like:

development - 15%
deployment - 30%
support - 15%
training - 25%
documentation - 5%
management overhead - 10%

Saving 25% of development costs rediuced total cost by around 4% but
if that puts deployment costs up by 10% the net gain is only 1%!
And in XML case it often puts deployment costs up by 100%
- a net loss of  24%!!
Now those figures come from a typical project that I work on which
probably has a total budget of betwen $10-100 million. If your
budget is smaller, say less than $1 million then the balance may
well change. But over 50% of the IT industry works on projects
with  $1m budgets according to both Datamation and Infoweek.

[ The only SOA/XML book that addresses this side of XML usage
is the excellent SOA - A Field Guide by Peter Erls. Erls also
suggests some mitigating strategies to get round it.]

 So here's my off-topic question: Ajax is being touted as the 
 'best-known
 method' (BKM) for making dynamic browser-based applications, and XML 
 is
 the BKM for transferring data in Ajax land.  If XML is a bad idea 
 for
 network data-transfer, what medium should be used instead?

The example I gave of having to upgrade the sites network was actually
an early adopter of XML/Ajax architecture! There are lots of other 
data
formats around - some are even self describing (CSV and TLV are cases)
Others simply hold the definition in an accessible library so you only
have to transport it once - eg IDL and ASN.1 - or optionally compile 
it
into your code for maximum efficiency. ASN./1 is typically around 50

Re: [Tutor] XML: Expletive Deleted (OT)

2006-06-12 Thread Carroll, Barry
Alan, Ralph, et al:

This is a little off-topic, I guess, being not directly related to
Python.  Oh, well.  Here are a couple of personal opinions and a
question about XML.

 -Original Message-
 Date: Sun, 11 Jun 2006 08:55:17 +0100
 From: Alan Gauld [EMAIL PROTECTED]
 Subject: Re: [Tutor] Expletive Deleted
 To: Ralph H. Stoos Jr. [EMAIL PROTECTED],
Tutor@python.org
 Message-ID: [EMAIL PROTECTED]
 Content-Type: text/plain; format=flowed; charset=iso-8859-1;
   reply-type=original
 
  I think XML is a tool that allows non-programmers to look at
  structured
  data and have it a in human readable form that gives us a chance of
  understanding that structure.
 
 Thats not a great reason to choose a file format IMHO.
 Tools can be written to display data in a readable format.
 For example SQL can be used to view the data in a database.
 File formats should be designed to store data, compactly
 and with easy access.

One reason to for choosing a human-readable format is the desire to
visually confirm the correctness of the stored data and format.  This
can be invaluable when troubleshooting a bug involving stored data.  If
there is a tool between the user and the data, one must then rely upon
the correctness of the tool to determine the correctness of the data.
In a case like this, nothing beats the evidence of one's eyes, IMHO.  

In their book, The Pragmatic Programmer: From Journeyman to Master
(Addison Wesley Professional), Andrew Hunt and David Thomas give another
reason for storing data in human readable form:

The problem with most binary formats is that the context necessary 
to understand the data is separate from the data itself. You are 
artificially divorcing the data from its meaning. The data may 
as well be encrypted; it is absolutely meaningless without the 
application logic to parse it. With plain text, however, you can 
achieve a self-describing data stream that is independent of the 
application that created it.

Tip 20

Keep Knowledge in Plain Text

  The other strength that I can see is this:  Once data is in this
  format,
  and a tool has been written to parse it,  data can be added to the
  structure (more elements) and the original tool will not be broken
  by
  this.  Whatever it is parsed for is found and the extra is ignored.
 
 But this is a very real plus point for XML.
 And this IMHO is the biggest single reason for using it, if you have
 data where the very structure itself is changing yet the same file
 has to be readable by old and new clients then XML is a good choice.

No argument there.  

  Without a doubt, the overhead XML adds over say, something as simple
  as
  CSV is considerable, and XML would appear to be rather more hard to
  work
  with in things like Python and PERL.
 
 Considerable is an understatement, its literally up to 10 or 20 times
 more space and that means bandwidth and CPU resource to
 process it.
 
 Using XML as a storage medium - a file - is not too bad, you suck
 it up, process it and foirget the file. MY big gripe is that people
 are
 inceasingly trying to use XML as the payload in comms systems,
 sending XML messages around. This is crazy! The extra cost of the
 network and hardware needed to process that kind of architecture
 is usually far higher than the minimal savings it gives in developer
 time.
 [As an example I recently had to uplift the bandwidth of the
 intranet pipe in one of our buildings from 4Mb to a full ATM pipe
 of 34Mb just to accomodate a system 'upgrade' that now used XML.
 That raised the network operations cost of that one building
 from $10k per year to over $100k! - The software upgrade by
 contrast was only a one-off cost of $10K]

This is an example of the resource balancing act that computer people
have been faced with since the beginning.  The most scarce/expensive
resource dictates the program's/system's design.  In Alan's example high
speed bandwidth is the limiting resource.  A data transmission method
that fails to minimize use of that resource is therefore a bad solution.


Python itself is a result of this balancing act.  Interpreted languages
like Basic were invented to overcome the disadvantages of writing of
programs in machine-readable, human-unfriendly formats.  Compiled
languages like C were invented to overcome the slow execution speed of
interpreted programs.  As processor speeds increased and execution times
dropped , interpreted languages like Python once again became viable for
large scale programs.  

  So, I think XML has it's place but I will not fault anyone for
  trying to
  make it easier to get code to work.
 
 Absolutely agree with that. Just be careful how you use it and
 think of the real cost impact you may be having if its your choice.
 Your customers will thank you.

So here's my off-topic question: Ajax is being touted as the 'best-known
method' (BKM) for making dynamic browser-based applications, and XML is
the BKM for 

Re: [Tutor] XML: Expletive Deleted (OT)

2006-06-12 Thread Kent Johnson
Carroll, Barry wrote:
 So here's my off-topic question: Ajax is being touted as the 'best-known
 method' (BKM) for making dynamic browser-based applications, and XML is
 the BKM for transferring data in Ajax land.  If XML is a bad idea for
 network data-transfer, what medium should be used instead?

JSON is a popular alternative to XML for Ajax applications. It is much 
lighter-weight than XML and easier to parse in JavaScript.
http://json.org/

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor