sounds great except for one thing: I think it would be better to support
columns as both attributes and elements - attributes do just fine in most
cases, but are more compact & human readable.

--
fedor.

----
Stult's Report:
        Our problems are mostly behind us.  What we have to do now is
fight the solutions.


> -----Original Message-----
> From: Russ Trotter [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, January 31, 2002 12:11 AM
> To: Turbine Developers List
> Subject: XML dump/load
> 
> 
> Hello Torquies,
> 
>   I had been trying to get Torque's XML dump/load tasks to 
> work.  In the
> process, I discovered several deficiencies:
> 
> 1. XML output stores column values as attributes. Long data 
> columns always
> have to be escaped with XML entities (i.e. CDATA is not allowed for
> attribute values)  It's convenient when editing the XML data 
> by hand to have
> data between elements (as body data) and the option to use CDATA.
> 2. Existing code does not escape illegal attribute chars, e.g. '<' or
> embedded double quotes.
> 3. The "project-datasql" task loads the entire XML data file 
> in memory as a
> big honkin' DOM object then passes it off to a bunch of 
> velocity scripts for
> generating the SQL.  For any moderately sized data, this DOM 
> load takes too
> much time and creates a "VM nightmare on earth" for my poor desktop.
> 4. Most importantly and most difficult was that since the 
> "project-datasql"
> task generated actual SQL that must in-turn be loaded with 
> some kind of
> DB-specific command line util (e.g. mysql) all string-ish 
> datatypes must be
> escaped properly in whatever DB-specific syntax is required.  
> (e.g. using a
> \ to escape embedded quotes or doubling them or whatever)
> 
>   So, I rolled my own Ant tasks that use JDBC->XML 
> (DatabaseMetaData with
> generic SELECT queries) for the dump and SAX XML->JDBC _directly_ (no
> intermediate SQL file, generic INSERTS).  This solution is 
> nice and simple,
> flexible and not so coupled with Torque's "quirks" errr, 
> functionality.
> 
> 1.  all columns become first-class elements (along with 
> tables of course)
> 2.  Escaping for all column values is handled.  If a column 
> value's length
> exceeds a certain threshold and it contains chars that must 
> be escaped, it
> will simply wrap the value in a CDATA.
> 3.  Uses event-driven SAX API which does not require the 
> whole XML file to
> get loaded into some data structure.... much faster on my poor little
> desktop.
> 4.  Loads data directly through JDBC so that the driver 
> handles DB-specific
> escaping issues.  Strings are just  set as column values on 
> the insert state
> ment with JDBC preparedstatements.
> 
>   My offering is this: I will gladly donate this code to the
> Apache/Jakarta/Torque as a replacement for the
> "project-datadump/project-datasql" tasks if it seems sensible by the
> community.  Lemme know what ya'll think.
> 
> russ
> 
> 
> --
> To unsubscribe, e-mail:   
<mailto:[EMAIL PROTECTED]>
For additional commands, e-mail:
<mailto:[EMAIL PROTECTED]>

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to