Hello Torquies, I had been trying to get Torque's XML dump/load tasks to work. In the process, I discovered several deficiencies:
1. XML output stores column values as attributes. Long data columns always have to be escaped with XML entities (i.e. CDATA is not allowed for attribute values) It's convenient when editing the XML data by hand to have data between elements (as body data) and the option to use CDATA. 2. Existing code does not escape illegal attribute chars, e.g. '<' or embedded double quotes. 3. The "project-datasql" task loads the entire XML data file in memory as a big honkin' DOM object then passes it off to a bunch of velocity scripts for generating the SQL. For any moderately sized data, this DOM load takes too much time and creates a "VM nightmare on earth" for my poor desktop. 4. Most importantly and most difficult was that since the "project-datasql" task generated actual SQL that must in-turn be loaded with some kind of DB-specific command line util (e.g. mysql) all string-ish datatypes must be escaped properly in whatever DB-specific syntax is required. (e.g. using a \ to escape embedded quotes or doubling them or whatever) So, I rolled my own Ant tasks that use JDBC->XML (DatabaseMetaData with generic SELECT queries) for the dump and SAX XML->JDBC _directly_ (no intermediate SQL file, generic INSERTS). This solution is nice and simple, flexible and not so coupled with Torque's "quirks" errr, functionality. 1. all columns become first-class elements (along with tables of course) 2. Escaping for all column values is handled. If a column value's length exceeds a certain threshold and it contains chars that must be escaped, it will simply wrap the value in a CDATA. 3. Uses event-driven SAX API which does not require the whole XML file to get loaded into some data structure.... much faster on my poor little desktop. 4. Loads data directly through JDBC so that the driver handles DB-specific escaping issues. Strings are just set as column values on the insert state ment with JDBC preparedstatements. My offering is this: I will gladly donate this code to the Apache/Jakarta/Torque as a replacement for the "project-datadump/project-datasql" tasks if it seems sensible by the community. Lemme know what ya'll think. russ -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
