On 12/29/11 5:27 PM, 鲍少明 wrote:
I think this could be an improvement.
Can we extend the existing SYSCS_UTIL.SYSCS_IMPORT_TABLE system procedure
to support bulking-loading multiple same format csv files in one call?
I think it will not be difficult to use something like SequenceInputStream to load those files. I hope this can be performance enhancement to loading those files which has been splitted into pieces. I was trying to use multi-thread, but I find that in the procedure ,it will lock the whole table. What should I do if I think using this SYSCS_UTIL.SYSCS_IMPORT_TABLE procedure and
using derby in in-memory mode is still not fast enough?
Best Regards
-Clark
Hi Clark,

I probably don't understand the problem you are addressing so this response may talk past your issue. If you can give us more detail about your problem we may be able to be more helpful.

Many limitations of bulk import can be bypassed by using table functions. You will enjoy the bulk-import optimizations if you insert into an empty table as follows:

insert into myTable select * from table( myTableFunction( ... ) ) s

Table functions are described here: http://db.apache.org/derby/docs/10.8/devguide/devguide-single.html#cdevspecialtabfuncs

If your csv files are produced by dumping data from an original data source, then you may be able to eliminate the csv files altogether by writing a table function which directly siphons data out of the original data source. You may want to take a look at ForeignTableVTI, a table function attached to https://issues.apache.org/jira/browse/DERBY-4962. ForeignTableVTI is useful for doing bulk migration of data out of other relational databases.

In addition, it should be possible to wrap a table function around a SequenceInputStream.

Hope this helps,
-Rick

Reply via email to