Re: load multiple same format csv files in one SYSCS_UTIL.SYSCS_IMPORT_TABLE call

Rick Hillegas Tue, 03 Jan 2012 06:34:33 -0800

On 12/29/11 5:27 PM, 鲍少明 wrote:

I think this could be an improvement.
Can we extend the existing SYSCS_UTIL.SYSCS_IMPORT_TABLE system procedure
to support bulking-loading multiple same format csv files in one call?
I think it will not be difficult to use something likeSequenceInputStream to load those files.I hope this can be performance enhancement to loading those fileswhich has been splitted into pieces.I was trying to use multi-thread, but I find that in the procedure ,itwill lock the whole table.What should I do if I think using this SYSCS_UTIL.SYSCS_IMPORT_TABLEprocedure and
using derby in in-memory mode is still not fast enough?
Best Regards
-Clark

Hi Clark,

I probably don't understand the problem you are addressing so thisresponse may talk past your issue. If you can give us more detail aboutyour problem we may be able to be more helpful.

Many limitations of bulk import can be bypassed by using tablefunctions. You will enjoy the bulk-import optimizations if you insertinto an empty table as follows:


insert into myTable select * from table( myTableFunction( ... ) ) s

Table functions are described here:http://db.apache.org/derby/docs/10.8/devguide/devguide-single.html#cdevspecialtabfuncs

If your csv files are produced by dumping data from an original datasource, then you may be able to eliminate the csv files altogether bywriting a table function which directly siphons data out of the originaldata source. You may want to take a look at ForeignTableVTI, a tablefunction attached to https://issues.apache.org/jira/browse/DERBY-4962.ForeignTableVTI is useful for doing bulk migration of data out of otherrelational databases.

In addition, it should be possible to wrap a table function around aSequenceInputStream.


Hope this helps,
-Rick

Re: load multiple same format csv files in one SYSCS_UTIL.SYSCS_IMPORT_TABLE call

Reply via email to