On 12/29/11 5:27 PM, 鲍少明 wrote:
I think this could be an improvement.
Can we extend the existing SYSCS_UTIL.SYSCS_IMPORT_TABLE system procedure
to support bulking-loading multiple same format csv files in one call?
I think it will not be difficult to use something like
SequenceInputStream to load those files.
I hope this can be performance enhancement to loading those files
which has been splitted into pieces.
I was trying to use multi-thread, but I find that in the procedure ,it
will lock the whole table.
What should I do if I think using this SYSCS_UTIL.SYSCS_IMPORT_TABLE
procedure and
using derby in in-memory mode is still not fast enough?
Best Regards
-Clark
Hi Clark,
I probably don't understand the problem you are addressing so this
response may talk past your issue. If you can give us more detail about
your problem we may be able to be more helpful.
Many limitations of bulk import can be bypassed by using table
functions. You will enjoy the bulk-import optimizations if you insert
into an empty table as follows:
insert into myTable select * from table( myTableFunction( ... ) ) s
Table functions are described here:
http://db.apache.org/derby/docs/10.8/devguide/devguide-single.html#cdevspecialtabfuncs
If your csv files are produced by dumping data from an original data
source, then you may be able to eliminate the csv files altogether by
writing a table function which directly siphons data out of the original
data source. You may want to take a look at ForeignTableVTI, a table
function attached to https://issues.apache.org/jira/browse/DERBY-4962.
ForeignTableVTI is useful for doing bulk migration of data out of other
relational databases.
In addition, it should be possible to wrap a table function around a
SequenceInputStream.
Hope this helps,
-Rick