Re: Load Data Question

Alexey Serbin Wed, 12 Jul 2017 19:05:47 -0700

If you use Kudu API and set flush mode for a session to anything butAUTO_FLUSH_SYNC, those inserts will be accumulated into batches at theclient side and sent to the corresponding tablet servers in chunks.Consider using the AUTO_FLUSH_BACKGROUND mode while working withKuduSession API (using MANUAL_FLUSH would require you to flush thosebatches manually before the size of the accumulated data reaches the maxallowed size, which is configurable).

Also, if the lines in your file(s) contain data for independent rows(i.e. you are not expecting to perform upserts for some lines), youcould split those lines into ranges (e.g., 0 -- 999999, 100000 --199999, etc.) and run multiple Kudu sessions (one per line range in thefile) in parallel.


Hope this helps.


Best regards,

Alexey



On 7/10/17 7:54 PM, sky wrote:

Hi,
     If load  data from a csv file, I can only traverse the file, one by one 
insert through the API ?






At 2017-07-10 22:40:05, "Jean-Daniel Cryans" <jdcry...@gmail.com> wrote:

(sending to user@ and putting dev@ in bcc)

Hi,

Kudu by itself doesn't really have file loading capabilities, you'd have to
write your own code that reads a file and then uses either the Java or C++
API to insert the data.

Hope this helps,

J-D

On Mon, Jul 10, 2017 at 1:55 AM, sky <x_h...@163.com> wrote:

Hi all,
     Kudu how to load data from a file?  I know that kudu can insert data
from impala , but is there any other way? Not through impala, executed by
kudu alone.
     Thanks.

Re: Load Data Question

Reply via email to