OK. Thanks! I changed to manual flush mode and it increased to ~15K / sec. :)
Is there any other tuning I can do to further improve this? and also, how much would SSD help in this case (only upsert)? Thanks again, Chao On Mon, Oct 30, 2017 at 11:42 PM, Todd Lipcon <t...@cloudera.com> wrote: > If you want to manage batching yourself you can use the manual flush mode. > Easiest would be the auto flush background mode. > > Todd > > On Oct 30, 2017 11:10 PM, "Chao Sun" <sunc...@uber.com> wrote: > >> Hi Todd, >> >> Thanks for the reply! I used a single Kafka consumer to pull the data. >> For Kudu, I was doing something very simple that basically just follow >> the example here >> <https://github.com/cloudera/kudu-examples/blob/master/java/java-sample/src/main/java/org/kududb/examples/sample/Sample.java> >> . >> In specific: >> >> loop { >> Insert insert = kuduTable.newInsert(); >> PartialRow row = insert.getRow(); >> // fill the columns >> kuduSession.apply(insert) >> } >> >> I didn't specify the flushing mode, so it will pick up the >> AUTO_FLUSH_SYNC as default? >> should I use MANUAL_FLUSH? >> >> Thanks, >> Chao >> >> On Mon, Oct 30, 2017 at 10:39 PM, Todd Lipcon <t...@cloudera.com> wrote: >> >>> Hey Chao, >>> >>> Nice to hear you are checking out Kudu. >>> >>> What are you using to consume from Kafka and write to Kudu? Is it >>> possible that it is Java code and you are using the SYNC flush mode? That >>> would result in a separate round trip for each record and thus very low >>> throughput. >>> >>> Todd >>> >>> On Oct 30, 2017 10:23 PM, "Chao Sun" <sunc...@uber.com> wrote: >>> >>> Hi, >>> >>> We are evaluating Kudu (version kudu 1.3.0-cdh5.11.1, revision >>> af02f3ea6d9a1807dcac0ec75bfbca79a01a5cab) on a 8-node cluster. >>> The data are coming from Kafka at a rate of around 30K / sec, and hash >>> partitioned into 128 buckets. However, with default settings, Kudu can only >>> consume the topics at a rate of around 1.5K / second. This is a direct >>> ingest with no transformation on the data. >>> >>> Could this because I was using the default configurations? also we are >>> using Kudu on HDD - could that also be related? >>> >>> Any help would be appreciated. Thanks. >>> >>> Best, >>> Chao >>> >>> >>> >>