Min Du created KUDU-2255:
----------------------------

             Summary: Customized configuration on KuduClient via KuduContext
                 Key: KUDU-2255
                 URL: https://issues.apache.org/jira/browse/KUDU-2255
             Project: Kudu
          Issue Type: Bug
          Components: client, spark
    Affects Versions: 1.4.0
         Environment: Kudu1.4.0; Spark2; Cloudera 12.5
            Reporter: Min Du


It seems that kuduClient can be created with customized configuration (e.g. 
timeout, etc.) where as there is no configuration can be passed if I use 
KuduContext class. KuduContext will created a kuduClient by using its default 
configuration settings. 

This is my scenarios, I have a Spark application, which connects to Kudu and 
tries to do insert / update / delete on Kudu table. I am using 
KuduContext.class to create a kuduContext instance and its (a)syncClient. 
upsertRows() method in the KuduContext class is the method that I often call to 
insert / update data to Kudu table.

Basing on my basic understanding, upsertRows() method is calling 
writePartitionRows() method, which will create a new kudu session from Kudu 
syncClient. Since it does NOT allow me to set customised value for rpc timeout, 
it will always use the default one (30000 millisecond). The default value works 
for some queries, but might not work for long-running queries, which needs to 
insert a large resultset to Kudu table.

Do we have a way to set customised configuration values when using KuduContext 
class?

According to the Kudu configuration guide 
(https://kudu.apache.org/docs/configuration_reference.html), I would expect 
these configuration parameters can be also set in Java API.

Could you please let me know if I miss anything? If it is not the right jira 
ticket to ask, could you please guide me to raise this issue?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to