[
https://issues.apache.org/jira/browse/KUDU-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Henke updated KUDU-2485:
------------------------------
Component/s: documentation
> Enhance Kudu-Spark docs
> -----------------------
>
> Key: KUDU-2485
> URL: https://issues.apache.org/jira/browse/KUDU-2485
> Project: Kudu
> Issue Type: Improvement
> Components: documentation, spark
> Affects Versions: 1.7.1
> Reporter: William Berkeley
> Priority: Major
>
> Users often get confused about the right way to use the Kudu-Spark
> integration. The most common dangerous result is that they create multiple
> Kudu clients, sometimes even one per task. It's pretty easy to overwhelm the
> master in this way, e.g., with a 2 second batch window and a client per task
> in a Spark streaming job. We should take our current minimal Spark docs and
> provide better examples and bigger, louder, redder warnings about making
> extra Kudu clients. Users should be directed to use the KuduContext
> exclusively. When a client is needed, the client instance inside the
> KuduContext should be used.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)