Brian created SPARK-21968:
-----------------------------

             Summary: Improved KernelDensity support
                 Key: SPARK-21968
                 URL: https://issues.apache.org/jira/browse/SPARK-21968
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
    Affects Versions: 2.2.0
            Reporter: Brian


Related to SPARK-7753.  The KernelDensity API still does not provide a way to 
specify a kernel as described in the 7753 ticket, and requires the client to 
calculate their own optimal bandwidth.

Specifying a kernel could be something like:
def
setKernel(kernel: Function2[Double,Double]): KernelDensity.this.type

There could be something providing the user with a few options for kernels they 
could pass here so they don't need to implement each kernel themselves. Here 
are some example kernels:
https://en.wikipedia.org/wiki/Kernel_(statistics)#Kernel_functions_in_common_use

functions could also be provided to get more optimal bandwidth settings without 
the user needing to calculate it themselves, e.g. the "rule of thumb" and/or 
"solve the equation" bandwidth described here:
https://en.wikipedia.org/wiki/Kernel_density_estimation#Bandwidth_selection




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to