r: @nirvanainternat
>
> -Original Message-
> From: Tim Hunter [mailto:timhun...@databricks.com]
> Sent: Friday, February 17, 2017 1:49 PM
> To: bradc
> Cc: dev@spark.apache.org
> Subject: Re: Design document - MLlib's statistical package for DataFrames
>
> Hi Brad
-Original Message-
From: Tim Hunter [mailto:timhun...@databricks.com]
Sent: Friday, February 17, 2017 1:49 PM
To: bradc
Cc: dev@spark.apache.org
Subject: Re: Design document - MLlib's statistical package for DataFrames
Hi Brad,
this task is focusing on moving the existing algorithms
Hi Brad,
this task is focusing on moving the existing algorithms, so that we
are held up by parity issues.
Do you have some paper suggestions for cardinality? I do not think
there is a feature request on JIRA either.
Tim
On Thu, Feb 16, 2017 at 2:21 PM, bradc wrote:
>
Hi,
While it is also missing in spark.mllib, I'd suggest adding cardinality as
part of the Simple descriptive statistics for both spark.ml and spark.mlib?
This is useful even for data in double precision FP to understand the
"uniqueness" of the feature data.
Cheers,
Brad
--
View this
Hello all,
I have been looking at some of the missing items for complete feature
parity between spark.ml and spark.mllib. Here is a proposal for
porting mllib.stats, the descriptive statistics package: