[jira] [Created] (SPARK-21622) Support Offset in SparkR

2017-08-03 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-21622: --- Summary: Support Offset in SparkR Key: SPARK-21622 URL: https://issues.apache.org/jira/browse/SPARK-21622 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-21310) Add offset to PySpark GLM

2017-07-04 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-21310: --- Summary: Add offset to PySpark GLM Key: SPARK-21310 URL: https://issues.apache.org/jira/browse/SPARK-21310 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-21275) Update GLM test to use supportedFamilyNames

2017-06-30 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-21275: --- Summary: Update GLM test to use supportedFamilyNames Key: SPARK-21275 URL: https://issues.apache.org/jira/browse/SPARK-21275 Project: Spark Issue Type:

[jira] [Created] (SPARK-20917) SparkR supports string encoding consistent with R

2017-05-29 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-20917: --- Summary: SparkR supports string encoding consistent with R Key: SPARK-20917 URL: https://issues.apache.org/jira/browse/SPARK-20917 Project: Spark Issue Type:

[jira] [Created] (SPARK-20899) PySpark supports stringIndexerOrderType in RFormula

2017-05-26 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-20899: --- Summary: PySpark supports stringIndexerOrderType in RFormula Key: SPARK-20899 URL: https://issues.apache.org/jira/browse/SPARK-20899 Project: Spark Issue

[jira] [Created] (SPARK-20892) Add SQL trunc function to SparkR

2017-05-25 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-20892: --- Summary: Add SQL trunc function to SparkR Key: SPARK-20892 URL: https://issues.apache.org/jira/browse/SPARK-20892 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-20889) SparkR grouped documentation for Column methods

2017-05-25 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-20889: --- Summary: SparkR grouped documentation for Column methods Key: SPARK-20889 URL: https://issues.apache.org/jira/browse/SPARK-20889 Project: Spark Issue Type:

[jira] [Created] (SPARK-20736) PySpark StringIndexer supports StringOrderType

2017-05-14 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-20736: --- Summary: PySpark StringIndexer supports StringOrderType Key: SPARK-20736 URL: https://issues.apache.org/jira/browse/SPARK-20736 Project: Spark Issue Type:

[jira] [Updated] (SPARK-20619) StringIndexer supports multiple ways of label ordering

2017-05-06 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang updated SPARK-20619: Description: StringIndexer maps labels to numbers according to the descending order of label

[jira] [Created] (SPARK-20619) StringIndexer supports multiple ways of label ordering

2017-05-06 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-20619: --- Summary: StringIndexer supports multiple ways of label ordering Key: SPARK-20619 URL: https://issues.apache.org/jira/browse/SPARK-20619 Project: Spark Issue

[jira] [Updated] (SPARK-20604) Allow Imputer to handle all numeric types

2017-05-04 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang updated SPARK-20604: Description: Imputer currently requires input column to be Double or Float, but the logic should

[jira] [Created] (SPARK-20604) Allow Imputer to handle all numeric types

2017-05-04 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-20604: --- Summary: Allow Imputer to handle all numeric types Key: SPARK-20604 URL: https://issues.apache.org/jira/browse/SPARK-20604 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-20574) Allow Bucketizer to handle non-Double column

2017-05-03 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-20574: --- Summary: Allow Bucketizer to handle non-Double column Key: SPARK-20574 URL: https://issues.apache.org/jira/browse/SPARK-20574 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-20258) SparkR logistic regression example did not converge in programming guide

2017-04-07 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-20258: --- Summary: SparkR logistic regression example did not converge in programming guide Key: SPARK-20258 URL: https://issues.apache.org/jira/browse/SPARK-20258 Project:

[jira] [Commented] (SPARK-20026) Document R GLM Tweedie family support in programming guide and code example

2017-04-04 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955342#comment-15955342 ] Wayne Zhang commented on SPARK-20026: - [~felixcheung] Yes, I will work on this. Thanks. > Document

[jira] [Created] (SPARK-19819) Use concrete data in SparkR DataFrame examples

2017-03-04 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-19819: --- Summary: Use concrete data in SparkR DataFrame examples Key: SPARK-19819 URL: https://issues.apache.org/jira/browse/SPARK-19819 Project: Spark Issue Type:

[jira] [Created] (SPARK-19818) SparkR union should check for name consistency of input data frames

2017-03-03 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-19818: --- Summary: SparkR union should check for name consistency of input data frames Key: SPARK-19818 URL: https://issues.apache.org/jira/browse/SPARK-19818 Project: Spark

[jira] [Closed] (SPARK-19773) SparkDataFrame should not allow duplicate names

2017-03-01 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang closed SPARK-19773. --- Resolution: Not A Problem > SparkDataFrame should not allow duplicate names >

[jira] [Created] (SPARK-19773) SparkDataFrame should not allow duplicate names

2017-02-28 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-19773: --- Summary: SparkDataFrame should not allow duplicate names Key: SPARK-19773 URL: https://issues.apache.org/jira/browse/SPARK-19773 Project: Spark Issue Type:

[jira] [Created] (SPARK-19682) Issue warning (or error) when subset method "[[" takes vector index

2017-02-21 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-19682: --- Summary: Issue warning (or error) when subset method "[[" takes vector index Key: SPARK-19682 URL: https://issues.apache.org/jira/browse/SPARK-19682 Project: Spark

[jira] [Closed] (SPARK-19473) Several DataFrame Methods still fail with dot in column names

2017-02-10 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang closed SPARK-19473. --- Resolution: Not A Problem > Several DataFrame Methods still fail with dot in column names >

[jira] [Updated] (SPARK-19473) Several DataFrame Methods still fail with dot in column names

2017-02-05 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang updated SPARK-19473: Summary: Several DataFrame Methods still fail with dot in column names (was: Several DataFrame

[jira] [Created] (SPARK-19473) Several DataFrame Method still fail with dot in column names

2017-02-05 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-19473: --- Summary: Several DataFrame Method still fail with dot in column names Key: SPARK-19473 URL: https://issues.apache.org/jira/browse/SPARK-19473 Project: Spark

[jira] [Created] (SPARK-19452) Fix bug in the name assignment method in SparkR

2017-02-03 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-19452: --- Summary: Fix bug in the name assignment method in SparkR Key: SPARK-19452 URL: https://issues.apache.org/jira/browse/SPARK-19452 Project: Spark Issue Type:

[jira] (SPARK-19400) GLM fails for intercept only model

2017-01-30 Thread Wayne Zhang (JIRA)
Title: Message Title Wayne Zhang created an issue

[jira] (SPARK-19400) GLM fails for intercept only model

2017-01-30 Thread Wayne Zhang (JIRA)
Title: Message Title Wayne Zhang updated an issue

[jira] (SPARK-19395) Convert coefficients in summary to matrix

2017-01-29 Thread Wayne Zhang (JIRA)
Title: Message Title Wayne Zhang created an issue

[jira] [Created] (SPARK-19391) Tweedie GLM API in SparkR

2017-01-28 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-19391: --- Summary: Tweedie GLM API in SparkR Key: SPARK-19391 URL: https://issues.apache.org/jira/browse/SPARK-19391 Project: Spark Issue Type: Improvement

[jira] [Reopened] (SPARK-18710) Add offset to GeneralizedLinearRegression models

2017-01-24 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang reopened SPARK-18710: - > Add offset to GeneralizedLinearRegression models >

[jira] [Commented] (SPARK-14659) OneHotEncoder support drop first category alphabetically in the encoded vector

2017-01-18 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828618#comment-15828618 ] Wayne Zhang commented on SPARK-14659: - [~yanboliang] [~josephkb] Has anyone been working on this

[jira] [Updated] (SPARK-19270) Add summary table to GLM summary

2017-01-18 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang updated SPARK-19270: Shepherd: Yanbo Liang > Add summary table to GLM summary > > >

[jira] [Created] (SPARK-19270) Add summary table to GLM summary

2017-01-17 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-19270: --- Summary: Add summary table to GLM summary Key: SPARK-19270 URL: https://issues.apache.org/jira/browse/SPARK-19270 Project: Spark Issue Type: Improvement

[jira] [Reopened] (SPARK-18929) Add Tweedie distribution in GLM

2017-01-10 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang reopened SPARK-18929: - > Add Tweedie distribution in GLM > --- > > Key:

[jira] [Closed] (SPARK-18929) Add Tweedie distribution in GLM

2017-01-06 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang closed SPARK-18929. --- Resolution: Unresolved > Add Tweedie distribution in GLM > --- > >

[jira] [Closed] (SPARK-18710) Add offset to GeneralizedLinearRegression models

2017-01-06 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang closed SPARK-18710. --- Resolution: Unresolved > Add offset to GeneralizedLinearRegression models >

[jira] [Commented] (SPARK-18710) Add offset to GeneralizedLinearRegression models

2016-12-26 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15779760#comment-15779760 ] Wayne Zhang commented on SPARK-18710: - Thanks for the comment, Yanbo. In IRLS, the fit method expects

[jira] [Commented] (SPARK-18710) Add offset to GeneralizedLinearRegression models

2016-12-20 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15765733#comment-15765733 ] Wayne Zhang commented on SPARK-18710: - [~yanboliang] Thanks for the suggestion. I think the issue is

[jira] [Commented] (SPARK-18710) Add offset to GeneralizedLinearRegression models

2016-12-19 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15762752#comment-15762752 ] Wayne Zhang commented on SPARK-18710: - [~yanboliang] It seems that I would need to change the case

[jira] [Created] (SPARK-18929) Add Tweedie distribution in GLM

2016-12-19 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-18929: --- Summary: Add Tweedie distribution in GLM Key: SPARK-18929 URL: https://issues.apache.org/jira/browse/SPARK-18929 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-18710) Add offset to GeneralizedLinearRegression models

2016-12-11 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang updated SPARK-18710: Shepherd: Yanbo Liang (was: Sean Owen) Remaining Estimate: 10h (was: 336h)

[jira] [Updated] (SPARK-18715) Fix wrong AIC calculation in Binomial GLM

2016-12-04 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang updated SPARK-18715: Description: The AIC calculation in Binomial GLM seems to be wrong when there are weights. The

[jira] [Updated] (SPARK-18715) Fix wrong AIC calculation in Binomial GLM

2016-12-04 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang updated SPARK-18715: Summary: Fix wrong AIC calculation in Binomial GLM (was: Correct AIC calculation in Binomial GLM)

[jira] [Created] (SPARK-18715) Correct AIC calculation in Binomial GLM

2016-12-04 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-18715: --- Summary: Correct AIC calculation in Binomial GLM Key: SPARK-18715 URL: https://issues.apache.org/jira/browse/SPARK-18715 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-18710) Add offset to GeneralizedLinearRegression models

2016-12-04 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-18710: --- Summary: Add offset to GeneralizedLinearRegression models Key: SPARK-18710 URL: https://issues.apache.org/jira/browse/SPARK-18710 Project: Spark Issue Type:

[jira] [Updated] (SPARK-18701) Poisson GLM fails due to wrong initialization

2016-12-04 Thread Wayne Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wayne Zhang updated SPARK-18701: Shepherd: Sean Owen (was: sean corkum) Issue Type: Bug (was: New Feature) > Poisson GLM

[jira] [Created] (SPARK-18701) Poisson GLM fails due to wrong initialization

2016-12-03 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-18701: --- Summary: Poisson GLM fails due to wrong initialization Key: SPARK-18701 URL: https://issues.apache.org/jira/browse/SPARK-18701 Project: Spark Issue Type: New

[jira] [Created] (SPARK-18166) GeneralizedLinearRegression Wrong Value Range for Poisson Distribution

2016-10-28 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-18166: --- Summary: GeneralizedLinearRegression Wrong Value Range for Poisson Distribution Key: SPARK-18166 URL: https://issues.apache.org/jira/browse/SPARK-18166 Project: