[ https://issues.apache.org/jira/browse/SPARK-28295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nils Skotara updated SPARK-28295: --------------------------------- Description: Using pyspark.ml.regression, when I fit a GeneralizedLinearRegression like this: glr = GeneralizedLinearRegression(family="gaussian", link="identity", regParam=0.3, maxIter=10) model = glr.fit(someData) It seems like there is no way to get the matching of the features and their coefficients or standard errors. I am using an ugly work around like this right now: field = model.summary._call_java('getClass').getDeclaredField("coefficientsWithStatistics") object2 = model._call_java('summary') field.setAccessible(True) value = field.get(object2) coef_value = {} for i in range(0, len(value)): row = value[i].toString() values = row.split(',') coef_value[values[0].replace('(', '').replace(')', '')] = float(values[1]) Am I missing something? If not, I'd like to request a method similar to model.coefficients with which one can just get the feature names in the right order, like model.features or something like that. was: In from pyspark.ml.regression when I fit a GeneralizedLinearRegression like this: glr = GeneralizedLinearRegression(family="gaussian", link="identity", regParam=0.3, maxIter=10) model = glr.fit(someData) It seems like there is no way to get the matching of the features and their coefficients or standard errors. I am using an ugly work around like this right now: field = model.summary._call_java('getClass').getDeclaredField("coefficientsWithStatistics") object2 = model._call_java('summary') field.setAccessible(True) value = field.get(object2) coef_value = {} for i in range(0, len(value)): row = value[i].toString() values = row.split(',') coef_value[values[0].replace('(', '').replace(')', '')] = float(values[1]) Am I missing something? If not, I'd like to request a method similar to model.coefficients with which one can just get the feature names in the right order, like model.features or something like that. > Is there a way of getting feature names from pyspark.ml.regression > GeneralizedLinearRegression? > ----------------------------------------------------------------------------------------------- > > Key: SPARK-28295 > URL: https://issues.apache.org/jira/browse/SPARK-28295 > Project: Spark > Issue Type: Request > Components: Build > Affects Versions: 2.3.1 > Reporter: Nils Skotara > Priority: Minor > Labels: features > Fix For: 2.3.1 > > > Using pyspark.ml.regression, > when I fit a GeneralizedLinearRegression like this: > glr = GeneralizedLinearRegression(family="gaussian", link="identity", > regParam=0.3, maxIter=10) > model = glr.fit(someData) > It seems like there is no way to get the matching of the features and their > coefficients or standard errors. I am using an ugly work around like this > right now: > field = > model.summary._call_java('getClass').getDeclaredField("coefficientsWithStatistics") > object2 = model._call_java('summary') > field.setAccessible(True) > value = field.get(object2) > coef_value = {} > for i in range(0, len(value)): > row = value[i].toString() > values = row.split(',') > coef_value[values[0].replace('(', '').replace(')', '')] = float(values[1]) > Am I missing something? > If not, I'd like to request a method similar to model.coefficients with > which one can just get the feature names in the right order, like > model.features or something like that. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org