[jira] [Commented] (SPARK-18277) na.fill() and friends should work on struct fields
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844575#comment-16844575 ] Hyukjin Kwon commented on SPARK-18277: -- Yea, please go ahead (but please add affect version too). There was a discussion about bulk-resolving. > na.fill() and friends should work on struct fields > -- > > Key: SPARK-18277 > URL: https://issues.apache.org/jira/browse/SPARK-18277 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Nicholas Chammas >Priority: Minor > Labels: bulk-closed > > It appears that you cannot use {{fill()}} and friends to quickly modify > struct fields. > For example: > {code} > >>> df = spark.createDataFrame([Row(a=Row(b='yeah yeah'), c='alright'), > >>> Row(a=Row(b=None), c=None)]) > >>> df.printSchema() > root > |-- a: struct (nullable = true) > ||-- b: string (nullable = true) > |-- c: string (nullable = true) > >>> df.show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| null| > +---+---+ > >>> df.na.fill('').show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| | > +---+---+ > {code} > {{c}} got filled in, but {{a.b}} didn't. > I don't know if it's "appropriate", but it would be nice if {{fill()}} and > friends worked automatically on struct fields. > As things are today, there doesn't appear to be a way to fill in null values > inside structs. If you try {{when()}}, you realize that you cannot do > {{when(col('a.b') is None, '')}} because {{Column}} doesn't implement the > appropriate protocol for {{is}}. And if you try {{when(col('a.b') == None, > '')}} it doesn't catch the null values. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18277) na.fill() and friends should work on struct fields
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844530#comment-16844530 ] Nicholas Chammas commented on SPARK-18277: -- [~hyukjin.kwon] - If I still think this issue is relevant, should I just reopen it? > na.fill() and friends should work on struct fields > -- > > Key: SPARK-18277 > URL: https://issues.apache.org/jira/browse/SPARK-18277 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Nicholas Chammas >Priority: Minor > Labels: bulk-closed > > It appears that you cannot use {{fill()}} and friends to quickly modify > struct fields. > For example: > {code} > >>> df = spark.createDataFrame([Row(a=Row(b='yeah yeah'), c='alright'), > >>> Row(a=Row(b=None), c=None)]) > >>> df.printSchema() > root > |-- a: struct (nullable = true) > ||-- b: string (nullable = true) > |-- c: string (nullable = true) > >>> df.show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| null| > +---+---+ > >>> df.na.fill('').show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| | > +---+---+ > {code} > {{c}} got filled in, but {{a.b}} didn't. > I don't know if it's "appropriate", but it would be nice if {{fill()}} and > friends worked automatically on struct fields. > As things are today, there doesn't appear to be a way to fill in null values > inside structs. If you try {{when()}}, you realize that you cannot do > {{when(col('a.b') is None, '')}} because {{Column}} doesn't implement the > appropriate protocol for {{is}}. And if you try {{when(col('a.b') == None, > '')}} it doesn't catch the null values. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18277) na.fill() and friends should work on struct fields
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637713#comment-15637713 ] Nicholas Chammas commented on SPARK-18277: -- {quote} If you try {{when()}}, you realize that you cannot do {{when(col('a.b') is None, '')}} because {{Column}} doesn't implement the appropriate protocol for {{is}}. {quote} Ah my bad, in this case the appropriate thing to do is {{when(col('a.b').isNull(), '')}}. So there is a workaround available today via {{when()}} and {{isNull()}}. > na.fill() and friends should work on struct fields > -- > > Key: SPARK-18277 > URL: https://issues.apache.org/jira/browse/SPARK-18277 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Nicholas Chammas >Priority: Minor > > It appears that you cannot use {{fill()}} and friends to quickly modify > struct fields. > For example: > {code} > >>> df = spark.createDataFrame([Row(a=Row(b='yeah yeah'), c='alright'), > >>> Row(a=Row(b=None), c=None)]) > >>> df.printSchema() > root > |-- a: struct (nullable = true) > ||-- b: string (nullable = true) > |-- c: string (nullable = true) > >>> df.show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| null| > +---+---+ > >>> df.na.fill('').show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| | > +---+---+ > {code} > {{c}} got filled in, but {{a.b}} didn't. > I don't know if it's "appropriate", but it would be nice if {{fill()}} and > friends worked automatically on struct fields. > As things are today, there doesn't appear to be a way to fill in null values > inside structs. If you try {{when()}}, you realize that you cannot do > {{when(col('a.b') is None, '')}} because {{Column}} doesn't implement the > appropriate protocol for {{is}}. And if you try {{when(col('a.b') == None, > '')}} it doesn't catch the null values. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18277) na.fill() and friends should work on struct fields
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637654#comment-15637654 ] Nicholas Chammas commented on SPARK-18277: -- Thanks for the pointer. I'll follow the discussion there. > na.fill() and friends should work on struct fields > -- > > Key: SPARK-18277 > URL: https://issues.apache.org/jira/browse/SPARK-18277 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Nicholas Chammas >Priority: Minor > > It appears that you cannot use {{fill()}} and friends to quickly modify > struct fields. > For example: > {code} > >>> df = spark.createDataFrame([Row(a=Row(b='yeah yeah'), c='alright'), > >>> Row(a=Row(b=None), c=None)]) > >>> df.printSchema() > root > |-- a: struct (nullable = true) > ||-- b: string (nullable = true) > |-- c: string (nullable = true) > >>> df.show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| null| > +---+---+ > >>> df.na.fill('').show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| | > +---+---+ > {code} > {{c}} got filled in, but {{a.b}} didn't. > I don't know if it's "appropriate", but it would be nice if {{fill()}} and > friends worked automatically on struct fields. > As things are today, there doesn't appear to be a way to fill in null values > inside structs. If you try {{when()}}, you realize that you cannot do > {{when(col('a.b') is None, '')}} because {{Column}} doesn't implement the > appropriate protocol for {{is}}. And if you try {{when(col('a.b') == None, > '')}} it doesn't catch the null values. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18277) na.fill() and friends should work on struct fields
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637622#comment-15637622 ] Michael Armbrust commented on SPARK-18277: -- We've been talking about better support for nested data for 2.2, [SPARK-16483]. > na.fill() and friends should work on struct fields > -- > > Key: SPARK-18277 > URL: https://issues.apache.org/jira/browse/SPARK-18277 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Nicholas Chammas >Priority: Minor > > It appears that you cannot use {{fill()}} and friends to quickly modify > struct fields. > For example: > {code} > >>> df = spark.createDataFrame([Row(a=Row(b='yeah yeah'), c='alright'), > >>> Row(a=Row(b=None), c=None)]) > >>> df.printSchema() > root > |-- a: struct (nullable = true) > ||-- b: string (nullable = true) > |-- c: string (nullable = true) > >>> df.show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| null| > +---+---+ > >>> df.na.fill('').show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| | > +---+---+ > {code} > {{c}} got filled in, but {{a.b}} didn't. > I don't know if it's "appropriate", but it would be nice if {{fill()}} and > friends worked automatically on struct fields. > As things are today, there doesn't appear to be a way to fill in null values > inside structs. If you try {{when()}}, you realize that you cannot do > {{when(col('a.b') is None, '')}} because {{Column}} doesn't implement the > appropriate protocol for {{is}}. And if you try {{when(col('a.b') == None, > '')}} it doesn't catch the null values. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18277) na.fill() and friends should work on struct fields
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637566#comment-15637566 ] Nicholas Chammas commented on SPARK-18277: -- [~marmbrus] / [~yhuai]: Is there is workaround for this available today? Also, do you think {{fill()}} should fit this use case down the line? > na.fill() and friends should work on struct fields > -- > > Key: SPARK-18277 > URL: https://issues.apache.org/jira/browse/SPARK-18277 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Nicholas Chammas >Priority: Minor > > It appears that you cannot use {{fill()}} and friends to quickly modify > struct fields. > For example: > {code} > >>> df = spark.createDataFrame([Row(a=Row(b='yeah yeah'), c='alright'), > >>> Row(a=Row(b=None), c=None)]) > >>> df.printSchema() > root > |-- a: struct (nullable = true) > ||-- b: string (nullable = true) > |-- c: string (nullable = true) > >>> df.show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| null| > +---+---+ > >>> df.na.fill('').show() > +---+---+ > | a| c| > +---+---+ > |[yeah yeah]|alright| > | [null]| | > +---+---+ > {code} > {{c}} got filled in, but {{a.b}} didn't. > I don't know if it's "appropriate", but it would be nice if {{fill()}} and > friends worked automatically on struct fields. > As things are today, there doesn't appear to be a way to fill in null values > inside structs. > If you try {{when()}}, you realize that you cannot do {{when(col('a.b') is > None, '')}} because {{Column}} doesn't implement the appropriate protocol for > {{is}}, and if you try {{when(col('a.b') == None, '')}} it doesn't catch the > null values. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org