[jira] [Resolved] (SPARK-31123) Drop does not work after join with aliases
[ https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikel San Vicente resolved SPARK-31123. --- Fix Version/s: 3.0.0 Resolution: Fixed > Drop does not work after join with aliases > -- > > Key: SPARK-31123 > URL: https://issues.apache.org/jira/browse/SPARK-31123 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.2 >Reporter: Mikel San Vicente >Priority: Major > Fix For: 3.0.0 > > > > Hi, > I am seeing a really strange behaviour in drop method after a join with > aliases. It doesn't seem to find the column when I reference to it using > dataframe("columnName") syntax, but it does work with other combinators like > select > {code:java} > case class Record(a: String, dup: String) > case class Record2(b: String, dup: String) > val df = Seq(Record("a", "dup")).toDF > val df2 = Seq(Record2("a", "dup")).toDF > val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b")) > val dupCol = df("dup") > joined.drop(dupCol) // Does not drop anything > joined.drop(func.col("a.dup")) // It drops the column > joined.select(dupCol) // It selects the column > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31123) Drop does not work after join with aliases
[ https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikel San Vicente updated SPARK-31123: -- Description: Hi, I am seeing a really strange behaviour in drop method after a join with aliases. It doesn't seem to find the column when I reference to it using dataframe("columnName") syntax, but it does work with other combinators like select {code:java} case class Record(a: String, dup: String) case class Record2(b: String, dup: String) val df = Seq(Record("a", "dup")).toDF val df2 = Seq(Record2("a", "dup")).toDF val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b")) val dupCol = df("dup") joined.drop(dupCol) // Does not drop anything joined.drop(func.col("a.dup")) // It drops the column joined.select(dupCol) // It selects the column {code} was: Hi, I am seeing a really strange behaviour in drop method after a join with aliases. It doesn't seem to find the column when I reference to it using dataframe("columnName") syntax, but it does work with other combinators like select {code:java} case class Record(a: String, dup: String) case class Record2(b: String, dup: String) val df = Seq(Record("a", "dup")).toDF val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b")) val dupCol = df("dup") joined.drop(dupCol) // Does not drop anything joined.drop(func.col("a.dup")) // It drops the column joined.select(dupCol) // It selects the column {code} > Drop does not work after join with aliases > -- > > Key: SPARK-31123 > URL: https://issues.apache.org/jira/browse/SPARK-31123 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.2 >Reporter: Mikel San Vicente >Priority: Major > > > Hi, > I am seeing a really strange behaviour in drop method after a join with > aliases. It doesn't seem to find the column when I reference to it using > dataframe("columnName") syntax, but it does work with other combinators like > select > {code:java} > case class Record(a: String, dup: String) > case class Record2(b: String, dup: String) > val df = Seq(Record("a", "dup")).toDF > val df2 = Seq(Record2("a", "dup")).toDF > val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b")) > val dupCol = df("dup") > joined.drop(dupCol) // Does not drop anything > joined.drop(func.col("a.dup")) // It drops the column > joined.select(dupCol) // It selects the column > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31123) Drop does not work after join with aliases
[ https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060046#comment-17060046 ] Mikel San Vicente commented on SPARK-31123: --- Hi L.C., did you perform the join using aliases? otherwise the issue won't happen > Drop does not work after join with aliases > -- > > Key: SPARK-31123 > URL: https://issues.apache.org/jira/browse/SPARK-31123 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.2 >Reporter: Mikel San Vicente >Priority: Major > > > Hi, > I am seeing a really strange behaviour in drop method after a join with > aliases. It doesn't seem to find the column when I reference to it using > dataframe("columnName") syntax, but it does work with other combinators like > select > {code:java} > case class Record(a: String, dup: String) > case class Record2(b: String, dup: String) > val df = Seq(Record("a", "dup")).toDF > val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b")) > val dupCol = df("dup") > joined.drop(dupCol) // Does not drop anything > joined.drop(func.col("a.dup")) // It drops the column > joined.select(dupCol) // It selects the column > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31123) Drop does not work after join with aliases
[ https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikel San Vicente updated SPARK-31123: -- Priority: Major (was: Minor) > Drop does not work after join with aliases > -- > > Key: SPARK-31123 > URL: https://issues.apache.org/jira/browse/SPARK-31123 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.2 >Reporter: Mikel San Vicente >Priority: Major > > > Hi, > I am seeing a really strange behaviour in drop method after a join with > aliases. It doesn't seem to find the column when I reference to it using > dataframe("columnName") syntax, but it does work with other combinators like > select > {code:java} > case class Record(a: String, dup: String) > case class Record2(b: String, dup: String) > val df = Seq(Record("a", "dup")).toDF > val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b")) > val dupCol = df("dup") > joined.drop(dupCol) // Does not drop anything > joined.drop(func.col("a.dup")) // It drops the column > joined.select(dupCol) // It selects the column > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31123) Drop does not work after join with aliases
[ https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikel San Vicente updated SPARK-31123: -- Description: Hi, I am seeing a really strange behaviour in drop method after a join with aliases. It doesn't seem to find the column when I reference to it using dataframe("columnName") syntax, but it does work with other combinators like select {code:java} case class Record(a: String, dup: String) case class Record2(b: String, dup: String) val df = Seq(Record("a", "dup")).toDF val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b")) val dupCol = df("dup") joined.drop(dupCol) // Does not drop anything joined.drop(func.col("a.dup")) // It drops the column joined.select(dupCol) // It selects the column {code} was: Hi, I am seeing a really strange behaviour in drop method after a join with aliases. It doesn't seem to find the column when I reference to it using dataframe("columnName") syntax, but it does work with other combinators like select {code:java} case class Record(a: String, dup: String) case class Record2(b: String, dup: String) val df = Seq(Record("a", "dup")).toDF val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b")) val dupCol = df("dup") joined.drop(dupCol) // Does not drop anything joined.drop(func.col("a.dup")) // It works! joined.select(dupCol) // It works! {code} > Drop does not work after join with aliases > -- > > Key: SPARK-31123 > URL: https://issues.apache.org/jira/browse/SPARK-31123 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.2 >Reporter: Mikel San Vicente >Priority: Minor > > > Hi, > I am seeing a really strange behaviour in drop method after a join with > aliases. It doesn't seem to find the column when I reference to it using > dataframe("columnName") syntax, but it does work with other combinators like > select > {code:java} > case class Record(a: String, dup: String) > case class Record2(b: String, dup: String) > val df = Seq(Record("a", "dup")).toDF > val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b")) > val dupCol = df("dup") > joined.drop(dupCol) // Does not drop anything > joined.drop(func.col("a.dup")) // It drops the column > joined.select(dupCol) // It selects the column > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31123) Drop does not work after join with aliases
Mikel San Vicente created SPARK-31123: - Summary: Drop does not work after join with aliases Key: SPARK-31123 URL: https://issues.apache.org/jira/browse/SPARK-31123 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.2 Reporter: Mikel San Vicente Hi, I am seeing a really strange behaviour in drop method after a join with aliases. It doesn't seem to find the column when I reference to it using dataframe("columnName") syntax, but it does work with other combinators like select {code:java} case class Record(a: String, dup: String) case class Record2(b: String, dup: String) val df = Seq(Record("a", "dup")).toDF val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b")) val dupCol = df("dup") joined.drop(dupCol) // Does not drop anything joined.drop(func.col("a.dup")) // It works! joined.select(dupCol) // It works! {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters
[ https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248999#comment-16248999 ] Mikel San Vicente commented on SPARK-22442: --- are you planning to make a patch to version 2.2.x with this bug fix? > Schema generated by Product Encoder doesn't match case class field name when > using non-standard characters > -- > > Key: SPARK-22442 > URL: https://issues.apache.org/jira/browse/SPARK-22442 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.2, 2.2.0 >Reporter: Mikel San Vicente >Assignee: Liang-Chi Hsieh > Fix For: 2.3.0 > > > Product encoder encodes special characters wrongly when field name contains > certain nonstandard characters. > For example for: > {quote} > case class MyType(`field.1`: String, `field 2`: String) > {quote} > we will get the following schema > {quote} > root > |-- field$u002E1: string (nullable = true) > |-- field$u00202: string (nullable = true) > {quote} > As a consequence of this issue a DataFrame with the correct schema can't be > converted to a Dataset using .as[MyType] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters
[ https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239821#comment-16239821 ] Mikel San Vicente commented on SPARK-22442: --- yes, that will work but it wont work for the correct schema that will be inferred if you read directly from json: spark.read.json(path).as[MyType] it won't work because the inferred schema will be [field.1: string, field 2: string] > Schema generated by Product Encoder doesn't match case class field name when > using non-standard characters > -- > > Key: SPARK-22442 > URL: https://issues.apache.org/jira/browse/SPARK-22442 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.2, 2.2.0 >Reporter: Mikel San Vicente > > Product encoder encodes special characters wrongly when field name contains > certain nonstandard characters. > For example for: > {quote} > case class MyType(`field.1`: String, `field 2`: String) > {quote} > we will get the following schema > {quote} > root > |-- field$u002E1: string (nullable = true) > |-- field$u00202: string (nullable = true) > {quote} > As a consequence of this issue a DataFrame with the correct schema can't be > converted to a Dataset using .as[MyType] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters
[ https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikel San Vicente updated SPARK-22442: -- Description: Product encoder encodes special characters wrongly when field name contains certain nonstandard characters. For example for: {quote} case class MyType(`field.1`: String, `field 2`: String) {quote} we will get the following schema {quote} root |-- field$u002E1: string (nullable = true) |-- field$u00202: string (nullable = true) {quote} As a consequence of this issue a DataFrame with the correct schema can't be converted to a Dataset using .as[MyType] was: Product encoder encodes special characters wrongly when field name contains certain nonstandard characters. For example for: {quote} case class MyType(`field.1`: String, `field 2`: String) {quote} we will get the following schema {quote} root |-- field$u002E1: string (nullable = true) |-- field$u00202: string (nullable = true) {quote} > Schema generated by Product Encoder doesn't match case class field name when > using non-standard characters > -- > > Key: SPARK-22442 > URL: https://issues.apache.org/jira/browse/SPARK-22442 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.2, 2.2.0 >Reporter: Mikel San Vicente > > Product encoder encodes special characters wrongly when field name contains > certain nonstandard characters. > For example for: > {quote} > case class MyType(`field.1`: String, `field 2`: String) > {quote} > we will get the following schema > {quote} > root > |-- field$u002E1: string (nullable = true) > |-- field$u00202: string (nullable = true) > {quote} > As a consequence of this issue a DataFrame with the correct schema can't be > converted to a Dataset using .as[MyType] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters
[ https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikel San Vicente updated SPARK-22442: -- Description: Product encoder encodes special characters wrongly when field name contains certain nonstandard characters. For example for: {quote} case class MyType(`field.1`: String, `field 2`: String) {quote} we will get the following schema {quote} root |-- field$u002E1: string (nullable = true) |-- field$u00202: string (nullable = true) {quote} was: Product encoder encodes special characters wrongly when field name contains certain nonstandard characters. For example for: {{ case class MyType(`field.1`: String, `field 2`: String) }} we will get the following schema {{ root |-- field$u002E1: string (nullable = true) |-- field$u00202: string (nullable = true) }} > Schema generated by Product Encoder doesn't match case class field name when > using non-standard characters > -- > > Key: SPARK-22442 > URL: https://issues.apache.org/jira/browse/SPARK-22442 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.2, 2.2.0 >Reporter: Mikel San Vicente >Priority: Normal > > Product encoder encodes special characters wrongly when field name contains > certain nonstandard characters. > For example for: > {quote} > case class MyType(`field.1`: String, `field 2`: String) > {quote} > we will get the following schema > {quote} > root > |-- field$u002E1: string (nullable = true) > |-- field$u00202: string (nullable = true) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters
[ https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikel San Vicente updated SPARK-22442: -- Description: Product encoder encodes special characters wrongly when field name contains certain nonstandard characters. For example for: {{case class MyType(`field.1`: String, `field 2`: String) }} we will get the following schema {{root |-- field$u002E1: string (nullable = true) |-- field$u00202: string (nullable = true)}} was: Product encoder encodes special characters wrongly when field name contains certain nonstandard characters. For example for: ``` case class MyType(`field.1`: String, `field 2`: String) ``` we will get the following schema ``` root |-- field$u002E1: string (nullable = true) |-- field$u00202: string (nullable = true) ``` > Schema generated by Product Encoder doesn't match case class field name when > using non-standard characters > -- > > Key: SPARK-22442 > URL: https://issues.apache.org/jira/browse/SPARK-22442 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.2, 2.2.0 >Reporter: Mikel San Vicente >Priority: Normal > > Product encoder encodes special characters wrongly when field name contains > certain nonstandard characters. > For example for: > {{case class MyType(`field.1`: String, `field 2`: String) > }} > we will get the following schema > {{root > |-- field$u002E1: string (nullable = true) > |-- field$u00202: string (nullable = true)}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters
[ https://issues.apache.org/jira/browse/SPARK-22442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikel San Vicente updated SPARK-22442: -- Description: Product encoder encodes special characters wrongly when field name contains certain nonstandard characters. For example for: {{ case class MyType(`field.1`: String, `field 2`: String) }} we will get the following schema {{ root |-- field$u002E1: string (nullable = true) |-- field$u00202: string (nullable = true) }} was: Product encoder encodes special characters wrongly when field name contains certain nonstandard characters. For example for: {{case class MyType(`field.1`: String, `field 2`: String) }} we will get the following schema {{root |-- field$u002E1: string (nullable = true) |-- field$u00202: string (nullable = true)}} > Schema generated by Product Encoder doesn't match case class field name when > using non-standard characters > -- > > Key: SPARK-22442 > URL: https://issues.apache.org/jira/browse/SPARK-22442 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.2, 2.2.0 >Reporter: Mikel San Vicente >Priority: Normal > > Product encoder encodes special characters wrongly when field name contains > certain nonstandard characters. > For example for: > {{ > case class MyType(`field.1`: String, `field 2`: String) > }} > we will get the following schema > {{ > root > |-- field$u002E1: string (nullable = true) > |-- field$u00202: string (nullable = true) > }} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22442) Schema generated by Product Encoder doesn't match case class field name when using non-standard characters
Mikel San Vicente created SPARK-22442: - Summary: Schema generated by Product Encoder doesn't match case class field name when using non-standard characters Key: SPARK-22442 URL: https://issues.apache.org/jira/browse/SPARK-22442 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0, 2.1.2, 2.0.2 Reporter: Mikel San Vicente Priority: Normal Product encoder encodes special characters wrongly when field name contains certain nonstandard characters. For example for: ``` case class MyType(`field.1`: String, `field 2`: String) ``` we will get the following schema ``` root |-- field$u002E1: string (nullable = true) |-- field$u00202: string (nullable = true) ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org