Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
dongjoon-hyun commented on PR #45095: URL: https://github.com/apache/spark/pull/45095#issuecomment-1960606439 Here is a follow-up. - #45226 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
dongjoon-hyun commented on PR #45095: URL: https://github.com/apache/spark/pull/45095#issuecomment-1960605169 NVM. Let me make a quick followup. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
dongjoon-hyun commented on PR #45095: URL: https://github.com/apache/spark/pull/45095#issuecomment-1960604937 ``` [info] SparkThrowableSuite: [info] - No duplicate error classes (20 milliseconds) [info] - Error classes are correctly formatted (35 milliseconds) [info] - SQLSTATE is mandatory (2 milliseconds) [info] - Error category and error state / SQLSTATE invariants (23 milliseconds) [info] - Message invariants (7 milliseconds) [info] - Message format invariants (8 milliseconds) [info] - Error classes match with document *** FAILED *** (55 milliseconds) [info] "...ects an instance of [ExpressionEncoder] but got ` "...ects an instance of [`ExpressionEncoder`] but got `
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk commented on PR #45095: URL: https://github.com/apache/spark/pull/45095#issuecomment-1959935174 @mihailom-db Congratulations with your first contribution to Apache Spark! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk closed pull request #45095: [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 URL: https://github.com/apache/spark/pull/45095 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk commented on PR #45095: URL: https://github.com/apache/spark/pull/45095#issuecomment-1959930750 +1, LGTM. Merging to master. Thank you, @mihailom-db. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
mihailom-db commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1499100913 ## common/utils/src/main/resources/error/error-states.json: ## @@ -2933,6 +2933,12 @@ "standard": "Y", "usedBy": ["SQL/Foundation", "PostgreSQL", "Redshift", "Oracle", "SQL Server"] }, +"42001": { +"description": "Invalid encoder error", +"origin": "SQL/Foundation", +"standard": "Y", +"usedBy": ["SQL/Foundation"] Review Comment: I took a look at different DBMS sqlstates and to my knowledge it seems that this one is not used by any. Changed doc accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1498755220 ## common/utils/src/main/resources/error/error-states.json: ## @@ -2933,6 +2933,12 @@ "standard": "Y", "usedBy": ["SQL/Foundation", "PostgreSQL", "Redshift", "Oracle", "SQL Server"] }, +"42001": { +"description": "Invalid encoder error", +"origin": "SQL/Foundation", +"standard": "Y", +"usedBy": ["SQL/Foundation"] Review Comment: Could you check is this SQL state used by other DBMS? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
mihailom-db commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1498260755 ## common/utils/src/main/resources/error/error-states.json: ## @@ -2933,6 +2933,12 @@ "standard": "Y", "usedBy": ["SQL/Foundation", "PostgreSQL", "Redshift", "Oracle", "SQL Server"] }, +"42001": { +"description": "Invalid encoder error", +"origin": "SQL/Foundation", +"standard": "Y", +"usedBy": ["SQL/Foundation"] +}, Review Comment: @MaxGekk I am not sure how I should approach to naming new sqlstate, do you have some more info on what these fields are about? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1497370381 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3748,6 +3748,12 @@ }, "sqlState" : "0A000" }, + "UNSUPPORTED_ENCODER" : { Review Comment: I think we are dealing with invalid encoder not some unsupported one. Could you rename it to: `INVALID_EXPRESSION_ENCODER` ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3748,6 +3748,12 @@ }, "sqlState" : "0A000" }, + "UNSUPPORTED_ENCODER" : { +"message" : [ + "Found unsupported encoder. Try switching to an expression encoder. (For more information consult: '/sql-error-conditions.html')" +], +"sqlState" : "42000" Review Comment: Let's assign some unused id like `42001`. ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3748,6 +3748,12 @@ }, "sqlState" : "0A000" }, + "UNSUPPORTED_ENCODER" : { +"message" : [ + "Found unsupported encoder. Try switching to an expression encoder. (For more information consult: '/sql-error-conditions.html')" Review Comment: ```suggestion "Found an invalid expression encoder. Expects an instance of `ExpressionEncoder` but got . For more information consult: '/api/java/index.html?org/apache/spark/sql/Encoder.html'" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
mihailom-db commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1497250977 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3748,6 +3748,12 @@ }, "sqlState" : "0A000" }, + "UNSUPPORTED_ENCODER" : { +"message" : [ + "Found unsupported encoder. Try switching to an expression encoder. (For more information on Encoders, consult: https://spark.apache.org/docs/latest/api/java/index.html?org/apache/spark/sql/Encoder.html)" Review Comment: Generally, I understood that these are generated files that are being referenced, but I referenced the file with all error state support and then I introduced a link to Encoder doc there. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1497110617 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3748,6 +3748,12 @@ }, "sqlState" : "0A000" }, + "UNSUPPORTED_ENCODER" : { +"message" : [ + "Found unsupported encoder. Try switching to an expression encoder. (For more information on Encoders, consult: https://spark.apache.org/docs/latest/api/java/index.html?org/apache/spark/sql/Encoder.html)" Review Comment: Could you follow other items in the file for doc refs, see https://github.com/apache/spark/blob/0d938de112dfe142f40d4c6f86cfa1e6e32210ec/common/utils/src/main/resources/error/error-classes.json#L1523 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
mihailom-db commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1497104483 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3748,6 +3748,12 @@ }, "sqlState" : "0A000" }, + "UNSUPPORTED_ENCODER" : { +"message" : [ + "Found unsupported encoder. Try switching to expression encoder." +], +"sqlState" : "0A000" Review Comment: I have changed the category to `42` as I believe the error is possible to happen only if a user defines implicit `Encoder` for which an implicit `Encoder` does not already exist. I experimented with trying to use types `Int` or `String`, but realised that we already have implicit values defined and in those cases `.toDS()` call was marked as unresolved and the code did not even compile. `KryoData` seems to not have an implicit Encoder defined, so the added test is possible to happen. So I am not sure if I should change error name and/or message accordingly? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
mihailom-db commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1497105825 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3748,6 +3748,12 @@ }, "sqlState" : "0A000" }, + "UNSUPPORTED_ENCODER" : { +"message" : [ + "Found unsupported encoder. Try switching to expression encoder." Review Comment: Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1495711840 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3748,6 +3748,12 @@ }, "sqlState" : "0A000" }, + "UNSUPPORTED_ENCODER" : { +"message" : [ + "Found unsupported encoder. Try switching to expression encoder." Review Comment: nit: ```suggestion "Found unsupported encoder. Try switching to an expression encoder." ``` ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3748,6 +3748,12 @@ }, "sqlState" : "0A000" }, + "UNSUPPORTED_ENCODER" : { +"message" : [ + "Found unsupported encoder. Try switching to expression encoder." +], +"sqlState" : "0A000" Review Comment: The category `0A` belongs to: ``` "0A": "feature not supported", ``` see `error-categories.json`. Any reasons to choose this one? How about `46` or `42`? ## common/utils/src/main/resources/error/error-classes.json: ## @@ -3748,6 +3748,12 @@ }, "sqlState" : "0A000" }, + "UNSUPPORTED_ENCODER" : { +"message" : [ + "Found unsupported encoder. Try switching to expression encoder." Review Comment: I think we should help users and point out how to switch to an expression encoder. At least, let's point out the doc: https://spark.apache.org/docs/latest/api/java/index.html?org/apache/spark/sql/Encoder.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
mihailom-db commented on PR #45095: URL: https://github.com/apache/spark/pull/45095#issuecomment-1953772610 Requested the access for the OSS JIRA account, should be able to comment soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk commented on PR #45095: URL: https://github.com/apache/spark/pull/45095#issuecomment-1953758521 @mihailom-db BTW, do you have an account at OSS JIRA? If so, could you leave a comment in https://issues.apache.org/jira/browse/SPARK-43259 - telling that you are working on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk commented on PR #45095: URL: https://github.com/apache/spark/pull/45095#issuecomment-1953754162 > In the PR, I propose to assign the proper name UNSUPPORTED_ERROR ... @mihailom-db Did you really assign this name? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
mihailom-db commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1495412865 ## sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala: ## @@ -1151,6 +1153,21 @@ class QueryExecutionErrorsSuite ) ) } + + test("SPARK-43259: Uses unsupported custom encoder") { +class CustomEncoder extends Encoder[Int] { + override def schema: StructType = StructType(Array.empty[StructField]) + override def clsTag: ClassTag[Int] = ClassTag.Int +} +val e = intercept[SparkRuntimeException] { + encoderFor(new CustomEncoder) Review Comment: I have introduced this change. Just out of curiosity, should we change the error format to pass the type of encoder that was used and then include it in the message? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1494449340 ## sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala: ## @@ -1151,6 +1153,21 @@ class QueryExecutionErrorsSuite ) ) } + + test("SPARK-43259: Uses unsupported custom encoder") { +class CustomEncoder extends Encoder[Int] { + override def schema: StructType = StructType(Array.empty[StructField]) + override def clsTag: ClassTag[Int] = ClassTag.Int +} +val e = intercept[SparkRuntimeException] { + encoderFor(new CustomEncoder) Review Comment: I think this is an user facing error, see how to reproduce it using public API: ```scala implicit val kryoEncoder = new Encoder[KryoData] { override def schema: StructType = StructType(Array.empty[StructField]) override def clsTag: ClassTag[KryoData] = ??? } val ds = Seq(KryoData(1), KryoData(2)).toDS() ds.groupByKey(p => p).count().collect() ``` it fails with: ```java Only expression encoders are supported for now. org.apache.spark.SparkRuntimeException: Only expression encoders are supported for now. at org.apache.spark.sql.errors.QueryExecutionErrors$.unsupportedEncoderError(QueryExecutionErrors.scala:463) at org.apache.spark.sql.catalyst.encoders.package$.encoderFor(package.scala:34) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
mihailom-db commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1494306782 ## sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala: ## @@ -1151,6 +1153,21 @@ class QueryExecutionErrorsSuite ) ) } + + test("SPARK-43259: Uses unsupported custom encoder") { +class CustomEncoder extends Encoder[Int] { + override def schema: StructType = StructType(Array.empty[StructField]) + override def clsTag: ClassTag[Int] = ClassTag.Int +} +val e = intercept[SparkRuntimeException] { + encoderFor(new CustomEncoder) Review Comment: I explored the problem and have changed the PR description accordingly, could you take a look at it again, if it makes sense to you as well? @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-43259][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2024 [spark]
MaxGekk commented on code in PR #45095: URL: https://github.com/apache/spark/pull/45095#discussion_r1489876902 ## sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala: ## @@ -1151,6 +1153,21 @@ class QueryExecutionErrorsSuite ) ) } + + test("SPARK-43259: Uses unsupported custom encoder") { +class CustomEncoder extends Encoder[Int] { + override def schema: StructType = StructType(Array.empty[StructField]) + override def clsTag: ClassTag[Int] = ClassTag.Int +} +val e = intercept[SparkRuntimeException] { + encoderFor(new CustomEncoder) Review Comment: Can you try to reproduce the issue using public Spark SQL API (sql or Java/Scala/Python api)? If it is not possible, we should convert the error to an internal one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org