[GitHub] [spark] goutam-git commented on a diff in pull request #37065: [SPARK-38699][SQL] Use error classes in the execution errors of dictionary encoding

2022-07-28 Thread GitBox


goutam-git commented on code in PR #37065:
URL: https://github.com/apache/spark/pull/37065#discussion_r932196681


##
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/compressionSchemes.scala:
##
@@ -421,7 +421,7 @@ private[columnar] case object DictionaryEncoding extends 
CompressionScheme {
 
 override def compress(from: ByteBuffer, to: ByteBuffer): ByteBuffer = {
   if (overflow) {

Review Comment:
   @MaxGekk  should I remove the assert and use the new method name with error 
class instead of useDictionaryEncodingWhenDictionaryOverflowError()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] goutam-git commented on a diff in pull request #37065: [SPARK-38699][SQL] Use error classes in the execution errors of dictionary encoding

2022-07-25 Thread GitBox


goutam-git commented on code in PR #37065:
URL: https://github.com/apache/spark/pull/37065#discussion_r928926262


##
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/compressionSchemes.scala:
##
@@ -421,7 +421,7 @@ private[columnar] case object DictionaryEncoding extends 
CompressionScheme {
 
 override def compress(from: ByteBuffer, to: ByteBuffer): ByteBuffer = {
   if (overflow) {

Review Comment:
   @MaxGekk
   
   on  replacing if(overflow) with asser ,  pyspark test for 
CrossValidatorTests is failing for this assertion.
   

   
   File "/__w/spark/spark/python/pyspark/ml/tests/test_tuning.py", line 141, in 
test_copy
   
   cvModel = cv.fit(dataset)
   

   
   py4j.protocol.Py4JJavaError: An error occurred while calling o48.evaluate.
   
   : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 
(TID 12) (localhost executor driver): java.lang.AssertionError: assertion failed
   
  at scala.Predef$.assert(Predef.scala:208)
   
  at 
org.apache.spark.sql.execution.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:425)
   

   Please advice.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] goutam-git commented on a diff in pull request #37065: [SPARK-38699][SQL] Use error classes in the execution errors of dictionary encoding

2022-07-17 Thread GitBox


goutam-git commented on code in PR #37065:
URL: https://github.com/apache/spark/pull/37065#discussion_r922831090


##
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/compressionSchemes.scala:
##
@@ -421,7 +421,7 @@ private[columnar] case object DictionaryEncoding extends 
CompressionScheme {
 
 override def compress(from: ByteBuffer, to: ByteBuffer): ByteBuffer = {
   if (overflow) {

Review Comment:
   @MaxGekk
   
   I used the DictionaryEncodingSuite and found that in case of big enough 
Dictionary the users face java.lang.IllegalArgumentException  from 
ByteBuffer.allocate in build method in CompressibleColumnBuilder as a result of 
which they never reach encoder.compress  which throws  
useDictionaryEncodingWhenDictionaryOverflowError
   Please advice next steps



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] goutam-git commented on a diff in pull request #37065: [SPARK-38699][SQL] Use error classes in the execution errors of dictionary encoding

2022-07-08 Thread GitBox


goutam-git commented on code in PR #37065:
URL: https://github.com/apache/spark/pull/37065#discussion_r917225777


##
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala:
##
@@ -878,9 +878,10 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
 new Exception(s"Unsupported type: ${dataType.catalogString}")
   }
 
-  def useDictionaryEncodingWhenDictionaryOverflowError(): Throwable = {
-new IllegalStateException(

Review Comment:
   @HyukjinKwon  This appears to be an internal exception not exposed to users



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] goutam-git commented on a diff in pull request #37065: [SPARK-38699][SQL] Use error classes in the execution errors of dictionary encoding

2022-07-08 Thread GitBox


goutam-git commented on code in PR #37065:
URL: https://github.com/apache/spark/pull/37065#discussion_r915497502


##
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala:
##
@@ -878,9 +878,10 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
 new Exception(s"Unsupported type: ${dataType.catalogString}")
   }
 
-  def useDictionaryEncodingWhenDictionaryOverflowError(): Throwable = {
-new IllegalStateException(

Review Comment:
   This appears to be an internal exception not exposed to users



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] goutam-git commented on a diff in pull request #37065: [SPARK-38699][SQL] Use error classes in the execution errors of dictionary encoding

2022-07-07 Thread GitBox


goutam-git commented on code in PR #37065:
URL: https://github.com/apache/spark/pull/37065#discussion_r915497502


##
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala:
##
@@ -878,9 +878,10 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
 new Exception(s"Unsupported type: ${dataType.catalogString}")
   }
 
-  def useDictionaryEncodingWhenDictionaryOverflowError(): Throwable = {
-new IllegalStateException(

Review Comment:
   This appears to be an internal exception not exposed to users



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org