[ 
https://issues.apache.org/jira/browse/SPARK-28732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909409#comment-16909409
 ] 

Liang-Chi Hsieh commented on SPARK-28732:
-----------------------------------------

As {{count}} return type is LongType, I think it is reasonable that it can't be 
fit into an Int column. The problem here might be the error is not friendly.

Normally, if we want to map dataset to specified type, an exception like this 
should be thrown, if it is incompatible:
{code}
You can either add an explicit cast to the input data or choose a higher 
precision type of the field in the target object;                               
                                                             
  at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveUpCast$$fail(Analyzer.scala:2801)
                                                          
  at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast$$anonfun$apply$32$$anonfun$applyOrElse$143.applyOrElse(Analyzer.scala:2821)
                                                                        
  at 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast$$anonfun$apply$32$$anonfun$applyOrElse$143.applyOrElse(Analyzer.scala:2812)
                                                                        
{code}

> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - failed to 
> compile: org.codehaus.commons.compiler.CompileException: File 
> 'generated.java' when storing the result of a count aggregation in an integer
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-28732
>                 URL: https://issues.apache.org/jira/browse/SPARK-28732
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0, 2.2.0, 2.3.0, 2.4.0
>            Reporter: Alix Métivier
>            Priority: Major
>
> I am using agg function on a dataset, and i want to count the number of lines 
> upon grouping columns. I would like to store the result of this count in an 
> integer, but it fails with this output : 
> {code}
> [ERROR]: org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - 
> failed to compile: org.codehaus.commons.compiler.CompileException: File 
> 'generated.java', Line 89, Column 53: No applicable constructor/method found 
> for actual parameters "long"; candidates are: "java.lang.Integer(int)", 
> "java.lang.Integer(java.lang.String)"
> Here is the line 89 and a few others to understand :
> /* 085 */ long value13 = i.getLong(5);
>  /* 086 */ argValue4 = value13;
>  /* 087 */
>  /* 088 */
>  /* 089 */ final java.lang.Integer value12 = false ? null : new 
> java.lang.Integer(argValue4);
> {code}
>  
> As per Integer documentation, there is not constructor for the type Long, so 
> this is why the generated code fails.
> Here is my code : 
> {code}
> org.apache.spark.sql.Dataset<row2Struct> ds_row2 = 
> ds_conntAggregateRow_1_Out_1
>  .groupBy(org.apache.spark.sql.functions.col("n_name").as("n_nameN"),
>  org.apache.spark.sql.functions.col("o_year").as("o_yearN"))
>  .agg(org.apache.spark.sql.functions.count("n_name").as("countN"),
>  .as(org.apache.spark.sql.Encoders.bean(row2Struct.class));
> {code}
> row2Struct class is composed of n_nameN: String, o_yearN: String, countN: Int
> If countN is a Long, code above wont fail
> If it is an Int, it works in 1.6 and 2.0, but fails on version 2.1+
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to