[ 
https://issues.apache.org/jira/browse/SPARK-40074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj Gargava updated SPARK-40074:
---------------------------------
    Affects Version/s: 3.2.2

> Error while creating dataset in Java spark-3.x using Encoders bean with Dense 
> Vector. (Issue arises when updating spark from 2.4 to 3.x)
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-40074
>                 URL: https://issues.apache.org/jira/browse/SPARK-40074
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API, ML, SQL
>    Affects Versions: 3.1.2, 3.2.2
>         Environment: Scala 2.12
> Spark 3.x
>            Reporter: Anuj Gargava
>            Priority: Major
>
> Encountered a compatibility issue while upgrading spark from 2.4 to 3.x (also 
> scala is upgraded from 2.11 to 2.12). 
> This java code below used to work with spark 2.4 but when migrated to 3.x it 
> gives the error (mentioned below) I have done my own research but couldn't 
> find a solution or any related information.
>  
>  
> {code:java|title=Code.java|borderStyle=solid}
> public void test() {
> final SparkSession spark = SparkSession.builder()
> .appName("Test")
> .getOrCreate();
> DenseClass denseFactor1 = new DenseClass( new DenseVector( new double[]{0.13, 
> 0.24}));
> DenseClass denseFactor2 = new DenseClass( new DenseVector( new double[]{0.24, 
> 0.32}));
> final List<DenseClass> inputsNew = Arrays.asList(denseFactor1, denseFactor2);
> final Dataset<DenseClass> denseVectorDf = spark.createDataset(inputsNew, 
> Encoders.bean(DenseClass.class));
> denseVectorDf.printSchema();
> }
> public static class DenseClass implements Serializable
> { private org.apache.spark.ml.linalg.DenseVector denseVector; }{code}
> The error occurs while creating the dataset *denseVectorDf* .
> Error
>  
> {noformat}
> }}
> {{org.apache.spark.sql.AnalysisException: Cannot up cast `denseVector` from 
> struct<> to 
> struct<type:tinyint,size:int,indices:array<int>,values:array<double>>.
> The type path of the target object is:
>  - field (class: "org.apache.spark.ml.linalg.DenseVector", name: 
> "denseVector")
> You can either add an explicit cast to the input data or choose a higher 
> precision type of the field in the target object}}
> {{{noformat}
> I have tried to use _double_ instead of dense vector and it works just fine, 
> but fails on using the dense vector with encoders bean.
>  
> StackOverflow link for the issue: 
> [https://stackoverflow.com/questions/73313660/error-while-creating-dataset-in-java-spark-3-x-using-encoders-bean-with-dense-ve]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to