Re: My I report a special comparaison of executions leading on issues on Spark JIRA ?

Marc Le Bihan Fri, 02 Oct 2020 13:07:47 -0700

Few tests (that are working on 2.4.6 and 2.4.7) are failling in 3.0.1 

Some with this message : *java.lang.ClassNotFoundException:
com/fasterxml/jackson/module/scala/ScalaObjectMapper*


Coming from :
        at
org.apache.spark.sql.catalyst.util.RebaseDateTime.lastSwitchJulianDay(RebaseDateTime.scala)
        at
org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.rebaseDays(VectorizedColumnReader.java:182)
        at
org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.decodeDictionaryIds(VectorizedColumnReader.java:336)
        at
org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.readBatch(VectorizedColumnReader.java:239)
        at
org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:273)

or
        at
org.apache.spark.sql.catalyst.util.DateTimeUtils$.toJavaDate(DateTimeUtils.scala:130)
        at
org.apache.spark.sql.catalyst.util.DateTimeUtils.toJavaDate(DateTimeUtils.scala)
        at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
Source)


The oher ones with this one :
org.apache.spark.sql.AnalysisException: *Can't extract value from
lambdavariable(MapObject, StringType, true, 376)*: need struct type but got
string;

These one might be hurting to a dataset having this schema ?

/**
 * Renvoyer le schéma du Dataset.
 * @return Schema.
 */
public StructType schemaEntreprise() {
   StructType schema = new StructType()
      .add("siren", StringType, false)
      .add("statutDiffusionUniteLegale", StringType, true)
      .add("unitePurgeeUniteLegale", StringType, true )
      .add("dateCreationEntreprise", StringType, true)
      .add("sigle", StringType, true)
      
      .add("sexe", StringType, true)
      .add("prenom1", StringType, true)
      .add("prenom2", StringType, true)
      .add("prenom3", StringType, true)
      .add("prenom4", StringType, true)
      
      .add("prenomUsuel", StringType, true)
      .add("pseudonyme", StringType, true)
      .add("rna", StringType, true)
      .add("trancheEffectifsUniteLegale", StringType, true)
      .add("anneeEffectifsUniteLegale", StringType, true)
      
      .add("dateDernierTraitement", StringType, true)
      .add("nombrePeriodesUniteLegale", StringType, true)
      .add("categorieEntreprise", StringType, true)
      .add("anneeCategorieEntreprise", StringType, true)
      .add("dateDebutHistorisation", StringType, true)

      .add("etatAdministratifUniteLegale", StringType, true)
      .add("nomNaissance", StringType, true)
      .add("nomUsage", StringType, true)
      .add("denominationEntreprise", StringType, true)
      .add("denominationUsuelle1", StringType, true)

      .add("denominationUsuelle2", StringType, true)
      .add("denominationUsuelle3", StringType, true)
      .add("categorieJuridique", StringType, true)
      .add("activitePrincipale", StringType, true)
      .add("nomenclatureActivitePrincipale", StringType, true)

      .add("nicSiege", StringType, true)
      .add("economieSocialeSolidaireUniteLegale", StringType, true)
      .add("caractereEmployeurUniteLegale", StringType, true)
      
         // Champs créés par withColumn
         .add("purgee", BooleanType, true)
         .add("anneeValiditeEffectifSalarie", IntegerType, true)
         .add("active", BooleanType, true)
         .add("nombrePeriodes", IntegerType, true)
         .add("anneeCategorie", IntegerType, true)

         .add("economieSocialeSolidaire", BooleanType, true)
         .add("caractereEmployeur", BooleanType, true);
   
   // Ajouter au Dataset des entreprises la liaison avec les établissements.
   MapType mapEtablissements = new MapType(StringType,
this.datasetEtablissement.schemaEtablissement(), true);
   StructField etablissements = new StructField("etablissements",
mapEtablissements, true, Metadata.empty());
   schema.add(etablissements);
   schema.add("libelleCategorieJuridique", StringType, true);
   schema.add("partition", StringType, true);
   
   return schema;
}

Are they worth to mention in an issue (or to complete the description of an
existing issue) ?
Do you need me to pursue some analysis, and if so, how ?



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: My I report a special comparaison of executions leading on issues on Spark JIRA ?

Reply via email to