Few tests (that are working on 2.4.6 and 2.4.7) are failling in 3.0.1
Some with this message : *java.lang.ClassNotFoundException:
com/fasterxml/jackson/module/scala/ScalaObjectMapper*
Coming from :
at
org.apache.spark.sql.catalyst.util.RebaseDateTime.lastSwitchJulianDay(RebaseDateTime.scala)
at
org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.rebaseDays(VectorizedColumnReader.java:182)
at
org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.decodeDictionaryIds(VectorizedColumnReader.java:336)
at
org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.readBatch(VectorizedColumnReader.java:239)
at
org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:273)
or
at
org.apache.spark.sql.catalyst.util.DateTimeUtils$.toJavaDate(DateTimeUtils.scala:130)
at
org.apache.spark.sql.catalyst.util.DateTimeUtils.toJavaDate(DateTimeUtils.scala)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
Source)
The oher ones with this one :
org.apache.spark.sql.AnalysisException: *Can't extract value from
lambdavariable(MapObject, StringType, true, 376)*: need struct type but got
string;
These one might be hurting to a dataset having this schema ?
/**
* Renvoyer le schéma du Dataset.
* @return Schema.
*/
public StructType schemaEntreprise() {
StructType schema = new StructType()
.add("siren", StringType, false)
.add("statutDiffusionUniteLegale", StringType, true)
.add("unitePurgeeUniteLegale", StringType, true )
.add("dateCreationEntreprise", StringType, true)
.add("sigle", StringType, true)
.add("sexe", StringType, true)
.add("prenom1", StringType, true)
.add("prenom2", StringType, true)
.add("prenom3", StringType, true)
.add("prenom4", StringType, true)
.add("prenomUsuel", StringType, true)
.add("pseudonyme", StringType, true)
.add("rna", StringType, true)
.add("trancheEffectifsUniteLegale", StringType, true)
.add("anneeEffectifsUniteLegale", StringType, true)
.add("dateDernierTraitement", StringType, true)
.add("nombrePeriodesUniteLegale", StringType, true)
.add("categorieEntreprise", StringType, true)
.add("anneeCategorieEntreprise", StringType, true)
.add("dateDebutHistorisation", StringType, true)
.add("etatAdministratifUniteLegale", StringType, true)
.add("nomNaissance", StringType, true)
.add("nomUsage", StringType, true)
.add("denominationEntreprise", StringType, true)
.add("denominationUsuelle1", StringType, true)
.add("denominationUsuelle2", StringType, true)
.add("denominationUsuelle3", StringType, true)
.add("categorieJuridique", StringType, true)
.add("activitePrincipale", StringType, true)
.add("nomenclatureActivitePrincipale", StringType, true)
.add("nicSiege", StringType, true)
.add("economieSocialeSolidaireUniteLegale", StringType, true)
.add("caractereEmployeurUniteLegale", StringType, true)
// Champs créés par withColumn
.add("purgee", BooleanType, true)
.add("anneeValiditeEffectifSalarie", IntegerType, true)
.add("active", BooleanType, true)
.add("nombrePeriodes", IntegerType, true)
.add("anneeCategorie", IntegerType, true)
.add("economieSocialeSolidaire", BooleanType, true)
.add("caractereEmployeur", BooleanType, true);
// Ajouter au Dataset des entreprises la liaison avec les établissements.
MapType mapEtablissements = new MapType(StringType,
this.datasetEtablissement.schemaEtablissement(), true);
StructField etablissements = new StructField("etablissements",
mapEtablissements, true, Metadata.empty());
schema.add(etablissements);
schema.add("libelleCategorieJuridique", StringType, true);
schema.add("partition", StringType, true);
return schema;
}
Are they worth to mention in an issue (or to complete the description of an
existing issue) ?
Do you need me to pursue some analysis, and if so, how ?
--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]