[ https://issues.apache.org/jira/browse/SPARK-18856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Reynold Xin resolved SPARK-18856. --------------------------------- Resolution: Fixed Assignee: Wenchen Fan Fix Version/s: 2.1.0 > Newly created catalog table assumed to have 0 rows and 0 bytes > -------------------------------------------------------------- > > Key: SPARK-18856 > URL: https://issues.apache.org/jira/browse/SPARK-18856 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: Reynold Xin > Assignee: Wenchen Fan > Priority: Blocker > Fix For: 2.1.0 > > > {code} > scala> spark.range(100).selectExpr("id % 10 p", > "id").write.partitionBy("p").format("json").saveAsTable("testjson") > scala> spark.table("testjson").queryExecution.optimizedPlan.statistics > res6: org.apache.spark.sql.catalyst.plans.logical.Statistics = > Statistics(sizeInBytes=0, isBroadcastable=false) > {code} > It shouldn't be 0. The issue is that in DataSource.scala, we do: > {code} > val fileCatalog = if > (sparkSession.sqlContext.conf.manageFilesourcePartitions && > catalogTable.isDefined && > catalogTable.get.tracksPartitionsInCatalog) { > new CatalogFileIndex( > sparkSession, > catalogTable.get, > catalogTable.get.stats.map(_.sizeInBytes.toLong).getOrElse(0L)) > } else { > new InMemoryFileIndex(sparkSession, globbedPaths, options, > Some(partitionSchema)) > } > {code} > We shouldn't use 0L as the fallback. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org