[ https://issues.apache.org/jira/browse/SPARK-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yin Huai updated SPARK-10287: ----------------------------- Description: I have a partitioned json table with around 2000 partitions. {code} val df = sqlContext.read.format("json").load("aPartitionedJsonData") val columnStr = df.schema.map(_.name).mkString(",") println(s"columns: $columnStr") val hash = df .selectExpr(s"hash($columnStr) as hashValue") .groupBy() .sum("hashValue") .head() .getLong(0) {code} was: {code} val df = sqlContext.read.format("json").load("aPartitionedJsonData") val columnStr = df.schema.map(_.name).mkString(",") println(s"columns: $columnStr") val hash = df .selectExpr(s"hash($columnStr) as hashValue") .groupBy() .sum("hashValue") .head() .getLong(0) {code} > After processing a query using JSON data, Spark SQL continuously refreshes > metadata of the table > ------------------------------------------------------------------------------------------------ > > Key: SPARK-10287 > URL: https://issues.apache.org/jira/browse/SPARK-10287 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.0 > Reporter: Yin Huai > Priority: Blocker > > I have a partitioned json table with around 2000 partitions. > {code} > val df = sqlContext.read.format("json").load("aPartitionedJsonData") > val columnStr = df.schema.map(_.name).mkString(",") > println(s"columns: $columnStr") > val hash = df > .selectExpr(s"hash($columnStr) as hashValue") > .groupBy() > .sum("hashValue") > .head() > .getLong(0) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org