>From branch-2.0, Spark 2.0.0 preview,

I found it interesting, no matter what you do by configuring


spark.sql.warehouse.dir


it will always pull up the default path which is /user/hive/warehouse


In the code, I notice that at LOC45

./sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala


object SimpleAnalyzer extends Analyzer(

    new SessionCatalog(

      new InMemoryCatalog,

      EmptyFunctionRegistry,

      new SimpleCatalystConf(caseSensitiveAnalysis = true)),

    new SimpleCatalystConf(caseSensitiveAnalysis = true))


It will always initialize with the SimpleCatalystConf which is applying the 
hardcoded default value

defined in LOC58


./sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystConf.scala


case class SimpleCatalystConf(

    caseSensitiveAnalysis: Boolean,

    orderByOrdinal: Boolean = true,

    groupByOrdinal: Boolean = true,

    optimizerMaxIterations: Int = 100,

    optimizerInSetConversionThreshold: Int = 10,

    maxCaseBranchesForCodegen: Int = 20,

    runSQLonFile: Boolean = true,

    warehousePath: String = "/user/hive/warehouse")

  extends CatalystConf


I couldn't find any other way to get around this.


It looks like this was fixed (in SPARK-15387) after


https://github.com/apache/spark/commit/9c817d027713859cac483b4baaaf8b53c040ad93

[https://avatars0.githubusercontent.com/u/4736016?v=3&s=200]<https://github.com/apache/spark/commit/9c817d027713859cac483b4baaaf8b53c040ad93>

[SPARK-15387][SQL] SessionCatalog in SimpleAnalyzer does not need to ... ยท 
apache/spark@9c817d0<https://github.com/apache/spark/commit/9c817d027713859cac483b4baaaf8b53c040ad93>
github.com
...make database directory. ## What changes were proposed in this pull request? 
After #12871 is fixed, we are forced to make `/user/hive/warehouse` when 
SimpleAnalyzer is used but SimpleAnalyzer ma...


Just want to confirm this was the root cause and the PR that fixed it. Thanks.




Reply via email to