[GitHub] [spark] luluorta commented on a change in pull request #30289: [SPARK-33141][SQL] Capture SQL configs when creating permanent views

2020-11-25 Thread GitBox


luluorta commented on a change in pull request #30289:
URL: https://github.com/apache/spark/pull/30289#discussion_r530134316



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
##
@@ -361,11 +379,38 @@ object ViewHelper {
 }
   }
 
+  /**
+   * Convert the view query SQL configs in `properties`.
+   */
+  private def generateQuerySQLConfigs(conf: SQLConf): Map[String, String] = {
+val modifiedConfs = conf.getAllConfs.filter { case (k, _) =>
+  conf.isModifiable(k) && !isConfigBlacklisted(k)
+}
+val props = new mutable.HashMap[String, String]
+if (modifiedConfs.nonEmpty) {
+  val confJson = compact(render(JsonProtocol.mapToJson(modifiedConfs)))
+  props.put(VIEW_QUERY_SQL_CONFIGS, confJson)

Review comment:
   Thanks for pointing this out. Splitting a large value string into small 
chunks seems a hive specific solution, so I changed to store one config per 
table property entry, each with a "view.sqlConfig." prefix.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30289: [SPARK-33141][SQL] Capture SQL configs when creating permanent views

2020-11-24 Thread GitBox


luluorta commented on a change in pull request #30289:
URL: https://github.com/apache/spark/pull/30289#discussion_r530133325



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
##
@@ -361,11 +364,34 @@ object ViewHelper {
 }
   }
 
+  /**
+   * Convert the view query SQL configs in `properties`.
+   */
+  private def generateQuerySQLConfigs(conf: SQLConf): Map[String, String] = {
+val modifiedConfs = conf.getAllConfs.filter { case (k, _) =>
+  conf.isModifiable(k) && k != SQLConf.MAX_NESTED_VIEW_DEPTH.key

Review comment:
   IMHO, we should keep the captured SQL configs as less as possible. It's 
hard to precisely captures the configs which only affect the parser and 
analyzer, an alternative way is just filtering out the configs that definitely 
can NOT affect parsing/analyzing.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30289: [SPARK-33141][SQL] Capture SQL configs when creating permanent views

2020-11-24 Thread GitBox


luluorta commented on a change in pull request #30289:
URL: https://github.com/apache/spark/pull/30289#discussion_r530134316



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
##
@@ -361,11 +379,38 @@ object ViewHelper {
 }
   }
 
+  /**
+   * Convert the view query SQL configs in `properties`.
+   */
+  private def generateQuerySQLConfigs(conf: SQLConf): Map[String, String] = {
+val modifiedConfs = conf.getAllConfs.filter { case (k, _) =>
+  conf.isModifiable(k) && !isConfigBlacklisted(k)
+}
+val props = new mutable.HashMap[String, String]
+if (modifiedConfs.nonEmpty) {
+  val confJson = compact(render(JsonProtocol.mapToJson(modifiedConfs)))
+  props.put(VIEW_QUERY_SQL_CONFIGS, confJson)

Review comment:
   Thanks for pointing this out. I changed to store one config per table 
property entry, each with a "view.sqlConfig." prefix.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30289: [SPARK-33141][SQL] Capture SQL configs when creating permanent views

2020-11-24 Thread GitBox


luluorta commented on a change in pull request #30289:
URL: https://github.com/apache/spark/pull/30289#discussion_r530133390



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
##
@@ -414,6 +433,8 @@ object CatalogTable {
   val VIEW_QUERY_OUTPUT_PREFIX = VIEW_PREFIX + "query.out."
   val VIEW_QUERY_OUTPUT_NUM_COLUMNS = VIEW_QUERY_OUTPUT_PREFIX + "numCols"
   val VIEW_QUERY_OUTPUT_COLUMN_NAME_PREFIX = VIEW_QUERY_OUTPUT_PREFIX + "col."
+
+  val VIEW_QUERY_SQL_CONFIGS = VIEW_PREFIX + "query.sqlConfigs"

Review comment:
   done

##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -1448,6 +1448,15 @@ object SQLConf {
 "must be positive.")
   .createWithDefault(100)
 
+  val APPLY_VIEW_SQL_CONFIGS =
+buildConf("spark.sql.legacy.view.applySQLConfigs")

Review comment:
   done

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
##
@@ -334,6 +337,21 @@ case class ShowViewsCommand(
 
 object ViewHelper {
 
+  private val configPrefixBlacklist = Seq(

Review comment:
   done

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
##
@@ -334,6 +337,21 @@ case class ShowViewsCommand(
 
 object ViewHelper {
 
+  private val configPrefixBlacklist = Seq(
+SQLConf.MAX_NESTED_VIEW_DEPTH.key,
+"spark.sql.optimizer.",
+"spark.sql.codegen.",
+"spark.sql.execution.",
+"spark.sql.shuffle.",
+"spark.sql.adaptive.")
+
+  private def isConfigBlacklisted(key: String): Boolean = {

Review comment:
   done

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
##
@@ -380,16 +425,19 @@ object ViewHelper {
 // for createViewCommand queryOutput may be different from fieldNames
 val queryOutput = analyzedPlan.schema.fieldNames
 
+val conf = SQLConf.get

Review comment:
   done

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
##
@@ -361,11 +379,38 @@ object ViewHelper {
 }
   }
 
+  /**
+   * Convert the view query SQL configs in `properties`.
+   */
+  private def generateQuerySQLConfigs(conf: SQLConf): Map[String, String] = {

Review comment:
   done

##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala
##
@@ -754,4 +754,77 @@ abstract class SQLViewSuite extends QueryTest with 
SQLTestUtils {
   }
 }
   }
+
+  test("SPARK-33141: view should be parsed and analyzed with configs set when 
creating") {
+withTable("t33141") {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30289: [SPARK-33141][SQL] Capture SQL configs when creating permanent views

2020-11-24 Thread GitBox


luluorta commented on a change in pull request #30289:
URL: https://github.com/apache/spark/pull/30289#discussion_r530133325



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
##
@@ -361,11 +364,34 @@ object ViewHelper {
 }
   }
 
+  /**
+   * Convert the view query SQL configs in `properties`.
+   */
+  private def generateQuerySQLConfigs(conf: SQLConf): Map[String, String] = {
+val modifiedConfs = conf.getAllConfs.filter { case (k, _) =>
+  conf.isModifiable(k) && k != SQLConf.MAX_NESTED_VIEW_DEPTH.key

Review comment:
   IMHO, we should keep the captured SQL configs as less as possible. 
Because it's hard to precisely captures the configs which only affect the 
parser and analyzer, an alternative way is just filtering out the configs that 
definitely can NOT affect parsing/analyzing.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30289: [SPARK-33141][SQL] Capture SQL configs when creating permanent views

2020-11-24 Thread GitBox


luluorta commented on a change in pull request #30289:
URL: https://github.com/apache/spark/pull/30289#discussion_r530130202



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala
##
@@ -748,4 +748,30 @@ abstract class SQLViewSuite extends QueryTest with 
SQLTestUtils {
   }
 }
   }
+
+  test("SPARK-33141 view should be parsed and analyzed with configs set when 
creating") {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30289: [SPARK-33141][SQL] Capture SQL configs when creating permanent views

2020-11-24 Thread GitBox


luluorta commented on a change in pull request #30289:
URL: https://github.com/apache/spark/pull/30289#discussion_r530130052



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
##
@@ -771,22 +771,24 @@ class AnalysisSuite extends AnalysisTest with Matchers {
 // RuleExecutor only throw exception or log warning when the rule is 
supposed to run
 // more than once.
 val maxIterations = 2
-val conf = new SQLConf().copy(SQLConf.ANALYZER_MAX_ITERATIONS -> 
maxIterations)
-val testAnalyzer = new Analyzer(
-  new SessionCatalog(new InMemoryCatalog, FunctionRegistry.builtin, conf), 
conf)
-
-val plan = testRelation2.select(
-  $"a" / Literal(2) as "div1",
-  $"a" / $"b" as "div2",
-  $"a" / $"c" as "div3",
-  $"a" / $"d" as "div4",
-  $"e" / $"e" as "div5")
-
-val message = intercept[TreeNodeException[LogicalPlan]] {
-  testAnalyzer.execute(plan)
-}.getMessage
-assert(message.startsWith(s"Max iterations ($maxIterations) reached for 
batch Resolution, " +
-  s"please set '${SQLConf.ANALYZER_MAX_ITERATIONS.key}' to a larger 
value."))
+withSQLConf(SQLConf.ANALYZER_MAX_ITERATIONS.key -> maxIterations.toString) 
{
+  val conf = new SQLConf().copy(SQLConf.ANALYZER_MAX_ITERATIONS -> 
maxIterations)

Review comment:
   ditto





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30289: [SPARK-33141][SQL] Capture SQL configs when creating permanent views

2020-11-24 Thread GitBox


luluorta commented on a change in pull request #30289:
URL: https://github.com/apache/spark/pull/30289#discussion_r530129632



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##
@@ -132,9 +132,11 @@ object AnalysisContext {
  */
 class Analyzer(
 override val catalogManager: CatalogManager,
-conf: SQLConf)
+deprecatedConf: SQLConf)

Review comment:
   This change is related to another sub-task of 
[SPARK-33138](https://issues.apache.org/jira/browse/SPARK-33138) which makes 
internal classes of SparkSession always using active SQLConf. I opened a 
seperated PR (https://github.com/apache/spark/pull/30299) and removed this 
change here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org