I'm hesitant to merge that PR in as it is using a brand new configuration path that is different from the way that the rest of Spark SQL / Spark are configured.
I'm suspicious that that hitting max iterations is emblematic of some other issue, as typically resolution happens bottom up, in a single pass. Can you provide more details about your schema / query? On Tue, Dec 9, 2014 at 11:43 PM, Nitin Goyal <nitin2go...@gmail.com> wrote: > I see that somebody had already raised a PR for this but it hasn't been > merged. > > https://issues.apache.org/jira/browse/SPARK-4339 > > Can we merge this in next 1.2 RC? > > Thanks > -Nitin > > > On Wed, Dec 10, 2014 at 11:50 AM, Nitin Goyal <nitin2go...@gmail.com> > wrote: > >> Hi Michael, >> >> I think I have found the exact problem in my case. I see that we have >> written something like following in Analyzer.scala :- >> >> // TODO: pass this in as a parameter. >> >> val fixedPoint = FixedPoint(100) >> >> >> and >> >> >> Batch("Resolution", fixedPoint, >> >> ResolveReferences :: >> >> ResolveRelations :: >> >> ResolveSortReferences :: >> >> NewRelationInstances :: >> >> ImplicitGenerate :: >> >> StarExpansion :: >> >> ResolveFunctions :: >> >> GlobalAggregates :: >> >> UnresolvedHavingClauseAttributes :: >> >> TrimGroupingAliases :: >> >> typeCoercionRules ++ >> >> extendedRules : _*), >> >> Perhaps in my case, it reaches the 100 iterations and break out of while >> loop in RuleExecutor.scala and thus, doesn't "resolve" all the attributes. >> >> Exception in my logs :- >> >> 14/12/10 04:45:28 INFO HiveContext$$anon$4: Max iterations (100) reached >> for batch Resolution >> >> 14/12/10 04:45:28 ERROR [Sql]: Servlet.service() for servlet [Sql] in >> context with path [] threw exception [Servlet execution threw an exception] >> with root cause >> >> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: >> Unresolved attributes: 'T1.SP AS SP#6566,'T1.DOWN_BYTESHTTPSUBCR AS >> DOWN_BYTESHTTPSUBCR#6567, tree: >> >> 'Project ['T1.SP AS SP#6566,'T1.DOWN_BYTESHTTPSUBCR AS >> DOWN_BYTESHTTPSUBCR#6567] >> >> ... >> >> ... >> >> ... >> >> at >> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:80) >> >> at >> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:78) >> >> at >> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144) >> >> at >> org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135) >> >> at >> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:78) >> >> at >> org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:76) >> >> at >> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61) >> >> at >> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59) >> >> at >> scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51) >> >> at >> scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60) >> >> at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34) >> >> at >> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59) >> >> at >> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51) >> >> at scala.collection.immutable.List.foreach(List.scala:318) >> >> at >> org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51) >> >> at >> org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411) >> >> at >> org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:411) >> >> at >> org.apache.spark.sql.CacheManager$$anonfun$cacheQuery$1.apply(CacheManager.scala:86) >> >> at >> org.apache.spark.sql.CacheManager$class.writeLock(CacheManager.scala:67) >> >> at >> org.apache.spark.sql.CacheManager$class.cacheQuery(CacheManager.scala:85) >> >> at org.apache.spark.sql.SQLContext.cacheQuery(SQLContext.scala:50) >> >> at org.apache.spark.sql.SchemaRDD.cache(SchemaRDD.scala:490) >> >> >> I think the solution here is to have the FixedPoint constructor argument >> as configurable/parameterized (also written as TODO). Do we have a plan to >> do this in 1.2 release? Or I can take this up as a task for myself if you >> want (since this is very crucial for our release). >> >> >> Thanks >> >> -Nitin >> >> On Wed, Dec 10, 2014 at 1:06 AM, Michael Armbrust <mich...@databricks.com >> > wrote: >> >>> val newSchemaRDD = sqlContext.applySchema(existingSchemaRDD, >>>> existingSchemaRDD.schema) >>>> >>> >>> This line is throwing away the logical information about >>> existingSchemaRDD and thus Spark SQL can't know how to push down >>> projections or predicates past this operator. >>> >>> Can you describe more the problems that you see if you don't do this >>> reapplication of the schema. >>> >> >> >> >> -- >> Regards >> Nitin Goyal >> > > > > -- > Regards > Nitin Goyal >