We've identified the cause of the change in behavior. It is related to the SQL 
conf key "spark.sql.hive.caseSensitiveInferenceMode". This key and its related 
functionality was absent from our previous build. The default setting in the 
current build was causing Spark to attempt to scan all table files during query 
analysis. Changing this setting to NEVER_INFER disabled this operation and 
resolved the issue we had.

Michael


> On Apr 20, 2017, at 3:42 PM, Michael Allman <mich...@videoamp.com> wrote:
> 
> I want to caution that in testing a build from this morning's branch-2.1 we 
> found that Hive partition pruning was not working. We found that Spark SQL 
> was fetching all Hive table partitions for a very simple query whereas in a 
> build from several weeks ago it was fetching only the required partitions. I 
> cannot currently think of a reason for the regression outside of some 
> difference between branch-2.1 from our previous build and branch-2.1 from 
> this morning.
> 
> That's all I know right now. We are actively investigating to find the root 
> cause of this problem, and specifically whether this is a problem in the 
> Spark codebase or not. I will report back when I have an answer to that 
> question.
> 
> Michael
> 
> 
>> On Apr 18, 2017, at 11:59 AM, Michael Armbrust <mich...@databricks.com 
>> <mailto:mich...@databricks.com>> wrote:
>> 
>> Please vote on releasing the following candidate as Apache Spark version 
>> 2.1.1. The vote is open until Fri, April 21st, 2018 at 13:00 PST and passes 
>> if a majority of at least 3 +1 PMC votes are cast.
>> 
>> [ ] +1 Release this package as Apache Spark 2.1.1
>> [ ] -1 Do not release this package because ...
>> 
>> 
>> To learn more about Apache Spark, please see http://spark.apache.org/ 
>> <http://spark.apache.org/>
>> 
>> The tag to be voted on is v2.1.1-rc3 
>> <https://github.com/apache/spark/tree/v2.1.1-rc3> 
>> (2ed19cff2f6ab79a718526e5d16633412d8c4dd4)
>> 
>> List of JIRA tickets resolved can be found with this filter 
>> <https://issues.apache.org/jira/browse/SPARK-20134?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.1.1>.
>> 
>> The release files, including signatures, digests, etc. can be found at:
>> http://home.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-bin/ 
>> <http://home.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-bin/>
>> 
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc 
>> <https://people.apache.org/keys/committer/pwendell.asc>
>> 
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1230/ 
>> <https://repository.apache.org/content/repositories/orgapachespark-1230/>
>> 
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-docs/ 
>> <http://people.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-docs/>
>> 
>> 
>> FAQ
>> 
>> How can I help test this release?
>> 
>> If you are a Spark user, you can help us test this release by taking an 
>> existing Spark workload and running on this release candidate, then 
>> reporting any regressions.
>> 
>> What should happen to JIRA tickets still targeting 2.1.1?
>> 
>> Committers should look at those and triage. Extremely important bug fixes, 
>> documentation, and API tweaks that impact compatibility should be worked on 
>> immediately. Everything else please retarget to 2.1.2 or 2.2.0.
>> 
>> But my bug isn't fixed!??!
>> 
>> In order to make timely releases, we will typically not hold the release 
>> unless the bug in question is a regression from 2.1.0.
>> 
>> What happened to RC1?
>> 
>> There were issues with the release packaging and as a result was skipped.
> 

Reply via email to