We've identified the cause of the change in behavior. It is related to the SQL
conf key "spark.sql.hive.caseSensitiveInferenceMode". This key and its related
functionality was absent from our previous build. The default setting in the
current build was causing Spark to attempt to scan all table
I want to caution that in testing a build from this morning's branch-2.1 we
found that Hive partition pruning was not working. We found that Spark SQL was
fetching all Hive table partitions for a very simple query whereas in a build
from several weeks ago it was fetching only the required
Doesn't common sub expression elimination address this issue as well?
On Thu, Apr 20, 2017 at 6:40 AM Herman van Hövell tot Westerflier <
hvanhov...@databricks.com> wrote:
> Hi Michael,
>
> This sounds like a good idea. Can you open a JIRA to track this?
>
> My initial feedback on your proposal
Steve,
I think you're a good person to ask about this. Is the below any cause for
concern? Or did I perhaps test this incorrectly?
Nick
On Tue, Apr 18, 2017 at 11:50 PM Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:
> I had trouble starting up a shell with the AWS package loaded
>
Hi Michael,
This sounds like a good idea. Can you open a JIRA to track this?
My initial feedback on your proposal would be that you might want to
express the no_collapse at the expression level and not at the plan level.
HTH
On Thu, Apr 20, 2017 at 3:31 PM, Michael Styles
Hello,
I am in the process of putting together a PR that introduces a new hint
called NO_COLLAPSE. This hint is essentially identical to Oracle's NO_MERGE
hint.
Let me first give an example of why I am proposing this.
df1 = sc.sql.createDataFrame([(1, "abc")], ["id", "user_agent"])
df2 =
+1 (non-binding)
On Wed, Apr 19, 2017 at 9:23 PM Dong Joon Hyun
wrote:
> +1
>
> I tested RC3 on CentOS 7.3.1611/OpenJDK 1.8.0_121/R 3.3.3
> with `-Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftserver
> –Psparkr`
>
> At the end of R test, I saw `Had CRAN check