spark git commit: [SPARK-8184][SQL] Add additional function description for weekofyear

2017-05-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master c9749068e -> 1c7db00c7 [SPARK-8184][SQL] Add additional function description for weekofyear ## What changes were proposed in this pull request? Add additional function description for weekofyear. ## How was this patch tested? manual

[GitHub] spark issue #18132: [SPARK-8184][SQL] Add additional function description fo...

2017-05-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18132 Thanks - merging in master/branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18086: [SPARK-20854][SQL] Extend hint syntax to support ...

2017-05-25 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/18086#discussion_r118473083 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -533,13 +533,16 @@ class AstBuilder(conf: SQLConf) extends

[GitHub] spark issue #18042: [SPARK-20817][core] Fix to return "Unknown processor" on...

2017-05-25 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18042 Does this really matter? I'd rather not complicate the actual code for it to display properly in some niche hardware that very few people use. --- If your project is set up for it, you can reply

[GitHub] spark issue #18086: [SPARK-20854][SQL] Extend hint syntax to support express...

2017-05-25 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18086 cc @gatorsmile @cloud-fan @hvanhovell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18016: [SPARK-20786][SQL]Improve ceil and floor handle the valu...

2017-05-25 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18016 hm guys please don’t use the end-to-end tests to test expression behavior. use unit tests which automatically tests code gen, interpreted, and different data types. --- If your project is set up

[GitHub] spark pull request #18087: [SPARK-20867][SQL] Move hints from Statistics int...

2017-05-24 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/18087#discussion_r118353924 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -195,9 +195,9 @@ case class Intersect

[GitHub] spark issue #18087: [SPARK-20867][SQL] Move hints from Statistics into HintI...

2017-05-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18087 cc @hvanhovell, @bogdanrdc --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18087: [SPARK-20867][SQL] Move hints from Statistics int...

2017-05-24 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/18087 [SPARK-20867][SQL] Move hints from Statistics into HintInfo class ## What changes were proposed in this pull request? This is a follow-up to SPARK-20857 to move the broadcast hint from Statistics

[GitHub] spark issue #18082: [SPARK-20665][SQL][FOLLOW-UP]Move test case to SQLQueryT...

2017-05-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18082 Hm I'm not sure if it is a good idea to run so many "unit test" style tests for expressions in the end to end suites. It takes a lot of time than just running unit tests. --- If your proj

spark git commit: [SPARK-20857][SQL] Generic resolved hint node

2017-05-23 Thread rxin
ore generic and would allow us to introduce other hint types in the future without introducing new hint nodes. ## How was this patch tested? Updated test cases. Author: Reynold Xin <r...@databricks.com> Closes #18072 from rxin/SPARK-20857. (cherry picked fr

spark git commit: [SPARK-20857][SQL] Generic resolved hint node

2017-05-23 Thread rxin
ric and would allow us to introduce other hint types in the future without introducing new hint nodes. ## How was this patch tested? Updated test cases. Author: Reynold Xin <r...@databricks.com> Closes #18072 from rxin/SPARK-20857. Project: http://git-wip-us.apache.org/repos/asf/spark/re

[GitHub] spark issue #18072: [SPARK-20857][SQL] Generic resolved hint node

2017-05-23 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18072 Merging in master / branch-2.2 ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18072: [SPARK-20857][SQL] Generic resolved hint node

2017-05-23 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/18072 [SPARK-20857][SQL] Generic resolved hint node ## What changes were proposed in this pull request? This patch renames BroadcastHint to ResolvedHint so it is more generic and would allow us

[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-23 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18064 That works too, if we can attach metrics to these commands. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18070: [SPARK-20713][Spark Core] Convert CommitDenied to TaskKi...

2017-05-23 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18070 cc @ericl --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17999: [SPARK-20751][SQL] Add built-in SQL Function - COT

2017-05-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17999 hmnmm seems like we should be following how we test tan, cos, etc in MathExpressionsSuite? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #18023: [SPARK-12139] [SQL] REGEX Column Specification

2017-05-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/18023#discussion_r117540055 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2624,4 +2624,92 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #18023: [SPARK-12139] [SQL] REGEX Column Specification

2017-05-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/18023#discussion_r117539904 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -795,6 +795,12 @@ object SQLConf { .intConf

[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2017-05-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16478 I don't know how important it is. It seems like it's primarily used by MLlib and very few other things ... --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #17997: [SPARK-20763][SQL]The function of `month` and `da...

2017-05-16 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17997#discussion_r116878495 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -601,22 +601,32 @@ object DateTimeUtils

[GitHub] spark issue #15821: [SPARK-13534][PySpark] Using Apache Arrow to increase pe...

2017-05-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15821 @BryanCutler even though the json is long, it is still so much clearer than reading a pile of code that generates json ... --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #17941: [SPARK-20684][R] Expose createGlobalTempView and dropGlo...

2017-05-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17941 @felixcheung what's your concern with this one? seems like just for api parity sake we should add this? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-05-12 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17711 I feel both are pretty complicated. Can we just do something similar to CombineUnion: ``` /** * Combines all adjacent [[Union]] operators into a single [[Union]]. */ object

[GitHub] spark pull request #17942: [SPARK-20702][Core]TaskContextImpl.markTaskComple...

2017-05-11 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17942#discussion_r116143097 --- Diff: core/src/main/scala/org/apache/spark/util/taskListeners.scala --- @@ -55,14 +55,16 @@ class TaskCompletionListenerException( extends

[GitHub] spark issue #17923: [SPARK-20591][WEB UI] Succeeded tasks num not equal in a...

2017-05-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17923 sry too long ago --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17931: [SPARK-12837][CORE][FOLLOWUP] getting name should not fa...

2017-05-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17931 What's the issue with SQL metrics? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

spark git commit: Revert "[SPARK-12297][SQL] Hive compatibility for Parquet Timestamps"

2017-05-09 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1b85bcd92 -> ac1ab6b9d Revert "[SPARK-12297][SQL] Hive compatibility for Parquet Timestamps" This reverts commit 22691556e5f0dfbac81b8cc9ca0a67c70c1711ca. See JIRA ticket for more information. Project:

[GitHub] spark issue #16781: [SPARK-12297][SQL] Hive compatibility for Parquet Timest...

2017-05-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16781 Did we conduct any performance tests on this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #17915: [SPARK-20674][SQL] Support registering UserDefine...

2017-05-09 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17915 [SPARK-20674][SQL] Support registering UserDefinedFunction as named UDF ## What changes were proposed in this pull request? For some reason we don't have an API to register UserDefinedFunction

spark git commit: [SPARK-20616] RuleExecutor logDebug of batch results should show diff to start of batch

2017-05-05 Thread rxin
Repository: spark Updated Branches: refs/heads/master b31648c08 -> 5d75b14bf [SPARK-20616] RuleExecutor logDebug of batch results should show diff to start of batch ## What changes were proposed in this pull request? Due to a likely typo, the logDebug msg printing the diff of query plans

spark git commit: [SPARK-20616] RuleExecutor logDebug of batch results should show diff to start of batch

2017-05-05 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.2 f59c74a94 -> 1d9b7a74a [SPARK-20616] RuleExecutor logDebug of batch results should show diff to start of batch ## What changes were proposed in this pull request? Due to a likely typo, the logDebug msg printing the diff of query

spark git commit: [SPARK-20616] RuleExecutor logDebug of batch results should show diff to start of batch

2017-05-05 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 704b249b6 -> a1112c615 [SPARK-20616] RuleExecutor logDebug of batch results should show diff to start of batch ## What changes were proposed in this pull request? Due to a likely typo, the logDebug msg printing the diff of query

[GitHub] spark issue #17875: [SPARK-20616] RuleExecutor logDebug of batch results sho...

2017-05-05 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17875 Merging in master/branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17875: [SPARK-20616] RuleExecutor logDebug of batch results sho...

2017-05-05 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17875 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17851: [SPARK-20585][SPARKR] R generic hint support

2017-05-05 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17851 @felixcheung was this merged only in master but not branch-2.2? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17770 @srinathshankar also thinks it's weird to add a barrier node. I suggest @hvanhovell and @srinathshankar duke it out. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...

2017-05-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17723 I'm saying avoid exposing Hadoop APIs. Wrap them around something if possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...

2017-05-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17723 I didn't read through the super long debate here, but I have a strong preference to not expose Hadoop APIs directly. I'm seeing more and more deployments out there that do not use Hadoop (e.g. connect

spark git commit: [SPARK-20584][PYSPARK][SQL] Python generic hint support

2017-05-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 13eb37c86 -> 02bbe7311 [SPARK-20584][PYSPARK][SQL] Python generic hint support ## What changes were proposed in this pull request? Adds `hint` method to PySpark `DataFrame`. ## How was this patch tested? Unit tests, doctests. Author:

spark git commit: [SPARK-20584][PYSPARK][SQL] Python generic hint support

2017-05-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.2 a3a5fcfef -> d8bd213f1 [SPARK-20584][PYSPARK][SQL] Python generic hint support ## What changes were proposed in this pull request? Adds `hint` method to PySpark `DataFrame`. ## How was this patch tested? Unit tests, doctests.

[GitHub] spark issue #17850: [SPARK-20584][PYSPARK][SQL] Python generic hint support

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17850 Merging in master/2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17850: [SPARK-20584][PYSPARK][SQL] Python generic hint support

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17850 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17850: [SPARK-20584][PYSPARK][SQL] Python generic hint s...

2017-05-03 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17850#discussion_r114677412 --- Diff: python/pyspark/sql/dataframe.py --- @@ -380,6 +380,35 @@ def withWatermark(self, eventTime, delayThreshold): jdf = self

spark git commit: [MINOR][SQL] Fix the test title from =!= to <=>, remove a duplicated test and add a test for =!=

2017-05-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6b9e49d12 -> 13eb37c86 [MINOR][SQL] Fix the test title from =!= to <=>, remove a duplicated test and add a test for =!= ## What changes were proposed in this pull request? This PR proposes three things as below: - This test looks not

spark git commit: [MINOR][SQL] Fix the test title from =!= to <=>, remove a duplicated test and add a test for =!=

2017-05-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.2 36d807906 -> 2629e7c7a [MINOR][SQL] Fix the test title from =!= to <=>, remove a duplicated test and add a test for =!= ## What changes were proposed in this pull request? This PR proposes three things as below: - This test looks

[GitHub] spark issue #17842: [MINOR][SQL] Fix the test title from =!= to <=>, remove ...

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17842 Merging in master/branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17839: [SPARK-20576][SQL] Support generic hint function in Data...

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17839 BTW I filed follow-up tickets for Python/R at https://issues.apache.org/jira/browse/SPARK-20576 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

spark git commit: [SPARK-20576][SQL] Support generic hint function in Dataset/DataFrame

2017-05-03 Thread rxin
s well as SQL. As an example, after this patch, the following will apply a broadcast hint on a DataFrame using the new hint function: ``` df1.join(df2.hint("broadcast")) ``` ## How was this patch tested? Added a test case in DataFrameJoinSuite. Author: Reynold Xin <r...@databricks.com

spark git commit: [SPARK-20576][SQL] Support generic hint function in Dataset/DataFrame

2017-05-03 Thread rxin
s well as SQL. As an example, after this patch, the following will apply a broadcast hint on a DataFrame using the new hint function: ``` df1.join(df2.hint("broadcast")) ``` ## How was this patch tested? Added a test case in DataFrameJoinSuite. Author: Reynold Xin <r...@databricks.com

[GitHub] spark issue #17839: [SPARK-20576][SQL] Support generic hint function in Data...

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17839 Merging in master/branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17839: [SPARK-20576][SQL] Support generic hint function in Data...

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17839 @felixcheung do you worry about conflicts? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17678: [SPARK-20381][SQL] Add SQL metrics of numOutputRows for ...

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17678 cc @gatorsmile can you review this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17770 Let's see what other people say before going too far... cc @cloud-fan / @hvanhovell / @marmbrus / @gatorsmile see my proposal: https://github.com/apache/spark/pull/17770#issuecomment-298833348

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17770 What self join case are you talking about? The one that we manually rewrite half of the plan? That one would be a special case anyway, wouldn't it? --- If your project is set up for it, you can

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17770 I'm actually wondering if we should just introduce a variant of transform that takes a stop condition, e.g. ``` def transform(stopCondition: BaseType => Boolean)(rule: PartialFunct

[GitHub] spark issue #17839: [SPARK-20576][SQL] Support generic hint function in Data...

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17839 Actually somebody should add the Python / R wrapper. cc @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17770 why don't we always add this to the dataset's logicalPlan? we can change that in one place. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #17770: [SPARK-20392][SQL] Set barrier to prevent re-ente...

2017-05-03 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17770#discussion_r114478015 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1134,7 +1138,7 @@ class Dataset[T] private[sql

[GitHub] spark pull request #17839: [SPARK-20576][SQL] Support generic hint function ...

2017-05-03 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17839 [SPARK-20576][SQL] Support generic hint function in Dataset/DataFrame ## What changes were proposed in this pull request? We allow users to specify hints (currently only "broadcast" is

[GitHub] spark issue #17806: [SPARK-20487][SQL] Display `serde` for `HiveTableScan` n...

2017-04-28 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17806 @gatorsmile i will let you merge ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17806: [SPARK-20487][SQL] Display `serde` for `HiveTableScan` n...

2017-04-28 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17806 Maybe get rid of the Some? If it is not defined, we probably just shouldn't show anything. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #17780: [SPARK-20487][SQL] `HiveTableScan` node is quite verbose...

2017-04-28 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17780 Can we at least include the serde? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

spark git commit: [SPARK-20474] Fixing OnHeapColumnVector reallocation

2017-04-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.2 6709bcf6e -> e278876ba [SPARK-20474] Fixing OnHeapColumnVector reallocation ## What changes were proposed in this pull request? OnHeapColumnVector reallocation copies to the new storage data up to 'elementsAppended'. This variable is

spark git commit: [SPARK-20474] Fixing OnHeapColumnVector reallocation

2017-04-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 99c6cf9ef -> a277ae80a [SPARK-20474] Fixing OnHeapColumnVector reallocation ## What changes were proposed in this pull request? OnHeapColumnVector reallocation copies to the new storage data up to 'elementsAppended'. This variable is only

[GitHub] spark issue #17773: [SPARK-20474] Fixing OnHeapColumnVector reallocation

2017-04-26 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17773 Merging in master/branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

spark git commit: [SPARK-20473] Enabling missing types in ColumnVector.Array

2017-04-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.2 b65858bb3 -> 6709bcf6e [SPARK-20473] Enabling missing types in ColumnVector.Array ## What changes were proposed in this pull request? ColumnVector implementations originally did not support some Catalyst types (float, short, and

spark git commit: [SPARK-20473] Enabling missing types in ColumnVector.Array

2017-04-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 66dd5b83f -> 99c6cf9ef [SPARK-20473] Enabling missing types in ColumnVector.Array ## What changes were proposed in this pull request? ColumnVector implementations originally did not support some Catalyst types (float, short, and boolean).

[GitHub] spark issue #17772: [SPARK-20473] Enabling missing types in ColumnVector.Arr...

2017-04-26 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17772 Merging in master / branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17770: [SPARK-20392][SQL][WIP] Set barrier to prevent re-enteri...

2017-04-26 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17770 Can we fix the description? It is really confusing since it uses the word exchange. Also can we just skip a plan if it is resolved in transform? --- If your project is set up for it, you can reply

[GitHub] spark issue #17727: [SQL][MINOR] Remove misleading comment (and tags do bett...

2017-04-25 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17727 Hm I don't think the comment makes sense ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

spark git commit: [SPARK-20453] Bump master branch version to 2.3.0-SNAPSHOT

2017-04-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5280d93e6 -> f44c8a843 [SPARK-20453] Bump master branch version to 2.3.0-SNAPSHOT This patch bumps the master branch version to `2.3.0-SNAPSHOT`. Author: Josh Rosen Closes #17753 from JoshRosen/SPARK-20453.

[GitHub] spark issue #17753: [SPARK-20453] Bump master branch version to 2.3.0-SNAPSH...

2017-04-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17753 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14731: [SPARK-17159] [streaming]: optimise check for new files ...

2017-04-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14731 Steve I think the main point is you should also respect the time of reviewers. The way most of your pull requests manifest have been suboptimal: they often start with a very early WIP (which

[GitHub] spark issue #17648: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...

2017-04-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17648 sgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17736: [SPARK-20399][SQL] Can't use same regex pattern between ...

2017-04-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17736 cc @hvanhovell for review ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-23 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17712 Why use a map? That's super unstructured and easy to break ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-22 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17712 cc @gatorsmile This is related to the deterministic thing you want to do? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #17717: [SPARK-20430][SQL] Initialise RangeExec parameters in a ...

2017-04-21 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17717 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17717: [SPARK-20430][SQL] Initialise RangeExec parameter...

2017-04-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17717#discussion_r112803232 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -1732,4 +1732,10 @@ class DataFrameSuite extends QueryTest

[GitHub] spark pull request #17717: [SPARK-20430][SQL] Initialise RangeExec parameter...

2017-04-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17717#discussion_r112803234 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -1732,4 +1732,10 @@ class DataFrameSuite extends QueryTest

[GitHub] spark pull request #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17712#discussion_r112803097 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -45,14 +45,33 @@ import

[GitHub] spark pull request #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17712#discussion_r112800640 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -45,14 +45,33 @@ import

[GitHub] spark issue #17648: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...

2017-04-21 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17648 I was saying rather than implementing them, just rewrite them into an aggregate on the conditions and compare them against the value. --- If your project is set up for it, you can reply

[GitHub] spark pull request #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17712#discussion_r112754224 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -47,12 +47,20 @@ case class UserDefinedFunction protected

spark git commit: [SPARK-20420][SQL] Add events to the external catalog

2017-04-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master 48d760d02 -> e2b3d2367 [SPARK-20420][SQL] Add events to the external catalog ## What changes were proposed in this pull request? It is often useful to be able to track changes to the `ExternalCatalog`. This PR makes the `ExternalCatalog`

spark git commit: [SPARK-20420][SQL] Add events to the external catalog

2017-04-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.2 6cd2f16b1 -> cddb4b7db [SPARK-20420][SQL] Add events to the external catalog ## What changes were proposed in this pull request? It is often useful to be able to track changes to the `ExternalCatalog`. This PR makes the

[GitHub] spark issue #17710: [SPARK-20420][SQL] Add events to the external catalog

2017-04-21 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17710 Merging in master/branch-2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17712#discussion_r112622098 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -47,12 +47,20 @@ case class UserDefinedFunction protected

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17711 can you add a test case in sql query file tests? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #17711: [SPARK-19951][SQL] Add string concatenate operato...

2017-04-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17711#discussion_r112590613 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -1483,4 +1483,12 @@ class SparkSqlAstBuilder(conf: SQLConf

[GitHub] spark issue #17705: [SPARK-20410][SQL] Make sparkConf a def in SharedSQLCont...

2017-04-20 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17705 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #17699: [SPARK-20405][SQL] Dataset.withNewExecutionId sho...

2017-04-20 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17699 [SPARK-20405][SQL] Dataset.withNewExecutionId should be private ## What changes were proposed in this pull request? Dataset.withNewExecutionId is only used in Dataset itself and should be private

[GitHub] spark pull request #17698: [SPARK-20403][SQL][Documentation]Modify the instr...

2017-04-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17698#discussion_r112383091 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -1036,3 +1036,8 @@ case class UpCast(child: Expression

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112382152 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ArrowConvertersSuite.scala --- @@ -0,0 +1,568 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112381608 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ArrowConvertersSuite.scala --- @@ -0,0 +1,568 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112376143 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ArrowConverters.scala --- @@ -0,0 +1,432 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112376037 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ArrowConverters.scala --- @@ -0,0 +1,432 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112375921 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ArrowConverters.scala --- @@ -0,0 +1,432 @@ +/* +* Licensed to the Apache Software Foundation

[GitHub] spark pull request #15821: [SPARK-13534][PySpark] Using Apache Arrow to incr...

2017-04-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15821#discussion_r112375496 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/ArrowConverters.scala --- @@ -0,0 +1,432 @@ +/* +* Licensed to the Apache Software Foundation

<    3   4   5   6   7   8   9   10   11   12   >