[GitHub] spark issue #17465: [SPARK-20136][SQL] Add num files and metadata operation ...

2017-03-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17465 cc @ericl, @bogdanrdc, @adrian-ionescu, @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #17465: [SPARK-20136][SQL] Add num files and metadata ope...

2017-03-29 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17465 [SPARK-20136][SQL] Add num files and metadata operation timing to scan operator metrics ## What changes were proposed in this pull request? This patch adds explicit metadata operation timing

spark git commit: [SPARK-20134][SQL] SQLMetrics.postDriverMetricUpdates to simplify driver side metric updates

2017-03-29 Thread rxin
ide. This patch introduces a new SQLMetrics.postDriverMetricUpdates function to do that, and adds documentation to make it more obvious. ## How was this patch tested? Updated a test case to use this method. Author: Reynold Xin <r...@databricks.com> Closes #17464 from rxin/SPARK-20134.

[GitHub] spark issue #17464: [SPARK-20134][SQL] SQLMetrics.postDriverMetricUpdates to...

2017-03-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17464 Merging in master/branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

spark git commit: [SPARK-20134][SQL] SQLMetrics.postDriverMetricUpdates to simplify driver side metric updates

2017-03-29 Thread rxin
ver side. This patch introduces a new SQLMetrics.postDriverMetricUpdates function to do that, and adds documentation to make it more obvious. ## How was this patch tested? Updated a test case to use this method. Author: Reynold Xin <r...@databricks.com> Closes #17464 from rxin/SPARK-20134.

[GitHub] spark pull request #17464: [SPARK-20134][SQL] SQLMetrics.postDriverMetricUpd...

2017-03-29 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17464#discussion_r108600240 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ui/SQLListenerSuite.scala --- @@ -477,9 +477,11 @@ private case class MyPlan(sc

[GitHub] spark pull request #17464: [SPARK-20134][SQL] SQLMetrics.postDriverMetricUpd...

2017-03-28 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17464 [SPARK-20134][SQL] SQLMetrics.postDriverMetricUpdates to simplify driver side metric updates ## What changes were proposed in this pull request? It is not super intuitive how to update SQLMetric

[GitHub] spark issue #17424: [SPARK-20089] [SQL] [TEST] Added DESC FUNCTION and DESC ...

2017-03-25 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17424 Hm - so this would require us to update the test suite every time there is an update to the docs? --- If your project is set up for it, you can reply to this email and have your reply appear

spark git commit: [SPARK-20070][SQL] Fix 2.10 build

2017-03-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master f88f56b83 -> 0a6c50711 [SPARK-20070][SQL] Fix 2.10 build ## What changes were proposed in this pull request? Commit https://github.com/apache/spark/commit/91fa80fe8a2480d64c430bd10f97b3d44c007bcc broke the build for scala 2.10. The

[GitHub] spark issue #17420: [SPARK-20070][SQL] Fix 2.10 build

2017-03-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17420 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [DOCS] Clarify round mode for format_number & round functions

2017-03-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master e011004be -> f88f56b83 [DOCS] Clarify round mode for format_number & round functions ## What changes were proposed in this pull request? Updated the description for the `format_number` description to indicate that it uses `HALF_EVEN`

[GitHub] spark issue #17399: [DOCS] Clarify round mode for format_number & round func...

2017-03-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17399 Thanks - merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

spark git commit: [SPARK-19846][SQL] Add a flag to disable constraint propagation

2017-03-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master b5c5bd98e -> e011004be [SPARK-19846][SQL] Add a flag to disable constraint propagation ## What changes were proposed in this pull request? Constraint propagation can be computation expensive and block the driver execution for long time.

[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

2017-03-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17186 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: Disable generate codegen since it fails my workload.

2017-03-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 91fa80fe8 -> b5c5bd98e Disable generate codegen since it fails my workload. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b5c5bd98 Tree:

[GitHub] spark issue #17399: Update functions.scala

2017-03-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17399 @roxannemoslehi can you fix the title? We can then merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17399: Update functions.scala

2017-03-23 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17399 Yea we definitely need a better title. Thanks for contributing though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

spark git commit: Typo fixup in comment

2017-03-23 Thread rxin
Repository: spark Updated Branches: refs/heads/master b70c03a42 -> b0ae6a38a Typo fixup in comment ## What changes were proposed in this pull request? Fixup typo in comment. ## How was this patch tested? Don't need. Author: Ye Yin Closes #17396 from hustcat/fix. Project:

[GitHub] spark issue #17397: [SPARK-20070][SQL] Redact DataSourceScanExec treeString

2017-03-23 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17397 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17396: Typo fixup in comment

2017-03-23 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17396 Merging in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17312: [SPARK-19973] Display num of executors for the stage.

2017-03-23 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17312 That would be pretty confusing wouldn't it? The table has 3 entries and the title says only 2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #17312: [SPARK-19973] Display num of executors for the stage.

2017-03-22 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17312 Your screenshot had 3 executors. Why does it say 2? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17359: [SPARK-20028][SQL] Add aggreagate expression nGrams

2017-03-22 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17359 Why do we want this? Seems extremely low usage on this function in the wild. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

spark git commit: clarify array_contains function description

2017-03-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master a8877bdbb -> a04dcde8c clarify array_contains function description ## What changes were proposed in this pull request? The description in the comment for array_contains is vague/incomplete (i.e., doesn't mention that it returns `null` if

[GitHub] spark issue #17380: clarify array_contains function description

2017-03-21 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17380 Thanks - merging in master/branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

spark git commit: clarify array_contains function description

2017-03-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 5c18b6c31 -> 9dfdd2adf clarify array_contains function description ## What changes were proposed in this pull request? The description in the comment for array_contains is vague/incomplete (i.e., doesn't mention that it returns

[GitHub] spark issue #17343: [SPARK-20014] Optimize mergeSpillsWithFileStream method

2017-03-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17343 Can you add some documentation inline so in the future we'd know why specific implementations were chosen? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #17312: [SPARK-19973] Display num of executors for the stage.

2017-03-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17312 Can you put a screenshot here? Might actually be useful to have. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17318: [SPARK-19896][SQL] Throw an exception if case classes ha...

2017-03-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17318 Can you put the after exception in the pr description as well? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

spark git commit: [SQL][MINOR] Fix scaladoc for UDFRegistration

2017-03-17 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 5fb70831b -> 780f6060c [SQL][MINOR] Fix scaladoc for UDFRegistration ## What changes were proposed in this pull request? Fix scaladoc for UDFRegistration ## How was this patch tested? local build Author: Jacek Laskowski

[GitHub] spark issue #17337: [SQL][MINOR] Fix scaladoc for UDFRegistration

2017-03-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17337 Merging in master/branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

spark git commit: [SQL][MINOR] Fix scaladoc for UDFRegistration

2017-03-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3783539d7 -> 6326d406b [SQL][MINOR] Fix scaladoc for UDFRegistration ## What changes were proposed in this pull request? Fix scaladoc for UDFRegistration ## How was this patch tested? local build Author: Jacek Laskowski

[GitHub] spark pull request #17330: [SPARK-19993][SQL][WIP] Caching logical plans con...

2017-03-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17330#discussion_r106758290 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala --- @@ -61,6 +63,36 @@ abstract class SubqueryExpression

spark git commit: [SPARK-18847][GRAPHX] PageRank gives incorrect results for graphs with sinks

2017-03-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 376d78216 -> bfdeea5c6 [SPARK-18847][GRAPHX] PageRank gives incorrect results for graphs with sinks ## What changes were proposed in this pull request? Graphs with sinks (vertices with no outgoing edges) don't have the expected rank sum

[GitHub] spark issue #16483: [SPARK-18847][GraphX] PageRank gives incorrect results f...

2017-03-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16483 Merging in master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17322: [SPARK-19987][SQL] Pass all filters into FileInde...

2017-03-16 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17322 [SPARK-19987][SQL] Pass all filters into FileIndex ## What changes were proposed in this pull request? This is a tiny teeny refactoring to pass data filters also to the FileIndex, so FileIndex

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-03-16 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17191 I personally have run into this issue and was surprised that we didn't support it ... it's pretty verbose to retype everything. If Postgres and MySQL both support it, I think we should do

[GitHub] spark issue #17303: [SPARK-19112][CORE] add codec for ZStandard

2017-03-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17303 Yes it'd be nice to have some benchmark on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

spark git commit: [MINOR][CORE] Fix a info message of `prunePartitions`

2017-03-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 97cc5e5a5 -> 54a3697f1 [MINOR][CORE] Fix a info message of `prunePartitions` ## What changes were proposed in this pull request? `PrunedInMemoryFileIndex.prunePartitions` shows `pruned NaN% partitions` for the following case. ```scala

[GitHub] spark issue #17273: [MINOR][CORE] Fix a info message of `prunePartitions` in...

2017-03-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17273 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-19960][CORE] Move `SparkHadoopWriter` to `internal/io/`

2017-03-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 02c274eab -> 97cc5e5a5 [SPARK-19960][CORE] Move `SparkHadoopWriter` to `internal/io/` ## What changes were proposed in this pull request? This PR introduces the following changes: 1. Move `SparkHadoopWriter` to `core/internal/io/`, so

[GitHub] spark issue #17304: [SPARK-19960][CORE] Move `SparkHadoopWriter` to `interna...

2017-03-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17304 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17301: [SPARK-19944][SQL] Move SQLConf from sql/core to ...

2017-03-15 Thread rxin
Github user rxin closed the pull request at: https://github.com/apache/spark/pull/17301 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17166 hm it might be useful to have details, but it'd also be useful to have this in the overview page without having to drill down. iiuc, the pr already has the information in task list page, doesn't

[GitHub] spark pull request #17301: [SPARK-19944][SQL] Move SQLConf from sql/core to ...

2017-03-15 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17301 [SPARK-19944][SQL] Move SQLConf from sql/core to sql/catalyst (branch-2.1) ## What changes were proposed in this pull request? This patch moves SQLConf from sql/core to sql/catalyst. To minimize

[GitHub] spark issue #17273: [MINOR][CORE] No need to call `prunePartitions` in case ...

2017-03-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17273 I'd fix the log msg instead. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17292: DebugFilesystem.assertNoOpenStreams should report...

2017-03-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17292#discussion_r106093910 --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala --- @@ -537,6 +539,21 @@ class SparkContextSuite extends SparkFunSuite

[GitHub] spark issue #17264: [SPARK-19923][SQL] Remove unnecessary type conversions p...

2017-03-14 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17264 In the future can we put the perf result in PR descriptions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #17285: [SPARK-19944][SQL] Move SQLConf from sql/core to ...

2017-03-14 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17285#discussion_r105976759 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SimpleCatalystConf.scala --- @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17285: [SPARK-19944][SQL] Move SQLConf from sql/core to ...

2017-03-13 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17285 [SPARK-19944][SQL] Move SQLConf from sql/core to sql/catalyst ## What changes were proposed in this pull request? This patch moves SQLConf from sql/core to sql/catalyst. To minimize the changes

[GitHub] spark issue #16541: [SPARK-19088][SQL] Optimize sequence type deserializatio...

2017-03-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16541 I didn't look into the details here, but very often scanning data twice doesn't necessarily slow things down, especially in the case of sequential scan. --- If your project is set up for it, you

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-03-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r105506911 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -17,43 +17,70 @@ package org.apache.spark.sql.internal

[GitHub] spark pull request #17241: [SPARK-19877][SQL] Restrict the nested level of a...

2017-03-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17241#discussion_r105453191 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -595,6 +594,11 @@ class Analyzer( case view

[GitHub] spark issue #17241: [SPARK-19877][SQL] Restrict the nested level of a view

2017-03-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17241 SGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17244: [SPARK-19889][SQL] Make TaskContext callbacks thread saf...

2017-03-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17244 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17220: [SPARK-19862] 'tungsten-sort' should be deleted in Spark...

2017-03-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17220 I don't think you understand this. This value is here so if at some point some user picked tungsten-sort, we won't break it. In recent versions of Spark the default sort manager accomplishes the thing

[GitHub] spark issue #17220: [SPARK-19862] 'tungsten-sort' should be deleted in Spark...

2017-03-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17220 If anything, we should just update the file to add a line of comment to make sure people don't delete this in the future. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #17220: [SPARK-19862] 'tungsten-sort' should be deleted in Spark...

2017-03-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17220 Is this change even correct? This is here for backward compatibility. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #17202: [SPARK-19861][SS] watermark should not be a negat...

2017-03-08 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17202#discussion_r104983300 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -576,6 +576,11 @@ class Dataset[T] private[sql]( val parsedDelay

[GitHub] spark pull request #17202: [SPARK-19861][SS] watermark should not be a negat...

2017-03-08 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17202#discussion_r104983221 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -563,7 +563,7 @@ class Dataset[T] private[sql]( * @param eventTime the name

[GitHub] spark issue #17205: [SPARK-19843] [SQL] [Followup] Classdoc for `IntWrapper`...

2017-03-08 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17205 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-19843][SQL][FOLLOWUP] Classdoc for `IntWrapper` and `LongWrapper`

2017-03-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9a6ac7226 -> e420fd459 [SPARK-19843][SQL][FOLLOWUP] Classdoc for `IntWrapper` and `LongWrapper` ## What changes were proposed in this pull request? This is as per suggestion by rxin at : https://github.com/apache/spark/pull/17

[GitHub] spark issue #17205: [SPARK-19843] [SQL] [Followup] Classdoc for `IntWrapper`...

2017-03-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17205 LGTM too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17184: [SPARK-19843] [SQL] UTF8String => (int / long) co...

2017-03-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17184#discussion_r104845661 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -897,41 +898,52 @@ public long toLong() { break

[GitHub] spark issue #17184: [SPARK-19843] [SQL] UTF8String => (int / long) conversio...

2017-03-07 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17184 I believe IBM J9 actually improved this specific case (their JIT handles tons of exceptions better). Oh well -- if only JIT is perfect. --- If your project is set up for it, you can reply

[GitHub] spark pull request #17184: [SPARK-19843] [SQL] UTF8String => (int / long) co...

2017-03-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17184#discussion_r104841789 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -897,41 +898,52 @@ public long toLong() { break

[GitHub] spark pull request #17184: [SPARK-19843] [SQL] UTF8String => (int / long) co...

2017-03-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17184#discussion_r104841761 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -897,41 +898,52 @@ public long toLong() { break

[GitHub] spark pull request #17184: [SPARK-19843] [SQL] UTF8String => (int / long) co...

2017-03-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17184#discussion_r104841735 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -850,26 +850,27 @@ public UTF8String translate(Map<Charac

[GitHub] spark pull request #17196: [SPARK-19855][SQL] Create an internal FilePartiti...

2017-03-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17196#discussion_r104804384 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FilePartitionStrategy.scala --- @@ -0,0 +1,156 @@ +/* + * Licensed

[GitHub] spark pull request #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved...

2017-03-07 Thread rxin
Github user rxin closed the pull request at: https://github.com/apache/spark/pull/16958 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #17196: [SPARK-19855][SQL] Create an internal FilePartiti...

2017-03-07 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17196#discussion_r104798525 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FilePartitionStrategy.scala --- @@ -0,0 +1,156 @@ +/* + * Licensed

[GitHub] spark pull request #17196: [SPARK-19855][SQL] Create an internal FilePartiti...

2017-03-07 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17196 [SPARK-19855][SQL] Create an internal FilePartitionStrategy interface ## What changes were proposed in this pull request? The way we currently do file partitioning strategy is hard coded

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104595706 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -2250,6 +2250,25 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104593920 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedClusterMessage.scala --- @@ -40,7 +40,8 @@ private[spark] object

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104593825 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -732,6 +732,13 @@ class DAGScheduler

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104593790 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -158,7 +158,8 @@ private[spark] class Executor( threadPool.execute(tr

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104593710 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -2250,6 +2250,25 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...

2017-03-06 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r104593724 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -2250,6 +2250,25 @@ class SparkContext(config: SparkConf) extends Logging

[GitHub] spark issue #15928: [SPARK-18478][SQL] Support codegen'd Hive UDFs

2017-03-02 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15928 What do you mean? The improvement was small? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17114: [SPARK-19758][SQL] Resolving timezone aware expressions ...

2017-02-28 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17114 Put the test case in a sql file? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17099: [SPARK-19766][SQL] Constant alias columns in INNE...

2017-02-28 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17099#discussion_r103501851 --- Diff: sql/core/src/test/resources/sql-tests/inputs/inner-join.sql --- @@ -0,0 +1,25 @@ +CREATE TEMPORARY VIEW t1 AS SELECT * FROM VALUES (1

spark git commit: [SPARK-17495][SQL] Add more tests for hive hash

2017-02-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master a920a4369 -> 3e40f6c3d [SPARK-17495][SQL] Add more tests for hive hash ## What changes were proposed in this pull request? This PR adds tests hive-hash by comparing the outputs generated against Hive 1.2.1. Following datatypes are

[GitHub] spark issue #17049: [SPARK-17495] [SQL] Add more tests for hive hash

2017-02-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17049 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17053: [SPARK-18939][SQL] Timezone support in partition ...

2017-02-23 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17053#discussion_r102889140 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala --- @@ -251,7 +251,8 @@ abstract class ExternalCatalog

[GitHub] spark issue #17049: [SPARK-17495] [SQL] Add more tests for hive hash

2017-02-23 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17049 Looks good except that comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17049: [SPARK-17495] [SQL] Add more tests for hive hash

2017-02-23 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17049#discussion_r102881054 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala --- @@ -71,6 +75,242 @@ class HashExpressionsSuite

[GitHub] spark issue #17002: [SPARK-19669][SQL] Open up visibility for sharedState, s...

2017-02-21 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17002 Yea @gatorsmile be careful in the future and check the commit hash. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #17002: [SPARK-19669][SQL] Open up visibility for sharedS...

2017-02-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/17002#discussion_r102070142 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala --- @@ -95,16 +95,26 @@ class SparkSession private( /** * State

[GitHub] spark pull request #17002: [SPARK-19669][SQL] Open up visibility for sharedS...

2017-02-20 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/17002 [SPARK-19669][SQL] Open up visibility for sharedState, sessionState, and a few other functions ## What changes were proposed in this pull request? To ease debugging, most of Spark SQL internals

[GitHub] spark issue #16977: [SPARK-19651][CORE] ParallelCollectionRDD.collect should...

2017-02-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16977 Are tests flaky right now? Otherwise it seems like this has introduced legitimate issue with the test timing out. Three times in a row. --- If your project is set up for it, you can reply

spark git commit: [SPARK-19447] Make Range operator generate "recordsRead" metric

2017-02-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 729ce3703 -> b486ffc86 [SPARK-19447] Make Range operator generate "recordsRead" metric ## What changes were proposed in this pull request? The Range was modified to produce "recordsRead" metric instead of "generated rows". The tests were

[GitHub] spark issue #16960: [SPARK-19447] Make Range operator generate "recordsRead"...

2017-02-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16960 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16960: [SPARK-19447] Make Range operator generate "recordsRead"...

2017-02-16 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16960 cc @hvanhovell if you have a min to review this ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16960: [SPARK-19447] Make Range operator generate "recor...

2017-02-16 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16960#discussion_r101575264 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -309,4 +314,84 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #16960: [SPARK-19447] Make Range operator generate "recor...

2017-02-16 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16960#discussion_r101575199 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -309,4 +314,84 @@ class SQLMetricsSuite extends

[GitHub] spark issue #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved.

2017-02-16 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16958 So nice when I got two LGTMs and then Jenkins disagreed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16826: [WIP][SPARK-19540][SQL] Add ability to clone SparkSessio...

2017-02-16 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16826 What's WIP about this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16611: [SPARK-17967][SPARK-17878][SQL][PYTHON] Support for arra...

2017-02-16 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16611 For SQL, rather than "array", can we follow Python, e.g. ``` CREATE TEMPORARY TABLE tableA USING csv OPTIONS (nullValue ['NA', 'null'], ...) ``` --- If your project

[GitHub] spark pull request #16611: [SPARK-17967][SPARK-17878][SQL][PYTHON] Support f...

2017-02-16 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16611#discussion_r101553890 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -97,6 +99,15 @@ class DataFrameReader private[sql](sparkSession

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-16 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16534 Change looks good to me but I didn't look super carefully. @holdenk can you take a look at this? --- If your project is set up for it, you can reply to this email and have your reply appear

<    5   6   7   8   9   10   11   12   13   14   >