[GitHub] [spark] vanzin commented on issue #26924: [SPARK-30285][CORE] Fix deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError

2019-12-23 Thread GitBox
vanzin commented on issue #26924: [SPARK-30285][CORE] Fix deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError URL: https://github.com/apache/spark/pull/26924#issuecomment-568560238 I don't think there's a way to write a proper test here without changing a bunch

[GitHub] [spark] vanzin commented on issue #26924: [SPARK-30285][CORE] Fix deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError

2019-12-23 Thread GitBox
vanzin commented on issue #26924: [SPARK-30285][CORE] Fix deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError URL: https://github.com/apache/spark/pull/26924#issuecomment-568560270 retest this please

[GitHub] [spark] rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#discussion_r360978971 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala

[GitHub] [spark] vanzin commented on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly

2019-12-23 Thread GitBox
vanzin commented on issue #26845: [SPARK-21869][SS] Revise Kafka producer pool to implement 'expire' correctly URL: https://github.com/apache/spark/pull/26845#issuecomment-568557634 retest this please This is an automated

[GitHub] [spark] gatorsmile commented on a change in pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing timestamps and dates from CSV

2019-12-23 Thread GitBox
gatorsmile commented on a change in pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing timestamps and dates from CSV URL: https://github.com/apache/spark/pull/23150#discussion_r360978303 ## File path: docs/sql-migration-guide-upgrade.md ## @@ -33,6

[GitHub] [spark] rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#discussion_r360977274 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala

[GitHub] [spark] brkyvz commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
brkyvz commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#discussion_r360976390 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala

[GitHub] [spark] brkyvz commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
brkyvz commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#discussion_r360976159 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala

[GitHub] [spark] hensg removed a comment on issue #26989: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S…

2019-12-23 Thread GitBox
hensg removed a comment on issue #26989: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S… URL: https://github.com/apache/spark/pull/26989#issuecomment-568548971 We still using it on our cluster and would be nice to have this fix

[GitHub] [spark] brkyvz commented on issue #26974: [SPARK-30324][SQL] Simplify JSON field access through dot notation

2019-12-23 Thread GitBox
brkyvz commented on issue #26974: [SPARK-30324][SQL] Simplify JSON field access through dot notation URL: https://github.com/apache/spark/pull/26974#issuecomment-568550864 retest this please This is an automated message from

[GitHub] [spark] rdblue commented on a change in pull request #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2

2019-12-23 Thread GitBox
rdblue commented on a change in pull request #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2 URL: https://github.com/apache/spark/pull/26231#discussion_r360970183 ## File path: sql/core/src/main/scala/org/apache/spark/sql/connector/read/V1Scan.scala ## @@

[GitHub] [spark] hensg commented on issue #26989: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S…

2019-12-23 Thread GitBox
hensg commented on issue #26989: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S… URL: https://github.com/apache/spark/pull/26989#issuecomment-568548971 We still using it on our cluster and would be nice to have this fix

[GitHub] [spark] rdblue commented on a change in pull request #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2

2019-12-23 Thread GitBox
rdblue commented on a change in pull request #26231: [SPARK-29572][SQL] add v1 read fallback API in DS v2 URL: https://github.com/apache/spark/pull/26231#discussion_r360969214 ## File path:

[GitHub] [spark] rdblue commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-568546914 One more thing: can we keep the discussion about the `TableProvider` API changes in one place? Right now we've

[GitHub] [spark] rdblue commented on issue #26868: [SPARK-29665][SQL] refine the TableProvider interface

2019-12-23 Thread GitBox
rdblue commented on issue #26868: [SPARK-29665][SQL] refine the TableProvider interface URL: https://github.com/apache/spark/pull/26868#issuecomment-568546625 @cloud-fan, thanks for bringing up a flag to not store schema and partitioning in the catalog. I don't recall that discussion, but

[GitHub] [spark] gatorsmile commented on a change in pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing timestamps and dates from CSV

2019-12-23 Thread GitBox
gatorsmile commented on a change in pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing timestamps and dates from CSV URL: https://github.com/apache/spark/pull/23150#discussion_r360966895 ## File path: docs/sql-migration-guide-upgrade.md ## @@ -33,6

[GitHub] [spark] vanzin closed pull request #26989: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S…

2019-12-23 Thread GitBox
vanzin closed pull request #26989: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S… URL: https://github.com/apache/spark/pull/26989 This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] vanzin commented on issue #26989: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S…

2019-12-23 Thread GitBox
vanzin commented on issue #26989: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S… URL: https://github.com/apache/spark/pull/26989#issuecomment-568546307 Sorry, but 1.6 has been EOL forever. We don't even have a way of testing it anymore.

[GitHub] [spark] Eric5553 commented on a change in pull request #26977: [SPARK-30326][SQL] Raise exception if analyzer exceed max iterations

2019-12-23 Thread GitBox
Eric5553 commented on a change in pull request #26977: [SPARK-30326][SQL] Raise exception if analyzer exceed max iterations URL: https://github.com/apache/spark/pull/26977#discussion_r360963254 ## File path:

[GitHub] [spark] Eric5553 commented on issue #26977: [SPARK-30326][SQL] Raise exception if analyzer exceed max iterations

2019-12-23 Thread GitBox
Eric5553 commented on issue #26977: [SPARK-30326][SQL] Raise exception if analyzer exceed max iterations URL: https://github.com/apache/spark/pull/26977#issuecomment-568542023 @cloud-fan Good catch! IMO, literally the description `The max number of iterations the analyzer runs.`

[GitHub] [spark] hensg opened a new pull request #26989: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S…

2019-12-23 Thread GitBox
hensg opened a new pull request #26989: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S… URL: https://github.com/apache/spark/pull/26989 …treamManager ## What changes were proposed in this pull request? This is mostly a clean backport of #23521 to

[GitHub] [spark] hensg closed pull request #26988: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S…

2019-12-23 Thread GitBox
hensg closed pull request #26988: [SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S… URL: https://github.com/apache/spark/pull/26988 This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] hensg opened a new pull request #26988: [WIP][SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S…

2019-12-23 Thread GitBox
hensg opened a new pull request #26988: [WIP][SPARK-26604][CORE][BACKPORT-1.6] Clean up channel registration for S… URL: https://github.com/apache/spark/pull/26988 …treamManager ## What changes were proposed in this pull request? This is mostly a clean backport of

[GitHub] [spark] rdblue commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-568536534 Retest this please. This is an automated message

[GitHub] [spark] rdblue commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-568536011 I noted a few things, but overall +1. This is an

[GitHub] [spark] rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#discussion_r360955664 ## File path:

[GitHub] [spark] rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#discussion_r360955664 ## File path:

[GitHub] [spark] rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#discussion_r360953624 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala

[GitHub] [spark] brkyvz opened a new pull request #26987: [SPARK-30334][SQL] Introduce as_json for marking a column as JSON data

2019-12-23 Thread GitBox
brkyvz opened a new pull request #26987: [SPARK-30334][SQL] Introduce as_json for marking a column as JSON data URL: https://github.com/apache/spark/pull/26987 ### What changes were proposed in this pull request? Semi-structured data is used widely in the data industry for reporting

[GitHub] [spark] rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#discussion_r360952930 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala

[GitHub] [spark] rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#discussion_r360952506 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala

[GitHub] [spark] rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider

2019-12-23 Thread GitBox
rdblue commented on a change in pull request #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#discussion_r360951885 ## File path:

[GitHub] [spark] viirya commented on a change in pull request #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project

2019-12-23 Thread GitBox
viirya commented on a change in pull request #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project URL: https://github.com/apache/spark/pull/26978#discussion_r360944222 ## File path:

[GitHub] [spark] huaxingao commented on issue #26972: [SPARK-30321][Ml] Log weightSum in Algo that has weights support

2019-12-23 Thread GitBox
huaxingao commented on issue #26972: [SPARK-30321][Ml] Log weightSum in Algo that has weights support URL: https://github.com/apache/spark/pull/26972#issuecomment-568520956 cc @zhengruifeng This is an automated message from

[GitHub] [spark] srowen commented on issue #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-12-23 Thread GitBox
srowen commented on issue #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component URL: https://github.com/apache/spark/pull/26124#issuecomment-568515137 Ack, wait, this didn't pass tests after the last commit. Let me monitor

[GitHub] [spark] srowen commented on issue #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-12-23 Thread GitBox
srowen commented on issue #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component URL: https://github.com/apache/spark/pull/26124#issuecomment-568514832 Merged to master This is an automated

[GitHub] [spark] srowen closed pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-12-23 Thread GitBox
srowen closed pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component URL: https://github.com/apache/spark/pull/26124 This is an automated message from the Apache Git Service. To

[GitHub] [spark] srowen commented on a change in pull request #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-23 Thread GitBox
srowen commented on a change in pull request #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors URL: https://github.com/apache/spark/pull/26982#discussion_r360933151 ## File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala ## @@

[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

2019-12-23 Thread GitBox
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by URL: https://github.com/apache/spark/pull/26946#discussion_r360924930 ## File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala

[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

2019-12-23 Thread GitBox
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by URL: https://github.com/apache/spark/pull/26946#discussion_r360924460 ## File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala

[GitHub] [spark] PavithraRamachandran commented on a change in pull request #26970: [SPARK-28825][SQL][DOC] Documentation for Explain Command

2019-12-23 Thread GitBox
PavithraRamachandran commented on a change in pull request #26970: [SPARK-28825][SQL][DOC] Documentation for Explain Command URL: https://github.com/apache/spark/pull/26970#discussion_r360912660 ## File path: docs/sql-ref-syntax-qry-explain.md ## @@ -19,4 +19,157 @@

[GitHub] [spark] srowen commented on issue #26979: [SPARK-28144][SPARK-29294][SS][FOLLOWUP] Use SystemTime defined in Kafka Time interface

2019-12-23 Thread GitBox
srowen commented on issue #26979: [SPARK-28144][SPARK-29294][SS][FOLLOWUP] Use SystemTime defined in Kafka Time interface URL: https://github.com/apache/spark/pull/26979#issuecomment-568494522 @shaneknapp any chance of kicking them? I'm seeing all the PR builders not respond

[GitHub] [spark] srowen commented on a change in pull request #26735: [SPARK-30102][ML][PYSPARK] GMM supports instance weighting

2019-12-23 Thread GitBox
srowen commented on a change in pull request #26735: [SPARK-30102][ML][PYSPARK] GMM supports instance weighting URL: https://github.com/apache/spark/pull/26735#discussion_r360904808 ## File path: mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ##

[GitHub] [spark] srowen commented on a change in pull request #26735: [SPARK-30102][ML][PYSPARK] GMM supports instance weighting

2019-12-23 Thread GitBox
srowen commented on a change in pull request #26735: [SPARK-30102][ML][PYSPARK] GMM supports instance weighting URL: https://github.com/apache/spark/pull/26735#discussion_r360904486 ## File path: mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ##

[GitHub] [spark] srowen commented on a change in pull request #26948: [SPARK-30120][ML] LSH approxNearestNeighbors should use BoundedPriorityQueue when numNearestNeighbors is small

2019-12-23 Thread GitBox
srowen commented on a change in pull request #26948: [SPARK-30120][ML] LSH approxNearestNeighbors should use BoundedPriorityQueue when numNearestNeighbors is small URL: https://github.com/apache/spark/pull/26948#discussion_r360901804 ## File path:

[GitHub] [spark] seayoun commented on issue #26975: [SPARK-26975][CORE] Stage retry and executor crash cause app hung up forever

2019-12-23 Thread GitBox
seayoun commented on issue #26975: [SPARK-26975][CORE] Stage retry and executor crash cause app hung up forever URL: https://github.com/apache/spark/pull/26975#issuecomment-568472226 @Ngone51 Please look at this patch, thanks!

[GitHub] [spark] seayoun commented on a change in pull request #26975: [SPARK-26975][CORE] Stage retry and executor crash cause app hung up forever

2019-12-23 Thread GitBox
seayoun commented on a change in pull request #26975: [SPARK-26975][CORE] Stage retry and executor crash cause app hung up forever URL: https://github.com/apache/spark/pull/26975#discussion_r360880185 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala

[GitHub] [spark] cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API URL: https://github.com/apache/spark/pull/26809#discussion_r360878939 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala ## @@ -1880,6 +1885,17

[GitHub] [spark] cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API URL: https://github.com/apache/spark/pull/26809#discussion_r360878656 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala ## @@

[GitHub] [spark] cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API URL: https://github.com/apache/spark/pull/26809#discussion_r360878634 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala ## @@

[GitHub] [spark] zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-23 Thread GitBox
zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors URL: https://github.com/apache/spark/pull/26982#issuecomment-568462174 retest this please This is an automated message from

[GitHub] [spark] zhengruifeng commented on a change in pull request #26550: [WIP][SPARK-29334][ML][MLLIB] Support for basic vector operators

2019-12-23 Thread GitBox
zhengruifeng commented on a change in pull request #26550: [WIP][SPARK-29334][ML][MLLIB] Support for basic vector operators URL: https://github.com/apache/spark/pull/26550#discussion_r360870985 ## File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala

[GitHub] [spark] zhengruifeng commented on a change in pull request #26550: [WIP][SPARK-29334][ML][MLLIB] Support for basic vector operators

2019-12-23 Thread GitBox
zhengruifeng commented on a change in pull request #26550: [WIP][SPARK-29334][ML][MLLIB] Support for basic vector operators URL: https://github.com/apache/spark/pull/26550#discussion_r360870324 ## File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala

[GitHub] [spark] zhengruifeng commented on a change in pull request #26550: [WIP][SPARK-29334][ML][MLLIB] Support for basic vector operators

2019-12-23 Thread GitBox
zhengruifeng commented on a change in pull request #26550: [WIP][SPARK-29334][ML][MLLIB] Support for basic vector operators URL: https://github.com/apache/spark/pull/26550#discussion_r360868131 ## File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala

[GitHub] [spark] wangshuo128 commented on issue #26924: [SPARK-30285][CORE] Fix deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError

2019-12-23 Thread GitBox
wangshuo128 commented on issue #26924: [SPARK-30285][CORE] Fix deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError URL: https://github.com/apache/spark/pull/26924#issuecomment-568458654 @vanzin @squito Thanks a lot for helping with this. I refined the

[GitHub] [spark] sandeep-katta commented on issue #26986: [SPARK-30333][CORE] Upgrade jackson-databind to 2.6.7.3

2019-12-23 Thread GitBox
sandeep-katta commented on issue #26986: [SPARK-30333][CORE] Upgrade jackson-databind to 2.6.7.3 URL: https://github.com/apache/spark/pull/26986#issuecomment-568458319 @srowen @dongjoon-hyun This is an automated message

[GitHub] [spark] sandeep-katta opened a new pull request #26986: [SPARK-30333][CORE] Upgrade jackson-databind to 2.6.7.3

2019-12-23 Thread GitBox
sandeep-katta opened a new pull request #26986: [SPARK-30333][CORE] Upgrade jackson-databind to 2.6.7.3 URL: https://github.com/apache/spark/pull/26986 ### What changes were proposed in this pull request? Upgrade jackson-databind to 2.6.7.3 to following CVE CVE-2018-14718 -

[GitHub] [spark] zhengruifeng commented on a change in pull request #26550: [WIP][SPARK-29334][ML][MLLIB] Support for basic vector operators

2019-12-23 Thread GitBox
zhengruifeng commented on a change in pull request #26550: [WIP][SPARK-29334][ML][MLLIB] Support for basic vector operators URL: https://github.com/apache/spark/pull/26550#discussion_r360865064 ## File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala

[GitHub] [spark] zhengruifeng commented on a change in pull request #26550: [WIP][SPARK-29334][ML][MLLIB] Support for basic vector operators

2019-12-23 Thread GitBox
zhengruifeng commented on a change in pull request #26550: [WIP][SPARK-29334][ML][MLLIB] Support for basic vector operators URL: https://github.com/apache/spark/pull/26550#discussion_r360865064 ## File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala

[GitHub] [spark] Ngone51 commented on issue #26981: [SPARK-26389][SS][FOLLOW-UP]Format config name to follow the other boolean conf naming convention

2019-12-23 Thread GitBox
Ngone51 commented on issue #26981: [SPARK-26389][SS][FOLLOW-UP]Format config name to follow the other boolean conf naming convention URL: https://github.com/apache/spark/pull/26981#issuecomment-568456288 It seems that Jenkins has crashed for the whole day...

[GitHub] [spark] zhengruifeng commented on a change in pull request #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-23 Thread GitBox
zhengruifeng commented on a change in pull request #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors URL: https://github.com/apache/spark/pull/26982#discussion_r360864156 ## File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala ##

[GitHub] [spark] zhengruifeng commented on a change in pull request #26735: [SPARK-30102][ML][PYSPARK] GMM supports instance weighting

2019-12-23 Thread GitBox
zhengruifeng commented on a change in pull request #26735: [SPARK-30102][ML][PYSPARK] GMM supports instance weighting URL: https://github.com/apache/spark/pull/26735#discussion_r360862611 ## File path: mllib/src/main/scala/org/apache/spark/ml/util/MetadataUtils.scala ##

[GitHub] [spark] zhengruifeng edited a comment on issue #26735: [SPARK-30102][ML][PYSPARK] GMM supports instance weighting

2019-12-23 Thread GitBox
zhengruifeng edited a comment on issue #26735: [SPARK-30102][ML][PYSPARK] GMM supports instance weighting URL: https://github.com/apache/spark/pull/26735#issuecomment-568451629 @huaxingao I found that the difference comes from the method to compute var `sumWeights`. In master, it keeps

[GitHub] [spark] cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API URL: https://github.com/apache/spark/pull/26809#discussion_r360862002 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ## @@ -309,20

[GitHub] [spark] zhengruifeng commented on issue #26735: [SPARK-30102][WIP][ML][PYSPARK] GMM supports instance weighting

2019-12-23 Thread GitBox
zhengruifeng commented on issue #26735: [SPARK-30102][WIP][ML][PYSPARK] GMM supports instance weighting URL: https://github.com/apache/spark/pull/26735#issuecomment-568451629 @huaxingao I found that the difference come from the method to compute var `sumWeights`. In master, it keep the

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox
Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend URL: https://github.com/apache/spark/pull/26980#discussion_r360858546 ## File path:

[GitHub] [spark] cloud-fan commented on issue #26981: [SPARK-26389][SS][FOLLOW-UP]Format config name to follow the other boolean conf naming convention

2019-12-23 Thread GitBox
cloud-fan commented on issue #26981: [SPARK-26389][SS][FOLLOW-UP]Format config name to follow the other boolean conf naming convention URL: https://github.com/apache/spark/pull/26981#issuecomment-568450590 retest this please

[GitHub] [spark] cloud-fan commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend URL: https://github.com/apache/spark/pull/26980#discussion_r360857961 ## File path:

[GitHub] [spark] cloud-fan commented on a change in pull request #26983: [SPARK-30331][SQL] Set isFinalPlan to true before posting the final AdaptiveSparkPlan event

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26983: [SPARK-30331][SQL] Set isFinalPlan to true before posting the final AdaptiveSparkPlan event URL: https://github.com/apache/spark/pull/26983#discussion_r360857625 ## File path:

[GitHub] [spark] cloud-fan commented on a change in pull request #26983: [SPARK-30331][SQL] Set isFinalPlan to true before posting the final AdaptiveSparkPlan event

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26983: [SPARK-30331][SQL] Set isFinalPlan to true before posting the final AdaptiveSparkPlan event URL: https://github.com/apache/spark/pull/26983#discussion_r360857528 ## File path:

[GitHub] [spark] zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-23 Thread GitBox
zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors URL: https://github.com/apache/spark/pull/26982#issuecomment-568449762 friendly ping @srowen This is an automated message

[GitHub] [spark] MaxGekk commented on issue #26973: [SPARK-30323][SQL] Support filters pushdown in CSV datasource

2019-12-23 Thread GitBox
MaxGekk commented on issue #26973: [SPARK-30323][SQL] Support filters pushdown in CSV datasource URL: https://github.com/apache/spark/pull/26973#issuecomment-568447761 jenkins, retest this, please This is an automated

[GitHub] [spark] Ngone51 commented on a change in pull request #26938: [SPARK-30297][CORE] Fix executor lost in net cause app hung upp

2019-12-23 Thread GitBox
Ngone51 commented on a change in pull request #26938: [SPARK-30297][CORE] Fix executor lost in net cause app hung upp URL: https://github.com/apache/spark/pull/26938#discussion_r360854474 ## File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ## @@

[GitHub] [spark] MaxGekk commented on issue #23541: [SPARK-26618][SQL] Make typed Timestamp/Date literals consistent to casting

2019-12-23 Thread GitBox
MaxGekk commented on issue #23541: [SPARK-26618][SQL] Make typed Timestamp/Date literals consistent to casting URL: https://github.com/apache/spark/pull/23541#issuecomment-568447202 Here is the PR https://github.com/apache/spark/pull/26985 which updates the SQL migration guide.

[GitHub] [spark] MaxGekk opened a new pull request #26985: [SPARK-26618][SQL][FOLLOWUP] Update the SQL migration guide regarding typed `TIMESTAMP` and `DATE` literals

2019-12-23 Thread GitBox
MaxGekk opened a new pull request #26985: [SPARK-26618][SQL][FOLLOWUP] Update the SQL migration guide regarding typed `TIMESTAMP` and `DATE` literals URL: https://github.com/apache/spark/pull/26985 ### What changes were proposed in this pull request? In the PR, I propose to update

[GitHub] [spark] WinkerDu commented on issue #26971: [SPARK-30320][SQL] Fix insert overwrite to DataSource table with dynamic partition error

2019-12-23 Thread GitBox
WinkerDu commented on issue #26971: [SPARK-30320][SQL] Fix insert overwrite to DataSource table with dynamic partition error URL: https://github.com/apache/spark/pull/26971#issuecomment-568442052 cc @cloud-fan This is an

[GitHub] [spark] zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-23 Thread GitBox
zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors URL: https://github.com/apache/spark/pull/26982#issuecomment-568436623 retest this please This is an automated message from

[GitHub] [spark] seayoun commented on issue #26938: [SPARK-30297][CORE] Fix executor lost in net cause app hung upp

2019-12-23 Thread GitBox
seayoun commented on issue #26938: [SPARK-30297][CORE] Fix executor lost in net cause app hung upp URL: https://github.com/apache/spark/pull/26938#issuecomment-568431278 A better way to fix here https://github.com/apache/spark/pull/26980

[GitHub] [spark] chummyhe89 opened a new pull request #26984: [SPARK-27641][CORE] Fix MetricsSystem to remove metrics of the specif…

2019-12-23 Thread GitBox
chummyhe89 opened a new pull request #26984: [SPARK-27641][CORE] Fix MetricsSystem to remove metrics of the specif… URL: https://github.com/apache/spark/pull/26984 ## What changes were proposed in this pull request? * check whether the source is registered before remove it. *

[GitHub] [spark] manuzhang commented on a change in pull request #26434: [SPARK-29544] [SQL] optimize skewed partition based on data size

2019-12-23 Thread GitBox
manuzhang commented on a change in pull request #26434: [SPARK-29544] [SQL] optimize skewed partition based on data size URL: https://github.com/apache/spark/pull/26434#discussion_r360836983 ## File path:

[GitHub] [spark] chummyhe89 closed pull request #24556: [SPARK-27641][CORE] Fix MetricsSystem to remove unregistered source correctly

2019-12-23 Thread GitBox
chummyhe89 closed pull request #24556: [SPARK-27641][CORE] Fix MetricsSystem to remove unregistered source correctly URL: https://github.com/apache/spark/pull/24556 This is an automated message from the Apache Git Service.

[GitHub] [spark] chummyhe89 commented on issue #24556: [SPARK-27641][CORE] Fix MetricsSystem to remove unregistered source correctly

2019-12-23 Thread GitBox
chummyhe89 commented on issue #24556: [SPARK-27641][CORE] Fix MetricsSystem to remove unregistered source correctly URL: https://github.com/apache/spark/pull/24556#issuecomment-568430293 too old to merge,so close it This is

[GitHub] [spark] seayoun commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox
seayoun commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend URL: https://github.com/apache/spark/pull/26980#discussion_r360834151 ## File path:

[GitHub] [spark] manuzhang commented on a change in pull request #26968: [SPARK-29906][SQL][FOLLOWUP] Update the final plan in UI for AQE

2019-12-23 Thread GitBox
manuzhang commented on a change in pull request #26968: [SPARK-29906][SQL][FOLLOWUP] Update the final plan in UI for AQE URL: https://github.com/apache/spark/pull/26968#discussion_r360823766 ## File path:

[GitHub] [spark] manuzhang opened a new pull request #26983: [SPARK-30331][SQL] Set isFinalPlan to true before posting the final AdaptiveSparkPlan event

2019-12-23 Thread GitBox
manuzhang opened a new pull request #26983: [SPARK-30331][SQL] Set isFinalPlan to true before posting the final AdaptiveSparkPlan event URL: https://github.com/apache/spark/pull/26983 ### What changes were proposed in this pull request? Set `isFinalPlan=true` before posting the

[GitHub] [spark] cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API URL: https://github.com/apache/spark/pull/26809#discussion_r360820286 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ## @@ -309,20

[GitHub] [spark] cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API URL: https://github.com/apache/spark/pull/26809#discussion_r360820167 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/LocalTableScanExec.scala ## @@

[GitHub] [spark] cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API URL: https://github.com/apache/spark/pull/26809#discussion_r360819691 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/LocalTableScanExec.scala ## @@

[GitHub] [spark] cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API URL: https://github.com/apache/spark/pull/26809#discussion_r360819691 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/LocalTableScanExec.scala ## @@

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox
Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend URL: https://github.com/apache/spark/pull/26980#discussion_r360819689 ## File path:

[GitHub] [spark] Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox
Ngone51 commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend URL: https://github.com/apache/spark/pull/26980#discussion_r360819557 ## File path:

[GitHub] [spark] cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API URL: https://github.com/apache/spark/pull/26809#discussion_r360819117 ## File path:

[GitHub] [spark] zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors

2019-12-23 Thread GitBox
zhengruifeng commented on issue #26982: [SPARK-30329][ML] add iterator/foreach methods for Vectors URL: https://github.com/apache/spark/pull/26982#issuecomment-568413966 Performace test for `SparseVector.iterator`: ```scala import org.apache.spark.ml.linalg._ import

[GitHub] [spark] cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26809: [SPARK-30185][SQL] Implement Dataset.tail API URL: https://github.com/apache/spark/pull/26809#discussion_r360818860 ## File path:

[GitHub] [spark] cloud-fan commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend URL: https://github.com/apache/spark/pull/26980#discussion_r360817459 ## File path:

[GitHub] [spark] cloud-fan commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox
cloud-fan commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend URL: https://github.com/apache/spark/pull/26980#discussion_r360817535 ## File path:

[GitHub] [spark] seayoun commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox
seayoun commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend URL: https://github.com/apache/spark/pull/26980#discussion_r360815277 ## File path:

[GitHub] [spark] seayoun commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend

2019-12-23 Thread GitBox
seayoun commented on a change in pull request #26980: [SPARK-27348][Core] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend URL: https://github.com/apache/spark/pull/26980#discussion_r360815277 ## File path:

[GitHub] [spark] MaxGekk commented on issue #26973: [SPARK-30323][SQL] Support filters pushdown in CSV datasource

2019-12-23 Thread GitBox
MaxGekk commented on issue #26973: [SPARK-30323][SQL] Support filters pushdown in CSV datasource URL: https://github.com/apache/spark/pull/26973#issuecomment-568409809 @HyukjinKwon I have updated the PR description. Please, tell me if something is unclear.

[GitHub] [spark] manuzhang commented on a change in pull request #26968: [SPARK-29906][SQL][FOLLOWUP] Update the final plan in UI for AQE

2019-12-23 Thread GitBox
manuzhang commented on a change in pull request #26968: [SPARK-29906][SQL][FOLLOWUP] Update the final plan in UI for AQE URL: https://github.com/apache/spark/pull/26968#discussion_r360810996 ## File path:

<    4   5   6   7   8   9   10   >