[jira] [Resolved] (SPARK-26607) Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite

2019-01-11 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-26607.
---
   Resolution: Fixed
 Assignee: Dongjoon Hyun
Fix Version/s: 3.0.0
   2.4.1
   2.3.3

This is resolved via https://github.com/apache/spark/pull/23526

> Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite
> 
>
> Key: SPARK-26607
> URL: https://issues.apache.org/jira/browse/SPARK-26607
> Project: Spark
>  Issue Type: Task
>  Components: SQL, Tests
>Affects Versions: 2.3.3, 2.4.1, 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
> Fix For: 2.3.3, 2.4.1, 3.0.0
>
>
> The final release of `branch-2.2` vote passed and the branch goes EOL. This 
> issue removes Spark 2.2.x from the testing coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26608) Remove Jenkins jobs for `branch-2.2`

2019-01-11 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-26608:
--
Description: 
This issue aims to remove the following Jenkins jobs for `branch-2.2` because 
of EOL.
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.6/]
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.7/]
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.6/]
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.7/]

As of today, the branch is healthy.

!Screen Shot 2019-01-11 at 8.47.27 PM.png!

  was:
This issue aims to remove the following Jenkins jobs for `branch-2.2` because 
of EOL.
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.6/]
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.7/]
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.6/]
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.7/]


> Remove Jenkins jobs for `branch-2.2`
> 
>
> Key: SPARK-26608
> URL: https://issues.apache.org/jira/browse/SPARK-26608
> Project: Spark
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 2.2.3
>Reporter: Dongjoon Hyun
>Priority: Major
> Attachments: Screen Shot 2019-01-11 at 8.47.27 PM.png
>
>
> This issue aims to remove the following Jenkins jobs for `branch-2.2` because 
> of EOL.
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.6/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.7/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.6/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.7/]
> As of today, the branch is healthy.
> !Screen Shot 2019-01-11 at 8.47.27 PM.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26608) Remove Jenkins jobs for `branch-2.2`

2019-01-11 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-26608:
--
Attachment: Screen Shot 2019-01-11 at 8.47.27 PM.png

> Remove Jenkins jobs for `branch-2.2`
> 
>
> Key: SPARK-26608
> URL: https://issues.apache.org/jira/browse/SPARK-26608
> Project: Spark
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 2.2.3
>Reporter: Dongjoon Hyun
>Priority: Major
> Attachments: Screen Shot 2019-01-11 at 8.47.27 PM.png
>
>
> This issue aims to remove the following Jenkins jobs for `branch-2.2` because 
> of EOL.
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.6/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.7/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.6/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.7/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26608) Remove Jenkins jobs for `branch-2.2`

2019-01-11 Thread Dongjoon Hyun (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740994#comment-16740994
 ] 

Dongjoon Hyun commented on SPARK-26608:
---

Hi, [~shaneknapp]. Could you take a look when you have some time next week?

> Remove Jenkins jobs for `branch-2.2`
> 
>
> Key: SPARK-26608
> URL: https://issues.apache.org/jira/browse/SPARK-26608
> Project: Spark
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 2.2.3
>Reporter: Dongjoon Hyun
>Priority: Major
>
> This issue aims to remove the following Jenkins jobs for `branch-2.2` because 
> of EOL.
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.6/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.7/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.6/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.7/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26608) Remove Jenkins jobs for `branch-2.2`

2019-01-11 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-26608:
-

 Summary: Remove Jenkins jobs for `branch-2.2`
 Key: SPARK-26608
 URL: https://issues.apache.org/jira/browse/SPARK-26608
 Project: Spark
  Issue Type: Task
  Components: Tests
Affects Versions: 2.2.3
Reporter: Dongjoon Hyun


This issue aims to remove the following Jenkins jobs for `branch-2.2`.
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.6/
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.7/
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.6/
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.7/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26608) Remove Jenkins jobs for `branch-2.2`

2019-01-11 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-26608:
--
Description: 
This issue aims to remove the following Jenkins jobs for `branch-2.2` because 
of EOL.
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.6/]
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.7/]
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.6/]
 - 
[https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.7/]

  was:
This issue aims to remove the following Jenkins jobs for `branch-2.2`.
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.6/
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.7/
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.6/
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.7/


> Remove Jenkins jobs for `branch-2.2`
> 
>
> Key: SPARK-26608
> URL: https://issues.apache.org/jira/browse/SPARK-26608
> Project: Spark
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 2.2.3
>Reporter: Dongjoon Hyun
>Priority: Major
>
> This issue aims to remove the following Jenkins jobs for `branch-2.2` because 
> of EOL.
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.6/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-maven-hadoop-2.7/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.6/]
>  - 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.2-test-sbt-hadoop-2.7/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26607) Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite

2019-01-11 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-26607:
--
Issue Type: Task  (was: Bug)

> Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite
> 
>
> Key: SPARK-26607
> URL: https://issues.apache.org/jira/browse/SPARK-26607
> Project: Spark
>  Issue Type: Task
>  Components: SQL, Tests
>Affects Versions: 2.3.3, 2.4.1, 3.0.0
>Reporter: Dongjoon Hyun
>Priority: Minor
>
> The final release of `branch-2.2` vote passed and the branch goes EOL. This 
> issue removes Spark 2.2.x from the testing coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26564) Fix wrong assertions and error messages for parameter checking

2019-01-11 Thread Kengo Seki (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kengo Seki updated SPARK-26564:
---
Description: 
I mistakenly set an equivalent value with spark.network.timeout to 
spark.executor.heartbeatInterval and got the following error:

{code}
java.lang.IllegalArgumentException: requirement failed: The value of 
spark.network.timeout=120s must be no less than the value of 
spark.executor.heartbeatInterval=120s.
{code}

But it can be read as they could be equal. "Greater than" is more precise than 
"no less than".



In addition, the following assertions are inconsistent with their messages.

{code:title=mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala}
 91   require(maxIter >= 0, s"maxIter must be a positive integer: $maxIter")
{code}

{code:title=sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala}
416   require(capacity < 51200, "Cannot broadcast more than 512 
millions rows")
{code}

  was:
I mistakenly set an equivalent value with spark.network.timeout to 
spark.executor.heartbeatInterval and got the following error:

{code}
java.lang.IllegalArgumentException: requirement failed: The value of 
spark.network.timeout=120s must be no less than the value of 
spark.executor.heartbeatInterval=120s.
{code}

But it can be read as they could be equal. "Greater than" is more precise than 
"no less than".



In addition, the following assertions are inconsistent with their messages and 
the messages are right.

{code:title=mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala}
 91   require(maxIter >= 0, s"maxIter must be a positive integer: $maxIter")
{code}

{code:title=sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala}
416   require(capacity < 51200, "Cannot broadcast more than 512 
millions rows")
{code}


> Fix wrong assertions and error messages for parameter checking
> --
>
> Key: SPARK-26564
> URL: https://issues.apache.org/jira/browse/SPARK-26564
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib, Spark Core, SQL
>Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.4.0
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Minor
>  Labels: starter
>
> I mistakenly set an equivalent value with spark.network.timeout to 
> spark.executor.heartbeatInterval and got the following error:
> {code}
> java.lang.IllegalArgumentException: requirement failed: The value of 
> spark.network.timeout=120s must be no less than the value of 
> spark.executor.heartbeatInterval=120s.
> {code}
> But it can be read as they could be equal. "Greater than" is more precise 
> than "no less than".
> 
> In addition, the following assertions are inconsistent with their messages.
> {code:title=mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala}
>  91   require(maxIter >= 0, s"maxIter must be a positive integer: $maxIter")
> {code}
> {code:title=sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala}
> 416   require(capacity < 51200, "Cannot broadcast more than 512 
> millions rows")
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26607) Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26607:


Assignee: (was: Apache Spark)

> Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite
> 
>
> Key: SPARK-26607
> URL: https://issues.apache.org/jira/browse/SPARK-26607
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 2.3.3, 2.4.1, 3.0.0
>Reporter: Dongjoon Hyun
>Priority: Minor
>
> The final release of `branch-2.2` vote passed and the branch goes EOL. This 
> issue removes Spark 2.2.x from the testing coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26607) Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite

2019-01-11 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-26607:
--
Component/s: Tests

> Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite
> 
>
> Key: SPARK-26607
> URL: https://issues.apache.org/jira/browse/SPARK-26607
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 2.3.3, 2.4.1, 3.0.0
>Reporter: Dongjoon Hyun
>Priority: Minor
>
> The final release of `branch-2.2` vote passed and the branch goes EOL. This 
> issue removes Spark 2.2.x from the testing coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26607) Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26607:


Assignee: Apache Spark

> Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite
> 
>
> Key: SPARK-26607
> URL: https://issues.apache.org/jira/browse/SPARK-26607
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 2.3.3, 2.4.1, 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Apache Spark
>Priority: Minor
>
> The final release of `branch-2.2` vote passed and the branch goes EOL. This 
> issue removes Spark 2.2.x from the testing coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26607) Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite

2019-01-11 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-26607:
-

 Summary: Remove Spark 2.2.x testing from 
HiveExternalCatalogVersionsSuite
 Key: SPARK-26607
 URL: https://issues.apache.org/jira/browse/SPARK-26607
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.3, 2.4.1, 3.0.0
Reporter: Dongjoon Hyun


The final release of `branch-2.2` vote passed and the branch goes EOL. This 
issue removes Spark 2.2.x from the testing coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-25692) Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks

2019-01-11 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-25692.
---
   Resolution: Fixed
 Assignee: Dongjoon Hyun
Fix Version/s: 3.0.0

This is resolved via https://github.com/apache/spark/pull/23522.

> Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks
> --
>
> Key: SPARK-25692
> URL: https://issues.apache.org/jira/browse/SPARK-25692
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Shixiong Zhu
>Assignee: Dongjoon Hyun
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: Screen Shot 2018-10-22 at 4.12.41 PM.png, Screen Shot 
> 2018-11-01 at 10.17.16 AM.png
>
>
> Looks like the whole test suite is pretty flaky. See: 
> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.6/5490/testReport/junit/org.apache.spark.network/ChunkFetchIntegrationSuite/history/
> This may be a regression in 3.0 as this didn't happen in 2.4 branch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26595) Allow delegation token renewal without a keytab

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26595:


Assignee: (was: Apache Spark)

> Allow delegation token renewal without a keytab
> ---
>
> Key: SPARK-26595
> URL: https://issues.apache.org/jira/browse/SPARK-26595
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Marcelo Vanzin
>Priority: Major
>
> Currently the delegation token renewal feature requires the user to provide 
> Spark with a keytab.
> It would be nice for this to also be supported when the user doesn't have a 
> keytab, as long as the user keeps a valid kerberos login. Spark has access to 
> the user's credential cache in that case, and can keep tokens updated much 
> like in the keytab case.
> It's not as automatic as with keytabs, but can help in some environments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26595) Allow delegation token renewal without a keytab

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26595:


Assignee: Apache Spark

> Allow delegation token renewal without a keytab
> ---
>
> Key: SPARK-26595
> URL: https://issues.apache.org/jira/browse/SPARK-26595
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Marcelo Vanzin
>Assignee: Apache Spark
>Priority: Major
>
> Currently the delegation token renewal feature requires the user to provide 
> Spark with a keytab.
> It would be nice for this to also be supported when the user doesn't have a 
> keytab, as long as the user keeps a valid kerberos login. Spark has access to 
> the user's credential cache in that case, and can keep tokens updated much 
> like in the keytab case.
> It's not as automatic as with keytabs, but can help in some environments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26586) Streaming queries should have isolated SparkSessions and confs

2019-01-11 Thread Shixiong Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu updated SPARK-26586:
-
Affects Version/s: 2.2.1
   2.2.2
   2.3.1
   2.3.2

> Streaming queries should have isolated SparkSessions and confs
> --
>
> Key: SPARK-26586
> URL: https://issues.apache.org/jira/browse/SPARK-26586
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Structured Streaming
>Affects Versions: 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.3.2, 2.4.0
>Reporter: Mukul Murthy
>Assignee: Mukul Murthy
>Priority: Major
> Fix For: 2.4.1, 3.0.0
>
>
> When a stream is started, the stream's config is supposed to be frozen and 
> all batches run with the config at start time. However, due to a race 
> condition in creating streams, updating a conf value in the active spark 
> session immediately after starting a stream can lead to the stream getting 
> that updated value.
>  
> The problem is that when StreamingQueryManager creates a MicrobatchExecution 
> (or ContinuousExecution), it passes in the shared spark session, and the 
> spark session isn't cloned until StreamExecution.start() is called. 
> DataStreamWriter.start() should not return until the SparkSession is cloned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26586) Streaming queries should have isolated SparkSessions and confs

2019-01-11 Thread Shixiong Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu updated SPARK-26586:
-
Affects Version/s: 2.2.0

> Streaming queries should have isolated SparkSessions and confs
> --
>
> Key: SPARK-26586
> URL: https://issues.apache.org/jira/browse/SPARK-26586
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Structured Streaming
>Affects Versions: 2.2.0, 2.3.0, 2.4.0
>Reporter: Mukul Murthy
>Assignee: Mukul Murthy
>Priority: Major
> Fix For: 2.4.1, 3.0.0
>
>
> When a stream is started, the stream's config is supposed to be frozen and 
> all batches run with the config at start time. However, due to a race 
> condition in creating streams, updating a conf value in the active spark 
> session immediately after starting a stream can lead to the stream getting 
> that updated value.
>  
> The problem is that when StreamingQueryManager creates a MicrobatchExecution 
> (or ContinuousExecution), it passes in the shared spark session, and the 
> spark session isn't cloned until StreamExecution.start() is called. 
> DataStreamWriter.start() should not return until the SparkSession is cloned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Attachment: (was: Screen Shot 2019-01-11 at 1.12.33 PM.png)

> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
>
> driver.extraJavaOptions and executor.extraJavaOptions are not being picked up 
> . Even though I see the parameters are being passed fine in the spark launch 
> command I do not see these parameters are being picked up for some unknown 
> reason. My source code throws an error stating the java params are empty
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`
>  
>  
> Is there any other way I can access the java params with out using 
> extraJavaOptions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Attachment: (was: Screen Shot 2019-01-09 at 4.31.01 PM.png)

> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions are not being picked up 
> . Even though I see the parameters are being passed fine in the spark launch 
> command I do not see these parameters are being picked up for some unknown 
> reason. My source code throws an error stating the java params are empty
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`
>  
>  
> Is there any other way I can access the java params with out using 
> extraJavaOptions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Ravindra (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740683#comment-16740683
 ] 

Ravindra edited comment on SPARK-26606 at 1/11/19 8:09 PM:
---

I can see the launch command which has the expected values. But the individual 
parts like  "-Dapp.env=prod"  "-Dapp.country=US" "-Dapp.banner=WMT" are 
missing. I am suspecting this could be the reason why my source code is 
throwing an exception that the jvm params are null. 

 


was (Author: rrb441):
So, if you look at the below image, you can see the spark launch command for my 
spark job. So, you can see the launch command which has the expected values. 
But the individual parts like  "-Dapp.env=prod"  "-Dapp.country=US" 
"-Dapp.banner=WMT" are missing. I am suspecting this could be the reason why my 
source code is throwing an exception that the jvm params are null. 

 

> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions are not being picked up 
> . Even though I see the parameters are being passed fine in the spark launch 
> command I do not see these parameters are being picked up for some unknown 
> reason. My source code throws an error stating the java params are empty
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`
>  
>  
> Is there any other way I can access the java params with out using 
> extraJavaOptions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Ravindra (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740683#comment-16740683
 ] 

Ravindra edited comment on SPARK-26606 at 1/11/19 8:08 PM:
---

So, if you look at the below image, you can see the spark launch command for my 
spark job. So, you can see the launch command which has the expected values. 
But the individual parts like  "-Dapp.env=prod"  "-Dapp.country=US" 
"-Dapp.banner=WMT" are missing. I am suspecting this could be the reason why my 
source code is throwing an exception that the jvm params are null. 

 


was (Author: rrb441):
So, if you look at the below image, you can see the spark launch command for my 
spark job. So, you can see the launch command which has the expected values. 
But the individual parts like  "-Dapp.env=prod"  "-Dapp.country=US" 
"-Dapp.banner=WMT" are missing. I am suspecting this could be the reason why my 
source code is throwing an exception that the jvm params are null.   !Screen 
Shot 2019-01-11 at 1.12.33 PM.png!

> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions are not being picked up 
> . Even though I see the parameters are being passed fine in the spark launch 
> command I do not see these parameters are being picked up for some unknown 
> reason. My source code throws an error stating the java params are empty
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`
>  
>  
> Is there any other way I can access the java params with out using 
> extraJavaOptions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26586) Streaming queries should have isolated SparkSessions and confs

2019-01-11 Thread Shixiong Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu updated SPARK-26586:
-
Target Version/s: 3.0.0  (was: 2.5.0, 3.0.0)

> Streaming queries should have isolated SparkSessions and confs
> --
>
> Key: SPARK-26586
> URL: https://issues.apache.org/jira/browse/SPARK-26586
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Structured Streaming
>Affects Versions: 2.3.0, 2.4.0
>Reporter: Mukul Murthy
>Assignee: Mukul Murthy
>Priority: Major
> Fix For: 2.4.1, 3.0.0
>
>
> When a stream is started, the stream's config is supposed to be frozen and 
> all batches run with the config at start time. However, due to a race 
> condition in creating streams, updating a conf value in the active spark 
> session immediately after starting a stream can lead to the stream getting 
> that updated value.
>  
> The problem is that when StreamingQueryManager creates a MicrobatchExecution 
> (or ContinuousExecution), it passes in the shared spark session, and the 
> spark session isn't cloned until StreamExecution.start() is called. 
> DataStreamWriter.start() should not return until the SparkSession is cloned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Description: 
driver.extraJavaOptions and executor.extraJavaOptions are not being picked up . 
Even though I see the parameters are being passed fine in the spark launch 
command I do not see these parameters are being picked up for some unknown 
reason. My source code throws an error stating the java params are empty

 

This is my spark submit command: 

    output=`spark-submit \
 --class com.demo.myApp.App \
 --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
-Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
-Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --executor-memory "$EXECUTOR_MEMORY" \
 --executor-cores "$EXECUTOR_CORES" \
 --total-executor-cores "$TOTAL_CORES" \
 --driver-memory "$DRIVER_MEMORY" \
 --deploy-mode cluster \
 /home/spark/asm//current/myapp-*.jar 2>&1 &`

 

 

Is there any other way I can access the java params with out using 
extraJavaOptions. 

  was:
driver.extraJavaOptions and executor.extraJavaOptions are not being picked up . 
Even though I see the parameters are being passed fine in the spark launch 
command I do not see these parameters are being picked up for some unknown 
reason. My source code throws an error stating the java params are empty

 

This is my spark submit command: 

    output=`spark-submit \
 --class com.demo.myApp.App \
 --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
-Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
-Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --executor-memory "$EXECUTOR_MEMORY" \
 --executor-cores "$EXECUTOR_CORES" \
 --total-executor-cores "$TOTAL_CORES" \
 --driver-memory "$DRIVER_MEMORY" \
 --deploy-mode cluster \
 /home/spark/asm//current/myapp-*.jar 2>&1 &`


> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions are not being picked up 
> . Even though I see the parameters are being passed fine in the spark launch 
> command I do not see these parameters are being picked up for some unknown 
> reason. My source code throws an error stating the java params are empty
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`
>  
>  
> Is there any other way I can access the java params with out using 
> extraJavaOptions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26586) Streaming queries should have isolated SparkSessions and confs

2019-01-11 Thread Shixiong Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu resolved SPARK-26586.
--
   Resolution: Fixed
 Assignee: Mukul Murthy
Fix Version/s: 3.0.0
   2.4.1

> Streaming queries should have isolated SparkSessions and confs
> --
>
> Key: SPARK-26586
> URL: https://issues.apache.org/jira/browse/SPARK-26586
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Structured Streaming
>Affects Versions: 2.3.0, 2.4.0
>Reporter: Mukul Murthy
>Assignee: Mukul Murthy
>Priority: Major
> Fix For: 2.4.1, 3.0.0
>
>
> When a stream is started, the stream's config is supposed to be frozen and 
> all batches run with the config at start time. However, due to a race 
> condition in creating streams, updating a conf value in the active spark 
> session immediately after starting a stream can lead to the stream getting 
> that updated value.
>  
> The problem is that when StreamingQueryManager creates a MicrobatchExecution 
> (or ContinuousExecution), it passes in the shared spark session, and the 
> spark session isn't cloned until StreamExecution.start() is called. 
> DataStreamWriter.start() should not return until the SparkSession is cloned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Marcelo Vanzin (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740719#comment-16740719
 ] 

Marcelo Vanzin commented on SPARK-26606:


So you're using standalone cluster mode. It might be an issue with that 
combination. That's not really my area.

bq. Is there any other way I can access the java params with out using 
extraJavaOptions. 

There's a thousand different ways. Use parameters to you class instead of 
system properties, config files, etc, etc.

> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions are not being picked up 
> . Even though I see the parameters are being passed fine in the spark launch 
> command I do not see these parameters are being picked up for some unknown 
> reason. My source code throws an error stating the java params are empty
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`
>  
>  
> Is there any other way I can access the java params with out using 
> extraJavaOptions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26585) [K8S] Add additional integration tests for K8s Scheduler Backend

2019-01-11 Thread Nagaram Prasad Addepally (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nagaram Prasad Addepally resolved SPARK-26585.
--
Resolution: Won't Fix

> [K8S] Add additional integration tests for K8s Scheduler Backend 
> -
>
> Key: SPARK-26585
> URL: https://issues.apache.org/jira/browse/SPARK-26585
> Project: Spark
>  Issue Type: Test
>  Components: Kubernetes
>Affects Versions: 3.0.0
>Reporter: Nagaram Prasad Addepally
>Priority: Major
>
> I have reviewed the kubernetes integration tests and found out that following 
> cases are missing for testing scheduler backend functionality. 
>  * Run application with driver and executor image specified independently
>  * Request Pods with custom CPU and Limits
>  * Request Pods with custom Memory and memory overhead factor
>  * Request Pods with custom Memory and memory overhead
>  * Pods are relaunched on failures (as per 
> spark.kubernetes.executor.lostCheck.maxAttempts)
> Logging this Jira to add these tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26585) [K8S] Add additional integration tests for K8s Scheduler Backend

2019-01-11 Thread Nagaram Prasad Addepally (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740718#comment-16740718
 ] 

Nagaram Prasad Addepally commented on SPARK-26585:
--

Closing this Jira as per comments in PR. We do not want to add integration 
tests and instead we need to add unittests to cover these cases.

> [K8S] Add additional integration tests for K8s Scheduler Backend 
> -
>
> Key: SPARK-26585
> URL: https://issues.apache.org/jira/browse/SPARK-26585
> Project: Spark
>  Issue Type: Test
>  Components: Kubernetes
>Affects Versions: 3.0.0
>Reporter: Nagaram Prasad Addepally
>Priority: Major
>
> I have reviewed the kubernetes integration tests and found out that following 
> cases are missing for testing scheduler backend functionality. 
>  * Run application with driver and executor image specified independently
>  * Request Pods with custom CPU and Limits
>  * Request Pods with custom Memory and memory overhead factor
>  * Request Pods with custom Memory and memory overhead
>  * Pods are relaunched on failures (as per 
> spark.kubernetes.executor.lostCheck.maxAttempts)
> Logging this Jira to add these tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26551) Selecting one complex field and having is null predicate on another complex field can cause error

2019-01-11 Thread DB Tsai (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DB Tsai resolved SPARK-26551.
-
   Resolution: Fixed
Fix Version/s: 2.4.1
   3.0.0

Issue resolved by pull request 23474
[https://github.com/apache/spark/pull/23474]

> Selecting one complex field and having is null predicate on another complex 
> field can cause error
> -
>
> Key: SPARK-26551
> URL: https://issues.apache.org/jira/browse/SPARK-26551
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0, 3.0.0
>Reporter: Liang-Chi Hsieh
>Assignee: Liang-Chi Hsieh
>Priority: Major
> Fix For: 3.0.0, 2.4.1
>
>
> The query below can cause error when doing schema pruning:
> {code:java}
> val query = sql("select * from contacts")
>   .where("name.middle is not null")
>   .select(
> "id",
> "name.first",
> "name.middle",
> "name.last"
>   )
>   .where("last = 'Jones'")
>   .select(count("id")
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26605) New executors failing with expired tokens in client mode

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26605:


Assignee: (was: Apache Spark)

> New executors failing with expired tokens in client mode
> 
>
> Key: SPARK-26605
> URL: https://issues.apache.org/jira/browse/SPARK-26605
> Project: Spark
>  Issue Type: New Feature
>  Components: YARN
>Affects Versions: 2.4.0
>Reporter: Marcelo Vanzin
>Priority: Major
>
> We ran into an issue with new executors being started with expired tokens in 
> client mode; cluster mode is fine. Master branch is also not affected.
> This means that executors that start after 7 days would fail in this scenario.
> Patch coming up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26605) New executors failing with expired tokens in client mode

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26605:


Assignee: Apache Spark

> New executors failing with expired tokens in client mode
> 
>
> Key: SPARK-26605
> URL: https://issues.apache.org/jira/browse/SPARK-26605
> Project: Spark
>  Issue Type: New Feature
>  Components: YARN
>Affects Versions: 2.4.0
>Reporter: Marcelo Vanzin
>Assignee: Apache Spark
>Priority: Major
>
> We ran into an issue with new executors being started with expired tokens in 
> client mode; cluster mode is fine. Master branch is also not affected.
> This means that executors that start after 7 days would fail in this scenario.
> Patch coming up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Ravindra (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740694#comment-16740694
 ] 

Ravindra commented on SPARK-26606:
--

10.36.67.188:7077 is what I am using. The rest of all the master links over 
there are just stand by's

> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions are not being picked up 
> . Even though I see the parameters are being passed fine in the spark launch 
> command I do not see these parameters are being picked up for some unknown 
> reason. My source code throws an error stating the java params are empty
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Marcelo Vanzin (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740692#comment-16740692
 ] 

Marcelo Vanzin commented on SPARK-26606:


What master are you using?

> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions are not being picked up 
> . Even though I see the parameters are being passed fine in the spark launch 
> command I do not see these parameters are being picked up for some unknown 
> reason. My source code throws an error stating the java params are empty
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Ravindra (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740683#comment-16740683
 ] 

Ravindra edited comment on SPARK-26606 at 1/11/19 7:21 PM:
---

So, if you look at the below image, you can see the spark launch command for my 
spark job. So, you can see the launch command which has the expected values. 
But the individual parts like  "-Dapp.env=prod"  "-Dapp.country=US" 
"-Dapp.banner=WMT" are missing. I am suspecting this could be the reason why my 
source code is throwing an exception that the jvm params are null.   !Screen 
Shot 2019-01-11 at 1.12.33 PM.png!


was (Author: rrb441):
So, if you look at the below image, you can see the spark launch command for my 
spark job. So, you can see the launch command which has the expected values. 
But the individual parts like  "-Dapp.env=prod"  "-Dapp.country=US" 
"-Dapp.banner=WMT" are missing. I am suspecting this could be the reason why my 
source code throwing an exception that the jvm params are null.   !Screen Shot 
2019-01-11 at 1.12.33 PM.png!

> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions are not being picked up 
> . Even though I see the parameters are being passed fine in the spark launch 
> command I do not see these parameters are being picked up for some unknown 
> reason. My source code throws an error stating the java params are empty
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Summary: parameters passed in extraJavaOptions are not being picked up   
(was: parameters passed in extraJavaOptions are not being picked up)

> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Description: 
driver.extraJavaOptions and executor.extraJavaOptions are not being picked up . 
Even though I see the parameters are being passed fine in the spark launch 
command I do not see these parameters are being picked up for some unknown 
reason. My source code throws an error stating the java params are empty

 

This is my spark submit command: 

    output=`spark-submit \
 --class com.demo.myApp.App \
 --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
-Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
-Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --executor-memory "$EXECUTOR_MEMORY" \
 --executor-cores "$EXECUTOR_CORES" \
 --total-executor-cores "$TOTAL_CORES" \
 --driver-memory "$DRIVER_MEMORY" \
 --deploy-mode cluster \
 /home/spark/asm//current/myapp-*.jar 2>&1 &`

  was:
driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
submit command. Even though I see the parameters are being passed fine in the 
spark launch command I do not see these parameters are being picked up for some 
unknown reason. 

 

This is my spark submit command: 

    output=`spark-submit \
 --class com.demo.myApp.App \
 --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
-Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
-Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --executor-memory "$EXECUTOR_MEMORY" \
 --executor-cores "$EXECUTOR_CORES" \
 --total-executor-cores "$TOTAL_CORES" \
 --driver-memory "$DRIVER_MEMORY" \
 --deploy-mode cluster \
 /home/spark/asm//current/myapp-*.jar 2>&1 &`


> parameters passed in extraJavaOptions are not being picked up 
> --
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions are not being picked up 
> . Even though I see the parameters are being passed fine in the spark launch 
> command I do not see these parameters are being picked up for some unknown 
> reason. My source code throws an error stating the java params are empty
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) parameters passed in extraJavaOptions are not being picked up

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Summary: parameters passed in extraJavaOptions are not being picked up  
(was: extraJavaOptions does not work )

> parameters passed in extraJavaOptions are not being picked up
> -
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Description: 
driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
submit command. Even though I see the parameters are being passed fine in the 
spark launch command I do not see these parameters are being picked up for some 
unknown reason. 

 

This is my spark submit command: 

    output=`spark-submit \
 --class com.demo.myApp.App \
 --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
-Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
-Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --executor-memory "$EXECUTOR_MEMORY" \
 --executor-cores "$EXECUTOR_CORES" \
 --total-executor-cores "$TOTAL_CORES" \
 --driver-memory "$DRIVER_MEMORY" \
 --deploy-mode cluster \
 /home/spark/asm//current/myapp-*.jar 2>&1 &`

  was:
driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
submit command. Even though I see the parameters are being passed fine in the 
spark launch command I do not see these parameters are being picked up for some 
unknown reason. 

 

This is my spark submit command: 

    output=`spark-submit \
 --class com.demo.myApp.App \
 --conf 'spark.executor.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
-Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --conf 'spark.driver.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
-Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --executor-memory "$EXECUTOR_MEMORY" \
 --executor-cores "$EXECUTOR_CORES" \
 --total-executor-cores "$TOTAL_CORES" \
 --driver-memory "$DRIVER_MEMORY" \
 --deploy-mode cluster \
 /home/spark/asm//current/myapp-*.jar 2>&1 &`


> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740683#comment-16740683
 ] 

Ravindra edited comment on SPARK-26606 at 1/11/19 7:17 PM:
---

So, if you look at the below image, you can see the spark launch command for my 
spark job. So, you can see the launch command which has the expected values. 
But the individual parts like  "-Dapp.env=prod"  "-Dapp.country=US" 
"-Dapp.banner=WMT" are missing. I am suspecting this could be the reason why my 
source code throwing an exception that the jvm params are null.   !Screen Shot 
2019-01-11 at 1.12.33 PM.png!


was (Author: rrb441):
So, if you look at the below image, you can see the spark launch command for my 
spark job. So, you can see the launch command which has the expected values. 
But the parts "-Dapp.env=prod"  "-Dapp.country=US" "-Dapp.banner=WMT" are 
missing. I am suspecting this could be the reason why my source code throwing 
an exception that the jvm params are null.  !Screen Shot 2019-01-11 at 1.12.33 
PM.png!

> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740683#comment-16740683
 ] 

Ravindra commented on SPARK-26606:
--

So, if you look at the below image, you can see the spark launch command for my 
spark job. So, you can see the launch command which has the expected values. 
But the parts "-Dapp.env=prod"  "-Dapp.country=US" "-Dapp.banner=WMT" are 
missing. I am suspecting this could be the reason why my source code throwing 
an exception that the jvm params are null.  !Screen Shot 2019-01-11 at 1.12.33 
PM.png!

> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Attachment: Screen Shot 2019-01-11 at 1.12.33 PM.png

> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png, Screen Shot 
> 2019-01-11 at 1.12.33 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Marcelo Vanzin (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740675#comment-16740675
 ] 

Marcelo Vanzin commented on SPARK-26606:


Can you define what "does not work" means?

Because it works for me based on my expectations.

{noformat}
$ ./bin/spark-shell --conf "spark.driver.extraJavaOptions=-Dprop1=$LOGNAME"
...
scala> sys.props("prop1")
res0: String = vanzin
{noformat}


> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740672#comment-16740672
 ] 

Ravindra edited comment on SPARK-26606 at 1/11/19 7:10 PM:
---

I have tried removing the single quotes too and even hardcoded the reference 
variables. It does not work yet. You can also see the attached screenshot where 
it picks the reference variable values


was (Author: rrb441):
I have tried removing the single quotes too and even hardcoded the reference 
variables. It does not work yet

> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=dev -Dapp.country=US 
> -Dapp.banner=ABC -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740672#comment-16740672
 ] 

Ravindra commented on SPARK-26606:
--

I have tried removing the single quotes too and even hardcoded the reference 
variables. It does not work yet

> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=$ENV 
> -Dapp.country=$COUNTRY -Dapp.banner=$BANNER 
> -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
> -Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Description: 
driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
submit command. Even though I see the parameters are being passed fine in the 
spark launch command I do not see these parameters are being picked up for some 
unknown reason. 

 

This is my spark submit command: 

    output=`spark-submit \
 --class com.demo.myApp.App \
 --conf 'spark.executor.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
-Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --conf 'spark.driver.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
-Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
 --executor-memory "$EXECUTOR_MEMORY" \
 --executor-cores "$EXECUTOR_CORES" \
 --total-executor-cores "$TOTAL_CORES" \
 --driver-memory "$DRIVER_MEMORY" \
 --deploy-mode cluster \
 /home/spark/asm//current/myapp-*.jar 2>&1 &`

  was:
driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
submit command. Even though I see the parameters are being passed fine in the 
spark launch command I do not see these parameters are being picked up for some 
unknown reason. 

 

This is my spark submit command: 

    output=`spark-submit \
--class com.demo.myApp.App \
--conf 'spark.executor.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
-Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=file:/dev/./urandom' \
 --conf 'spark.driver.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
-Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=file:/dev/./urandom' \
 --executor-memory "$EXECUTOR_MEMORY" \
 --executor-cores "$EXECUTOR_CORES" \
 --total-executor-cores "$TOTAL_CORES" \
 --driver-memory "$DRIVER_MEMORY" \
 --deploy-mode cluster \
 /home/spark/asm//current/myapp-*.jar 2>&1 &`


> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
>  --class com.demo.myApp.App \
>  --conf 'spark.executor.extraJavaOptions=-Dapp.env=$ENV 
> -Dapp.country=$COUNTRY -Dapp.banner=$BANNER 
> -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
> -Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=[file:/dev/./urandom|file:///dev/urandom]' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Marcelo Vanzin (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740668#comment-16740668
 ] 

Marcelo Vanzin commented on SPARK-26606:


Define not working?

e.g. I see you have both single quotes and variable references. That won't work 
the way you're probably expecting it to.

> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
> --class com.demo.myApp.App \
> --conf 'spark.executor.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
> -Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=file:/dev/./urandom' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
> -Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=file:/dev/./urandom' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Labels: expert  (was: )

> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: expert
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
> --class com.demo.myApp.App \
> --conf 'spark.executor.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
> -Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=file:/dev/./urandom' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
> -Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=file:/dev/./urandom' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Labels: java spark  (was: expert)

> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
>  Labels: java, spark
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
> --class com.demo.myApp.App \
> --conf 'spark.executor.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
> -Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=file:/dev/./urandom' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
> -Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=file:/dev/./urandom' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra updated SPARK-26606:
-
Attachment: Screen Shot 2019-01-09 at 4.31.01 PM.png

> extraJavaOptions does not work 
> ---
>
> Key: SPARK-26606
> URL: https://issues.apache.org/jira/browse/SPARK-26606
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.1
>Reporter: Ravindra
>Priority: Major
> Attachments: Screen Shot 2019-01-09 at 4.31.01 PM.png
>
>
> driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
> submit command. Even though I see the parameters are being passed fine in the 
> spark launch command I do not see these parameters are being picked up for 
> some unknown reason. 
>  
> This is my spark submit command: 
>     output=`spark-submit \
> --class com.demo.myApp.App \
> --conf 'spark.executor.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
> -Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=file:/dev/./urandom' \
>  --conf 'spark.driver.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
> -Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
> -Djava.security.egd=file:/dev/./urandom' \
>  --executor-memory "$EXECUTOR_MEMORY" \
>  --executor-cores "$EXECUTOR_CORES" \
>  --total-executor-cores "$TOTAL_CORES" \
>  --driver-memory "$DRIVER_MEMORY" \
>  --deploy-mode cluster \
>  /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26606) extraJavaOptions does not work

2019-01-11 Thread Ravindra (JIRA)
Ravindra created SPARK-26606:


 Summary: extraJavaOptions does not work 
 Key: SPARK-26606
 URL: https://issues.apache.org/jira/browse/SPARK-26606
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 2.3.1
Reporter: Ravindra


driver.extraJavaOptions and executor.extraJavaOptions does not work in spark 
submit command. Even though I see the parameters are being passed fine in the 
spark launch command I do not see these parameters are being picked up for some 
unknown reason. 

 

This is my spark submit command: 

    output=`spark-submit \
--class com.demo.myApp.App \
--conf 'spark.executor.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
-Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=file:/dev/./urandom' \
 --conf 'spark.driver.extraJavaOptions=-Dapp.env=$ENV -Dapp.country=$COUNTRY 
-Dapp.banner=$BANNER -Doracle.net.tns_admin=/work/artifacts/oracle/current 
-Djava.security.egd=file:/dev/./urandom' \
 --executor-memory "$EXECUTOR_MEMORY" \
 --executor-cores "$EXECUTOR_CORES" \
 --total-executor-cores "$TOTAL_CORES" \
 --driver-memory "$DRIVER_MEMORY" \
 --deploy-mode cluster \
 /home/spark/asm//current/myapp-*.jar 2>&1 &`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26605) New executors failing with expired tokens in client mode

2019-01-11 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-26605:
--

 Summary: New executors failing with expired tokens in client mode
 Key: SPARK-26605
 URL: https://issues.apache.org/jira/browse/SPARK-26605
 Project: Spark
  Issue Type: New Feature
  Components: YARN
Affects Versions: 2.4.0
Reporter: Marcelo Vanzin


We ran into an issue with new executors being started with expired tokens in 
client mode; cluster mode is fine. Master branch is also not affected.

This means that executors that start after 7 days would fail in this scenario.

Patch coming up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26482) Use ConfigEntry for hardcoded configs for ui categories.

2019-01-11 Thread Marcelo Vanzin (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Vanzin resolved SPARK-26482.

   Resolution: Fixed
Fix Version/s: 3.0.0

Issue resolved by pull request 23423
[https://github.com/apache/spark/pull/23423]

> Use ConfigEntry for hardcoded configs for ui categories.
> 
>
> Key: SPARK-26482
> URL: https://issues.apache.org/jira/browse/SPARK-26482
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Takuya Ueshin
>Assignee: Jungtaek Lim
>Priority: Major
> Fix For: 3.0.0
>
>
> Make the following hardcoded configs to use ConfigEntry.
> {code}
> spark.ui
> spark.ssl
> spark.authenticate
> spark.master.rest
> spark.master.ui
> spark.metrics
> spark.admin
> spark.modify.acl
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26482) Use ConfigEntry for hardcoded configs for ui categories.

2019-01-11 Thread Marcelo Vanzin (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Vanzin reassigned SPARK-26482:
--

Assignee: Jungtaek Lim

> Use ConfigEntry for hardcoded configs for ui categories.
> 
>
> Key: SPARK-26482
> URL: https://issues.apache.org/jira/browse/SPARK-26482
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Takuya Ueshin
>Assignee: Jungtaek Lim
>Priority: Major
>
> Make the following hardcoded configs to use ConfigEntry.
> {code}
> spark.ui
> spark.ssl
> spark.authenticate
> spark.master.rest
> spark.master.ui
> spark.metrics
> spark.admin
> spark.modify.acl
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26591) illegal hardware instruction

2019-01-11 Thread Bryan Cutler (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740571#comment-16740571
 ] 

Bryan Cutler commented on SPARK-26591:
--

Could you share some details of your pyarrow installation - version, did you 
pip install, are you using a virtual env?  If possible, I would make a clean 
virtual environment and try installation again, it sounds like something went 
bad.

> illegal hardware instruction
> 
>
> Key: SPARK-26591
> URL: https://issues.apache.org/jira/browse/SPARK-26591
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.0
> Environment: Python 3.6.7
> Pyspark 2.4.0
> OS:
> {noformat}
> Linux 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 
> x86_64 x86_64 GNU/Linux{noformat}
> CPU:
>  
> {code:java}
> Dual core AMD Athlon II P360 (-MCP-) cache: 1024 KB
> clock speeds: max: 2300 MHz 1: 1700 MHz 2: 1700 MHz
> {code}
>  
>  
>Reporter: Elchin
>Priority: Critical
>
> When I try to use pandas_udf from examples in 
> [documentation|https://spark.apache.org/docs/2.4.0/api/python/pyspark.sql.html#pyspark.sql.functions.pandas_udf]:
> {code:java}
> from pyspark.sql.functions import pandas_udf, PandasUDFType
> from pyspark.sql.types import IntegerType, StringType
> slen = pandas_udf(lambda s: s.str.len(), IntegerType()) #here it is 
> crashed{code}
> I get the error:
> {code:java}
> [1]    17969 illegal hardware instruction (core dumped)  python3{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26576) Broadcast hint not applied to partitioned table

2019-01-11 Thread Xiao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-26576.
-
   Resolution: Fixed
 Assignee: John Zhuge
Fix Version/s: 3.0.0
   2.4.1
   2.3.3

> Broadcast hint not applied to partitioned table
> ---
>
> Key: SPARK-26576
> URL: https://issues.apache.org/jira/browse/SPARK-26576
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.2, 2.3.2, 2.4.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Major
> Fix For: 2.3.3, 2.4.1, 3.0.0
>
>
> Broadcast hint is not applied to partitioned Parquet table. Below 
> "SortMergeJoin" is chosen incorrectly and "ResolvedHit(broadcast)" is removed 
> in Optimized Plan.
> {noformat}
> scala> spark.sql("CREATE TABLE jzhuge.parquet_with_part (val STRING) 
> PARTITIONED BY (dateint INT) STORED AS parquet")
> scala> spark.conf.set("spark.sql.autoBroadcastJoinThreshold", "-1")
> scala> Seq(spark.table("jzhuge.parquet_with_part")).map(df => 
> df.join(broadcast(df), "dateint").explain(true))
> == Parsed Logical Plan ==
> 'Join UsingJoin(Inner,List(dateint))
> :- SubqueryAlias `jzhuge`.`parquet_with_part`
> :  +- Relation[val#28,dateint#29] parquet
> +- ResolvedHint (broadcast)
>+- SubqueryAlias `jzhuge`.`parquet_with_part`
>   +- Relation[val#32,dateint#33] parquet
> == Analyzed Logical Plan ==
> dateint: int, val: string, val: string
> Project [dateint#29, val#28, val#32]
> +- Join Inner, (dateint#29 = dateint#33)
>:- SubqueryAlias `jzhuge`.`parquet_with_part`
>:  +- Relation[val#28,dateint#29] parquet
>+- ResolvedHint (broadcast)
>   +- SubqueryAlias `jzhuge`.`parquet_with_part`
>  +- Relation[val#32,dateint#33] parquet
> == Optimized Logical Plan ==
> Project [dateint#29, val#28, val#32]
> +- Join Inner, (dateint#29 = dateint#33)
>:- Project [val#28, dateint#29]
>:  +- Filter isnotnull(dateint#29)
>: +- Relation[val#28,dateint#29] parquet
>+- Project [val#32, dateint#33]
>   +- Filter isnotnull(dateint#33)
>  +- Relation[val#32,dateint#33] parquet
> == Physical Plan ==
> *(5) Project [dateint#29, val#28, val#32]
> +- *(5) SortMergeJoin [dateint#29], [dateint#33], Inner
>:- *(2) Sort [dateint#29 ASC NULLS FIRST], false, 0
>:  +- Exchange(coordinator id: 55629191) hashpartitioning(dateint#29, 
> 500), coordinator[target post-shuffle partition size: 67108864]
>: +- *(1) FileScan parquet jzhuge.parquet_with_part[val#28,dateint#29] 
> Batched: true, Format: Parquet, Location: PrunedInMemoryFileIndex[], 
> PartitionCount: 0, PartitionFilters: [isnotnull(dateint#29)], PushedFilters: 
> [], ReadSchema: struct
>+- *(4) Sort [dateint#33 ASC NULLS FIRST], false, 0
>   +- ReusedExchange [val#32, dateint#33], Exchange(coordinator id: 
> 55629191) hashpartitioning(dateint#29, 500), coordinator[target post-shuffle 
> partition size: 67108864]
> {noformat}
> Broadcast hint is applied to Parquet table without partition. Below 
> "BroadcastHashJoin" is chosen as expected.
> {noformat}
> scala> spark.sql("CREATE TABLE jzhuge.parquet_no_part (val STRING, dateint 
> INT) STORED AS parquet")
> scala> spark.conf.set("spark.sql.autoBroadcastJoinThreshold", "-1")
> scala> Seq(spark.table("jzhuge.parquet_no_part")).map(df => 
> df.join(broadcast(df), "dateint").explain(true))
> == Parsed Logical Plan ==
> 'Join UsingJoin(Inner,List(dateint))
> :- SubqueryAlias `jzhuge`.`parquet_no_part`
> :  +- Relation[val#44,dateint#45] parquet
> +- ResolvedHint (broadcast)
>+- SubqueryAlias `jzhuge`.`parquet_no_part`
>   +- Relation[val#50,dateint#51] parquet
> == Analyzed Logical Plan ==
> dateint: int, val: string, val: string
> Project [dateint#45, val#44, val#50]
> +- Join Inner, (dateint#45 = dateint#51)
>:- SubqueryAlias `jzhuge`.`parquet_no_part`
>:  +- Relation[val#44,dateint#45] parquet
>+- ResolvedHint (broadcast)
>   +- SubqueryAlias `jzhuge`.`parquet_no_part`
>  +- Relation[val#50,dateint#51] parquet
> == Optimized Logical Plan ==
> Project [dateint#45, val#44, val#50]
> +- Join Inner, (dateint#45 = dateint#51)
>:- Filter isnotnull(dateint#45)
>:  +- Relation[val#44,dateint#45] parquet
>+- ResolvedHint (broadcast)
>   +- Filter isnotnull(dateint#51)
>  +- Relation[val#50,dateint#51] parquet
> == Physical Plan ==
> *(2) Project [dateint#45, val#44, val#50]
> +- *(2) BroadcastHashJoin [dateint#45], [dateint#51], Inner, BuildRight
>:- *(2) Project [val#44, dateint#45]
>:  +- *(2) Filter isnotnull(dateint#45)
>: +- *(2) FileScan parquet jzhuge.parquet_no_part[val#44,dateint#45] 
> Batched: true, Format: Parquet, Location: InMemoryFileIndex[...], 
> PartitionFilters: [], PushedFilters: [IsNotNull(

[jira] [Comment Edited] (SPARK-25692) Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks

2019-01-11 Thread Dongjoon Hyun (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734934#comment-16734934
 ] 

Dongjoon Hyun edited comment on SPARK-25692 at 1/11/19 5:16 PM:


Hi, [~zsxwing] and [~tgraves]. 

While looking other failures, I notice that this failure still happens 
frequently. 

The failure is always `fetchBothChunks`. `amp-jenkins-worker-05` machine might 
be related.

- [master 
5856|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5856/]
 (amp-jenkins-worker-05)
- [master 
5837|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5837/testReport]
 (amp-jenkins-worker-05)
- [master 
5835|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5835/testReport]
 (amp-jenkins-worker-05)
- [master 
5829|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5829/testReport]
 (amp-jenkins-worker-05)
- [master 
5828|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5828/testReport]
 (amp-jenkins-worker-05)
- [master 
5822|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5822/testReport]
 (amp-jenkins-worker-05)
- [master 
5814|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5814/testReport]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100784|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100784/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100785|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100785/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100787|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100787/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100788|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100788/consoleFull]
 (amp-jenkins-worker-05)


was (Author: dongjoon):
Hi, [~zsxwing] and [~tgraves]. 

While looking other failures, I notice that this failure still happens 
frequently. 

The failure is always `fetchBothChunks`. `amp-jenkins-worker-05` machine might 
be related.

- [master 
5856|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5856/]
 (amp-jenkins-worker-05)
- [master 
5835|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5835/testReport]
 (amp-jenkins-worker-05)
- [master 
5829|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5829/testReport]
 (amp-jenkins-worker-05)
- [master 
5828|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5828/testReport]
 (amp-jenkins-worker-05)
- [master 
5822|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5822/testReport]
 (amp-jenkins-worker-05)
- [master 
5814|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5814/testReport]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100784|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100784/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100785|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100785/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100787|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100787/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100788|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100788/consoleFull]
 (amp-jenkins-worker-05)

> Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks
> --
>
> Key: SPARK-25692
> URL: https://issues.apache.org/jira/browse/SPARK-25692
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Shixiong Zhu
>Priority: Blocker
> Attachments: Screen Shot 2018-10-22 at 4.12.41 PM.png, Screen Shot 
> 2018-11-01 at 10.17.16 AM.png
>
>
> Looks like the whole test suite is pretty flaky. See: 
> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.6/5490/testReport/junit/org.apache.spark.network/ChunkFetchIntegrationSuite/history/
> This may be a regression in 3.0 as this didn't happen in 2.4 branch.



--
This me

[jira] [Comment Edited] (SPARK-25692) Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks

2019-01-11 Thread Dongjoon Hyun (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734934#comment-16734934
 ] 

Dongjoon Hyun edited comment on SPARK-25692 at 1/11/19 5:08 PM:


Hi, [~zsxwing] and [~tgraves]. 

While looking other failures, I notice that this failure still happens 
frequently. 

The failure is always `fetchBothChunks`. `amp-jenkins-worker-05` machine might 
be related.

- [master 
5856|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5856/]
- [master 
5835|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5835/testReport]
 (amp-jenkins-worker-05)
- [master 
5829|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5829/testReport]
 (amp-jenkins-worker-05)
- [master 
5828|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5828/testReport]
 (amp-jenkins-worker-05)
- [master 
5822|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5822/testReport]
 (amp-jenkins-worker-05)
- [master 
5814|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5814/testReport]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100784|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100784/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100785|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100785/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100787|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100787/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100788|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100788/consoleFull]
 (amp-jenkins-worker-05)


was (Author: dongjoon):
Hi, [~zsxwing] and [~tgraves]. 

While looking other failures, I notice that this failure still happens 
frequently. 

The failure is always `fetchBothChunks`. `amp-jenkins-worker-05` machine might 
be related.

- [master 
5835|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5835/testReport]
 (amp-jenkins-worker-05)
- [master 
5829|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5829/testReport]
 (amp-jenkins-worker-05)
- [master 
5828|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5828/testReport]
 (amp-jenkins-worker-05)
- [master 
5822|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5822/testReport]
 (amp-jenkins-worker-05)
- [master 
5814|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5814/testReport]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100784|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100784/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100785|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100785/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100787|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100787/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100788|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100788/consoleFull]
 (amp-jenkins-worker-05)

> Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks
> --
>
> Key: SPARK-25692
> URL: https://issues.apache.org/jira/browse/SPARK-25692
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Shixiong Zhu
>Priority: Blocker
> Attachments: Screen Shot 2018-10-22 at 4.12.41 PM.png, Screen Shot 
> 2018-11-01 at 10.17.16 AM.png
>
>
> Looks like the whole test suite is pretty flaky. See: 
> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.6/5490/testReport/junit/org.apache.spark.network/ChunkFetchIntegrationSuite/history/
> This may be a regression in 3.0 as this didn't happen in 2.4 branch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-25692) Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks

2019-01-11 Thread Dongjoon Hyun (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734934#comment-16734934
 ] 

Dongjoon Hyun edited comment on SPARK-25692 at 1/11/19 5:08 PM:


Hi, [~zsxwing] and [~tgraves]. 

While looking other failures, I notice that this failure still happens 
frequently. 

The failure is always `fetchBothChunks`. `amp-jenkins-worker-05` machine might 
be related.

- [master 
5856|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5856/]
 (amp-jenkins-worker-05)
- [master 
5835|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5835/testReport]
 (amp-jenkins-worker-05)
- [master 
5829|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5829/testReport]
 (amp-jenkins-worker-05)
- [master 
5828|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5828/testReport]
 (amp-jenkins-worker-05)
- [master 
5822|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5822/testReport]
 (amp-jenkins-worker-05)
- [master 
5814|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5814/testReport]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100784|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100784/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100785|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100785/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100787|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100787/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100788|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100788/consoleFull]
 (amp-jenkins-worker-05)


was (Author: dongjoon):
Hi, [~zsxwing] and [~tgraves]. 

While looking other failures, I notice that this failure still happens 
frequently. 

The failure is always `fetchBothChunks`. `amp-jenkins-worker-05` machine might 
be related.

- [master 
5856|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5856/]
- [master 
5835|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5835/testReport]
 (amp-jenkins-worker-05)
- [master 
5829|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5829/testReport]
 (amp-jenkins-worker-05)
- [master 
5828|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5828/testReport]
 (amp-jenkins-worker-05)
- [master 
5822|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5822/testReport]
 (amp-jenkins-worker-05)
- [master 
5814|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5814/testReport]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100784|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100784/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100785|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100785/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100787|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100787/consoleFull]
 (amp-jenkins-worker-05)

- [SparkPullRequestBuilder 
100788|https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/100788/consoleFull]
 (amp-jenkins-worker-05)

> Flaky test: ChunkFetchIntegrationSuite.fetchBothChunks
> --
>
> Key: SPARK-25692
> URL: https://issues.apache.org/jira/browse/SPARK-25692
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Shixiong Zhu
>Priority: Blocker
> Attachments: Screen Shot 2018-10-22 at 4.12.41 PM.png, Screen Shot 
> 2018-11-01 at 10.17.16 AM.png
>
>
> Looks like the whole test suite is pretty flaky. See: 
> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.6/5490/testReport/junit/org.apache.spark.network/ChunkFetchIntegrationSuite/history/
> This may be a regression in 3.0 as this didn't happen in 2.4 branch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional comm

[jira] [Commented] (SPARK-18165) Kinesis support in Structured Streaming

2019-01-11 Thread Aman Mundra (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740555#comment-16740555
 ] 

Aman Mundra commented on SPARK-18165:
-

 Hi [~itsvikramagr], any idea when this jar would be coming out as a  
production feature in spark 2.2 or later?

> Kinesis support in Structured Streaming
> ---
>
> Key: SPARK-18165
> URL: https://issues.apache.org/jira/browse/SPARK-18165
> Project: Spark
>  Issue Type: New Feature
>  Components: Structured Streaming
>Reporter: Lauren Moos
>Priority: Major
>
> Implement Kinesis based sources and sinks for Structured Streaming



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26603) Update minikube backend in K8s integration tests

2019-01-11 Thread Stavros Kontopoulos (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stavros Kontopoulos updated SPARK-26603:

Priority: Major  (was: Minor)

> Update minikube backend in K8s integration tests
> 
>
> Key: SPARK-26603
> URL: https://issues.apache.org/jira/browse/SPARK-26603
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 2.4.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> Minikube status command has changed 
> ([https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788])
>  in the latest releases >0.30.
> Old output:
> {quote}minikube status
>  There is a newer version of minikube available (v0.31.0). Download it here:
>  [https://github.com/kubernetes/minikube/releases/tag/v0.31.0]
> To disable this notification, run the following:
>  minikube config set WantUpdateNotification false
>  minikube: 
>  cluster: 
>  kubectl: 
> {quote}
> new output:
> {quote}minikube status
>  host: Running
>  kubelet: Running
>  apiserver: Running
>  kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77
> {quote}
> That means users with latest version of minikube will not be able to run the 
> integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26604) Register channel for stream request

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26604:


Assignee: (was: Apache Spark)

> Register channel for stream request
> ---
>
> Key: SPARK-26604
> URL: https://issues.apache.org/jira/browse/SPARK-26604
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Liang-Chi Hsieh
>Priority: Minor
>
> Now in {{TransportRequestHandler.processStreamRequest}}, when a stream 
> request is processed, the stream id is not registered with the current 
> channel in stream manager. It should do that so in case of that the channel 
> gets terminated we can remove associated streams from stream requests too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26604) Register channel for stream request

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26604:


Assignee: Apache Spark

> Register channel for stream request
> ---
>
> Key: SPARK-26604
> URL: https://issues.apache.org/jira/browse/SPARK-26604
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Liang-Chi Hsieh
>Assignee: Apache Spark
>Priority: Minor
>
> Now in {{TransportRequestHandler.processStreamRequest}}, when a stream 
> request is processed, the stream id is not registered with the current 
> channel in stream manager. It should do that so in case of that the channel 
> gets terminated we can remove associated streams from stream requests too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26604) Register channel for stream request

2019-01-11 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-26604:
---

 Summary: Register channel for stream request
 Key: SPARK-26604
 URL: https://issues.apache.org/jira/browse/SPARK-26604
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Liang-Chi Hsieh


Now in {{TransportRequestHandler.processStreamRequest}}, when a stream request 
is processed, the stream id is not registered with the current channel in 
stream manager. It should do that so in case of that the channel gets 
terminated we can remove associated streams from stream requests too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26503) Get rid of spark.sql.legacy.timeParser.enabled

2019-01-11 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-26503.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

Resolved by https://github.com/apache/spark/pull/23495

> Get rid of spark.sql.legacy.timeParser.enabled
> --
>
> Key: SPARK-26503
> URL: https://issues.apache.org/jira/browse/SPARK-26503
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Sean Owen
>Priority: Minor
> Fix For: 3.0.0
>
>
> The flag is used in CSV/JSON datasources as well in time related functions to 
> control parsing/formatting date/timestamps. By default, DateTimeFormat is 
> used for the purpose but the flag allow switch back to SimpleDateFormat and 
> some fallback. In the major release 3.0, the flag should be removed, and new 
> formatters DateFormatter/TimestampFormatter should be used by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26503) Get rid of spark.sql.legacy.timeParser.enabled

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26503:


Assignee: Apache Spark  (was: Sean Owen)

> Get rid of spark.sql.legacy.timeParser.enabled
> --
>
> Key: SPARK-26503
> URL: https://issues.apache.org/jira/browse/SPARK-26503
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Apache Spark
>Priority: Minor
>
> The flag is used in CSV/JSON datasources as well in time related functions to 
> control parsing/formatting date/timestamps. By default, DateTimeFormat is 
> used for the purpose but the flag allow switch back to SimpleDateFormat and 
> some fallback. In the major release 3.0, the flag should be removed, and new 
> formatters DateFormatter/TimestampFormatter should be used by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26503) Get rid of spark.sql.legacy.timeParser.enabled

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26503:


Assignee: Sean Owen  (was: Apache Spark)

> Get rid of spark.sql.legacy.timeParser.enabled
> --
>
> Key: SPARK-26503
> URL: https://issues.apache.org/jira/browse/SPARK-26503
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Sean Owen
>Priority: Minor
>
> The flag is used in CSV/JSON datasources as well in time related functions to 
> control parsing/formatting date/timestamps. By default, DateTimeFormat is 
> used for the purpose but the flag allow switch back to SimpleDateFormat and 
> some fallback. In the major release 3.0, the flag should be removed, and new 
> formatters DateFormatter/TimestampFormatter should be used by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-26503) Get rid of spark.sql.legacy.timeParser.enabled

2019-01-11 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen reopened SPARK-26503:
---

> Get rid of spark.sql.legacy.timeParser.enabled
> --
>
> Key: SPARK-26503
> URL: https://issues.apache.org/jira/browse/SPARK-26503
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Sean Owen
>Priority: Minor
>
> The flag is used in CSV/JSON datasources as well in time related functions to 
> control parsing/formatting date/timestamps. By default, DateTimeFormat is 
> used for the purpose but the flag allow switch back to SimpleDateFormat and 
> some fallback. In the major release 3.0, the flag should be removed, and new 
> formatters DateFormatter/TimestampFormatter should be used by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26503) Get rid of spark.sql.legacy.timeParser.enabled

2019-01-11 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen reassigned SPARK-26503:
-

Assignee: Sean Owen

> Get rid of spark.sql.legacy.timeParser.enabled
> --
>
> Key: SPARK-26503
> URL: https://issues.apache.org/jira/browse/SPARK-26503
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Sean Owen
>Priority: Minor
>
> The flag is used in CSV/JSON datasources as well in time related functions to 
> control parsing/formatting date/timestamps. By default, DateTimeFormat is 
> used for the purpose but the flag allow switch back to SimpleDateFormat and 
> some fallback. In the major release 3.0, the flag should be removed, and new 
> formatters DateFormatter/TimestampFormatter should be used by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26602) Once creating and quering udf with incorrect path,followed by querying tables or functions registered with correct path gives the runtime exception within the same sessi

2019-01-11 Thread Haripriya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haripriya updated SPARK-26602:
--
Description: 
In sql,

1.Query the existing  udf(say myFunc1)

2. create and select the udf registered with incorrect path (say myFunc2)

3.Now again query the existing udf  in the same session - Wil throw exception 
stating that couldn't read resource of myFunc2's path

4.Even  the basic operations like insert and select will fail giving the same 
error

Result: 

java.lang.RuntimeException: Failed to read external resource 
hdfs:///tmp/hari_notexists1/two_udfs.jar
 at 
org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288)
 at 
org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242)
 at 
org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163)
 at 
org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149)
 at 
org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841)
 at 
org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112)

  was:
In sql,

1.Query the existing  udf(say myFunc1)

2. create and select the udf registered with incorrect path (say myFunc2)

3.Now again query the existing udf  in the same session - Wil throw exception 
stating that couldn't read resource of myFunc2's path

Result: 

java.lang.RuntimeException: Failed to read external resource 
hdfs:///tmp/hari_notexists1/two_udfs.jar
 at 
org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288)
 at 
org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242)
 at 
org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163)
 at 
org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149)
 at 
org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841)
 at 
org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112)

Summary: Once creating and quering udf with incorrect path,followed by 
querying tables or functions registered with correct path gives the runtime 
exception within the same session  (was: Once creating and quering udf with 
incorrect path,even the functions registered with correct path follows the same 
incorrect path in that session)

> Once creating and quering udf with incorrect path,followed by querying tables 
> or functions registered with correct path gives the runtime exception within 
> the same session
> ---
>
> Key: SPARK-26602
> URL: https://issues.apache.org/jira/browse/SPARK-26602
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Haripriya
>Priority: Major
>
> In sql,
> 1.Query the existing  udf(say myFunc1)
> 2. create and select the udf registered with incorrect path (say my

[jira] [Assigned] (SPARK-26603) Update minikube backend in K8s integration tests

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26603:


Assignee: (was: Apache Spark)

> Update minikube backend in K8s integration tests
> 
>
> Key: SPARK-26603
> URL: https://issues.apache.org/jira/browse/SPARK-26603
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 2.4.0
>Reporter: Stavros Kontopoulos
>Priority: Minor
>
> Minikube status command has changed 
> ([https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788])
>  in the latest releases >0.30.
> Old output:
> {quote}minikube status
>  There is a newer version of minikube available (v0.31.0). Download it here:
>  [https://github.com/kubernetes/minikube/releases/tag/v0.31.0]
> To disable this notification, run the following:
>  minikube config set WantUpdateNotification false
>  minikube: 
>  cluster: 
>  kubectl: 
> {quote}
> new output:
> {quote}minikube status
>  host: Running
>  kubelet: Running
>  apiserver: Running
>  kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77
> {quote}
> That means users with latest version of minikube will not be able to run the 
> integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26603) Update minikube backend in K8s integration tests

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26603:


Assignee: Apache Spark

> Update minikube backend in K8s integration tests
> 
>
> Key: SPARK-26603
> URL: https://issues.apache.org/jira/browse/SPARK-26603
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 2.4.0
>Reporter: Stavros Kontopoulos
>Assignee: Apache Spark
>Priority: Minor
>
> Minikube status command has changed 
> ([https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788])
>  in the latest releases >0.30.
> Old output:
> {quote}minikube status
>  There is a newer version of minikube available (v0.31.0). Download it here:
>  [https://github.com/kubernetes/minikube/releases/tag/v0.31.0]
> To disable this notification, run the following:
>  minikube config set WantUpdateNotification false
>  minikube: 
>  cluster: 
>  kubectl: 
> {quote}
> new output:
> {quote}minikube status
>  host: Running
>  kubelet: Running
>  apiserver: Running
>  kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77
> {quote}
> That means users with latest version of minikube will not be able to run the 
> integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26603) Update minikube backend in K8s integration tests

2019-01-11 Thread Stavros Kontopoulos (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stavros Kontopoulos updated SPARK-26603:

Description: 
Minikube status command has changed 
(https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788)
 its output in the latest releases >0.30.

Old output:
```minikube status
There is a newer version of minikube available (v0.31.0). Download it here:
https://github.com/kubernetes/minikube/releases/tag/v0.31.0

To disable this notification, run the following:
minikube config set WantUpdateNotification false
minikube: 
cluster: 
kubectl:``` 
new output:
```minikube status
host: Running
kubelet: Running
apiserver: Running
kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77```

That means users with latest version of minikube will not be able to run the 
integration tests.

  was:
Minikube status command has changed its output in the latest releases >0.30.

That means users with latest version of minikube will not be able to run the 
integration tests.


> Update minikube backend in K8s integration tests
> 
>
> Key: SPARK-26603
> URL: https://issues.apache.org/jira/browse/SPARK-26603
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 2.4.0
>Reporter: Stavros Kontopoulos
>Priority: Minor
>
> Minikube status command has changed 
> (https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788)
>  its output in the latest releases >0.30.
> Old output:
> ```minikube status
> There is a newer version of minikube available (v0.31.0). Download it here:
> https://github.com/kubernetes/minikube/releases/tag/v0.31.0
> To disable this notification, run the following:
> minikube config set WantUpdateNotification false
> minikube: 
> cluster: 
> kubectl:``` 
> new output:
> ```minikube status
> host: Running
> kubelet: Running
> apiserver: Running
> kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77```
> That means users with latest version of minikube will not be able to run the 
> integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26603) Update minikube backend in K8s integration tests

2019-01-11 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-26603:
---

 Summary: Update minikube backend in K8s integration tests
 Key: SPARK-26603
 URL: https://issues.apache.org/jira/browse/SPARK-26603
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes
Affects Versions: 2.4.0
Reporter: Stavros Kontopoulos


Minikube status command has changed its output in the latest releases >0.30.

That means users with latest version of minikube will not be able to run the 
integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26603) Update minikube backend in K8s integration tests

2019-01-11 Thread Stavros Kontopoulos (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stavros Kontopoulos updated SPARK-26603:

Description: 
Minikube status command has changed 
([https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788])
 its output in the latest releases >0.30.

Old output:
{quote}
minikube status
 There is a newer version of minikube available (v0.31.0). Download it here:
 [https://github.com/kubernetes/minikube/releases/tag/v0.31.0]

To disable this notification, run the following:
 minikube config set WantUpdateNotification false
 minikube: 
 cluster: 
 kubectl: 
{quote}

 new output:
{quote}
minikube status
 host: Running
 kubelet: Running
 apiserver: Running
 kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77
{quote}
That means users with latest version of minikube will not be able to run the 
integration tests.

  was:
Minikube status command has changed 
(https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788)
 its output in the latest releases >0.30.

Old output:
```minikube status
There is a newer version of minikube available (v0.31.0). Download it here:
https://github.com/kubernetes/minikube/releases/tag/v0.31.0

To disable this notification, run the following:
minikube config set WantUpdateNotification false
minikube: 
cluster: 
kubectl:``` 
new output:
```minikube status
host: Running
kubelet: Running
apiserver: Running
kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77```

That means users with latest version of minikube will not be able to run the 
integration tests.


> Update minikube backend in K8s integration tests
> 
>
> Key: SPARK-26603
> URL: https://issues.apache.org/jira/browse/SPARK-26603
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 2.4.0
>Reporter: Stavros Kontopoulos
>Priority: Minor
>
> Minikube status command has changed 
> ([https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788])
>  its output in the latest releases >0.30.
> Old output:
> {quote}
> minikube status
>  There is a newer version of minikube available (v0.31.0). Download it here:
>  [https://github.com/kubernetes/minikube/releases/tag/v0.31.0]
> To disable this notification, run the following:
>  minikube config set WantUpdateNotification false
>  minikube: 
>  cluster: 
>  kubectl: 
> {quote}
>  new output:
> {quote}
> minikube status
>  host: Running
>  kubelet: Running
>  apiserver: Running
>  kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77
> {quote}
> That means users with latest version of minikube will not be able to run the 
> integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26603) Update minikube backend in K8s integration tests

2019-01-11 Thread Stavros Kontopoulos (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stavros Kontopoulos updated SPARK-26603:

Description: 
Minikube status command has changed 
([https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788])
 in the latest releases >0.30.

Old output:
{quote}minikube status
 There is a newer version of minikube available (v0.31.0). Download it here:
 [https://github.com/kubernetes/minikube/releases/tag/v0.31.0]

To disable this notification, run the following:
 minikube config set WantUpdateNotification false
 minikube: 
 cluster: 
 kubectl: 
{quote}
new output:
{quote}minikube status
 host: Running
 kubelet: Running
 apiserver: Running
 kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77
{quote}
That means users with latest version of minikube will not be able to run the 
integration tests.

  was:
Minikube status command has changed 
([https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788])
 its output in the latest releases >0.30.

Old output:
{quote}
minikube status
 There is a newer version of minikube available (v0.31.0). Download it here:
 [https://github.com/kubernetes/minikube/releases/tag/v0.31.0]

To disable this notification, run the following:
 minikube config set WantUpdateNotification false
 minikube: 
 cluster: 
 kubectl: 
{quote}

 new output:
{quote}
minikube status
 host: Running
 kubelet: Running
 apiserver: Running
 kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77
{quote}
That means users with latest version of minikube will not be able to run the 
integration tests.


> Update minikube backend in K8s integration tests
> 
>
> Key: SPARK-26603
> URL: https://issues.apache.org/jira/browse/SPARK-26603
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 2.4.0
>Reporter: Stavros Kontopoulos
>Priority: Minor
>
> Minikube status command has changed 
> ([https://github.com/kubernetes/minikube/commit/cb3624dd089e7ab0c03fbfb81f20c2bde43a60f3#diff-bd0534bbb0703b4170d467d074373788])
>  in the latest releases >0.30.
> Old output:
> {quote}minikube status
>  There is a newer version of minikube available (v0.31.0). Download it here:
>  [https://github.com/kubernetes/minikube/releases/tag/v0.31.0]
> To disable this notification, run the following:
>  minikube config set WantUpdateNotification false
>  minikube: 
>  cluster: 
>  kubectl: 
> {quote}
> new output:
> {quote}minikube status
>  host: Running
>  kubelet: Running
>  apiserver: Running
>  kubectl: Correctly Configured: pointing to minikube-vm at 172.31.34.77
> {quote}
> That means users with latest version of minikube will not be able to run the 
> integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26602) Once creating and quering udf with incorrect path,even the functions registered with correct path follows the same incorrect path in that session

2019-01-11 Thread Haripriya (JIRA)
Haripriya created SPARK-26602:
-

 Summary: Once creating and quering udf with incorrect path,even 
the functions registered with correct path follows the same incorrect path in 
that session
 Key: SPARK-26602
 URL: https://issues.apache.org/jira/browse/SPARK-26602
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.0
Reporter: Haripriya


In sql,

1.Query the existing  udf(say myFunc1)

2. create and select the udf registered with incorrect path (say myFunc2)

3.Now again query the existing udf  in the same session - Wil throw exception 
stating that couldn't read resource of myFunc2's path

Result: 

java.lang.RuntimeException: Failed to read external resource 
hdfs:///tmp/hari_notexists1/two_udfs.jar
 at 
org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288)
 at 
org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242)
 at 
org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163)
 at 
org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149)
 at 
org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696)
 at 
org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841)
 at 
org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime and maxThreadNumber configurable

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26601:


Assignee: (was: Apache Spark)

> Make broadcast-exchange thread pool keepalivetime and maxThreadNumber 
> configurable
> --
>
> Key: SPARK-26601
> URL: https://issues.apache.org/jira/browse/SPARK-26601
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: zhoukang
>Priority: Major
> Attachments: 26601-largeobject.png, 26601-occupy.png, 
> 26601-path2gcroot.png
>
>
> Currently,thread number of broadcast-exchange thread pool is fixed and 
> keepAliveSeconds is also fixed as 60s.
> {code:java}
> object BroadcastExchangeExec {
>   private[execution] val executionContext = 
> ExecutionContext.fromExecutorService(
> ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
> }
>  /**
>* Create a cached thread pool whose max number of threads is 
> `maxThreadNumber`. Thread names
>* are formatted as prefix-ID, where ID is a unique, sequentially assigned 
> integer.
>*/
>   def newDaemonCachedThreadPool(
>   prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
> ThreadPoolExecutor = {
> val threadFactory = namedThreadFactory(prefix)
> val threadPool = new ThreadPoolExecutor(
>   maxThreadNumber, // corePoolSize: the max number of threads to create 
> before queuing the tasks
>   maxThreadNumber, // maximumPoolSize: because we use 
> LinkedBlockingDeque, this one is not used
>   keepAliveSeconds,
>   TimeUnit.SECONDS,
>   new LinkedBlockingQueue[Runnable],
>   threadFactory)
> threadPool.allowCoreThreadTimeOut(true)
> threadPool
>   }
> {code}
> But some times, if the Thead object do not GC quickly it may caused 
> server(driver) OOM. In such case,we need to make this thread pool 
> configurable.
> Below is an example:
>  !26601-occupy.png! 
>  !26601-largeobject.png! 
>  !26601-path2gcroot.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime and maxThreadNumber configurable

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26601:


Assignee: Apache Spark

> Make broadcast-exchange thread pool keepalivetime and maxThreadNumber 
> configurable
> --
>
> Key: SPARK-26601
> URL: https://issues.apache.org/jira/browse/SPARK-26601
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: zhoukang
>Assignee: Apache Spark
>Priority: Major
> Attachments: 26601-largeobject.png, 26601-occupy.png, 
> 26601-path2gcroot.png
>
>
> Currently,thread number of broadcast-exchange thread pool is fixed and 
> keepAliveSeconds is also fixed as 60s.
> {code:java}
> object BroadcastExchangeExec {
>   private[execution] val executionContext = 
> ExecutionContext.fromExecutorService(
> ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
> }
>  /**
>* Create a cached thread pool whose max number of threads is 
> `maxThreadNumber`. Thread names
>* are formatted as prefix-ID, where ID is a unique, sequentially assigned 
> integer.
>*/
>   def newDaemonCachedThreadPool(
>   prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
> ThreadPoolExecutor = {
> val threadFactory = namedThreadFactory(prefix)
> val threadPool = new ThreadPoolExecutor(
>   maxThreadNumber, // corePoolSize: the max number of threads to create 
> before queuing the tasks
>   maxThreadNumber, // maximumPoolSize: because we use 
> LinkedBlockingDeque, this one is not used
>   keepAliveSeconds,
>   TimeUnit.SECONDS,
>   new LinkedBlockingQueue[Runnable],
>   threadFactory)
> threadPool.allowCoreThreadTimeOut(true)
> threadPool
>   }
> {code}
> But some times, if the Thead object do not GC quickly it may caused 
> server(driver) OOM. In such case,we need to make this thread pool 
> configurable.
> Below is an example:
>  !26601-occupy.png! 
>  !26601-largeobject.png! 
>  !26601-path2gcroot.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime and maxThreadNumber configurable

2019-01-11 Thread zhoukang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated SPARK-26601:
-
Summary: Make broadcast-exchange thread pool keepalivetime and 
maxThreadNumber configurable  (was: Make broadcast-exchange thread pool 
keepalivetime configurable)

> Make broadcast-exchange thread pool keepalivetime and maxThreadNumber 
> configurable
> --
>
> Key: SPARK-26601
> URL: https://issues.apache.org/jira/browse/SPARK-26601
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: zhoukang
>Priority: Major
> Attachments: 选区_001.png, 选区_002 (1).png, 选区_002.png
>
>
> Currently,thread number of broadcast-exchange thread pool is fixed and 
> keepAliveSeconds is also fixed as 60s.
> {code:java}
> object BroadcastExchangeExec {
>   private[execution] val executionContext = 
> ExecutionContext.fromExecutorService(
> ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
> }
>  /**
>* Create a cached thread pool whose max number of threads is 
> `maxThreadNumber`. Thread names
>* are formatted as prefix-ID, where ID is a unique, sequentially assigned 
> integer.
>*/
>   def newDaemonCachedThreadPool(
>   prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
> ThreadPoolExecutor = {
> val threadFactory = namedThreadFactory(prefix)
> val threadPool = new ThreadPoolExecutor(
>   maxThreadNumber, // corePoolSize: the max number of threads to create 
> before queuing the tasks
>   maxThreadNumber, // maximumPoolSize: because we use 
> LinkedBlockingDeque, this one is not used
>   keepAliveSeconds,
>   TimeUnit.SECONDS,
>   new LinkedBlockingQueue[Runnable],
>   threadFactory)
> threadPool.allowCoreThreadTimeOut(true)
> threadPool
>   }
> {code}
> But some times, if the Thead object do not GC quickly it may caused 
> server(driver) OOM.
> Below is an example:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26600) Update spark-submit usage message

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26600:


Assignee: (was: Apache Spark)

> Update spark-submit usage message
> -
>
> Key: SPARK-26600
> URL: https://issues.apache.org/jira/browse/SPARK-26600
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Luca Canali
>Priority: Minor
> Fix For: 3.0.0
>
>
> Spark-submit usage message should be put in sync with recent changes in 
> particular regarding K8S support. These are the proposed changes to the usage 
> message:
> --executor-cores NUM -> can be useed for Spark on YARN and K8S 
> --principal PRINCIPAL  and --keytab KEYTAB -> can be used for Spark on YARN 
> and K8S
> --total-executor-cores NUM -> can be used for Spark standalone, YARN and K8S 
> In addition this PR proposes to remove certain implementation details from 
> the --keytab argument description as the implementation details vary between 
> YARN and K8S, for example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26600) Update spark-submit usage message

2019-01-11 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26600:


Assignee: Apache Spark

> Update spark-submit usage message
> -
>
> Key: SPARK-26600
> URL: https://issues.apache.org/jira/browse/SPARK-26600
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Luca Canali
>Assignee: Apache Spark
>Priority: Minor
> Fix For: 3.0.0
>
>
> Spark-submit usage message should be put in sync with recent changes in 
> particular regarding K8S support. These are the proposed changes to the usage 
> message:
> --executor-cores NUM -> can be useed for Spark on YARN and K8S 
> --principal PRINCIPAL  and --keytab KEYTAB -> can be used for Spark on YARN 
> and K8S
> --total-executor-cores NUM -> can be used for Spark standalone, YARN and K8S 
> In addition this PR proposes to remove certain implementation details from 
> the --keytab argument description as the implementation details vary between 
> YARN and K8S, for example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime and maxThreadNumber configurable

2019-01-11 Thread zhoukang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated SPARK-26601:
-
Attachment: 选区_002 (1).png
选区_002.png
选区_001.png

> Make broadcast-exchange thread pool keepalivetime and maxThreadNumber 
> configurable
> --
>
> Key: SPARK-26601
> URL: https://issues.apache.org/jira/browse/SPARK-26601
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: zhoukang
>Priority: Major
> Attachments: 选区_001.png, 选区_002 (1).png, 选区_002.png
>
>
> Currently,thread number of broadcast-exchange thread pool is fixed and 
> keepAliveSeconds is also fixed as 60s.
> {code:java}
> object BroadcastExchangeExec {
>   private[execution] val executionContext = 
> ExecutionContext.fromExecutorService(
> ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
> }
>  /**
>* Create a cached thread pool whose max number of threads is 
> `maxThreadNumber`. Thread names
>* are formatted as prefix-ID, where ID is a unique, sequentially assigned 
> integer.
>*/
>   def newDaemonCachedThreadPool(
>   prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
> ThreadPoolExecutor = {
> val threadFactory = namedThreadFactory(prefix)
> val threadPool = new ThreadPoolExecutor(
>   maxThreadNumber, // corePoolSize: the max number of threads to create 
> before queuing the tasks
>   maxThreadNumber, // maximumPoolSize: because we use 
> LinkedBlockingDeque, this one is not used
>   keepAliveSeconds,
>   TimeUnit.SECONDS,
>   new LinkedBlockingQueue[Runnable],
>   threadFactory)
> threadPool.allowCoreThreadTimeOut(true)
> threadPool
>   }
> {code}
> But some times, if the Thead object do not GC quickly it may caused 
> server(driver) OOM.
> Below is an example:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26600) Update spark-submit usage message

2019-01-11 Thread Luca Canali (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Canali updated SPARK-26600:

Description: 
Spark-submit usage message should be put in sync with recent changes in 
particular regarding K8S support. These are the proposed changes to the usage 
message:

--executor-cores NUM -> can be useed for Spark on YARN and K8S 

--principal PRINCIPAL  and --keytab KEYTAB -> can be used for Spark on YARN and 
K8S

--total-executor-cores NUM -> can be used for Spark standalone, YARN and K8S 



In addition this PR proposes to remove certain implementation details from the 
--keytab argument description as the implementation details vary between YARN 
and K8S, for example.

  was:
Spark-submit usage message should be put in sync with recent changes in 
particular regarding K8S support. These are the proposed changes to the usage 
message:

--executor-cores NUM -> can be useed for Spark on YARN and K8S 

--principal PRINCIPAL  and --keytab KEYTAB -> can be used for Spark on YARN and 
K8S

--total-executor-cores NUM -> can be used for Spark standalone, YARN and K8S 


> Update spark-submit usage message
> -
>
> Key: SPARK-26600
> URL: https://issues.apache.org/jira/browse/SPARK-26600
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Luca Canali
>Priority: Minor
> Fix For: 3.0.0
>
>
> Spark-submit usage message should be put in sync with recent changes in 
> particular regarding K8S support. These are the proposed changes to the usage 
> message:
> --executor-cores NUM -> can be useed for Spark on YARN and K8S 
> --principal PRINCIPAL  and --keytab KEYTAB -> can be used for Spark on YARN 
> and K8S
> --total-executor-cores NUM -> can be used for Spark standalone, YARN and K8S 
> In addition this PR proposes to remove certain implementation details from 
> the --keytab argument description as the implementation details vary between 
> YARN and K8S, for example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime and maxThreadNumber configurable

2019-01-11 Thread zhoukang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated SPARK-26601:
-
Description: 
Currently,thread number of broadcast-exchange thread pool is fixed and 
keepAliveSeconds is also fixed as 60s.

{code:java}
object BroadcastExchangeExec {
  private[execution] val executionContext = 
ExecutionContext.fromExecutorService(
ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
}

 /**
   * Create a cached thread pool whose max number of threads is 
`maxThreadNumber`. Thread names
   * are formatted as prefix-ID, where ID is a unique, sequentially assigned 
integer.
   */
  def newDaemonCachedThreadPool(
  prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
ThreadPoolExecutor = {
val threadFactory = namedThreadFactory(prefix)
val threadPool = new ThreadPoolExecutor(
  maxThreadNumber, // corePoolSize: the max number of threads to create 
before queuing the tasks
  maxThreadNumber, // maximumPoolSize: because we use LinkedBlockingDeque, 
this one is not used
  keepAliveSeconds,
  TimeUnit.SECONDS,
  new LinkedBlockingQueue[Runnable],
  threadFactory)
threadPool.allowCoreThreadTimeOut(true)
threadPool
  }
{code}
But some times, if the Thead object do not GC quickly it may caused 
server(driver) OOM. In such case,we need to make this thread pool configurable.
Below is an example:
 !26601-occupy.png! 
 !26601-largeobject.png! 
 !26601-path2gcroot.png! 

  was:
Currently,thread number of broadcast-exchange thread pool is fixed and 
keepAliveSeconds is also fixed as 60s.

{code:java}
object BroadcastExchangeExec {
  private[execution] val executionContext = 
ExecutionContext.fromExecutorService(
ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
}

 /**
   * Create a cached thread pool whose max number of threads is 
`maxThreadNumber`. Thread names
   * are formatted as prefix-ID, where ID is a unique, sequentially assigned 
integer.
   */
  def newDaemonCachedThreadPool(
  prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
ThreadPoolExecutor = {
val threadFactory = namedThreadFactory(prefix)
val threadPool = new ThreadPoolExecutor(
  maxThreadNumber, // corePoolSize: the max number of threads to create 
before queuing the tasks
  maxThreadNumber, // maximumPoolSize: because we use LinkedBlockingDeque, 
this one is not used
  keepAliveSeconds,
  TimeUnit.SECONDS,
  new LinkedBlockingQueue[Runnable],
  threadFactory)
threadPool.allowCoreThreadTimeOut(true)
threadPool
  }
{code}
But some times, if the Thead object do not GC quickly it may caused 
server(driver) OOM.
Below is an example:
 !26601-occupy.png! 
 !26601-largeobject.png! 
 !26601-path2gcroot.png! 


> Make broadcast-exchange thread pool keepalivetime and maxThreadNumber 
> configurable
> --
>
> Key: SPARK-26601
> URL: https://issues.apache.org/jira/browse/SPARK-26601
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: zhoukang
>Priority: Major
> Attachments: 26601-largeobject.png, 26601-occupy.png, 
> 26601-path2gcroot.png
>
>
> Currently,thread number of broadcast-exchange thread pool is fixed and 
> keepAliveSeconds is also fixed as 60s.
> {code:java}
> object BroadcastExchangeExec {
>   private[execution] val executionContext = 
> ExecutionContext.fromExecutorService(
> ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
> }
>  /**
>* Create a cached thread pool whose max number of threads is 
> `maxThreadNumber`. Thread names
>* are formatted as prefix-ID, where ID is a unique, sequentially assigned 
> integer.
>*/
>   def newDaemonCachedThreadPool(
>   prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
> ThreadPoolExecutor = {
> val threadFactory = namedThreadFactory(prefix)
> val threadPool = new ThreadPoolExecutor(
>   maxThreadNumber, // corePoolSize: the max number of threads to create 
> before queuing the tasks
>   maxThreadNumber, // maximumPoolSize: because we use 
> LinkedBlockingDeque, this one is not used
>   keepAliveSeconds,
>   TimeUnit.SECONDS,
>   new LinkedBlockingQueue[Runnable],
>   threadFactory)
> threadPool.allowCoreThreadTimeOut(true)
> threadPool
>   }
> {code}
> But some times, if the Thead object do not GC quickly it may caused 
> server(driver) OOM. In such case,we need to make this thread pool 
> configurable.
> Below is an example:
>  !26601-occupy.png! 
>  !26601-largeobject.png! 
>  !26601-path2gcroot.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To 

[jira] [Updated] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime and maxThreadNumber configurable

2019-01-11 Thread zhoukang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated SPARK-26601:
-
Description: 
Currently,thread number of broadcast-exchange thread pool is fixed and 
keepAliveSeconds is also fixed as 60s.

{code:java}
object BroadcastExchangeExec {
  private[execution] val executionContext = 
ExecutionContext.fromExecutorService(
ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
}

 /**
   * Create a cached thread pool whose max number of threads is 
`maxThreadNumber`. Thread names
   * are formatted as prefix-ID, where ID is a unique, sequentially assigned 
integer.
   */
  def newDaemonCachedThreadPool(
  prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
ThreadPoolExecutor = {
val threadFactory = namedThreadFactory(prefix)
val threadPool = new ThreadPoolExecutor(
  maxThreadNumber, // corePoolSize: the max number of threads to create 
before queuing the tasks
  maxThreadNumber, // maximumPoolSize: because we use LinkedBlockingDeque, 
this one is not used
  keepAliveSeconds,
  TimeUnit.SECONDS,
  new LinkedBlockingQueue[Runnable],
  threadFactory)
threadPool.allowCoreThreadTimeOut(true)
threadPool
  }
{code}
But some times, if the Thead object do not GC quickly it may caused 
server(driver) OOM.
Below is an example:
 !26601-occupy.png! 
 !26601-largeobject.png! 
 !26601-path2gcroot.png! 

  was:
Currently,thread number of broadcast-exchange thread pool is fixed and 
keepAliveSeconds is also fixed as 60s.

{code:java}
object BroadcastExchangeExec {
  private[execution] val executionContext = 
ExecutionContext.fromExecutorService(
ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
}

 /**
   * Create a cached thread pool whose max number of threads is 
`maxThreadNumber`. Thread names
   * are formatted as prefix-ID, where ID is a unique, sequentially assigned 
integer.
   */
  def newDaemonCachedThreadPool(
  prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
ThreadPoolExecutor = {
val threadFactory = namedThreadFactory(prefix)
val threadPool = new ThreadPoolExecutor(
  maxThreadNumber, // corePoolSize: the max number of threads to create 
before queuing the tasks
  maxThreadNumber, // maximumPoolSize: because we use LinkedBlockingDeque, 
this one is not used
  keepAliveSeconds,
  TimeUnit.SECONDS,
  new LinkedBlockingQueue[Runnable],
  threadFactory)
threadPool.allowCoreThreadTimeOut(true)
threadPool
  }
{code}
But some times, if the Thead object do not GC quickly it may caused 
server(driver) OOM.
Below is an example:



> Make broadcast-exchange thread pool keepalivetime and maxThreadNumber 
> configurable
> --
>
> Key: SPARK-26601
> URL: https://issues.apache.org/jira/browse/SPARK-26601
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: zhoukang
>Priority: Major
> Attachments: 26601-largeobject.png, 26601-occupy.png, 
> 26601-path2gcroot.png
>
>
> Currently,thread number of broadcast-exchange thread pool is fixed and 
> keepAliveSeconds is also fixed as 60s.
> {code:java}
> object BroadcastExchangeExec {
>   private[execution] val executionContext = 
> ExecutionContext.fromExecutorService(
> ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
> }
>  /**
>* Create a cached thread pool whose max number of threads is 
> `maxThreadNumber`. Thread names
>* are formatted as prefix-ID, where ID is a unique, sequentially assigned 
> integer.
>*/
>   def newDaemonCachedThreadPool(
>   prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
> ThreadPoolExecutor = {
> val threadFactory = namedThreadFactory(prefix)
> val threadPool = new ThreadPoolExecutor(
>   maxThreadNumber, // corePoolSize: the max number of threads to create 
> before queuing the tasks
>   maxThreadNumber, // maximumPoolSize: because we use 
> LinkedBlockingDeque, this one is not used
>   keepAliveSeconds,
>   TimeUnit.SECONDS,
>   new LinkedBlockingQueue[Runnable],
>   threadFactory)
> threadPool.allowCoreThreadTimeOut(true)
> threadPool
>   }
> {code}
> But some times, if the Thead object do not GC quickly it may caused 
> server(driver) OOM.
> Below is an example:
>  !26601-occupy.png! 
>  !26601-largeobject.png! 
>  !26601-path2gcroot.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime and maxThreadNumber configurable

2019-01-11 Thread zhoukang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated SPARK-26601:
-
Attachment: (was: 选区_002.png)

> Make broadcast-exchange thread pool keepalivetime and maxThreadNumber 
> configurable
> --
>
> Key: SPARK-26601
> URL: https://issues.apache.org/jira/browse/SPARK-26601
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: zhoukang
>Priority: Major
> Attachments: 26601-largeobject.png, 26601-occupy.png, 
> 26601-path2gcroot.png
>
>
> Currently,thread number of broadcast-exchange thread pool is fixed and 
> keepAliveSeconds is also fixed as 60s.
> {code:java}
> object BroadcastExchangeExec {
>   private[execution] val executionContext = 
> ExecutionContext.fromExecutorService(
> ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
> }
>  /**
>* Create a cached thread pool whose max number of threads is 
> `maxThreadNumber`. Thread names
>* are formatted as prefix-ID, where ID is a unique, sequentially assigned 
> integer.
>*/
>   def newDaemonCachedThreadPool(
>   prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
> ThreadPoolExecutor = {
> val threadFactory = namedThreadFactory(prefix)
> val threadPool = new ThreadPoolExecutor(
>   maxThreadNumber, // corePoolSize: the max number of threads to create 
> before queuing the tasks
>   maxThreadNumber, // maximumPoolSize: because we use 
> LinkedBlockingDeque, this one is not used
>   keepAliveSeconds,
>   TimeUnit.SECONDS,
>   new LinkedBlockingQueue[Runnable],
>   threadFactory)
> threadPool.allowCoreThreadTimeOut(true)
> threadPool
>   }
> {code}
> But some times, if the Thead object do not GC quickly it may caused 
> server(driver) OOM.
> Below is an example:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime and maxThreadNumber configurable

2019-01-11 Thread zhoukang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated SPARK-26601:
-
Attachment: 26601-path2gcroot.png
26601-occupy.png
26601-largeobject.png

> Make broadcast-exchange thread pool keepalivetime and maxThreadNumber 
> configurable
> --
>
> Key: SPARK-26601
> URL: https://issues.apache.org/jira/browse/SPARK-26601
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: zhoukang
>Priority: Major
> Attachments: 26601-largeobject.png, 26601-occupy.png, 
> 26601-path2gcroot.png
>
>
> Currently,thread number of broadcast-exchange thread pool is fixed and 
> keepAliveSeconds is also fixed as 60s.
> {code:java}
> object BroadcastExchangeExec {
>   private[execution] val executionContext = 
> ExecutionContext.fromExecutorService(
> ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
> }
>  /**
>* Create a cached thread pool whose max number of threads is 
> `maxThreadNumber`. Thread names
>* are formatted as prefix-ID, where ID is a unique, sequentially assigned 
> integer.
>*/
>   def newDaemonCachedThreadPool(
>   prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
> ThreadPoolExecutor = {
> val threadFactory = namedThreadFactory(prefix)
> val threadPool = new ThreadPoolExecutor(
>   maxThreadNumber, // corePoolSize: the max number of threads to create 
> before queuing the tasks
>   maxThreadNumber, // maximumPoolSize: because we use 
> LinkedBlockingDeque, this one is not used
>   keepAliveSeconds,
>   TimeUnit.SECONDS,
>   new LinkedBlockingQueue[Runnable],
>   threadFactory)
> threadPool.allowCoreThreadTimeOut(true)
> threadPool
>   }
> {code}
> But some times, if the Thead object do not GC quickly it may caused 
> server(driver) OOM.
> Below is an example:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime and maxThreadNumber configurable

2019-01-11 Thread zhoukang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated SPARK-26601:
-
Attachment: (was: 选区_001.png)

> Make broadcast-exchange thread pool keepalivetime and maxThreadNumber 
> configurable
> --
>
> Key: SPARK-26601
> URL: https://issues.apache.org/jira/browse/SPARK-26601
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: zhoukang
>Priority: Major
> Attachments: 26601-largeobject.png, 26601-occupy.png, 
> 26601-path2gcroot.png
>
>
> Currently,thread number of broadcast-exchange thread pool is fixed and 
> keepAliveSeconds is also fixed as 60s.
> {code:java}
> object BroadcastExchangeExec {
>   private[execution] val executionContext = 
> ExecutionContext.fromExecutorService(
> ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
> }
>  /**
>* Create a cached thread pool whose max number of threads is 
> `maxThreadNumber`. Thread names
>* are formatted as prefix-ID, where ID is a unique, sequentially assigned 
> integer.
>*/
>   def newDaemonCachedThreadPool(
>   prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
> ThreadPoolExecutor = {
> val threadFactory = namedThreadFactory(prefix)
> val threadPool = new ThreadPoolExecutor(
>   maxThreadNumber, // corePoolSize: the max number of threads to create 
> before queuing the tasks
>   maxThreadNumber, // maximumPoolSize: because we use 
> LinkedBlockingDeque, this one is not used
>   keepAliveSeconds,
>   TimeUnit.SECONDS,
>   new LinkedBlockingQueue[Runnable],
>   threadFactory)
> threadPool.allowCoreThreadTimeOut(true)
> threadPool
>   }
> {code}
> But some times, if the Thead object do not GC quickly it may caused 
> server(driver) OOM.
> Below is an example:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime and maxThreadNumber configurable

2019-01-11 Thread zhoukang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated SPARK-26601:
-
Attachment: (was: 选区_002 (1).png)

> Make broadcast-exchange thread pool keepalivetime and maxThreadNumber 
> configurable
> --
>
> Key: SPARK-26601
> URL: https://issues.apache.org/jira/browse/SPARK-26601
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: zhoukang
>Priority: Major
> Attachments: 26601-largeobject.png, 26601-occupy.png, 
> 26601-path2gcroot.png
>
>
> Currently,thread number of broadcast-exchange thread pool is fixed and 
> keepAliveSeconds is also fixed as 60s.
> {code:java}
> object BroadcastExchangeExec {
>   private[execution] val executionContext = 
> ExecutionContext.fromExecutorService(
> ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
> }
>  /**
>* Create a cached thread pool whose max number of threads is 
> `maxThreadNumber`. Thread names
>* are formatted as prefix-ID, where ID is a unique, sequentially assigned 
> integer.
>*/
>   def newDaemonCachedThreadPool(
>   prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
> ThreadPoolExecutor = {
> val threadFactory = namedThreadFactory(prefix)
> val threadPool = new ThreadPoolExecutor(
>   maxThreadNumber, // corePoolSize: the max number of threads to create 
> before queuing the tasks
>   maxThreadNumber, // maximumPoolSize: because we use 
> LinkedBlockingDeque, this one is not used
>   keepAliveSeconds,
>   TimeUnit.SECONDS,
>   new LinkedBlockingQueue[Runnable],
>   threadFactory)
> threadPool.allowCoreThreadTimeOut(true)
> threadPool
>   }
> {code}
> But some times, if the Thead object do not GC quickly it may caused 
> server(driver) OOM.
> Below is an example:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26600) Update spark-submit usage message

2019-01-11 Thread Luca Canali (JIRA)
Luca Canali created SPARK-26600:
---

 Summary: Update spark-submit usage message
 Key: SPARK-26600
 URL: https://issues.apache.org/jira/browse/SPARK-26600
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.4.0
Reporter: Luca Canali
 Fix For: 3.0.0


Spark-submit usage message should be put in sync with recent changes in 
particular regarding K8S support. These are the proposed changes to the usage 
message:

--executor-cores NUM -> can be useed for Spark on YARN and K8S 

--principal PRINCIPAL  and --keytab KEYTAB -> can be used for Spark on YARN and 
K8S

--total-executor-cores NUM-> can be used for Spark standalone, YARN and K8S 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26601) Make broadcast-exchange thread pool keepalivetime configurable

2019-01-11 Thread zhoukang (JIRA)
zhoukang created SPARK-26601:


 Summary: Make broadcast-exchange thread pool keepalivetime 
configurable
 Key: SPARK-26601
 URL: https://issues.apache.org/jira/browse/SPARK-26601
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.4.0
Reporter: zhoukang


Currently,thread number of broadcast-exchange thread pool is fixed and 
keepAliveSeconds is also fixed as 60s.

{code:java}
object BroadcastExchangeExec {
  private[execution] val executionContext = 
ExecutionContext.fromExecutorService(
ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128))
}

 /**
   * Create a cached thread pool whose max number of threads is 
`maxThreadNumber`. Thread names
   * are formatted as prefix-ID, where ID is a unique, sequentially assigned 
integer.
   */
  def newDaemonCachedThreadPool(
  prefix: String, maxThreadNumber: Int, keepAliveSeconds: Int = 60): 
ThreadPoolExecutor = {
val threadFactory = namedThreadFactory(prefix)
val threadPool = new ThreadPoolExecutor(
  maxThreadNumber, // corePoolSize: the max number of threads to create 
before queuing the tasks
  maxThreadNumber, // maximumPoolSize: because we use LinkedBlockingDeque, 
this one is not used
  keepAliveSeconds,
  TimeUnit.SECONDS,
  new LinkedBlockingQueue[Runnable],
  threadFactory)
threadPool.allowCoreThreadTimeOut(true)
threadPool
  }
{code}
But some times, if the Thead object do not GC quickly it may caused 
server(driver) OOM.
Below is an example:




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26600) Update spark-submit usage message

2019-01-11 Thread Luca Canali (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Canali updated SPARK-26600:

Description: 
Spark-submit usage message should be put in sync with recent changes in 
particular regarding K8S support. These are the proposed changes to the usage 
message:

--executor-cores NUM -> can be useed for Spark on YARN and K8S 

--principal PRINCIPAL  and --keytab KEYTAB -> can be used for Spark on YARN and 
K8S

--total-executor-cores NUM -> can be used for Spark standalone, YARN and K8S 

  was:
Spark-submit usage message should be put in sync with recent changes in 
particular regarding K8S support. These are the proposed changes to the usage 
message:

--executor-cores NUM -> can be useed for Spark on YARN and K8S 

--principal PRINCIPAL  and --keytab KEYTAB -> can be used for Spark on YARN and 
K8S

--total-executor-cores NUM-> can be used for Spark standalone, YARN and K8S 


> Update spark-submit usage message
> -
>
> Key: SPARK-26600
> URL: https://issues.apache.org/jira/browse/SPARK-26600
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Luca Canali
>Priority: Minor
> Fix For: 3.0.0
>
>
> Spark-submit usage message should be put in sync with recent changes in 
> particular regarding K8S support. These are the proposed changes to the usage 
> message:
> --executor-cores NUM -> can be useed for Spark on YARN and K8S 
> --principal PRINCIPAL  and --keytab KEYTAB -> can be used for Spark on YARN 
> and K8S
> --total-executor-cores NUM -> can be used for Spark standalone, YARN and K8S 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org