date:20210708

[jira] [Created] (SPARK-36042) [Dynamic allocation] Executor grace period (ExecutorIdleTimeout) ignored due to nulll startTime for pods in pending state

2021-07-08 Thread Alexandre CLEMENT (Jira)

Alexandre CLEMENT created SPARK-36042:
-

 Summary: [Dynamic allocation] Executor grace period 
(ExecutorIdleTimeout) ignored due to nulll startTime for pods in pending state
 Key: SPARK-36042
 URL: https://issues.apache.org/jira/browse/SPARK-36042
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes
Affects Versions: 3.1.1
 Environment: AWS EKS with dynamic allocation 
Reporter: Alexandre CLEMENT


Pending executor are always timeouted due to null startTime and funtion 
returning true in case of exception in parsing startTime.

private def isExecutorIdleTimedOut(state: ExecutorPodState, currentTime: Long): 
Boolean = {
{{ try {}}
{{ val startTime = 
Instant.parse(state.pod.getStatus.getStartTime).toEpochMilli()}}
{{ currentTime - startTime > executorIdleTimeout}}
{{ } catch {}}
{{ case _: Exception =>}}
{{ logDebug(s"Cannot get startTime of pod ${state.pod}")}}
{{ true}}
{{ }}}
{{}}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36042) [Dynamic allocation] Executor grace period (ExecutorIdleTimeout) ignored due to nulll startTime for pods in pending state

2021-07-08 Thread Alexandre CLEMENT (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre CLEMENT updated SPARK-36042:
--
Description: 
Pending executor are always timeouted due to null startTime and funtion 
returning true in case of exception in parsing startTime.

 

In class ExecutorPodsAllocator:

{{private def isExecutorIdleTimedOut(state: ExecutorPodState, currentTime: 
Long): Boolean = {}}
 {{try {}}
 {{ val startTime = 
Instant.parse(state.pod.getStatus.getStartTime).toEpochMilli()}}
 {{ currentTime - startTime > executorIdleTimeout}}
 {{catch {}}
 {{  case _: Exception =>}}
 {{  logDebug(s"Cannot get startTime of pod ${state.pod}")
 {{  true}}
}

  was:
Pending executor are always timeouted due to null startTime and funtion 
returning true in case of exception in parsing startTime.

 

{{In class }}ExecutorPodsAllocator:

{{private def isExecutorIdleTimedOut(state: ExecutorPodState, currentTime: 
Long): Boolean = {}}
{{try {}}
{{ val startTime = 
Instant.parse(state.pod.getStatus.getStartTime).toEpochMilli()}}
{{ currentTime - startTime > executorIdleTimeout}}
{{catch {}}
{{  case _: Exception =>}}
{{  logDebug(s"Cannot get startTime of pod ${state.pod}")
{{  true}}
}


> [Dynamic allocation] Executor grace period (ExecutorIdleTimeout) ignored due 
> to nulll startTime for pods in pending state
> -
>
> Key: SPARK-36042
> URL: https://issues.apache.org/jira/browse/SPARK-36042
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.1.1
> Environment: AWS EKS with dynamic allocation 
>Reporter: Alexandre CLEMENT
>Priority: Major
>
> Pending executor are always timeouted due to null startTime and funtion 
> returning true in case of exception in parsing startTime.
>  
> In class ExecutorPodsAllocator:
> {{private def isExecutorIdleTimedOut(state: ExecutorPodState, currentTime: 
> Long): Boolean = {}}
>  {{try {}}
>  {{ val startTime = 
> Instant.parse(state.pod.getStatus.getStartTime).toEpochMilli()}}
>  {{ currentTime - startTime > executorIdleTimeout}}
>  {{catch {}}
>  {{  case _: Exception =>}}
>  {{  logDebug(s"Cannot get startTime of pod ${state.pod}")
>  {{  true}}
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36042) [Dynamic allocation] Executor grace period (ExecutorIdleTimeout) ignored due to nulll startTime for pods in pending state

2021-07-08 Thread Alexandre CLEMENT (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre CLEMENT updated SPARK-36042:
--
Description: 
Pending executor are always timeouted due to null startTime and funtion 
returning true in case of exception in parsing startTime.

 

{{In class }}ExecutorPodsAllocator:

{{private def isExecutorIdleTimedOut(state: ExecutorPodState, currentTime: 
Long): Boolean = {}}
{{try {}}
{{ val startTime = 
Instant.parse(state.pod.getStatus.getStartTime).toEpochMilli()}}
{{ currentTime - startTime > executorIdleTimeout}}
{{catch {}}
{{  case _: Exception =>}}
{{  logDebug(s"Cannot get startTime of pod ${state.pod}")
{{  true}}
}

  was:
Pending executor are always timeouted due to null startTime and funtion 
returning true in case of exception in parsing startTime.

private def isExecutorIdleTimedOut(state: ExecutorPodState, currentTime: Long): 
Boolean = {
{{ try {}}
{{ val startTime = 
Instant.parse(state.pod.getStatus.getStartTime).toEpochMilli()}}
{{ currentTime - startTime > executorIdleTimeout}}
{{ } catch {}}
{{ case _: Exception =>}}
{{ logDebug(s"Cannot get startTime of pod ${state.pod}")}}
{{ true}}
{{ }}}
{{}}}


> [Dynamic allocation] Executor grace period (ExecutorIdleTimeout) ignored due 
> to nulll startTime for pods in pending state
> -
>
> Key: SPARK-36042
> URL: https://issues.apache.org/jira/browse/SPARK-36042
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.1.1
> Environment: AWS EKS with dynamic allocation 
>Reporter: Alexandre CLEMENT
>Priority: Major
>
> Pending executor are always timeouted due to null startTime and funtion 
> returning true in case of exception in parsing startTime.
>  
> {{In class }}ExecutorPodsAllocator:
> {{private def isExecutorIdleTimedOut(state: ExecutorPodState, currentTime: 
> Long): Boolean = {}}
> {{try {}}
> {{ val startTime = 
> Instant.parse(state.pod.getStatus.getStartTime).toEpochMilli()}}
> {{ currentTime - startTime > executorIdleTimeout}}
> {{catch {}}
> {{  case _: Exception =>}}
> {{  logDebug(s"Cannot get startTime of pod ${state.pod}")
> {{  true}}
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36042) [Dynamic allocation] Executor grace period (ExecutorIdleTimeout) ignored due to nulll startTime for pods in pending state

2021-07-08 Thread Alexandre CLEMENT (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre CLEMENT updated SPARK-36042:
--
Description: 
Pending executor are always timeouted due to null startTime and funtion 
returning true in case of exception in parsing startTime.

 

In class ExecutorPodsAllocator:

{{private def isExecutorIdleTimedOut(state: ExecutorPodState, currentTime: 
Long): Boolean = {}}
 {{try {}}
 {{ val startTime = 
Instant.parse(state.pod.getStatus.getStartTime).toEpochMilli()}}
 {{ currentTime - startTime > executorIdleTimeout}}
 {{catch {}}
 {{  case _: Exception =>}}
 {{  logDebug(s"Cannot get startTime of pod ${state.pod}")
 {{  true}}
 }}

  was:
Pending executor are always timeouted due to null startTime and funtion 
returning true in case of exception in parsing startTime.

 

In class ExecutorPodsAllocator:

{{private def isExecutorIdleTimedOut(state: ExecutorPodState, currentTime: 
Long): Boolean = {}}
 {{try {}}
 {{ val startTime = 
Instant.parse(state.pod.getStatus.getStartTime).toEpochMilli()}}
 {{ currentTime - startTime > executorIdleTimeout}}
 {{catch {}}
 {{  case _: Exception =>}}
 {{  logDebug(s"Cannot get startTime of pod ${state.pod}")
 {{  true}}
}


> [Dynamic allocation] Executor grace period (ExecutorIdleTimeout) ignored due 
> to nulll startTime for pods in pending state
> -
>
> Key: SPARK-36042
> URL: https://issues.apache.org/jira/browse/SPARK-36042
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.1.1
> Environment: AWS EKS with dynamic allocation 
>Reporter: Alexandre CLEMENT
>Priority: Major
>
> Pending executor are always timeouted due to null startTime and funtion 
> returning true in case of exception in parsing startTime.
>  
> In class ExecutorPodsAllocator:
> {{private def isExecutorIdleTimedOut(state: ExecutorPodState, currentTime: 
> Long): Boolean = {}}
>  {{try {}}
>  {{ val startTime = 
> Instant.parse(state.pod.getStatus.getStartTime).toEpochMilli()}}
>  {{ currentTime - startTime > executorIdleTimeout}}
>  {{catch {}}
>  {{  case _: Exception =>}}
>  {{  logDebug(s"Cannot get startTime of pod ${state.pod}")
>  {{  true}}
>  }}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36043) Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ

2021-07-08 Thread Gengliang Wang (Jira)

Gengliang Wang created SPARK-36043:
--

 Summary: Add end-to-end tests with default timestamp type as 
TIMESTAMP_NTZ
 Key: SPARK-36043
 URL: https://issues.apache.org/jira/browse/SPARK-36043
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Tests
Affects Versions: 3.2.0
Reporter: Gengliang Wang
Assignee: Gengliang Wang


Run end-to-end tests with default timestamp type as TIMESTAMP_NTZ to increase 
test coverage. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36037) Support new function localtimestamp()

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377182#comment-17377182
 ] 

Apache Spark commented on SPARK-36037:
--

User 'beliefer' has created a pull request for this issue:
https://github.com/apache/spark/pull/33258

> Support new function localtimestamp()
> -
>
> Key: SPARK-36037
> URL: https://issues.apache.org/jira/browse/SPARK-36037
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Priority: Major
>
> Returns the current timestamp at the start of query evaluation as TIMESTAMP 
> WITH OUT TIME ZONE. This is similar to "current_timestamp()"
> Note we need to update the optimization rule ComputeCurrentTime so that  
> Spark returns the same result in a single query if the function is called 
> multiple times



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36037) Support new function localtimestamp()

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377183#comment-17377183
 ] 

Apache Spark commented on SPARK-36037:
--

User 'beliefer' has created a pull request for this issue:
https://github.com/apache/spark/pull/33258

> Support new function localtimestamp()
> -
>
> Key: SPARK-36037
> URL: https://issues.apache.org/jira/browse/SPARK-36037
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Priority: Major
>
> Returns the current timestamp at the start of query evaluation as TIMESTAMP 
> WITH OUT TIME ZONE. This is similar to "current_timestamp()"
> Note we need to update the optimization rule ComputeCurrentTime so that  
> Spark returns the same result in a single query if the function is called 
> multiple times



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36037) Support new function localtimestamp()

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36037:


Assignee: Apache Spark

> Support new function localtimestamp()
> -
>
> Key: SPARK-36037
> URL: https://issues.apache.org/jira/browse/SPARK-36037
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Apache Spark
>Priority: Major
>
> Returns the current timestamp at the start of query evaluation as TIMESTAMP 
> WITH OUT TIME ZONE. This is similar to "current_timestamp()"
> Note we need to update the optimization rule ComputeCurrentTime so that  
> Spark returns the same result in a single query if the function is called 
> multiple times



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36037) Support new function localtimestamp()

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36037:


Assignee: (was: Apache Spark)

> Support new function localtimestamp()
> -
>
> Key: SPARK-36037
> URL: https://issues.apache.org/jira/browse/SPARK-36037
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Priority: Major
>
> Returns the current timestamp at the start of query evaluation as TIMESTAMP 
> WITH OUT TIME ZONE. This is similar to "current_timestamp()"
> Note we need to update the optimization rule ComputeCurrentTime so that  
> Spark returns the same result in a single query if the function is called 
> multiple times



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36043) Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36043:


Assignee: Gengliang Wang  (was: Apache Spark)

> Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ
> -
>
> Key: SPARK-36043
> URL: https://issues.apache.org/jira/browse/SPARK-36043
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> Run end-to-end tests with default timestamp type as TIMESTAMP_NTZ to increase 
> test coverage. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36043) Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36043:


Assignee: Apache Spark  (was: Gengliang Wang)

> Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ
> -
>
> Key: SPARK-36043
> URL: https://issues.apache.org/jira/browse/SPARK-36043
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Apache Spark
>Priority: Major
>
> Run end-to-end tests with default timestamp type as TIMESTAMP_NTZ to increase 
> test coverage. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36043) Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377189#comment-17377189
 ] 

Apache Spark commented on SPARK-36043:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/33259

> Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ
> -
>
> Key: SPARK-36043
> URL: https://issues.apache.org/jira/browse/SPARK-36043
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> Run end-to-end tests with default timestamp type as TIMESTAMP_NTZ to increase 
> test coverage. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36044) Suport TimestampNTZ in functions unix_timestamp/to_unix_timestamp

2021-07-08 Thread Gengliang Wang (Jira)

Gengliang Wang created SPARK-36044:
--

 Summary: Suport TimestampNTZ in functions 
unix_timestamp/to_unix_timestamp
 Key: SPARK-36044
 URL: https://issues.apache.org/jira/browse/SPARK-36044
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: Gengliang Wang


The functions unix_timestamp/to_unix_timestamp should be able to accept input 
of TimestampNTZ type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36045) TO_UTC_TIMESTAMP: return different result based on the default timestamp type

2021-07-08 Thread Gengliang Wang (Jira)

Gengliang Wang created SPARK-36045:
--

 Summary: TO_UTC_TIMESTAMP: return different result based on the 
default timestamp type
 Key: SPARK-36045
 URL: https://issues.apache.org/jira/browse/SPARK-36045
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: Gengliang Wang


For the function `TO_UTC_TIMESTAMP`, it should return different timestamp type 
based on the default timestamp type



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36046) Support new function make_timestamp_ntz

2021-07-08 Thread Gengliang Wang (Jira)

Gengliang Wang created SPARK-36046:
--

 Summary: Support new function make_timestamp_ntz
 Key: SPARK-36046
 URL: https://issues.apache.org/jira/browse/SPARK-36046
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: Gengliang Wang


Syntax:
make_timestamp_ntz(year, month, day, hour, min, sec)  Create local timestamp 
from year, month, day, hour, min, sec and timezone fields



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36047) Replace the handwriting compare methods with static compare methods in Java code

2021-07-08 Thread Yang Jie (Jira)

Yang Jie created SPARK-36047:


 Summary: Replace the handwriting compare methods with static 
compare methods in Java code
 Key: SPARK-36047
 URL: https://issues.apache.org/jira/browse/SPARK-36047
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.3.0
Reporter: Yang Jie


There are some handwriting compare methods like 
`ShuffleInMemorySorter.SortComparator`
{code:java}
private static final class SortComparator implements 
Comparator {
  @Override
  public int compare(PackedRecordPointer left, PackedRecordPointer right) {
int leftId = left.getPartitionId();
int rightId = right.getPartitionId();
return Integer.compare(leftId, rightId);
  }
}
{code}
the handwriting compare methods can replace with `Integer.compare()` method and 
similar methods after Java 1.7



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36048) Fix HealthTrackerSuite.allExecutorAndHostIds

2021-07-08 Thread wuyi (Jira)

wuyi created SPARK-36048:


 Summary: Fix HealthTrackerSuite.allExecutorAndHostIds
 Key: SPARK-36048
 URL: https://issues.apache.org/jira/browse/SPARK-36048
 Project: Spark
  Issue Type: Test
  Components: Spark Core
Affects Versions: 3.1.2, 3.0.3, 3.2.0
Reporter: wuyi


`HealthTrackerSuite.allExecutorAndHostIds` is mistakenly declared, which leads 
to the executor exclusion isn't correctly tested. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36048) Wrong HealthTrackerSuite.allExecutorAndHostIds

2021-07-08 Thread wuyi (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wuyi updated SPARK-36048:
-
Summary: Wrong HealthTrackerSuite.allExecutorAndHostIds  (was: Fix 
HealthTrackerSuite.allExecutorAndHostIds)

> Wrong HealthTrackerSuite.allExecutorAndHostIds
> --
>
> Key: SPARK-36048
> URL: https://issues.apache.org/jira/browse/SPARK-36048
> Project: Spark
>  Issue Type: Test
>  Components: Spark Core
>Affects Versions: 3.0.3, 3.1.2, 3.2.0
>Reporter: wuyi
>Priority: Major
>
> `HealthTrackerSuite.allExecutorAndHostIds` is mistakenly declared, which 
> leads to the executor exclusion isn't correctly tested. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36047) Replace the handwriting compare methods with static compare methods in Java code

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36047:


Assignee: (was: Apache Spark)

> Replace the handwriting compare methods with static compare methods in Java 
> code
> 
>
> Key: SPARK-36047
> URL: https://issues.apache.org/jira/browse/SPARK-36047
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Trivial
>
> There are some handwriting compare methods like 
> `ShuffleInMemorySorter.SortComparator`
> {code:java}
> private static final class SortComparator implements 
> Comparator {
>   @Override
>   public int compare(PackedRecordPointer left, PackedRecordPointer right) {
> int leftId = left.getPartitionId();
> int rightId = right.getPartitionId();
> return Integer.compare(leftId, rightId);
>   }
> }
> {code}
> the handwriting compare methods can replace with `Integer.compare()` method 
> and similar methods after Java 1.7



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36047) Replace the handwriting compare methods with static compare methods in Java code

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377241#comment-17377241
 ] 

Apache Spark commented on SPARK-36047:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/33260

> Replace the handwriting compare methods with static compare methods in Java 
> code
> 
>
> Key: SPARK-36047
> URL: https://issues.apache.org/jira/browse/SPARK-36047
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Trivial
>
> There are some handwriting compare methods like 
> `ShuffleInMemorySorter.SortComparator`
> {code:java}
> private static final class SortComparator implements 
> Comparator {
>   @Override
>   public int compare(PackedRecordPointer left, PackedRecordPointer right) {
> int leftId = left.getPartitionId();
> int rightId = right.getPartitionId();
> return Integer.compare(leftId, rightId);
>   }
> }
> {code}
> the handwriting compare methods can replace with `Integer.compare()` method 
> and similar methods after Java 1.7



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36047) Replace the handwriting compare methods with static compare methods in Java code

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36047:


Assignee: Apache Spark

> Replace the handwriting compare methods with static compare methods in Java 
> code
> 
>
> Key: SPARK-36047
> URL: https://issues.apache.org/jira/browse/SPARK-36047
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Trivial
>
> There are some handwriting compare methods like 
> `ShuffleInMemorySorter.SortComparator`
> {code:java}
> private static final class SortComparator implements 
> Comparator {
>   @Override
>   public int compare(PackedRecordPointer left, PackedRecordPointer right) {
> int leftId = left.getPartitionId();
> int rightId = right.getPartitionId();
> return Integer.compare(leftId, rightId);
>   }
> }
> {code}
> the handwriting compare methods can replace with `Integer.compare()` method 
> and similar methods after Java 1.7



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35334) Spark should be more resilient to intermittent K8s flakiness

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377243#comment-17377243
 ] 

Apache Spark commented on SPARK-35334:
--

User 'attilapiros' has created a pull request for this issue:
https://github.com/apache/spark/pull/33261

> Spark should be more resilient to intermittent K8s flakiness
> 
>
> Key: SPARK-35334
> URL: https://issues.apache.org/jira/browse/SPARK-35334
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.2.0
>Reporter: Attila Zsolt Piros
>Assignee: Attila Zsolt Piros
>Priority: Major
>
> Internal K8s errors such as an etcdserver leader election is propagated to 
> the API client and could cause serious issues in Spark, like:
> {noformat}
> Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure 
> executing: GET at:
> https://kubernetes.default.svc/api/v1/namespaces/dex-app-bl24w4z9/pods/sparkpi-10-fcd3f6781a874212-driver.
>  Message: etcdserver: 
> leader changed. Received status: Status(apiVersion=v1, code=500, 
> details=null, kind=Status, message=etcdserver: leader changed, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), reason=null, 
> status=Failure, additionalProperties={}).
> {noformat}
> First I try to fix in kubernetes-client by adding retries with exponential 
> backoff:
> https://github.com/fabric8io/kubernetes-client/issues/3087
> If I manage it then this will could be just version update and introducing 
> some new configs in Spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-35334) Spark should be more resilient to intermittent K8s flakiness

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-35334:


Assignee: Apache Spark  (was: Attila Zsolt Piros)

> Spark should be more resilient to intermittent K8s flakiness
> 
>
> Key: SPARK-35334
> URL: https://issues.apache.org/jira/browse/SPARK-35334
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.2.0
>Reporter: Attila Zsolt Piros
>Assignee: Apache Spark
>Priority: Major
>
> Internal K8s errors such as an etcdserver leader election is propagated to 
> the API client and could cause serious issues in Spark, like:
> {noformat}
> Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure 
> executing: GET at:
> https://kubernetes.default.svc/api/v1/namespaces/dex-app-bl24w4z9/pods/sparkpi-10-fcd3f6781a874212-driver.
>  Message: etcdserver: 
> leader changed. Received status: Status(apiVersion=v1, code=500, 
> details=null, kind=Status, message=etcdserver: leader changed, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), reason=null, 
> status=Failure, additionalProperties={}).
> {noformat}
> First I try to fix in kubernetes-client by adding retries with exponential 
> backoff:
> https://github.com/fabric8io/kubernetes-client/issues/3087
> If I manage it then this will could be just version update and introducing 
> some new configs in Spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-35334) Spark should be more resilient to intermittent K8s flakiness

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-35334:


Assignee: Attila Zsolt Piros  (was: Apache Spark)

> Spark should be more resilient to intermittent K8s flakiness
> 
>
> Key: SPARK-35334
> URL: https://issues.apache.org/jira/browse/SPARK-35334
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.2.0
>Reporter: Attila Zsolt Piros
>Assignee: Attila Zsolt Piros
>Priority: Major
>
> Internal K8s errors such as an etcdserver leader election is propagated to 
> the API client and could cause serious issues in Spark, like:
> {noformat}
> Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure 
> executing: GET at:
> https://kubernetes.default.svc/api/v1/namespaces/dex-app-bl24w4z9/pods/sparkpi-10-fcd3f6781a874212-driver.
>  Message: etcdserver: 
> leader changed. Received status: Status(apiVersion=v1, code=500, 
> details=null, kind=Status, message=etcdserver: leader changed, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), reason=null, 
> status=Failure, additionalProperties={}).
> {noformat}
> First I try to fix in kubernetes-client by adding retries with exponential 
> backoff:
> https://github.com/fabric8io/kubernetes-client/issues/3087
> If I manage it then this will could be just version update and introducing 
> some new configs in Spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36048) Wrong HealthTrackerSuite.allExecutorAndHostIds

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36048:


Assignee: Apache Spark

> Wrong HealthTrackerSuite.allExecutorAndHostIds
> --
>
> Key: SPARK-36048
> URL: https://issues.apache.org/jira/browse/SPARK-36048
> Project: Spark
>  Issue Type: Test
>  Components: Spark Core
>Affects Versions: 3.0.3, 3.1.2, 3.2.0
>Reporter: wuyi
>Assignee: Apache Spark
>Priority: Major
>
> `HealthTrackerSuite.allExecutorAndHostIds` is mistakenly declared, which 
> leads to the executor exclusion isn't correctly tested. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35334) Spark should be more resilient to intermittent K8s flakiness

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377244#comment-17377244
 ] 

Apache Spark commented on SPARK-35334:
--

User 'attilapiros' has created a pull request for this issue:
https://github.com/apache/spark/pull/33261

> Spark should be more resilient to intermittent K8s flakiness
> 
>
> Key: SPARK-35334
> URL: https://issues.apache.org/jira/browse/SPARK-35334
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.2.0
>Reporter: Attila Zsolt Piros
>Assignee: Attila Zsolt Piros
>Priority: Major
>
> Internal K8s errors such as an etcdserver leader election is propagated to 
> the API client and could cause serious issues in Spark, like:
> {noformat}
> Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure 
> executing: GET at:
> https://kubernetes.default.svc/api/v1/namespaces/dex-app-bl24w4z9/pods/sparkpi-10-fcd3f6781a874212-driver.
>  Message: etcdserver: 
> leader changed. Received status: Status(apiVersion=v1, code=500, 
> details=null, kind=Status, message=etcdserver: leader changed, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), reason=null, 
> status=Failure, additionalProperties={}).
> {noformat}
> First I try to fix in kubernetes-client by adding retries with exponential 
> backoff:
> https://github.com/fabric8io/kubernetes-client/issues/3087
> If I manage it then this will could be just version update and introducing 
> some new configs in Spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36048) Wrong HealthTrackerSuite.allExecutorAndHostIds

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377246#comment-17377246
 ] 

Apache Spark commented on SPARK-36048:
--

User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/33262

> Wrong HealthTrackerSuite.allExecutorAndHostIds
> --
>
> Key: SPARK-36048
> URL: https://issues.apache.org/jira/browse/SPARK-36048
> Project: Spark
>  Issue Type: Test
>  Components: Spark Core
>Affects Versions: 3.0.3, 3.1.2, 3.2.0
>Reporter: wuyi
>Priority: Major
>
> `HealthTrackerSuite.allExecutorAndHostIds` is mistakenly declared, which 
> leads to the executor exclusion isn't correctly tested. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36048) Wrong HealthTrackerSuite.allExecutorAndHostIds

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36048:


Assignee: (was: Apache Spark)

> Wrong HealthTrackerSuite.allExecutorAndHostIds
> --
>
> Key: SPARK-36048
> URL: https://issues.apache.org/jira/browse/SPARK-36048
> Project: Spark
>  Issue Type: Test
>  Components: Spark Core
>Affects Versions: 3.0.3, 3.1.2, 3.2.0
>Reporter: wuyi
>Priority: Major
>
> `HealthTrackerSuite.allExecutorAndHostIds` is mistakenly declared, which 
> leads to the executor exclusion isn't correctly tested. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35027) Close the inputStream in FileAppender when writing the logs failure

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377257#comment-17377257
 ] 

Apache Spark commented on SPARK-35027:
--

User 'jhu-chang' has created a pull request for this issue:
https://github.com/apache/spark/pull/33263

> Close the inputStream in FileAppender when writing the logs failure
> ---
>
> Key: SPARK-35027
> URL: https://issues.apache.org/jira/browse/SPARK-35027
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.1
>Reporter: Jack Hu
>Priority: Major
>
> In Spark Cluster, the ExecutorRunner uses FileAppender  to redirect the 
> stdout/stderr of executors to file, when the writing processing is failure 
> due to some reasons: disk full, the FileAppender will only close the input 
> stream to file, but leave the pipe's stdout/stderr open, following writting 
> operation in executor side may be hung. 
> need to close the inputStream in FileAppender ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35027) Close the inputStream in FileAppender when writing the logs failure

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377261#comment-17377261
 ] 

Apache Spark commented on SPARK-35027:
--

User 'jhu-chang' has created a pull request for this issue:
https://github.com/apache/spark/pull/33263

> Close the inputStream in FileAppender when writing the logs failure
> ---
>
> Key: SPARK-35027
> URL: https://issues.apache.org/jira/browse/SPARK-35027
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.1
>Reporter: Jack Hu
>Priority: Major
>
> In Spark Cluster, the ExecutorRunner uses FileAppender  to redirect the 
> stdout/stderr of executors to file, when the writing processing is failure 
> due to some reasons: disk full, the FileAppender will only close the input 
> stream to file, but leave the pipe's stdout/stderr open, following writting 
> operation in executor side may be hung. 
> need to close the inputStream in FileAppender ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36049) Remove IntervalUnit in code

2021-07-08 Thread angerszhu (Jira)

angerszhu created SPARK-36049:
-

 Summary: Remove IntervalUnit in code
 Key: SPARK-36049
 URL: https://issues.apache.org/jira/browse/SPARK-36049
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: angerszhu


According to https://github.com/apache/spark/pull/33252#issuecomment-876280183



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36050) Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC

2021-07-08 Thread Gengliang Wang (Jira)

Gengliang Wang created SPARK-36050:
--

 Summary: Spark doesn’t support reading/writing TIMESTAMP_NTZ with 
ORC
 Key: SPARK-36050
 URL: https://issues.apache.org/jira/browse/SPARK-36050
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: Gengliang Wang
Assignee: Gengliang Wang


The OrcTimestamp of MapReduce library is instant of java.sql.Timestamp. As it 
is the only support Timestamp type of ORC, Spark doesn’t support 
reading/writing TIMESTAMP_NTZ with ORC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36050) Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377269#comment-17377269
 ] 

Apache Spark commented on SPARK-36050:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/33264

> Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC
> 
>
> Key: SPARK-36050
> URL: https://issues.apache.org/jira/browse/SPARK-36050
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> The OrcTimestamp of MapReduce library is instant of java.sql.Timestamp. As it 
> is the only support Timestamp type of ORC, Spark doesn’t support 
> reading/writing TIMESTAMP_NTZ with ORC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36050) Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377270#comment-17377270
 ] 

Apache Spark commented on SPARK-36050:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/33264

> Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC
> 
>
> Key: SPARK-36050
> URL: https://issues.apache.org/jira/browse/SPARK-36050
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> The OrcTimestamp of MapReduce library is instant of java.sql.Timestamp. As it 
> is the only support Timestamp type of ORC, Spark doesn’t support 
> reading/writing TIMESTAMP_NTZ with ORC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36050) Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36050:


Assignee: Apache Spark  (was: Gengliang Wang)

> Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC
> 
>
> Key: SPARK-36050
> URL: https://issues.apache.org/jira/browse/SPARK-36050
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Apache Spark
>Priority: Major
>
> The OrcTimestamp of MapReduce library is instant of java.sql.Timestamp. As it 
> is the only support Timestamp type of ORC, Spark doesn’t support 
> reading/writing TIMESTAMP_NTZ with ORC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36050) Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36050:


Assignee: Gengliang Wang  (was: Apache Spark)

> Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC
> 
>
> Key: SPARK-36050
> URL: https://issues.apache.org/jira/browse/SPARK-36050
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> The OrcTimestamp of MapReduce library is instant of java.sql.Timestamp. As it 
> is the only support Timestamp type of ORC, Spark doesn’t support 
> reading/writing TIMESTAMP_NTZ with ORC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36051) Add the detection of .rst files into documentation build guides

2021-07-08 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-36051:
-
Priority: Trivial  (was: Major)

> Add the detection of .rst files into documentation build guides
> ---
>
> Key: SPARK-36051
> URL: https://issues.apache.org/jira/browse/SPARK-36051
> Project: Spark
>  Issue Type: Improvement
>  Components: docs, PySpark
>Affects Versions: 3.2.0
>Reporter: Hyukjin Kwon
>Priority: Trivial
>
> At 
> https://github.com/apache/spark/tree/master/docs#automatically-rebuilding-api-docs,
>  
> {code}
> cd "$SPARK_HOME/python/docs"
> find .. -type f -name '*.py' \
> | entr -s 'make html && cp -r _build/html/. ../../docs/api/python'
> {code}
> we should also check .rst files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36051) Add the detection of .rst files into documentation build guides

2021-07-08 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-36051:


 Summary: Add the detection of .rst files into documentation build 
guides
 Key: SPARK-36051
 URL: https://issues.apache.org/jira/browse/SPARK-36051
 Project: Spark
  Issue Type: Improvement
  Components: docs, PySpark
Affects Versions: 3.2.0
Reporter: Hyukjin Kwon


At 
https://github.com/apache/spark/tree/master/docs#automatically-rebuilding-api-docs,
 

{code}
cd "$SPARK_HOME/python/docs"
find .. -type f -name '*.py' \
| entr -s 'make html && cp -r _build/html/. ../../docs/api/python'
{code}

we should also check .rst files




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36049) Remove IntervalUnit in code

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36049:


Assignee: Apache Spark

> Remove IntervalUnit in code
> ---
>
> Key: SPARK-36049
> URL: https://issues.apache.org/jira/browse/SPARK-36049
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: Apache Spark
>Priority: Major
>
> According to https://github.com/apache/spark/pull/33252#issuecomment-876280183



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36049) Remove IntervalUnit in code

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377289#comment-17377289
 ] 

Apache Spark commented on SPARK-36049:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/33265

> Remove IntervalUnit in code
> ---
>
> Key: SPARK-36049
> URL: https://issues.apache.org/jira/browse/SPARK-36049
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Priority: Major
>
> According to https://github.com/apache/spark/pull/33252#issuecomment-876280183



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36049) Remove IntervalUnit in code

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36049:


Assignee: (was: Apache Spark)

> Remove IntervalUnit in code
> ---
>
> Key: SPARK-36049
> URL: https://issues.apache.org/jira/browse/SPARK-36049
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Priority: Major
>
> According to https://github.com/apache/spark/pull/33252#issuecomment-876280183



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36049) Remove IntervalUnit in code

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377287#comment-17377287
 ] 

Apache Spark commented on SPARK-36049:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/33265

> Remove IntervalUnit in code
> ---
>
> Key: SPARK-36049
> URL: https://issues.apache.org/jira/browse/SPARK-36049
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Priority: Major
>
> According to https://github.com/apache/spark/pull/33252#issuecomment-876280183



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36051) Remove automatic update of documentation build in the guides

2021-07-08 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-36051:
-
Summary: Remove automatic update of documentation build in the guides  
(was: Add the detection of .rst files into documentation build guides)

> Remove automatic update of documentation build in the guides
> 
>
> Key: SPARK-36051
> URL: https://issues.apache.org/jira/browse/SPARK-36051
> Project: Spark
>  Issue Type: Improvement
>  Components: docs, PySpark
>Affects Versions: 3.2.0
>Reporter: Hyukjin Kwon
>Priority: Trivial
>
> At 
> https://github.com/apache/spark/tree/master/docs#automatically-rebuilding-api-docs,
>  
> {code}
> cd "$SPARK_HOME/python/docs"
> find .. -type f -name '*.py' \
> | entr -s 'make html && cp -r _build/html/. ../../docs/api/python'
> {code}
> we should also check .rst files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36051) Remove automatic update of documentation build in the guides

2021-07-08 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-36051:
-
Description: 
At 
https://github.com/apache/spark/tree/master/docs#automatically-rebuilding-api-docs,
 

{code}
cd "$SPARK_HOME/python/docs"
find .. -type f -name '*.py' \
| entr -s 'make html && cp -r _build/html/. ../../docs/api/python'
{code}

is documented. However, this doesn't work very well:

1. It doesn't detect the changes in .rst files. But PySpark internally 
generates rst files so we can't just simply include it in the detection. 
Otherwise, it goes to an infinite loop
2. During PySpark documentation generation, it launches some jobs to generate 
plot images now. This is broken with {{entr}} command, and the job fails. Seems 
like it's related to how {{entr}} creates the process internally.

  was:
At 
https://github.com/apache/spark/tree/master/docs#automatically-rebuilding-api-docs,
 

{code}
cd "$SPARK_HOME/python/docs"
find .. -type f -name '*.py' \
| entr -s 'make html && cp -r _build/html/. ../../docs/api/python'
{code}

we should also check .rst files



> Remove automatic update of documentation build in the guides
> 
>
> Key: SPARK-36051
> URL: https://issues.apache.org/jira/browse/SPARK-36051
> Project: Spark
>  Issue Type: Improvement
>  Components: docs, PySpark
>Affects Versions: 3.2.0
>Reporter: Hyukjin Kwon
>Priority: Trivial
>
> At 
> https://github.com/apache/spark/tree/master/docs#automatically-rebuilding-api-docs,
>  
> {code}
> cd "$SPARK_HOME/python/docs"
> find .. -type f -name '*.py' \
> | entr -s 'make html && cp -r _build/html/. ../../docs/api/python'
> {code}
> is documented. However, this doesn't work very well:
> 1. It doesn't detect the changes in .rst files. But PySpark internally 
> generates rst files so we can't just simply include it in the detection. 
> Otherwise, it goes to an infinite loop
> 2. During PySpark documentation generation, it launches some jobs to generate 
> plot images now. This is broken with {{entr}} command, and the job fails. 
> Seems like it's related to how {{entr}} creates the process internally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36051) Remove automatic update of documentation build in the guides

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36051:


Assignee: (was: Apache Spark)

> Remove automatic update of documentation build in the guides
> 
>
> Key: SPARK-36051
> URL: https://issues.apache.org/jira/browse/SPARK-36051
> Project: Spark
>  Issue Type: Improvement
>  Components: docs, PySpark
>Affects Versions: 3.2.0
>Reporter: Hyukjin Kwon
>Priority: Trivial
>
> At 
> https://github.com/apache/spark/tree/master/docs#automatically-rebuilding-api-docs,
>  
> {code}
> cd "$SPARK_HOME/python/docs"
> find .. -type f -name '*.py' \
> | entr -s 'make html && cp -r _build/html/. ../../docs/api/python'
> {code}
> is documented. However, this doesn't work very well:
> 1. It doesn't detect the changes in .rst files. But PySpark internally 
> generates rst files so we can't just simply include it in the detection. 
> Otherwise, it goes to an infinite loop
> 2. During PySpark documentation generation, it launches some jobs to generate 
> plot images now. This is broken with {{entr}} command, and the job fails. 
> Seems like it's related to how {{entr}} creates the process internally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36051) Remove automatic update of documentation build in the guides

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377316#comment-17377316
 ] 

Apache Spark commented on SPARK-36051:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/33266

> Remove automatic update of documentation build in the guides
> 
>
> Key: SPARK-36051
> URL: https://issues.apache.org/jira/browse/SPARK-36051
> Project: Spark
>  Issue Type: Improvement
>  Components: docs, PySpark
>Affects Versions: 3.2.0
>Reporter: Hyukjin Kwon
>Priority: Trivial
>
> At 
> https://github.com/apache/spark/tree/master/docs#automatically-rebuilding-api-docs,
>  
> {code}
> cd "$SPARK_HOME/python/docs"
> find .. -type f -name '*.py' \
> | entr -s 'make html && cp -r _build/html/. ../../docs/api/python'
> {code}
> is documented. However, this doesn't work very well:
> 1. It doesn't detect the changes in .rst files. But PySpark internally 
> generates rst files so we can't just simply include it in the detection. 
> Otherwise, it goes to an infinite loop
> 2. During PySpark documentation generation, it launches some jobs to generate 
> plot images now. This is broken with {{entr}} command, and the job fails. 
> Seems like it's related to how {{entr}} creates the process internally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36051) Remove automatic update of documentation build in the guides

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36051:


Assignee: Apache Spark

> Remove automatic update of documentation build in the guides
> 
>
> Key: SPARK-36051
> URL: https://issues.apache.org/jira/browse/SPARK-36051
> Project: Spark
>  Issue Type: Improvement
>  Components: docs, PySpark
>Affects Versions: 3.2.0
>Reporter: Hyukjin Kwon
>Assignee: Apache Spark
>Priority: Trivial
>
> At 
> https://github.com/apache/spark/tree/master/docs#automatically-rebuilding-api-docs,
>  
> {code}
> cd "$SPARK_HOME/python/docs"
> find .. -type f -name '*.py' \
> | entr -s 'make html && cp -r _build/html/. ../../docs/api/python'
> {code}
> is documented. However, this doesn't work very well:
> 1. It doesn't detect the changes in .rst files. But PySpark internally 
> generates rst files so we can't just simply include it in the detection. 
> Otherwise, it goes to an infinite loop
> 2. During PySpark documentation generation, it launches some jobs to generate 
> plot images now. This is broken with {{entr}} command, and the job fails. 
> Seems like it's related to how {{entr}} creates the process internally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36051) Remove automatic update of documentation build in the guides

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377318#comment-17377318
 ] 

Apache Spark commented on SPARK-36051:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/33266

> Remove automatic update of documentation build in the guides
> 
>
> Key: SPARK-36051
> URL: https://issues.apache.org/jira/browse/SPARK-36051
> Project: Spark
>  Issue Type: Improvement
>  Components: docs, PySpark
>Affects Versions: 3.2.0
>Reporter: Hyukjin Kwon
>Priority: Trivial
>
> At 
> https://github.com/apache/spark/tree/master/docs#automatically-rebuilding-api-docs,
>  
> {code}
> cd "$SPARK_HOME/python/docs"
> find .. -type f -name '*.py' \
> | entr -s 'make html && cp -r _build/html/. ../../docs/api/python'
> {code}
> is documented. However, this doesn't work very well:
> 1. It doesn't detect the changes in .rst files. But PySpark internally 
> generates rst files so we can't just simply include it in the detection. 
> Otherwise, it goes to an infinite loop
> 2. During PySpark documentation generation, it launches some jobs to generate 
> plot images now. This is broken with {{entr}} command, and the job fails. 
> Seems like it's related to how {{entr}} creates the process internally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36043) Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ

2021-07-08 Thread Gengliang Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang resolved SPARK-36043.

Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33259
[https://github.com/apache/spark/pull/33259]

> Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ
> -
>
> Key: SPARK-36043
> URL: https://issues.apache.org/jira/browse/SPARK-36043
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
> Fix For: 3.2.0
>
>
> Run end-to-end tests with default timestamp type as TIMESTAMP_NTZ to increase 
> test coverage. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36052) Introduce pending pod limit for Spark on K8s

2021-07-08 Thread Attila Zsolt Piros (Jira)

Attila Zsolt Piros created SPARK-36052:
--

 Summary: Introduce pending pod limit for Spark on K8s
 Key: SPARK-36052
 URL: https://issues.apache.org/jira/browse/SPARK-36052
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes
Affects Versions: 3.3.0
Reporter: Attila Zsolt Piros
Assignee: Attila Zsolt Piros






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36052) Introduce pending pod limit for Spark on K8s

2021-07-08 Thread Attila Zsolt Piros (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Zsolt Piros updated SPARK-36052:
---
Description: Introduce a new configuration to limit the number of pending 
PODs for Spark on K8S as the K8S scheduler could be overloaded with requests 
which slows down the resource allocations (especially in case of dynamic 
allocation).  (was: Introduce pending POD limit for Spark on K8S as K8S 
scheduler could be overloaded with requests which slows down the resource 
allocations (especially in case of dynamic allocation).)

> Introduce pending pod limit for Spark on K8s
> 
>
> Key: SPARK-36052
> URL: https://issues.apache.org/jira/browse/SPARK-36052
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.0
>Reporter: Attila Zsolt Piros
>Assignee: Attila Zsolt Piros
>Priority: Major
>
> Introduce a new configuration to limit the number of pending PODs for Spark 
> on K8S as the K8S scheduler could be overloaded with requests which slows 
> down the resource allocations (especially in case of dynamic allocation).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36052) Introduce pending pod limit for Spark on K8s

2021-07-08 Thread Attila Zsolt Piros (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Zsolt Piros updated SPARK-36052:
---
Description: Introduce pending POD limit for Spark on K8S as K8S scheduler 
could be overloaded with requests which slows down the resource allocations 
(especially in case of dynamic allocation).

> Introduce pending pod limit for Spark on K8s
> 
>
> Key: SPARK-36052
> URL: https://issues.apache.org/jira/browse/SPARK-36052
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.0
>Reporter: Attila Zsolt Piros
>Assignee: Attila Zsolt Piros
>Priority: Major
>
> Introduce pending POD limit for Spark on K8S as K8S scheduler could be 
> overloaded with requests which slows down the resource allocations 
> (especially in case of dynamic allocation).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-35988) The implementation for RocksDBStateStoreProvider

2021-07-08 Thread Jungtaek Lim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim reassigned SPARK-35988:


Assignee: Yuanjian Li

> The implementation for RocksDBStateStoreProvider
> 
>
> Key: SPARK-35988
> URL: https://issues.apache.org/jira/browse/SPARK-35988
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 3.2.0
>Reporter: Yuanjian Li
>Assignee: Yuanjian Li
>Priority: Major
>
> Add the implementation for the RocksDBStateStoreProvider. It's the subclass 
> of StateStoreProvider that leverages all the functionalities implemented in 
> the RocksDB instance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36053) Unify write exception and delete abnormal disk block object file process

2021-07-08 Thread Yang Jie (Jira)

Yang Jie created SPARK-36053:


 Summary: Unify write exception and delete abnormal disk block 
object file process
 Key: SPARK-36053
 URL: https://issues.apache.org/jira/browse/SPARK-36053
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.3.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-35988) The implementation for RocksDBStateStoreProvider

2021-07-08 Thread Jungtaek Lim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-35988.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33187
[https://github.com/apache/spark/pull/33187]

> The implementation for RocksDBStateStoreProvider
> 
>
> Key: SPARK-35988
> URL: https://issues.apache.org/jira/browse/SPARK-35988
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 3.2.0
>Reporter: Yuanjian Li
>Assignee: Yuanjian Li
>Priority: Major
> Fix For: 3.2.0
>
>
> Add the implementation for the RocksDBStateStoreProvider. It's the subclass 
> of StateStoreProvider that leverages all the functionalities implemented in 
> the RocksDB instance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36053) Unify write exception and delete abnormal disk block object file process

2021-07-08 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-36053:
-
Description: 
There are some duplicate codes related to cleaning up failed files after 
DiskBlockObjectWriter writes data abnormally in `BypassMergeSortShuffleWriter`, 
`ExternalAppendOnlyMap` and `ExternalSorter`, the duplicate codes as follows:
{code:java}
writer.revertPartialWritesAndClose()
if (file.exists()) {
  if (!file.delete()) {
logWarning(s"Error deleting ${file}")
  }
}
{code}

> Unify write exception and delete abnormal disk block object file process
> 
>
> Key: SPARK-36053
> URL: https://issues.apache.org/jira/browse/SPARK-36053
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Minor
>
> There are some duplicate codes related to cleaning up failed files after 
> DiskBlockObjectWriter writes data abnormally in 
> `BypassMergeSortShuffleWriter`, `ExternalAppendOnlyMap` and `ExternalSorter`, 
> the duplicate codes as follows:
> {code:java}
> writer.revertPartialWritesAndClose()
> if (file.exists()) {
>   if (!file.delete()) {
> logWarning(s"Error deleting ${file}")
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36053) Unify write exception and delete abnormal disk block object file process

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36053:


Assignee: Apache Spark

> Unify write exception and delete abnormal disk block object file process
> 
>
> Key: SPARK-36053
> URL: https://issues.apache.org/jira/browse/SPARK-36053
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>
> There are some duplicate codes related to cleaning up failed files after 
> DiskBlockObjectWriter writes data abnormally in 
> `BypassMergeSortShuffleWriter`, `ExternalAppendOnlyMap` and `ExternalSorter`, 
> the duplicate codes as follows:
> {code:java}
> writer.revertPartialWritesAndClose()
> if (file.exists()) {
>   if (!file.delete()) {
> logWarning(s"Error deleting ${file}")
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36053) Unify write exception and delete abnormal disk block object file process

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377359#comment-17377359
 ] 

Apache Spark commented on SPARK-36053:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/33267

> Unify write exception and delete abnormal disk block object file process
> 
>
> Key: SPARK-36053
> URL: https://issues.apache.org/jira/browse/SPARK-36053
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Minor
>
> There are some duplicate codes related to cleaning up failed files after 
> DiskBlockObjectWriter writes data abnormally in 
> `BypassMergeSortShuffleWriter`, `ExternalAppendOnlyMap` and `ExternalSorter`, 
> the duplicate codes as follows:
> {code:java}
> writer.revertPartialWritesAndClose()
> if (file.exists()) {
>   if (!file.delete()) {
> logWarning(s"Error deleting ${file}")
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36053) Unify write exception and delete abnormal disk block object file process

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36053:


Assignee: (was: Apache Spark)

> Unify write exception and delete abnormal disk block object file process
> 
>
> Key: SPARK-36053
> URL: https://issues.apache.org/jira/browse/SPARK-36053
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Minor
>
> There are some duplicate codes related to cleaning up failed files after 
> DiskBlockObjectWriter writes data abnormally in 
> `BypassMergeSortShuffleWriter`, `ExternalAppendOnlyMap` and `ExternalSorter`, 
> the duplicate codes as follows:
> {code:java}
> writer.revertPartialWritesAndClose()
> if (file.exists()) {
>   if (!file.delete()) {
> logWarning(s"Error deleting ${file}")
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36054) Support group by TimestampNTZ column

2021-07-08 Thread Gengliang Wang (Jira)

Gengliang Wang created SPARK-36054:
--

 Summary: Support group by TimestampNTZ column
 Key: SPARK-36054
 URL: https://issues.apache.org/jira/browse/SPARK-36054
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: Gengliang Wang
Assignee: Gengliang Wang


Support group by TimestampNTZ column



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36054) Support group by TimestampNTZ column

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36054:


Assignee: Gengliang Wang  (was: Apache Spark)

> Support group by TimestampNTZ column
> 
>
> Key: SPARK-36054
> URL: https://issues.apache.org/jira/browse/SPARK-36054
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> Support group by TimestampNTZ column



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36054) Support group by TimestampNTZ column

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36054:


Assignee: Apache Spark  (was: Gengliang Wang)

> Support group by TimestampNTZ column
> 
>
> Key: SPARK-36054
> URL: https://issues.apache.org/jira/browse/SPARK-36054
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Apache Spark
>Priority: Major
>
> Support group by TimestampNTZ column



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36054) Support group by TimestampNTZ column

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377399#comment-17377399
 ] 

Apache Spark commented on SPARK-36054:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/33268

> Support group by TimestampNTZ column
> 
>
> Key: SPARK-36054
> URL: https://issues.apache.org/jira/browse/SPARK-36054
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> Support group by TimestampNTZ column



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-35958) Refactor SparkError.scala to SparkThrowable.java

2021-07-08 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-35958:
---

Assignee: Karen Feng

> Refactor SparkError.scala to SparkThrowable.java
> 
>
> Key: SPARK-35958
> URL: https://issues.apache.org/jira/browse/SPARK-35958
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Assignee: Karen Feng
>Priority: Major
>
> Following up from SPARK-34920:
> Error has a special meaning in Java; SparkError should encompass all 
> Throwables. It'd be more correct to rename SparkError to SparkThrowable.
> In addition, some Throwables come from Java, so to maximize usability, we 
> should migrate the base trait from Scala to Java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-35958) Refactor SparkError.scala to SparkThrowable.java

2021-07-08 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-35958.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33164
[https://github.com/apache/spark/pull/33164]

> Refactor SparkError.scala to SparkThrowable.java
> 
>
> Key: SPARK-35958
> URL: https://issues.apache.org/jira/browse/SPARK-35958
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Assignee: Karen Feng
>Priority: Major
> Fix For: 3.2.0
>
>
> Following up from SPARK-34920:
> Error has a special meaning in Java; SparkError should encompass all 
> Throwables. It'd be more correct to rename SparkError to SparkThrowable.
> In addition, some Throwables come from Java, so to maximize usability, we 
> should migrate the base trait from Scala to Java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-35874) AQE Shuffle should wait for its subqueries to finish before materializing

2021-07-08 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-35874.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33058
[https://github.com/apache/spark/pull/33058]

> AQE Shuffle should wait for its subqueries to finish before materializing
> -
>
> Key: SPARK-35874
> URL: https://issues.apache.org/jira/browse/SPARK-35874
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-35874) AQE Shuffle should wait for its subqueries to finish before materializing

2021-07-08 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-35874:
---

Assignee: Wenchen Fan

> AQE Shuffle should wait for its subqueries to finish before materializing
> -
>
> Key: SPARK-35874
> URL: https://issues.apache.org/jira/browse/SPARK-35874
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36055) Assign pretty SQL string to TimestampNTZ literals

2021-07-08 Thread Gengliang Wang (Jira)

Gengliang Wang created SPARK-36055:
--

 Summary: Assign pretty SQL string to TimestampNTZ literals
 Key: SPARK-36055
 URL: https://issues.apache.org/jira/browse/SPARK-36055
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: Gengliang Wang
Assignee: Gengliang Wang


Currently the TimestampNTZ literals shows only long value instead of timestamp 
string in its SQL string and toString result.
Before changes (with default timestamp type as TIMESTAMP_NTZ)
```
 -- !query
 select timestamp '2019-01-01\t'
 -- !query schema
struct<15463008:timestamp_ntz>
```

After changes:
 -- !query
 select timestamp '2019-01-01\t'
 -- !query schema
struct



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36056) Combine readBatch and readIntegers in VectorizedRleValuesReader

2021-07-08 Thread Chao Sun (Jira)

Chao Sun created SPARK-36056:


 Summary: Combine readBatch and readIntegers in 
VectorizedRleValuesReader
 Key: SPARK-36056
 URL: https://issues.apache.org/jira/browse/SPARK-36056
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: Chao Sun


{{readBatch}} and {{readIntegers}} share similar code path and this Jira aims 
to combine them into one method for easier maintenance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36055) Assign pretty SQL string to TimestampNTZ literals

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36055:


Assignee: Gengliang Wang  (was: Apache Spark)

> Assign pretty SQL string to TimestampNTZ literals
> -
>
> Key: SPARK-36055
> URL: https://issues.apache.org/jira/browse/SPARK-36055
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> Currently the TimestampNTZ literals shows only long value instead of 
> timestamp string in its SQL string and toString result.
> Before changes (with default timestamp type as TIMESTAMP_NTZ)
> ```
>  -- !query
>  select timestamp '2019-01-01\t'
>  -- !query schema
> struct<15463008:timestamp_ntz>
> ```
> After changes:
>  -- !query
>  select timestamp '2019-01-01\t'
>  -- !query schema
> struct



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36055) Assign pretty SQL string to TimestampNTZ literals

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377507#comment-17377507
 ] 

Apache Spark commented on SPARK-36055:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/33269

> Assign pretty SQL string to TimestampNTZ literals
> -
>
> Key: SPARK-36055
> URL: https://issues.apache.org/jira/browse/SPARK-36055
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> Currently the TimestampNTZ literals shows only long value instead of 
> timestamp string in its SQL string and toString result.
> Before changes (with default timestamp type as TIMESTAMP_NTZ)
> ```
>  -- !query
>  select timestamp '2019-01-01\t'
>  -- !query schema
> struct<15463008:timestamp_ntz>
> ```
> After changes:
>  -- !query
>  select timestamp '2019-01-01\t'
>  -- !query schema
> struct



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36055) Assign pretty SQL string to TimestampNTZ literals

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36055:


Assignee: Apache Spark  (was: Gengliang Wang)

> Assign pretty SQL string to TimestampNTZ literals
> -
>
> Key: SPARK-36055
> URL: https://issues.apache.org/jira/browse/SPARK-36055
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Apache Spark
>Priority: Major
>
> Currently the TimestampNTZ literals shows only long value instead of 
> timestamp string in its SQL string and toString result.
> Before changes (with default timestamp type as TIMESTAMP_NTZ)
> ```
>  -- !query
>  select timestamp '2019-01-01\t'
>  -- !query schema
> struct<15463008:timestamp_ntz>
> ```
> After changes:
>  -- !query
>  select timestamp '2019-01-01\t'
>  -- !query schema
> struct



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36012) Lost the null flag info when show create table

2021-07-08 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-36012:
---

Assignee: PengLei

> Lost the null flag info when show create table
> --
>
> Key: SPARK-36012
> URL: https://issues.apache.org/jira/browse/SPARK-36012
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: PengLei
>Assignee: PengLei
>Priority: Major
> Fix For: 3.2.0
>
>
> When execute `SHOW CREATE TALBE XXX` command, the ddl info lost the null flag.
> {code:java}
> // def toDDL: String = s"${quoteIdentifier(name)} 
> ${dataType.sql}$getDDLComment"
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36012) Lost the null flag info when show create table

2021-07-08 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-36012.
-
Resolution: Fixed

Issue resolved by pull request 33219
[https://github.com/apache/spark/pull/33219]

> Lost the null flag info when show create table
> --
>
> Key: SPARK-36012
> URL: https://issues.apache.org/jira/browse/SPARK-36012
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: PengLei
>Assignee: PengLei
>Priority: Major
> Fix For: 3.2.0
>
>
> When execute `SHOW CREATE TALBE XXX` command, the ddl info lost the null flag.
> {code:java}
> // def toDDL: String = s"${quoteIdentifier(name)} 
> ${dataType.sql}$getDDLComment"
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35956) Support auto-assigning labels to less important pods (e.g. decommissioning pods)

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377527#comment-17377527
 ] 

Apache Spark commented on SPARK-35956:
--

User 'holdenk' has created a pull request for this issue:
https://github.com/apache/spark/pull/33270

> Support auto-assigning labels to less important pods (e.g. decommissioning 
> pods)
> 
>
> Key: SPARK-35956
> URL: https://issues.apache.org/jira/browse/SPARK-35956
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.2.0
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Major
>
> To allow for folks to use pod disruption budgets or replicasets we should 
> indicate which pods Spark cares about "the least", those would be pods that 
> are otherwise exiting soon.
>  
> With PDBs the user would create a PDB representing the label of 
> decommissioning executors and this could have a higher number of unavailable 
> than the PDB for the "regular" execs. For people using replicasets in 1.21 we 
> could also set a label of "controller.kubernetes.io/pod-deletion-cost" (see 
> [https://github.com/kubernetes/kubernetes/pull/99163] ) to hint to the 
> controller that a pod is less important to us.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-35956) Support auto-assigning labels to less important pods (e.g. decommissioning pods)

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-35956:


Assignee: Apache Spark  (was: Holden Karau)

> Support auto-assigning labels to less important pods (e.g. decommissioning 
> pods)
> 
>
> Key: SPARK-35956
> URL: https://issues.apache.org/jira/browse/SPARK-35956
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.2.0
>Reporter: Holden Karau
>Assignee: Apache Spark
>Priority: Major
>
> To allow for folks to use pod disruption budgets or replicasets we should 
> indicate which pods Spark cares about "the least", those would be pods that 
> are otherwise exiting soon.
>  
> With PDBs the user would create a PDB representing the label of 
> decommissioning executors and this could have a higher number of unavailable 
> than the PDB for the "regular" execs. For people using replicasets in 1.21 we 
> could also set a label of "controller.kubernetes.io/pod-deletion-cost" (see 
> [https://github.com/kubernetes/kubernetes/pull/99163] ) to hint to the 
> controller that a pod is less important to us.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-35956) Support auto-assigning labels to less important pods (e.g. decommissioning pods)

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-35956:


Assignee: Holden Karau  (was: Apache Spark)

> Support auto-assigning labels to less important pods (e.g. decommissioning 
> pods)
> 
>
> Key: SPARK-35956
> URL: https://issues.apache.org/jira/browse/SPARK-35956
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.2.0
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Major
>
> To allow for folks to use pod disruption budgets or replicasets we should 
> indicate which pods Spark cares about "the least", those would be pods that 
> are otherwise exiting soon.
>  
> With PDBs the user would create a PDB representing the label of 
> decommissioning executors and this could have a higher number of unavailable 
> than the PDB for the "regular" execs. For people using replicasets in 1.21 we 
> could also set a label of "controller.kubernetes.io/pod-deletion-cost" (see 
> [https://github.com/kubernetes/kubernetes/pull/99163] ) to hint to the 
> controller that a pod is less important to us.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35956) Support auto-assigning labels to less important pods (e.g. decommissioning pods)

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377528#comment-17377528
 ] 

Apache Spark commented on SPARK-35956:
--

User 'holdenk' has created a pull request for this issue:
https://github.com/apache/spark/pull/33270

> Support auto-assigning labels to less important pods (e.g. decommissioning 
> pods)
> 
>
> Key: SPARK-35956
> URL: https://issues.apache.org/jira/browse/SPARK-35956
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.2.0
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Major
>
> To allow for folks to use pod disruption budgets or replicasets we should 
> indicate which pods Spark cares about "the least", those would be pods that 
> are otherwise exiting soon.
>  
> With PDBs the user would create a PDB representing the label of 
> decommissioning executors and this could have a higher number of unavailable 
> than the PDB for the "regular" execs. For people using replicasets in 1.21 we 
> could also set a label of "controller.kubernetes.io/pod-deletion-cost" (see 
> [https://github.com/kubernetes/kubernetes/pull/99163] ) to hint to the 
> controller that a pod is less important to us.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36057) Support volcano/alternative schedulers

2021-07-08 Thread Holden Karau (Jira)

Holden Karau created SPARK-36057:


 Summary: Support volcano/alternative schedulers
 Key: SPARK-36057
 URL: https://issues.apache.org/jira/browse/SPARK-36057
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes
Affects Versions: 3.2.0
Reporter: Holden Karau


This is an umbrella issue for tracking the work for supporting Volcano & 
Yunikorn on Kubernetes. These schedulers provide more YARN like features (such 
as queues and minimum resources before scheduling jobs) that many folks want on 
Kubernetes.

 

Yunikorn is an ASF project & Volcano is a CNCF project (sig-batch).

 

They've taken slightly different approaches to solving the same problem, but 
from Spark's point of view we should be able to share much of the code.

 

See the initial brainstorming discussion in SPARK-35623.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36058) Support replicasets/job API

2021-07-08 Thread Holden Karau (Jira)

Holden Karau created SPARK-36058:


 Summary: Support replicasets/job API
 Key: SPARK-36058
 URL: https://issues.apache.org/jira/browse/SPARK-36058
 Project: Spark
  Issue Type: Sub-task
  Components: Kubernetes
Affects Versions: 3.2.0
Reporter: Holden Karau


Volcano & Yunikorn both support scheduling invidual pods, but they also support 
higher level abstractions similar to the vanilla Kube replicasets which we can 
use to improve scheduling performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36059) Add the ability to specify a scheduler & queue

2021-07-08 Thread Holden Karau (Jira)

Holden Karau created SPARK-36059:


 Summary: Add the ability to specify a scheduler & queue
 Key: SPARK-36059
 URL: https://issues.apache.org/jira/browse/SPARK-36059
 Project: Spark
  Issue Type: Sub-task
  Components: Kubernetes
Affects Versions: 3.2.0
Reporter: Holden Karau






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36060) Support backing off dynamic allocation increases if resources are "stuck"

2021-07-08 Thread Holden Karau (Jira)

Holden Karau created SPARK-36060:


 Summary: Support backing off dynamic allocation increases if 
resources are "stuck"
 Key: SPARK-36060
 URL: https://issues.apache.org/jira/browse/SPARK-36060
 Project: Spark
  Issue Type: Sub-task
  Components: Kubernetes
Affects Versions: 3.2.0
Reporter: Holden Karau


In a over-subscribed environment we may enter a situation where our requests 
for more pods are not going to be fulfilled. Adding more requests for more pods 
is not going to help and may slow down the scheduler. We should detect this 
situation and hold off on increasing pod requests until the scheduler allocates 
more pods to us. We have a limited version of this in the Kube scheduler it's 
self but it would be better to plumb this all the way through to the DA logic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-35743) Improve Parquet vectorized reader

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-35743:


Assignee: Apache Spark

> Improve Parquet vectorized reader
> -
>
> Key: SPARK-35743
> URL: https://issues.apache.org/jira/browse/SPARK-35743
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Chao Sun
>Assignee: Apache Spark
>Priority: Major
>
> This umbrella JIRA tracks efforts to improve vectorized Parquet reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35743) Improve Parquet vectorized reader

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377536#comment-17377536
 ] 

Apache Spark commented on SPARK-35743:
--

User 'sunchao' has created a pull request for this issue:
https://github.com/apache/spark/pull/33271

> Improve Parquet vectorized reader
> -
>
> Key: SPARK-35743
> URL: https://issues.apache.org/jira/browse/SPARK-35743
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Chao Sun
>Priority: Major
>
> This umbrella JIRA tracks efforts to improve vectorized Parquet reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-35743) Improve Parquet vectorized reader

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-35743:


Assignee: (was: Apache Spark)

> Improve Parquet vectorized reader
> -
>
> Key: SPARK-35743
> URL: https://issues.apache.org/jira/browse/SPARK-35743
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Chao Sun
>Priority: Major
>
> This umbrella JIRA tracks efforts to improve vectorized Parquet reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36061) Create a PodGroup with user specified minimum resources required

2021-07-08 Thread Holden Karau (Jira)

Holden Karau created SPARK-36061:


 Summary: Create a PodGroup with user specified minimum resources 
required
 Key: SPARK-36061
 URL: https://issues.apache.org/jira/browse/SPARK-36061
 Project: Spark
  Issue Type: Sub-task
  Components: Kubernetes
Affects Versions: 3.2.0
Reporter: Holden Karau






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36056) Combine readBatch and readIntegers in VectorizedRleValuesReader

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36056:


Assignee: Apache Spark

> Combine readBatch and readIntegers in VectorizedRleValuesReader
> ---
>
> Key: SPARK-36056
> URL: https://issues.apache.org/jira/browse/SPARK-36056
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Chao Sun
>Assignee: Apache Spark
>Priority: Minor
>
> {{readBatch}} and {{readIntegers}} share similar code path and this Jira aims 
> to combine them into one method for easier maintenance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36056) Combine readBatch and readIntegers in VectorizedRleValuesReader

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36056:


Assignee: (was: Apache Spark)

> Combine readBatch and readIntegers in VectorizedRleValuesReader
> ---
>
> Key: SPARK-36056
> URL: https://issues.apache.org/jira/browse/SPARK-36056
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Chao Sun
>Priority: Minor
>
> {{readBatch}} and {{readIntegers}} share similar code path and this Jira aims 
> to combine them into one method for easier maintenance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36056) Combine readBatch and readIntegers in VectorizedRleValuesReader

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377544#comment-17377544
 ] 

Apache Spark commented on SPARK-36056:
--

User 'sunchao' has created a pull request for this issue:
https://github.com/apache/spark/pull/33271

> Combine readBatch and readIntegers in VectorizedRleValuesReader
> ---
>
> Key: SPARK-36056
> URL: https://issues.apache.org/jira/browse/SPARK-36056
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Chao Sun
>Priority: Minor
>
> {{readBatch}} and {{readIntegers}} share similar code path and this Jira aims 
> to combine them into one method for easier maintenance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36056) Combine readBatch and readIntegers in VectorizedRleValuesReader

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377545#comment-17377545
 ] 

Apache Spark commented on SPARK-36056:
--

User 'sunchao' has created a pull request for this issue:
https://github.com/apache/spark/pull/33271

> Combine readBatch and readIntegers in VectorizedRleValuesReader
> ---
>
> Key: SPARK-36056
> URL: https://issues.apache.org/jira/browse/SPARK-36056
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Chao Sun
>Priority: Minor
>
> {{readBatch}} and {{readIntegers}} share similar code path and this Jira aims 
> to combine them into one method for easier maintenance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36050) Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC

2021-07-08 Thread Gengliang Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang resolved SPARK-36050.

Resolution: Not A Problem

> Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC
> 
>
> Key: SPARK-36050
> URL: https://issues.apache.org/jira/browse/SPARK-36050
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>
> The OrcTimestamp of MapReduce library is instant of java.sql.Timestamp. As it 
> is the only support Timestamp type of ORC, Spark doesn’t support 
> reading/writing TIMESTAMP_NTZ with ORC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36055) Assign pretty SQL string to TimestampNTZ literals

2021-07-08 Thread Max Gekk (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-36055.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33269
[https://github.com/apache/spark/pull/33269]

> Assign pretty SQL string to TimestampNTZ literals
> -
>
> Key: SPARK-36055
> URL: https://issues.apache.org/jira/browse/SPARK-36055
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
> Fix For: 3.2.0
>
>
> Currently the TimestampNTZ literals shows only long value instead of 
> timestamp string in its SQL string and toString result.
> Before changes (with default timestamp type as TIMESTAMP_NTZ)
> ```
>  -- !query
>  select timestamp '2019-01-01\t'
>  -- !query schema
> struct<15463008:timestamp_ntz>
> ```
> After changes:
>  -- !query
>  select timestamp '2019-01-01\t'
>  -- !query schema
> struct



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-35340) Standardize TypeError messages for unsupported basic operations

2021-07-08 Thread Takuya Ueshin (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Ueshin resolved SPARK-35340.
---
Fix Version/s: 3.2.0
 Assignee: Xinrong Meng
   Resolution: Fixed

Issue resolved by pull request 33237
https://github.com/apache/spark/pull/33237

> Standardize TypeError messages for unsupported basic operations
> ---
>
> Key: SPARK-35340
> URL: https://issues.apache.org/jira/browse/SPARK-35340
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Xinrong Meng
>Assignee: Xinrong Meng
>Priority: Major
> Fix For: 3.2.0
>
>
> Inconsistent TypeError messages are shown for unsupported data-type-based 
> basic operations.
> Take addition's TypeError messages for example: 
> {code:java}
> addition can not be applied to given types.
> string addition can only be applied to string series or literals.
> {code}
> Standardizing TypeError messages would improve user experience and reduce 
> maintenance costs.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36054) Support group by TimestampNTZ column

2021-07-08 Thread Max Gekk (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-36054.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33268
[https://github.com/apache/spark/pull/33268]

> Support group by TimestampNTZ column
> 
>
> Key: SPARK-36054
> URL: https://issues.apache.org/jira/browse/SPARK-36054
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
> Fix For: 3.2.0
>
>
> Support group by TimestampNTZ column



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36049) Remove IntervalUnit in code

2021-07-08 Thread Max Gekk (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-36049:


Assignee: angerszhu

> Remove IntervalUnit in code
> ---
>
> Key: SPARK-36049
> URL: https://issues.apache.org/jira/browse/SPARK-36049
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>
> According to https://github.com/apache/spark/pull/33252#issuecomment-876280183



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36049) Remove IntervalUnit in code

2021-07-08 Thread Max Gekk (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-36049.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33265
[https://github.com/apache/spark/pull/33265]

> Remove IntervalUnit in code
> ---
>
> Key: SPARK-36049
> URL: https://issues.apache.org/jira/browse/SPARK-36049
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.2.0
>
>
> According to https://github.com/apache/spark/pull/33252#issuecomment-876280183



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36035) Adjust `test_astype`, `test_neg` for old pandas versions

2021-07-08 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377590#comment-17377590
 ] 

Apache Spark commented on SPARK-36035:
--

User 'xinrong-databricks' has created a pull request for this issue:
https://github.com/apache/spark/pull/33272

> Adjust `test_astype`, `test_neg` for old pandas versions
> 
>
> Key: SPARK-36035
> URL: https://issues.apache.org/jira/browse/SPARK-36035
> Project: Spark
>  Issue Type: Test
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Xinrong Meng
>Priority: Major
>
> * test_astype
> For pandas < 1.1.0, declaring or converting to StringDtype was in general 
> only possible if the data was already only str or nan-like (GH31204).
> In pandas 1.1.0, the problem is adjusted by 
> [https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.1.0.html#all-dtypes-can-now-be-converted-to-stringdtype].
> That should be considered in `test_astype`, otherwise, current tests will 
> fail with pandas < 1.1.0.
>  * test_neg
> {code:java}
> dtypes = [
>   "Int8",
>   "Int16",
>   "Int32",
>   "Int64",
> ]
> psers = []
> for dtype in dtypes:
>   psers.append(pd.Series([1, 2, 3, None], dtype=dtype))
>   
> for pser in psers:
>   print((-pser).dtype){code}
>  ~ 1.0.5, object dtype
>  1.1.0~1.1.2, TypeError: bad operand type for unary -: 'IntegerArray'
>  1.1.3, correct respective dtype



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36062) Try to capture faulthanlder when a Python worker crashes.

2021-07-08 Thread Takuya Ueshin (Jira)

Takuya Ueshin created SPARK-36062:
-

 Summary: Try to capture faulthanlder when a Python worker crashes.
 Key: SPARK-36062
 URL: https://issues.apache.org/jira/browse/SPARK-36062
 Project: Spark
  Issue Type: Improvement
  Components: PySpark
Affects Versions: 3.2.0
Reporter: Takuya Ueshin


Currently, we just see an error message saying "exited unexpectedly (crashed)" 
when the UDFs causes the Python worker to crash by like segmentation fault.
 We should take advantage of 
{{[faulthandler|https://docs.python.org/3/library/faulthandler.html]}} and try 
to capture the error message from the {{faulthandler}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36062) Try to capture faulthanlder when a Python worker crashes.

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36062:


Assignee: Apache Spark

> Try to capture faulthanlder when a Python worker crashes.
> -
>
> Key: SPARK-36062
> URL: https://issues.apache.org/jira/browse/SPARK-36062
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Takuya Ueshin
>Assignee: Apache Spark
>Priority: Major
>
> Currently, we just see an error message saying "exited unexpectedly 
> (crashed)" when the UDFs causes the Python worker to crash by like 
> segmentation fault.
>  We should take advantage of 
> {{[faulthandler|https://docs.python.org/3/library/faulthandler.html]}} and 
> try to capture the error message from the {{faulthandler}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36062) Try to capture faulthanlder when a Python worker crashes.

2021-07-08 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36062:


Assignee: (was: Apache Spark)

> Try to capture faulthanlder when a Python worker crashes.
> -
>
> Key: SPARK-36062
> URL: https://issues.apache.org/jira/browse/SPARK-36062
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Takuya Ueshin
>Priority: Major
>
> Currently, we just see an error message saying "exited unexpectedly 
> (crashed)" when the UDFs causes the Python worker to crash by like 
> segmentation fault.
>  We should take advantage of 
> {{[faulthandler|https://docs.python.org/3/library/faulthandler.html]}} and 
> try to capture the error message from the {{faulthandler}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

1 2 >

1 - 100 of 135 matches

Mail list logo