[jira] [Commented] (SPARK-47223) Update usage of deprecated Thread.getId() to Thread.threadId()

2024-02-29 Thread Neil Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-47223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1788#comment-1788
 ] 

Neil Gupta commented on SPARK-47223:


Update: Realized that the minimum Java version here is Java 17, thus cannot use 
the method since it exists in Java19+.

> Update usage of deprecated Thread.getId() to Thread.threadId()
> --
>
> Key: SPARK-47223
> URL: https://issues.apache.org/jira/browse/SPARK-47223
> Project: Spark
>  Issue Type: Request
>  Components: Spark Core, SQL
>Affects Versions: 3.5.1
>Reporter: Neil Gupta
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.5.1
>
>
> Update usage of deprecated Thread.getId() to Thread.threadId().
>  
> Currently in Spark, there are multiple references still to the deprecated 
> method [Thread.getId()|#getId()]] given that the current version is using 
> Java 21. Java officially requests any type of usage to be switched to the 
> [Thread.threadId()|#threadId()]] method instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-47223) Update usage of deprecated Thread.getId() to Thread.threadId()

2024-02-28 Thread Neil Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-47223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821925#comment-17821925
 ] 

Neil Gupta commented on SPARK-47223:


I can take a stab it this one myself

> Update usage of deprecated Thread.getId() to Thread.threadId()
> --
>
> Key: SPARK-47223
> URL: https://issues.apache.org/jira/browse/SPARK-47223
> Project: Spark
>  Issue Type: Request
>  Components: Spark Core, SQL
>Affects Versions: 3.5.1
>Reporter: Neil Gupta
>Priority: Trivial
> Fix For: 3.5.1
>
>
> Update usage of deprecated Thread.getId() to Thread.threadId().
>  
> Currently in Spark, there are multiple references still to the deprecated 
> method [Thread.getId()|#getId()]] given that the current version is using 
> Java 21. Java officially requests any type of usage to be switched to the 
> [Thread.threadId()|#threadId()]] method instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47223) Update usage of deprecated Thread.getId() to Thread.threadId()

2024-02-28 Thread Neil Gupta (Jira)
Neil Gupta created SPARK-47223:
--

 Summary: Update usage of deprecated Thread.getId() to 
Thread.threadId()
 Key: SPARK-47223
 URL: https://issues.apache.org/jira/browse/SPARK-47223
 Project: Spark
  Issue Type: Request
  Components: Spark Core, SQL
Affects Versions: 3.5.1
Reporter: Neil Gupta
 Fix For: 3.5.1


Update usage of deprecated Thread.getId() to Thread.threadId().

 

Currently in Spark, there are multiple references still to the deprecated 
method 
[`Thread.getId()`|[https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Thread.html#getId()]|https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Thread.html#getId()],]
 given that the current version is using Java 21. Java officially requests any 
type of usage to be switched to the 
[`Thread.threadId()`|[https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Thread.html#threadId()]]
 method instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47223) Update usage of deprecated Thread.getId() to Thread.threadId()

2024-02-28 Thread Neil Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Gupta updated SPARK-47223:
---
Description: 
Update usage of deprecated Thread.getId() to Thread.threadId().

 

Currently in Spark, there are multiple references still to the deprecated 
method [Thread.getId()|#getId()]] given that the current version is using Java 
21. Java officially requests any type of usage to be switched to the 
[Thread.threadId()|#threadId()]] method instead.

  was:
Update usage of deprecated Thread.getId() to Thread.threadId().

 

Currently in Spark, there are multiple references still to the deprecated 
method 
[`Thread.getId()`|[https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Thread.html#getId()]|https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Thread.html#getId()],]
 given that the current version is using Java 21. Java officially requests any 
type of usage to be switched to the 
[`Thread.threadId()`|[https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Thread.html#threadId()]]
 method instead.


> Update usage of deprecated Thread.getId() to Thread.threadId()
> --
>
> Key: SPARK-47223
> URL: https://issues.apache.org/jira/browse/SPARK-47223
> Project: Spark
>  Issue Type: Request
>  Components: Spark Core, SQL
>Affects Versions: 3.5.1
>Reporter: Neil Gupta
>Priority: Trivial
> Fix For: 3.5.1
>
>
> Update usage of deprecated Thread.getId() to Thread.threadId().
>  
> Currently in Spark, there are multiple references still to the deprecated 
> method [Thread.getId()|#getId()]] given that the current version is using 
> Java 21. Java officially requests any type of usage to be switched to the 
> [Thread.threadId()|#threadId()]] method instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39104) Null Pointer Exeption on unpersist call

2022-05-08 Thread Neil Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533546#comment-17533546
 ] 

Neil Gupta commented on SPARK-39104:


Do you have reproduction steps? 

> Null Pointer Exeption on unpersist call
> ---
>
> Key: SPARK-39104
> URL: https://issues.apache.org/jira/browse/SPARK-39104
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.2.1
>Reporter: Denis
>Priority: Major
>
> DataFrame.unpesist call fails wth NPE
>  
> {code:java}
> java.lang.NullPointerException
>     at 
> org.apache.spark.sql.execution.columnar.CachedRDDBuilder.isCachedRDDLoaded(InMemoryRelation.scala:247)
>     at 
> org.apache.spark.sql.execution.columnar.CachedRDDBuilder.isCachedColumnBuffersLoaded(InMemoryRelation.scala:241)
>     at 
> org.apache.spark.sql.execution.CacheManager.$anonfun$uncacheQuery$8(CacheManager.scala:189)
>     at 
> org.apache.spark.sql.execution.CacheManager.$anonfun$uncacheQuery$8$adapted(CacheManager.scala:176)
>     at 
> scala.collection.TraversableLike.$anonfun$filterImpl$1(TraversableLike.scala:304)
>     at scala.collection.Iterator.foreach(Iterator.scala:943)
>     at scala.collection.Iterator.foreach$(Iterator.scala:943)
>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
>     at scala.collection.IterableLike.foreach(IterableLike.scala:74)
>     at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
>     at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
>     at scala.collection.TraversableLike.filterImpl(TraversableLike.scala:303)
>     at scala.collection.TraversableLike.filterImpl$(TraversableLike.scala:297)
>     at scala.collection.AbstractTraversable.filterImpl(Traversable.scala:108)
>     at scala.collection.TraversableLike.filter(TraversableLike.scala:395)
>     at scala.collection.TraversableLike.filter$(TraversableLike.scala:395)
>     at scala.collection.AbstractTraversable.filter(Traversable.scala:108)
>     at 
> org.apache.spark.sql.execution.CacheManager.recacheByCondition(CacheManager.scala:219)
>     at 
> org.apache.spark.sql.execution.CacheManager.uncacheQuery(CacheManager.scala:176)
>     at org.apache.spark.sql.Dataset.unpersist(Dataset.scala:3220)
>     at org.apache.spark.sql.Dataset.unpersist(Dataset.scala:3231){code}
> Looks like syncronization in required for 
> org.apache.spark.sql.execution.columnar.CachedRDDBuilder#isCachedColumnBuffersLoaded
>  
> {code:java}
> def isCachedColumnBuffersLoaded: Boolean = {
>   _cachedColumnBuffers != null && isCachedRDDLoaded
> }
> def isCachedRDDLoaded: Boolean = {
> _cachedColumnBuffersAreLoaded || {
>   val bmMaster = SparkEnv.get.blockManager.master
>   val rddLoaded = _cachedColumnBuffers.partitions.forall { partition =>
> bmMaster.getBlockStatus(RDDBlockId(_cachedColumnBuffers.id, 
> partition.index), false)
>   .exists { case(_, blockStatus) => blockStatus.isCached }
>   }
>   if (rddLoaded) {
> _cachedColumnBuffersAreLoaded = rddLoaded
>   }
>   rddLoaded
>   }
> } {code}
> isCachedRDDLoaded relies on _cachedColumnBuffers != null check while it can 
> be changed concurrently from other thread. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-39104) Null Pointer Exeption on unpersist call

2022-05-08 Thread Neil Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533546#comment-17533546
 ] 

Neil Gupta edited comment on SPARK-39104 at 5/9/22 12:06 AM:
-

Hi Denis, do you have reproduction steps? 


was (Author: neilagupta):
Do you have reproduction steps? 

> Null Pointer Exeption on unpersist call
> ---
>
> Key: SPARK-39104
> URL: https://issues.apache.org/jira/browse/SPARK-39104
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.2.1
>Reporter: Denis
>Priority: Major
>
> DataFrame.unpesist call fails wth NPE
>  
> {code:java}
> java.lang.NullPointerException
>     at 
> org.apache.spark.sql.execution.columnar.CachedRDDBuilder.isCachedRDDLoaded(InMemoryRelation.scala:247)
>     at 
> org.apache.spark.sql.execution.columnar.CachedRDDBuilder.isCachedColumnBuffersLoaded(InMemoryRelation.scala:241)
>     at 
> org.apache.spark.sql.execution.CacheManager.$anonfun$uncacheQuery$8(CacheManager.scala:189)
>     at 
> org.apache.spark.sql.execution.CacheManager.$anonfun$uncacheQuery$8$adapted(CacheManager.scala:176)
>     at 
> scala.collection.TraversableLike.$anonfun$filterImpl$1(TraversableLike.scala:304)
>     at scala.collection.Iterator.foreach(Iterator.scala:943)
>     at scala.collection.Iterator.foreach$(Iterator.scala:943)
>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
>     at scala.collection.IterableLike.foreach(IterableLike.scala:74)
>     at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
>     at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
>     at scala.collection.TraversableLike.filterImpl(TraversableLike.scala:303)
>     at scala.collection.TraversableLike.filterImpl$(TraversableLike.scala:297)
>     at scala.collection.AbstractTraversable.filterImpl(Traversable.scala:108)
>     at scala.collection.TraversableLike.filter(TraversableLike.scala:395)
>     at scala.collection.TraversableLike.filter$(TraversableLike.scala:395)
>     at scala.collection.AbstractTraversable.filter(Traversable.scala:108)
>     at 
> org.apache.spark.sql.execution.CacheManager.recacheByCondition(CacheManager.scala:219)
>     at 
> org.apache.spark.sql.execution.CacheManager.uncacheQuery(CacheManager.scala:176)
>     at org.apache.spark.sql.Dataset.unpersist(Dataset.scala:3220)
>     at org.apache.spark.sql.Dataset.unpersist(Dataset.scala:3231){code}
> Looks like syncronization in required for 
> org.apache.spark.sql.execution.columnar.CachedRDDBuilder#isCachedColumnBuffersLoaded
>  
> {code:java}
> def isCachedColumnBuffersLoaded: Boolean = {
>   _cachedColumnBuffers != null && isCachedRDDLoaded
> }
> def isCachedRDDLoaded: Boolean = {
> _cachedColumnBuffersAreLoaded || {
>   val bmMaster = SparkEnv.get.blockManager.master
>   val rddLoaded = _cachedColumnBuffers.partitions.forall { partition =>
> bmMaster.getBlockStatus(RDDBlockId(_cachedColumnBuffers.id, 
> partition.index), false)
>   .exists { case(_, blockStatus) => blockStatus.isCached }
>   }
>   if (rddLoaded) {
> _cachedColumnBuffersAreLoaded = rddLoaded
>   }
>   rddLoaded
>   }
> } {code}
> isCachedRDDLoaded relies on _cachedColumnBuffers != null check while it can 
> be changed concurrently from other thread. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39091) SQL Expression traits don't compose due to nodePatterns being final.

2022-05-03 Thread Neil Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531282#comment-17531282
 ] 

Neil Gupta commented on SPARK-39091:


Tried to implement a fix - [https://github.com/apache/spark/pull/36441.] It 
might need to be expanded a little.

> SQL Expression traits don't compose due to nodePatterns being final.
> 
>
> Key: SPARK-39091
> URL: https://issues.apache.org/jira/browse/SPARK-39091
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0, 3.2.1
>Reporter: Huw
>Priority: Major
>
> In Spark 3.1 I have an expression which contains these parts:
> {code:scala}
> case class MyExploder(
>   arrays: Expression,// Array[AnyDataType]
>   asOfDate: Expression,  // LambdaFunction[AnyDataType -> TimestampType]
>   extractor: Expression, // TimestampType
> ) extends HigherOrderFunction with Generator with TimeZoneAwareExpression {
>  override def arguments: Seq[Expression] =
>   Seq(arrays, asOfDate)
>  override def argumentTypes: Seq[AbstractDataType] =
>   Seq(ArrayType, TimestampType)
>  override def functions: Seq[Expression] =
>   Seq(extractor)
>  override def functionTypes =
>   Seq(TimestampType)
> }{code}
>  
> This is grossly simplified example. The extractor is a lambda which can 
> gather information from a nested array, and explodes based on some business 
> logic.
> When upgrading to Spark 3.2 however this can't work anymore, because they 
> have conflicting final values for nodePatterns.
> {code:java}
> trait HigherOrderFunction extends Expression with ExpectsInputTypes {
>  final override val nodePatterns: Seq[TreePattern] = Seq(HIGH_ORDER_FUNCTION)
> }   {code}
>  
> We get this errror.
> {noformat}
> value nodePatterns in trait TimeZoneAwareExpression of type 
> Seq[org.apache.spark.sql.catalyst.trees.TreePattern.TreePattern] cannot 
> override final member{noformat}
>  
> This blocks us from upgrading. What's doubly annoying is that the actual 
> value of the member appears to be the same.
>  
> Thank you for your time.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-39091) SQL Expression traits don't compose due to nodePatterns being final.

2022-05-03 Thread Neil Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531282#comment-17531282
 ] 

Neil Gupta edited comment on SPARK-39091 at 5/3/22 4:28 PM:


Tried to implement a fix - [https://github.com/apache/spark/pull/36441.] It 
might need to be expanded to other traits but only covered a few traits 
especially those covered in your example.


was (Author: neilagupta):
Tried to implement a fix - [https://github.com/apache/spark/pull/36441.] It 
might need to be expanded a little.

> SQL Expression traits don't compose due to nodePatterns being final.
> 
>
> Key: SPARK-39091
> URL: https://issues.apache.org/jira/browse/SPARK-39091
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0, 3.2.1
>Reporter: Huw
>Priority: Major
>
> In Spark 3.1 I have an expression which contains these parts:
> {code:scala}
> case class MyExploder(
>   arrays: Expression,// Array[AnyDataType]
>   asOfDate: Expression,  // LambdaFunction[AnyDataType -> TimestampType]
>   extractor: Expression, // TimestampType
> ) extends HigherOrderFunction with Generator with TimeZoneAwareExpression {
>  override def arguments: Seq[Expression] =
>   Seq(arrays, asOfDate)
>  override def argumentTypes: Seq[AbstractDataType] =
>   Seq(ArrayType, TimestampType)
>  override def functions: Seq[Expression] =
>   Seq(extractor)
>  override def functionTypes =
>   Seq(TimestampType)
> }{code}
>  
> This is grossly simplified example. The extractor is a lambda which can 
> gather information from a nested array, and explodes based on some business 
> logic.
> When upgrading to Spark 3.2 however this can't work anymore, because they 
> have conflicting final values for nodePatterns.
> {code:java}
> trait HigherOrderFunction extends Expression with ExpectsInputTypes {
>  final override val nodePatterns: Seq[TreePattern] = Seq(HIGH_ORDER_FUNCTION)
> }   {code}
>  
> We get this errror.
> {noformat}
> value nodePatterns in trait TimeZoneAwareExpression of type 
> Seq[org.apache.spark.sql.catalyst.trees.TreePattern.TreePattern] cannot 
> override final member{noformat}
>  
> This blocks us from upgrading. What's doubly annoying is that the actual 
> value of the member appears to be the same.
>  
> Thank you for your time.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org