[jira] [Commented] (SPARK-42834) Divided by zero occurs in PushBasedFetchHelper.createChunkBlockInfosFromMetaResponse

2023-03-17 Thread Li Ying (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702050#comment-17702050
 ] 

Li Ying commented on SPARK-42834:
-

[~csingh] Thanks for help. I would take this fix :)

> Divided by zero occurs in 
> PushBasedFetchHelper.createChunkBlockInfosFromMetaResponse
> 
>
> Key: SPARK-42834
> URL: https://issues.apache.org/jira/browse/SPARK-42834
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.2.0
>Reporter: Li Ying
>Priority: Major
>
> {color:#22}Sometimes when run a SQL job with push based shuffle, 
> exception occurs as below.  It seems that there’s no element in the bitmaps 
> which stores merge chunk meta. {color}
> {color:#22}Is it a bug that we should not createChunkBlockInfos when 
> bitmaps is empty or the bitmaps should never be empty here ?{color}
>  
> {code:java}
> Caused by: java.lang.ArithmeticException: / by zero
> at 
> org.apache.spark.storage.PushBasedFetchHelper.createChunkBlockInfosFromMetaResponse(PushBasedFetchHelper.scala:117)
> at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:980)
> at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:84)
>  {code}
> related code:
> {code:java}
> def createChunkBlockInfosFromMetaResponse(
> shuffleId: Int,
> shuffleMergeId: Int,
> reduceId: Int,
> blockSize: Long,
> bitmaps: Array[RoaringBitmap]): ArrayBuffer[(BlockId, Long, Int)] = {
>   val approxChunkSize = blockSize / bitmaps.length
>   val blocksToFetch = new ArrayBuffer[(BlockId, Long, Int)]()
>   for (i <- bitmaps.indices) {
> val blockChunkId = ShuffleBlockChunkId(shuffleId, shuffleMergeId, 
> reduceId, i)
> chunksMetaMap.put(blockChunkId, bitmaps(i))
> logDebug(s"adding block chunk $blockChunkId of size $approxChunkSize")
> blocksToFetch += ((blockChunkId, approxChunkSize, SHUFFLE_PUSH_MAP_ID))
>   }
>   blocksToFetch
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42834) Divided by zero occurs in PushBasedFetchHelper.createChunkBlockInfosFromMetaResponse

2023-03-17 Thread Chandni Singh (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17701908#comment-17701908
 ] 

Chandni Singh commented on SPARK-42834:
---

We don't expect the `numChunks` to be zero or `bitmaps` to be empty. There was 
a bug in 3.2.0 which was fixed with 
https://issues.apache.org/jira/browse/SPARK-37675
Can you please check if you have this fix?

> Divided by zero occurs in 
> PushBasedFetchHelper.createChunkBlockInfosFromMetaResponse
> 
>
> Key: SPARK-42834
> URL: https://issues.apache.org/jira/browse/SPARK-42834
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.2.0
>Reporter: Li Ying
>Priority: Major
>
> {color:#22}Sometimes when run a SQL job with push based shuffle, 
> exception occurs as below.  It seems that there’s no element in the bitmaps 
> which stores merge chunk meta. {color}
> {color:#22}Is it a bug that we should not createChunkBlockInfos when 
> bitmaps is empty or the bitmaps should never be empty here ?{color}
>  
> {code:java}
> Caused by: java.lang.ArithmeticException: / by zero
> at 
> org.apache.spark.storage.PushBasedFetchHelper.createChunkBlockInfosFromMetaResponse(PushBasedFetchHelper.scala:117)
> at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:980)
> at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:84)
>  {code}
> related code:
> {code:java}
> def createChunkBlockInfosFromMetaResponse(
> shuffleId: Int,
> shuffleMergeId: Int,
> reduceId: Int,
> blockSize: Long,
> bitmaps: Array[RoaringBitmap]): ArrayBuffer[(BlockId, Long, Int)] = {
>   val approxChunkSize = blockSize / bitmaps.length
>   val blocksToFetch = new ArrayBuffer[(BlockId, Long, Int)]()
>   for (i <- bitmaps.indices) {
> val blockChunkId = ShuffleBlockChunkId(shuffleId, shuffleMergeId, 
> reduceId, i)
> chunksMetaMap.put(blockChunkId, bitmaps(i))
> logDebug(s"adding block chunk $blockChunkId of size $approxChunkSize")
> blocksToFetch += ((blockChunkId, approxChunkSize, SHUFFLE_PUSH_MAP_ID))
>   }
>   blocksToFetch
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42834) Divided by zero occurs in PushBasedFetchHelper.createChunkBlockInfosFromMetaResponse

2023-03-17 Thread Li Ying (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17701610#comment-17701610
 ] 

Li Ying commented on SPARK-42834:
-

[~csingh] Could you please help confirm this?

> Divided by zero occurs in 
> PushBasedFetchHelper.createChunkBlockInfosFromMetaResponse
> 
>
> Key: SPARK-42834
> URL: https://issues.apache.org/jira/browse/SPARK-42834
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.2.0
>Reporter: Li Ying
>Priority: Major
>
> {color:#22}Sometimes when run a SQL job with push based shuffle, 
> exception occurs as below.  It seems that there’s no element in the bitmaps 
> which stores merge chunk meta. {color}
> {color:#22}Is it a bug that we should not createChunkBlockInfos when 
> bitmaps is empty or the bitmaps should never be empty here ?{color}
>  
> {code:java}
> Caused by: java.lang.ArithmeticException: / by zero
> at 
> org.apache.spark.storage.PushBasedFetchHelper.createChunkBlockInfosFromMetaResponse(PushBasedFetchHelper.scala:117)
> at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:980)
> at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:84)
>  {code}
> related code:
> {code:java}
> def createChunkBlockInfosFromMetaResponse(
> shuffleId: Int,
> shuffleMergeId: Int,
> reduceId: Int,
> blockSize: Long,
> bitmaps: Array[RoaringBitmap]): ArrayBuffer[(BlockId, Long, Int)] = {
>   val approxChunkSize = blockSize / bitmaps.length
>   val blocksToFetch = new ArrayBuffer[(BlockId, Long, Int)]()
>   for (i <- bitmaps.indices) {
> val blockChunkId = ShuffleBlockChunkId(shuffleId, shuffleMergeId, 
> reduceId, i)
> chunksMetaMap.put(blockChunkId, bitmaps(i))
> logDebug(s"adding block chunk $blockChunkId of size $approxChunkSize")
> blocksToFetch += ((blockChunkId, approxChunkSize, SHUFFLE_PUSH_MAP_ID))
>   }
>   blocksToFetch
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org