[jira] [Commented] (KAFKA-18753) Enabling S3 Tiered Storage Causes: A fatal error has been detected by the Java Runtime Environment

Hasil Sharma (Jira) Tue, 11 Feb 2025 09:42:09 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17926083#comment-17926083
 ]


Hasil Sharma commented on KAFKA-18753:
--------------------------------------

> Increasing memory heap size to allocate more indexes, and increasing the 
> remote index cache size to reduce the rate of cache removal may help here. In 
> general tiered storage requires a bit more room (more so on the fetching 
> side) to operate efficiently.

Thanks for this insight. I will increase the heap size and check whether that 
helps.

 

> Could you confirm there are consumers reading from beginning? (e.g. check 
> [RemoteFetchBytesPerSec|https://kafka.apache.org/documentation/#tiered_storage_monitoring]
>  metric)

We have not changed the local retention on the topics and therefore all the 
consumes are being served from kafka disk instead of reading from S3. We did 
find that to serve ListOffset request broker fetches the index/chunk from S3.

 

> Is the remote index cache size tuned or default (1GB)? 

The remote index cache size is set to 512MB. We begun with default 1 GB though 
that resulted in too many files and kafka started to run out of file 
descriptors.

 

Additional context, we are using - 
[https://github.com/Aiven-Open/tiered-storage-for-apache-kafka]  implementation 
of the RemoteStorageManager.

> Enabling S3 Tiered Storage Causes: A fatal error has been detected by the 
> Java Runtime Environment
> --------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-18753
>                 URL: https://issues.apache.org/jira/browse/KAFKA-18753
>             Project: Kafka
>          Issue Type: Bug
>          Components: Tiered-Storage
>    Affects Versions: 3.8.1
>         Environment: Current:
> Linux 6.8.0-1021-aws #23-Ubuntu SMP Mon Dec  9 23:59:34 UTC 2024 x86_64 
> x86_64 x86_64 GNU/Linux
> OpenJDK Runtime Environment Corretto-17.0.14.7.1 (17.0.14+7) (build 
> 17.0.14+7-LTS)
>            Reporter: Hasil Sharma
>            Priority: Major
>         Attachments: hs_err_pid2775295 - redacted full.log
>
>
> Allowing brokers to upload to S3 as part of S3 Tiered Storage rollout 
> (occasionally) results in errors shaped as below:
> {code:java}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x000075a38ea42564, pid=2775295, tid=2901446
> #
> # JRE version: OpenJDK Runtime Environment Corretto-17.0.14.7.1 (17.0.14+7) 
> (build 17.0.14+7-LTS)
> # Java VM: OpenJDK 64-Bit Server VM Corretto-17.0.14.7.1 (17.0.14+7-LTS, 
> mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, 
> linux-amd64)
> # Problematic frame:
> # J 26432 c2 
> org.apache.kafka.storage.internals.log.AbstractIndex.binarySearch(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;II)I
>  (161 bytes) @ 0x000075a38ea42564 [0x000075a38ea421c0+0x00000000000003a4]
> #
> # Core dump will be written. Default location: Core dumps may be processed 
> with "/usr/local/bin/crash-handler -b '%e' -m 1 -d /pay/crash -p '%u.%p.%t' 
> -P '%P'" (or dumping to 
> /pay/deploy/kafka-brokers-kafkapub-northwest-green/deploy-1737677684489251978/core.2775295)
> #
> # If you would like to submit a bug report, please visit:
> #   https://github.com/corretto/corretto-17/issues/
> # {code}
>  
> We ran into similar error with jdk11 and upgraded to jdk17, though the error 
> has not stopped.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-18753) Enabling S3 Tiered Storage Causes: A fatal error has been detected by the Java Runtime Environment

Reply via email to