[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

Li Cheng (Jira) Thu, 24 Oct 2019 21:12:51 -0700


    [ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959407#comment-16959407
 ]


Li Cheng commented on HDDS-2356:
--------------------------------

Quick update. I tried to make ozone have more handlers (like 10+ times more) 
and cease to see this error. See the attached properties. However, writing 
fails due to no more blocks allocated. I guess my cluster cannot keep up with 
the writing. 

 

<property>
 <name>ozone.scm.handler.count.key</name>
 <value>128</value>
 <tag>OZONE, MANAGEMENT, PERFORMANCE</tag>
 <description>
 The number of RPC handler threads for each SCM service
 endpoint.

The default is appropriate for small clusters (tens of nodes).

Set a value that is appropriate for the cluster size. Generally, HDFS
 recommends RPC handler count is set to 20 * log2(Cluster Size) with an
 upper limit of 200. However, SCM will not have the same amount of
 traffic as Namenode, so a value much smaller than that will work well too.
 </description>
 </property>
 <property>
 <name>ozone.om.handler.count.key</name>
 <value>256</value>
 <tag>OM, PERFORMANCE</tag>
 <description>
 The number of RPC handler threads for OM service endpoints.
 </description>
 </property>
 <property>
 <name>dfs.container.ratis.num.container.op.executors</name>
 <value>128</value>
 <tag>OZONE, RATIS, PERFORMANCE</tag>
 <description>Number of executors that will be used by Ratis to execute
 container ops.(10 by default).
 </description>
 </property>
 <property>
 <name>dfs.container.ratis.num.write.chunk.threads</name>
 <value>512</value>
 <tag>OZONE, RATIS, PERFORMANCE</tag>
 <description>Maximum number of threads in the thread pool that Ratis
 will use for writing chunks (60 by default).
 </description>
 </property>

> Multipart upload report errors while writing to ozone Ratis pipeline
> --------------------------------------------------------------------
>
>                 Key: HDDS-2356
>                 URL: https://issues.apache.org/jira/browse/HDDS-2356
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Manager
>    Affects Versions: 0.4.1
>         Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>            Reporter: Li Cheng
>            Assignee: Bharat Viswanadham
>            Priority: Blocker
>             Fix For: 0.5.0
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> 2019-10-24 16:01:59,527 [OMDoubleBufferFlushThread] ERROR - Terminating with 
> exit status 2: OMDoubleBuffer flush 
> threadOMDoubleBufferFlushThreadencountered Throwable error
> java.util.ConcurrentModificationException
>  at java.util.TreeMap.forEach(TreeMap.java:1004)
>  at 
> org.apache.hadoop.ozone.om.helpers.OmMultipartKeyInfo.getProto(OmMultipartKeyInfo.java:111)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:38)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:31)
>  at 
> org.apache.hadoop.hdds.utils.db.CodecRegistry.asRawData(CodecRegistry.java:68)
>  at 
> org.apache.hadoop.hdds.utils.db.TypedTable.putWithBatch(TypedTable.java:125)
>  at 
> org.apache.hadoop.ozone.om.response.s3.multipart.S3MultipartUploadCommitPartResponse.addToDBBatch(S3MultipartUploadCommitPartResponse.java:112)
>  at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.lambda$flushTransactions$0(OzoneManagerDoubleBuffer.java:137)
>  at java.util.Iterator.forEachRemaining(Iterator.java:116)
>  at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:135)
>  at java.lang.Thread.run(Thread.java:745)
> 2019-10-24 16:01:59,629 [shutdown-hook-0] INFO - SHUTDOWN_MSG:



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

Reply via email to