Hi,
I'm running a PoC orangefs setup, and have read through all of the
previous posts on performance. We are running 2.9.6 on Redhat 7.4. The
clients are using the kernel interface
I'm running currently on 1 Power 750 server as host (with 8 dual meta/data
servers running). The clients are a mix of Intel and PPC64 systems all
interconnected by Infiniband DDR cards in Connected mode.
The storage backend is a 4G FC attached ssd chassis with 20 250 Gig SSD
cards (not regular drives) in, with 8 are assigned to meta and 8 are
assigned to data
The network tests good. 7ish Gb/s with no retries or errors. We are using
bmi_tcp.
I can get great performance for large files as expected but when
performing small file actions the performance is significantly poorer.
For example. I can untar linux-4.13.3.tar.xz locally on the filesystem
in 14seconds while on the orangefs it takes 10mins
I can see the performance difference when playing with stripe sizes etc
when copying monolithic files, but there seems to be a wall that gets hit
when there is a lot of metadata activity.
I can see how the need for network back and forth could impact performance
but is it reasonable to see a 42x performance drop in such cases?
Also if I then try and compile the kernel on the orangefs it takes well
over 2hours and most of the time is spent io waiting. Compiling locally
takes about 20mins on the same server.
I've tried running over multiple hosts, running some meta only and some
data only servers, i've tried running with just 1 meta and 1 data server.
I've applied all the system level optimizations I've found and just cannot
reasonably speed up the untar operation.
Maybe it's a client thing? there's not much i can see that's configurable
about the client though.
It seems to me that i should be able to get better performance, even on
small file operations, but I'm kinda stumped.
Am I just chasing unicorns or is it possible to get usable performance for
this sort of file activity? (untaring, compiling etc etc)
Config below
<Defaults>
UnexpectedRequests 256
EventLogging none
EnableTracing no
LogStamp datetime
BMIModules bmi_tcp
FlowModules flowproto_multiqueue
PerfUpdateInterval 10000000
ServerJobBMITimeoutSecs 30
ServerJobFlowTimeoutSecs 60
ClientJobBMITimeoutSecs 30
ClientJobFlowTimeoutSecs 600
ClientRetryLimit 5
ClientRetryDelayMilliSecs 100
PrecreateBatchSize 0,1024,1024,1024,32,1024,0
PrecreateLowThreshold 0,256,256,256,16,256,0
TroveMaxConcurrentIO 16
<Security>
TurnOffTimeouts yes
</Security>
</Defaults>
<Aliases>
Alias server01 tcp://server01:3334
Alias server02 tcp://server01:3335
Alias server03 tcp://server01:3336
Alias server04 tcp://server01:3337
Alias server05 tcp://server01:3338
Alias server06 tcp://server01:3339
Alias server07 tcp://server01:3340
Alias server08 tcp://server01:3341
</Aliases>
<ServerOptions>
Server server01
DataStorageSpace /usr/local/storage/server01/data
MetadataStorageSpace /usr/local/storage/server01/meta
LogFile /var/log/orangefs-server-server01.log
</ServerOptions>
<ServerOptions>
Server server02
DataStorageSpace /usr/local/storage/server02/data
MetadataStorageSpace /usr/local/storage/server02/meta
LogFile /var/log/orangefs-server-server02.log
</ServerOptions>
<ServerOptions>
Server server03
DataStorageSpace /usr/local/storage/server03/data
MetadataStorageSpace /usr/local/storage/server03/meta
LogFile /var/log/orangefs-server-server03.log
</ServerOptions>
<ServerOptions>
Server server04
DataStorageSpace /usr/local/storage/server04/data
MetadataStorageSpace /usr/local/storage/server04/meta
LogFile /var/log/orangefs-server-server04.log
</ServerOptions>
<ServerOptions>
Server server05
DataStorageSpace /usr/local/storage/server05/data
MetadataStorageSpace /usr/local/storage/server05/meta
LogFile /var/log/orangefs-server-server05.log
</ServerOptions>
<ServerOptions>
Server server06
DataStorageSpace /usr/local/storage/server06/data
MetadataStorageSpace /usr/local/storage/server06/meta
LogFile /var/log/orangefs-server-server06.log
</ServerOptions>
<ServerOptions>
Server server07
DataStorageSpace /usr/local/storage/server07/data
MetadataStorageSpace /usr/local/storage/server07/meta
LogFile /var/log/orangefs-server-server07.log
</ServerOptions>
<ServerOptions>
Server server08
DataStorageSpace /usr/local/storage/server08/data
MetadataStorageSpace /usr/local/storage/server08/meta
LogFile /var/log/orangefs-server-server08.log
</ServerOptions>
<Filesystem>
Name orangefs
ID 146181131
RootHandle 1048576
FileStuffing yes
FlowBufferSizeBytes 1048576
FlowBuffersPerFlow 8
DistrDirServersInitial 1
DistrDirServersMax 1
DistrDirSplitSize 100
TreeThreshold 16
<Distribution>
Name simple_stripe
Param strip_size
Value 1048576
</Distribution>
<MetaHandleRanges>
Range server01 3-576460752303423489
Range server02 576460752303423490-1152921504606846976
Range server03 1152921504606846977-1729382256910270463
Range server04 1729382256910270464-2305843009213693950
Range server05 2305843009213693951-2882303761517117437
Range server06 2882303761517117438-3458764513820540924
Range server07 3458764513820540925-4035225266123964411
Range server08 4035225266123964412-4611686018427387898
</MetaHandleRanges>
<DataHandleRanges>
Range server01 4611686018427387899-5188146770730811385
Range server02 5188146770730811386-5764607523034234872
Range server03 5764607523034234873-6341068275337658359
Range server04 6341068275337658360-6917529027641081846
Range server05 6917529027641081847-7493989779944505333
Range server06 7493989779944505334-8070450532247928820
Range server07 8070450532247928821-8646911284551352307
Range server08 8646911284551352308-9223372036854775794
</DataHandleRanges>
<StorageHints>
TroveSyncMeta yes
TroveSyncData no
TroveMethod alt-aio
#DirectIOThreadNum 120
#DirectIOOpsPerQueue 200
#DBCacheSizeBytes 17179869184
#AttrCacheSize 10037
#AttrCacheMaxNumElems 2048
</StorageHints>
</Filesystem>
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users