Hi,

I'm running a PoC orangefs setup, and have read through all of the 
previous posts on performance. We are running 2.9.6 on Redhat 7.4. The 
clients are using the kernel interface

I'm running currently on 1 Power 750 server as host (with 8 dual meta/data 
servers running).  The clients are a mix of Intel and PPC64 systems all 
interconnected by Infiniband DDR cards in Connected mode.

The storage backend is a 4G FC attached ssd chassis with 20 250 Gig SSD 
cards (not regular drives) in, with 8 are assigned to meta and 8 are 
assigned to data

The network tests good. 7ish Gb/s with no retries or errors. We are using 
bmi_tcp.

I can get great performance for large files as expected but when 
performing small file actions the performance is significantly poorer.

For example.  I can untar linux-4.13.3.tar.xz locally on the  filesystem 
in 14seconds  while on the orangefs it takes 10mins

I can see the performance difference when playing with stripe sizes etc 
when copying monolithic files, but there seems to be a wall that gets hit 
when there is a lot of metadata activity.

I can see how the need for network back and forth could impact performance 
but is it reasonable to see a 42x performance drop in such cases?

Also if I then try and compile the kernel on the orangefs it takes well 
over 2hours and most of the time is spent io waiting.  Compiling locally 
takes about 20mins on the same server.

I've tried running over multiple hosts, running some meta only and some 
data only servers, i've tried running with just 1 meta and 1 data server. 
I've applied all the system level optimizations I've found and just cannot 
reasonably speed up the untar operation.

Maybe it's a client thing? there's not much i can see that's configurable 
about the client though. 

It seems to me that i should be able to get better performance, even on 
small file operations, but I'm kinda stumped.

Am I just chasing unicorns or is it possible to get usable performance for 
this sort of file activity? (untaring, compiling etc etc)

Config below

<Defaults>
        UnexpectedRequests 256
        EventLogging none
        EnableTracing no
        LogStamp datetime
        BMIModules bmi_tcp
        FlowModules flowproto_multiqueue
        PerfUpdateInterval 10000000
        ServerJobBMITimeoutSecs 30
        ServerJobFlowTimeoutSecs 60
        ClientJobBMITimeoutSecs 30
        ClientJobFlowTimeoutSecs 600
        ClientRetryLimit 5
        ClientRetryDelayMilliSecs 100
        PrecreateBatchSize 0,1024,1024,1024,32,1024,0
        PrecreateLowThreshold 0,256,256,256,16,256,0
        TroveMaxConcurrentIO 16
        <Security>
                TurnOffTimeouts yes
        </Security>
</Defaults>

<Aliases>
        Alias server01 tcp://server01:3334
        Alias server02 tcp://server01:3335
        Alias server03 tcp://server01:3336
        Alias server04 tcp://server01:3337
        Alias server05 tcp://server01:3338
        Alias server06 tcp://server01:3339
        Alias server07 tcp://server01:3340
        Alias server08 tcp://server01:3341
</Aliases>

<ServerOptions>
     Server server01
     DataStorageSpace /usr/local/storage/server01/data
     MetadataStorageSpace /usr/local/storage/server01/meta
     LogFile /var/log/orangefs-server-server01.log
</ServerOptions>

<ServerOptions>
     Server server02
     DataStorageSpace /usr/local/storage/server02/data
     MetadataStorageSpace /usr/local/storage/server02/meta
     LogFile /var/log/orangefs-server-server02.log
</ServerOptions>

<ServerOptions>
     Server server03
     DataStorageSpace /usr/local/storage/server03/data
     MetadataStorageSpace /usr/local/storage/server03/meta
     LogFile /var/log/orangefs-server-server03.log
</ServerOptions>
<ServerOptions>
     Server server04
     DataStorageSpace /usr/local/storage/server04/data
     MetadataStorageSpace /usr/local/storage/server04/meta
     LogFile /var/log/orangefs-server-server04.log
</ServerOptions>
<ServerOptions>
     Server server05
     DataStorageSpace /usr/local/storage/server05/data
     MetadataStorageSpace /usr/local/storage/server05/meta
     LogFile /var/log/orangefs-server-server05.log
</ServerOptions>
<ServerOptions>
     Server server06
     DataStorageSpace /usr/local/storage/server06/data
     MetadataStorageSpace /usr/local/storage/server06/meta
     LogFile /var/log/orangefs-server-server06.log
</ServerOptions>
<ServerOptions>
     Server server07
     DataStorageSpace /usr/local/storage/server07/data
     MetadataStorageSpace /usr/local/storage/server07/meta
     LogFile /var/log/orangefs-server-server07.log
</ServerOptions>
<ServerOptions>
     Server server08
     DataStorageSpace /usr/local/storage/server08/data
     MetadataStorageSpace /usr/local/storage/server08/meta
     LogFile /var/log/orangefs-server-server08.log
</ServerOptions>

<Filesystem>
        Name orangefs
        ID 146181131
        RootHandle 1048576
        FileStuffing yes
        FlowBufferSizeBytes 1048576
        FlowBuffersPerFlow  8
        DistrDirServersInitial 1
        DistrDirServersMax 1
        DistrDirSplitSize 100
        TreeThreshold 16
 <Distribution>
        Name simple_stripe
        Param strip_size
        Value 1048576
  </Distribution>
<MetaHandleRanges>
                Range server01 3-576460752303423489
                Range server02 576460752303423490-1152921504606846976
                Range server03 1152921504606846977-1729382256910270463
                Range server04 1729382256910270464-2305843009213693950
                Range server05 2305843009213693951-2882303761517117437
                Range server06 2882303761517117438-3458764513820540924
                Range server07 3458764513820540925-4035225266123964411
                Range server08 4035225266123964412-4611686018427387898
        </MetaHandleRanges>
        <DataHandleRanges>
                Range server01 4611686018427387899-5188146770730811385
                Range server02 5188146770730811386-5764607523034234872
                Range server03 5764607523034234873-6341068275337658359
                Range server04 6341068275337658360-6917529027641081846
                Range server05 6917529027641081847-7493989779944505333
                Range server06 7493989779944505334-8070450532247928820
                Range server07 8070450532247928821-8646911284551352307
                Range server08 8646911284551352308-9223372036854775794
        </DataHandleRanges>
        <StorageHints>
                TroveSyncMeta yes
                TroveSyncData no
                TroveMethod alt-aio
                #DirectIOThreadNum 120
                #DirectIOOpsPerQueue 200
                #DBCacheSizeBytes 17179869184
                #AttrCacheSize 10037
                #AttrCacheMaxNumElems 2048
        </StorageHints>
</Filesystem>





_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to