Hi Niu, This is a good review of the scalability of connections. But there are some questions. I have now cc'd lustre-devel to get the discussion in the open.
> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Niu YaWei > Sent: Wednesday, November 15, 2006 8:29 PM > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: [Arch] scalability study: > single client _CONNECTS_ to a very large number of OSS servers Review form: > 1. Use case identifier: single client _CONNECTS_ to a very > large number of OSS servers > > 2. Link to architectural information: None > > 3. HLD available: YES > > 4. Patterns of basic operations: > a. RPCs: > - One OST_CONNECT RPC for each OST. > b. fs/obd other methods: > - obd_connect. > c. cache: None. > d. Lustre & Linux locks: No suspect locks. > e. lists, arrays, queues @runtime: > - obd array obd_devs, the maximum device > count is 8k, so the osc count must be less than 8k, I think > it's enough. Nope - we want this to be far more scalable than 8K OSC's. Last week we heard that Evan Felix ran with 4000 OSC's (getting a whopping 130GB/sec read from Lustre! The array needs to go away. I think Nathan is already working on this for the load simulator btw. Hmm, I don't see a server side consideration of this problem. Am I missing something? > - qos_add_tgt() will search and maintain the > lq_oss_list, this list grows as OSS number grow. > - Need search connection in the > imp_conn_list, but this list is quite small and will never grow. > f. startup data: None. > > 5. Scalable use pattern of basic operations: > - One client perform mount. > - MDS setup. Is connect also used against the management server? > 6. Scalability measures: > - The number of OST_CONNECT RPC is N (OST count), > since the RPC is sent asynchronously, it runs in O(1) time. > - Unless we are going to build a cluster with more > than 8k OSTs, we can't run out of obd_devs. > - The time complexity of qos_add_tgt() is O(N), and > it should only happen when MDS connect OSS, so no need > to improve it. On the server side isn't there scanning in the list of existing connections to see if a UUID of a connection is already in the list? Isn't that list O(N) long? If so, the scan is O(N^2)? Eeb - can you confirm one more time that connection setup, which is likely to happen at this point in LNET has no linear scans? > > 7. Experiment description and findings: > - No test for it. Nathan - will the load simulator do this? I think it could even be used over the net? > 8. Recommendations for improvements: > - No recommendation on implementation improvements. A. Kill the array (P2) B. Fix the searching on the server (P1) > 9. Non scalable issues encountered, not identified by this process: > - The qos lists are useless for client's lov, but we > have to setup them since MDS and client use > the same lov driver, this needless list maintenance > work will burden each client mount, we should > avoid it. Hmm. But this will change in the future when the client has a full WB cache, so let's leave them in. Does the setup scale well? - Peter - > > > _______________________________________________ > Arch mailing list > [EMAIL PROTECTED] > https://mail.clusterfs.com/mailman/listinfo/arch _______________________________________________ Lustre-devel mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-devel
