Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the 
following link:
https://bugzilla.lustre.org/show_bug.cgi?id=11013



(From update of attachment 9386)
>+lustre_hash_delitem_nolock(struct lustre_class_hash_body *hash_body, 
>+                           int hashent, struct hlist_node * hash_item)
>+{
>+        struct lustre_hash_operations *hop = hash_body->lchb_hash_operations;
>+
>+        if (hash_item == NULL) {
>+                RETURN(-ENOENT);
>+        }

No need for braces here.

>+
>+        hlist_del(hash_item);
>+
>+        hop->lustre_hash_object_refcount_put(hash_item);
>+
>+#ifdef LUSTRE_HASH_DEBUG
>+        hash_body->lchb_hash_tables[hashent]->lhb_item_count--;
>+        CDEBUG(D_INFO, "hashname[%s] bucket[%d] has [%d] hashitem\n", 
>+                        hash_body->hashname, hashent, 
>+                        hash_body->lchb_hash_tables[hashent]->lhb_item_count);
>+#endif

You are referencing a struct that may already be freed by the _put (if that was
the last reference).

\>@@ -655,41 +654,38 @@ int target_handle_connect(struct ptlrpc_
>                 goto dont_check_exports;
> 
>         spin_lock(&target->obd_dev_lock);
>+        export = lustre_hash_get_object_by_key(target->obd_uuid_hash_body, 
>&cluuid);
> 
>+        if (export != NULL && export->exp_connecting) { /* bug 9635, et. al. 
>*/
>+        } else if (export != NULL) {
>+        } else {
>                 export = NULL;
>         }

Isn't export == NULL in this case already?

>+int lustre_hash_additem_unique(struct lustre_class_hash_body *hash_body, 
>+                               void *key, struct hlist_node *actual_hnode)
>+{
>+        int hashent;
>+        struct lustre_hash_bucket *bucket = NULL;
>+        struct lustre_hash_operations *hop = hash_body->lchb_hash_operations;
>+        ENTRY;
>+
>+        hashent = hop->lustre_hashfn(hash_body, key);
>+
>+        /* get the hash-bucket and lock it */
>+        bucket = &hash_body->lchb_hash_tables[hashent];
>+        spin_lock(&bucket->lhb_lock);
>+
>+        if ( (lustre_hash_getitem_in_bucket_nolock(hash_body, hashent, key)) 
>!= NULL) {
>+                /* the added-item exist in hashtables, so cannot add it again 
>*/
>+                spin_unlock(&bucket->lhb_lock);
>+
>+                CWARN("Object exist\n");

This isn't a useful error message.  Either there are already error messages in
the caller (that may already be true), or this needs to be made more useful
like "Already found {key} in {hashname}, not adding duplicate".

>+void * lustre_hash_get_object_by_key(struct lustre_class_hash_body *hash_body,
>+                                     void *key)
>+{
>+        if (hash_item_hnode == NULL) {
>+                spin_unlock(&bucket->lhb_lock); /* lock the bucket */
>+                CERROR("Object don't exist\n");

Again, not a very useful error message.  How often do we expect this to be hit?
 If very often at all, then it probably shouldn't go to the console.

>+}
>+EXPORT_SYMBOL(lustre_hash_get_object_by_key);

What happens if this is called when there are multiple of the same keys in the
hash, as is described for the NID hash above?

>+int nid_hash_key_compare(void *key, struct hlist_node *compared_hnode)
>+{
>+        if (strcmp(obd_export_nid2str(export), libcfs_nid2str(*nid_key)) == 0)
>+                 retval = 1;

How often do we expect to call obd_export_nid2str()?  If this is called a LOT
then it may make sense to just save the formatted NID into the export and have
obd_export_nid2str() just return that pointer always.

> int obd_export_evict_by_nid(struct obd_device *obd, char *nid)
>+        lnet_nid_t nid_key = libcfs_str2nid(nid);
>+
>+        for (; ;) {

I think it is preferred to use do { } while (1)

> int obd_export_evict_by_uuid(struct obd_device *obd, char *uuid)
> {
>         struct obd_export *doomed_exp = NULL;
>         struct obd_uuid doomed;
>         int exports_evicted = 0;
> 
>         obd_str2uuid(&doomed, uuid);
> 
>+        doomed_exp = lustre_hash_get_object_by_key(obd->obd_uuid_hash_body, 
>+                                                   &doomed);

Not directly a scalability (but rather a usability) issue, can you have this
code skip a leading "uuid:" if present, so that it can be used similar to the
"nid:" tag for evict-by-nid.

>@@ -426,6 +442,16 @@ int class_cleanup(struct obd_device *obd
>+        if (obd->obd_uuid_hash_body != NULL) {
>+                lustre_hash_exit(obd->obd_uuid_hash_body);
>+        }
>+
>+        if (obd->obd_nid_hash_body != NULL) {
>+                lustre_hash_exit(obd->obd_nid_hash_body);
>+        }

No need for braces here.

>@@ -143,14 +145,27 @@ int ptlrpc_put_connection(struct ptlrpc_
>+                if (!hlist_unhashed(&c->c_hash)) {
>+                        lustre_hash_delitem(conn_hash_body, &peer, 
>&c->c_hash);
>+                }

No need for braces.

I haven't done a full inspection, but this is generally good looking code here.

What I don't see is real unit tests for this.  You need to write a test that
verifies the scalability of this code.  One option is to use (and possibly
modify) Nathan's loadgen program so that it only does the connection part
without the IO generation.  Then, time the performance of "evict-by-nid" or
similar.

_______________________________________________
Lustre-devel mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-devel

Reply via email to