Shawn Walker wrote:
Michal Pryc wrote:
Shawn:

I've asked if we are doing call load_catalogs and we are as below from the api.py. This takes huge amount of time if you have a lot of repositories with lots of packages. In your system it takes probably 2-3 seconds but it's still not efficient. If possible I would remove the load_catalogs from that function as other functions are doing that if necessary.

def get_publisher_last_update_time(self, prefix=None, alias=None):
        """Returns a datetime object representing the last time the
        catalog for a publisher was modified or None."""
        if alias:
                prefix = self.get_publisher(alias=alias).prefix
        dt = None
        self.__activity_lock.acquire()
        try:
                self.__set_can_be_canceled(True)
                try:
####################BELOW LINE######################################
                        self.img.load_catalogs(self.progresstracker)
######################################################################
                        dt = self.img.get_publisher_last_update_time(
                            prefix)
                except api_errors.CanceledException:
                        self.__reset_unlock()
                        raise
                except Exception:
                        self.__reset_unlock()
                        raise
        finally:
                self.__activity_lock.release()
        return dt

From an API perspective, there is no guarantee that the catalogs have been loaded yet, which I need to get the last_modified() information, although I've found a workaround -- more on that in a moment.

I short, when using the API functions, you should never have to call load_catalogs() on your own. It should do that if it needs access to the information. As such, I've changed load_catalogs() by:

* adding 'pubs' -- an optional list of publisher prefixes (names) to load catalogs for (defaults to None, which means all publishers)

* adding 'when_needed' (defaults to false for compatibility) -- a boolean value, that when True, indicates that the catalogs for publishers should only be loaded if they have not been already

I've also changed the api to call load_catalogs, but with 'when_needed=True' and with the particular publisher to refresh (where applicable) internally. This should have a bonus of speeding up several operations for the GUI (including info).

Finally, I've changed image:get_publisher_last_update_time() to load just the catalog file and retrieve it's last_modified() timestamp if the catalog hasn't already been loaded. This reduced the time on my system from 2.5 seconds to 0.000566005706787 seconds :-)

I'll post a new webrev after make test finishes (next 15 minutes).

Shawn,
Great thanks. I will get the things make changes to support the new load_catalogs() (by doing this I will remove few calls to api.img from packagemanager).

I will probably not finish to make those changes Today, but I will work a little bit Tomorrow to make all that sorted.

best
Michal
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to