[Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist
The proposal to merge lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist has been updated. Status: Needs review = Merged For more details, see: https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479 -- https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479 Your team Zeitgeist Framework Team is subscribed to branch lp:zeitgeist. ___ Mailing list: https://launchpad.net/~zeitgeist Post to : zeitgeist@lists.launchpad.net Unsubscribe : https://launchpad.net/~zeitgeist More help : https://help.launchpad.net/ListHelp
Re: [Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist
So, how about now? -- https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479 Your team Zeitgeist Framework Team is requested to review the proposed merge of lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist. ___ Mailing list: https://launchpad.net/~zeitgeist Post to : zeitgeist@lists.launchpad.net Unsubscribe : https://launchpad.net/~zeitgeist More help : https://help.launchpad.net/ListHelp
Re: [Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist
I wouldn't really like to change the semantics of the Search() method, somehow magically it approximates the results quite ok (plus it'd be weird if _SUBJECTS groupings worked perfectly and the others just ok-ish). -- https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479 Your team Zeitgeist Framework Team is subscribed to branch lp:zeitgeist. ___ Mailing list: https://launchpad.net/~zeitgeist Post to : zeitgeist@lists.launchpad.net Unsubscribe : https://launchpad.net/~zeitgeist More help : https://help.launchpad.net/ListHelp
Re: [Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist
I'm starting to think that doing the secondary sorting in FTS isn't a good idea, we're sending the relevancies to the client, so we should keep the full Zeitgeist sorting, and since the client has the relevancies, it can do this kind of sort itself (or not). -- https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479 Your team Zeitgeist Framework Team is requested to review the proposed merge of lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist. ___ Mailing list: https://launchpad.net/~zeitgeist Post to : zeitgeist@lists.launchpad.net Unsubscribe : https://launchpad.net/~zeitgeist More help : https://help.launchpad.net/ListHelp
Re: [Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist
As discussed on IRC, I don't really like how this is ending up. We should look into re-architecting FTS at some point, starting from the assumption that it's only for searching current documents (so it may change from storing all events to storing one of each subjects + event information, or whatever). But since this stuff is supposed to be working in Precise, I guess it's fine to go with the workaround for now. The main problem I see with the MP right now is that it's just looking at the URI when re-requesting the events, so if the request was for something particular (especially event interpretation, event manifestation or actor, since the subject data could be seen as somewhat more constant) it's likely to end up giving a wrong sort of event. A possible way of fixing this would be merging the uri templates with the request templates (the ones used in CompileEventFilterQuery). The trivial implementation would go something like: tmpls = [] for template in templates: for uri in uris: tmpl = copy(template) tmpl.subject_uri = ... tmpls.append(tmpl) However, it may end up generating really big SQL queries (eg. consider just two templates for subject_interpretation={Music,Video} and a limit of 100 events; that becomes 200 templates with subject_interpretation and subject_manifestation, which is 400 conditions in the generated SQL). -- https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479 Your team Zeitgeist Framework Team is requested to review the proposed merge of lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist. ___ Mailing list: https://launchpad.net/~zeitgeist Post to : zeitgeist@lists.launchpad.net Unsubscribe : https://launchpad.net/~zeitgeist More help : https://help.launchpad.net/ListHelp
[Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist
Michal Hruby has proposed merging lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist. Requested reviews: Zeitgeist Framework Team (zeitgeist) For more details, see: https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479 Implements secondary sorting based on ResultType to SearchWithRelevancies method. -- https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479 Your team Zeitgeist Framework Team is requested to review the proposed merge of lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist. === modified file 'extensions/fts++/indexer.cpp' --- extensions/fts++/indexer.cpp 2012-03-07 16:08:26 + +++ extensions/fts++/indexer.cpp 2012-03-07 22:37:19 + @@ -23,6 +23,7 @@ #include xapian.h #include queue #include vector +#include cmath #include gio/gio.h #include gio/gdesktopappinfo.h @@ -804,7 +805,6 @@ if (event_templates-len 0) { -ZeitgeistTimeRange *time_range = zeitgeist_time_range_new_anytime (); results = zeitgeist_db_reader_find_events (zg_reader, time_range, event_templates, @@ -813,8 +813,6 @@ result_type, NULL, error); - -g_object_unref (time_range); } else { @@ -841,6 +839,34 @@ return results; } +static gint +sort_events_by_relevance (gconstpointer a, gconstpointer b, gpointer user_data) +{ + gdouble rel1 = 0.0; + gdouble rel2 = 0.0; + std::mapunsigned, gdouble::const_iterator it; + ZeitgeistEvent **e1 = (ZeitgeistEvent**) a; + ZeitgeistEvent **e2 = (ZeitgeistEvent**) b; + std::mapunsigned, gdouble const relevancy_map = +*(static_caststd::mapunsigned, gdouble* (user_data)); + + it = relevancy_map.find (zeitgeist_event_get_id (*e1)); + if (it != relevancy_map.end ()) rel1 = it-second; + + it = relevancy_map.find (zeitgeist_event_get_id (*e2)); + if (it != relevancy_map.end ()) rel2 = it-second; + + gdouble delta = rel1 - rel2; + if (fabs (delta) 0.1) + { +// relevancy of both items is the same, let's make use of stable sort +return e1 e2 ? 1 : -1; + } + + // we want the higher ranked events first + return (delta 0) ? 1 : -1; +} + GPtrArray* Indexer::SearchWithRelevancies (const gchar *search, ZeitgeistTimeRange *time_range, GPtrArray *templates, @@ -860,24 +886,51 @@ guint maxhits = count; -if (result_type == RELEVANCY_RESULT_TYPE) -{ - enquire-set_sort_by_relevance (); -} -else -{ - enquire-set_sort_by_value (VALUE_TIMESTAMP, true); -} - if (storage_state != ZEITGEIST_STORAGE_STATE_ANY) { g_set_error_literal (error, ZEITGEIST_ENGINE_ERROR, ZEITGEIST_ENGINE_ERROR_INVALID_ARGUMENT, - Only ANY stogate state is supported); + Only ANY storage state is supported); return NULL; } +if (result_type == RELEVANCY_RESULT_TYPE) +{ + enquire-set_sort_by_relevance (); +} +else if (result_type == ZEITGEIST_RESULT_TYPE_MOST_RECENT_EVENTS || +result_type == ZEITGEIST_RESULT_TYPE_LEAST_RECENT_EVENTS) +{ + enquire-set_sort_by_relevance_then_value (VALUE_TIMESTAMP, true); + enquire-set_collapse_key (VALUE_EVENT_ID); +} +else if (result_type == ZEITGEIST_RESULT_TYPE_MOST_RECENT_SUBJECTS || +result_type == ZEITGEIST_RESULT_TYPE_LEAST_RECENT_SUBJECTS || +result_type == ZEITGEIST_RESULT_TYPE_MOST_POPULAR_SUBJECTS || +result_type == ZEITGEIST_RESULT_TYPE_LEAST_POPULAR_SUBJECTS) +{ + enquire-set_sort_by_relevance_then_value (VALUE_TIMESTAMP, true); + enquire-set_collapse_key (VALUE_URI_HASH); +} +else if (result_type == ZEITGEIST_RESULT_TYPE_MOST_RECENT_ORIGIN || +result_type == ZEITGEIST_RESULT_TYPE_LEAST_RECENT_ORIGIN || +result_type == ZEITGEIST_RESULT_TYPE_MOST_POPULAR_ORIGIN || +result_type == ZEITGEIST_RESULT_TYPE_LEAST_POPULAR_ORIGIN) +{ + // FIXME: not really correct but close :) + enquire-set_sort_by_relevance_then_value (VALUE_TIMESTAMP, true); + enquire-set_collapse_key (VALUE_URI_HASH); + maxhits *= 3; +} +else +{ + // throw an error for these? + enquire-set_sort_by_relevance_then_value (VALUE_TIMESTAMP, true); + enquire-set_collapse_key (VALUE_EVENT_ID); + maxhits *= 3; +} + Xapian::Query q(query_parser-parse_query (query_string, QUERY_PARSER_FLAGS)); enquire-set_query (q); Xapian::MSet hits (enquire-get_mset (offset, maxhits)); @@ -906,6 +959,8 @@ NULL