On Sunday, March 09, 2014 10:23:18 PM Denis Steckelmacher wrote: > Hi, > > I'm happy and excited to announce that I have ported my Nepomuk query > parser to Baloo. The port itself and a small preliminary commit live in > kde:baloo, branch "queryparser". >
+1 I thought we were going to use Term::setUserData instead of setPosition/Length. Or was it the other way around? > All the parsing passes work, so the new parser is able to understand > numbers, file sizes, float values, property names (and their aliases), > type hints ("emails from John", "archives tagged as secret", etc) and > date-times ("yesterday", "last Sunday", "May 18, 2011"). > > Working on Baloo has been a great pleasure. The parser heavily relies on > Term, and Baloo::Term is way nicer than Nepomuk2::Term (and its > subclasses that all have a slightly different API for the same thing). > The porting operation mainly consisted of removing all the boilerplate > needed by Nepomuk. I congratulate all the Baloo developers for their > hard work! > Thanks. I'd love for you to get more involved in Baloo, as you have tons of experience from Nepomuk. And you're a pretty good coder! Have a look at - http://community.kde.org/Baloo/Tasks > The first commit in my branch adds the methods Term::setPosition, > Term::setLength, Term::position and Term::length. I finally chose to use > these methods instead of a more general setUserData and getUserData, > because I think that exposing a clear API is better than saying "to get > the position of a term, pass _k_term_position to getUserData". > Hmm. Alright. Ignore my previous comment then. Though who would be the potential consumers of this API? Just the query parser? > During my work, I encountered three problems that prevent some features > of the parser to work as they did in the Nepomuk days: > > * How do I do a regular expression matching? I currently build a Term > whose operation is "Contains" and value is a QRegExp object, but the > JSon serializer does not support regular expressions and is unable to > serialize queries using them (it returns an empty byte array) Well, this is primarily cause the backends do not support regular-expressions that easily. We can do a regexp match in the file search store, but it will involve match every single file url. > * Date-times are handled by Query, using setDateFilter. The parser > properly lowers date-time comparisons to date filters (see > QueryParser::tuneTerm for all the lowering that takes place), but there > is nothing I can do with times or more advanced comparisons (before a > date-time, between two date-times that are not aligned on a > year/month/day boundary, etc). The API does support it right now, correct? I'm thinking of something along the lines of Term("modified", date, LessThan). The only problem is the backend not supporting it. Do you want to add support for that? It should be fairly trivial. Have a look at how the email search store does it. > * Subqueries are gone, as there is no more "related to" property. I have > ported the code, but it is currently #ifdef'ed out. > Yes. Lets put on a hold on that for now. We currently do not have any "related" data, hence the lack of a way of representing that. > My branch also contains a test suite that is not very extensive but > tests all the parsing passes (and allowed me to find plenty of bugs, as > the semantics of Term have slightly changed between Nepomuk and Baloo). > What about the widget side of this? Also, this brings us to the question of how/when do we want to ship this. I'm not very comfortable with shipping it with this release cause it would mean more APIs which we have to stick with. On option is that we could ship it as a separate library, and not as part of baloocore. This way we can potentially not maintain source/library compatibility on that library. Otherwise this can be shipped separately on top of Baloo. What are your thoughts on this? Where would you like to see this go? -- Vishesh Handa >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<