Hello all
I have been reading through many emails on this list and I've learnt a lot about how Basex works and how others use it. A month or so back I have sent an email myself to this list concerning caching. Even though I have some more questions about that, I will leave that for another time. Today I am concerned about retrieving chunked input from Basex. (Question also found on StackOverflow, with a nice bounty! :) http://stackoverflow.com/questions/36675388/efficient-and-user-friendly-way- to-present-slow-loading-results) Case at hand: we use Basex to query a 50 million tokens corpus. We also make this available to other users through a website. The thing is that this is slow. For our own projects that's no problem, we dive straight into the back-end and run a search command from terminal and let the query run for all the time it needs. However, for users it is paramount that they get a quick response. At the moment it is taking too long. I don't blame BaseX. We love BaseX and are astounded by its efficiency and optimisations! However, we want to deliver the best user-experience to our users. We call a new session from PHP, wait to receive the results, do some post-processing and then load the result page. As said, this takes too much time. We've been looking into some solutions. The best one that I think should be possible, is returning chunks of the results. Do you know those websites that allow you to see results but only, like, 20 per page? I think something similar is appropriate. When a user has searched for a pattern, we only show the 20 or so first results just so they can get an idea of the results they'd find. Then, when they click a button, we should query for the twenty next results which are then appended to the list (JavaScript solution I guess), and so on. Until all results have been found. Additionally, I will also provide a button from which users can download all results in a text file. This is allowed to take a longer time. The main thing is that users should get early feedback and results on their query. Now the question is if something like this is possible in an efficient manner in BaseX. Can you form a query that only finds the 20 first results, and then the following 20 and so on - and is this faster than searching everything at once? In other words, when I am searching for the results 120-140 (after having pushed the button a couple of times), is BaseX smart enough to skip the search space it has already done to find the 120 previous hits? If so, that would be great. Could you help me on my way, with some PHP/XQuery code that is suitable? I also highly encourage you to participate on StackOverflow. As I said, I am offering a 200 bounty - for the people who are interested in Internet fame. :) Thank you for your time Kind regards Bram Vanroy http://bramvanroy.be