Re: really poor search performance

Thomas Mueller Thu, 29 Jan 2015 23:01:05 -0800

Hi,

Which version of Jackrabbit do you use?


> select * from [nt:resource]
> With a limit of 20.

Is this the query that is slow? How slow is it?
What is the source code you have used?

> Since there is no ³count(*)² functionality in jackrabbit,
> I have to run the query a second time without the limit to count the
>total size.

What are possible results for this? 349'244 for example? Counting is
expensive, as all nodes needs to be loaded and access rights need to be
checked. I suggest to only display in the UI "page 1 of many" instead of
"page 1 of 30593". So, don't count if it's more than 100 or so.

Regards,
Thomas







On 29/01/15 10:25, "cfalletta" <[email protected]> wrote:

>Hi,
>
>I¹m using jackrabbit in a production environment where there are, for the
>moment, 350.000 documents, mostly pdf¹s.
>I don¹t understand why it takes a few minutes to execute a search query
>for
>so little documentsŠ As of the following month, I¹ll need to handle a flow
>of 1Million documents a month, so I¹d better improve the performance
>alreadyŠ
>
>I¹m using SQL2 to query the repository, and my request is really simple :
>select * from [nt:resource]. With a limit of 20.
>Since there is no ³count(*)² functionality in jackrabbit, I have to run
>the
>query a second time without the limit to count the total size. Weirdly, it
>seems to take less time than the first query (maybe because of the caching
>mechanism).
>
>I can¹t optimize the query obviously, so either there is something wrong
>with my configuration or the way I store the documents and index them. I
>can¹t imagine Jackrabbit is not capable of handling that little documents.
>
>Of course, I¹m looking on it for quite some time, and here are a few
>information that may help solve the problem :
>-      I¹m storing up to 200 resource nodes (files) on a folder. Thus, my path
>looks like /ATTACHMENT/2014/01/01/{someFolderUUID}/{theFile}. The reason
>of
>this was that the database size was growing exponentially when storing
>thousands of files in the same folderŠ
>-      I¹m using MIX_REFERENCEABLE and MIX_VERSIONABLE when I store my
>documents,
>and the backend opens/closes a session after each operation. However, for
>the moment not so many people use it.
>-      Not indexing the content (disabling tika parsers) doesn¹t seem to change
>the performance
>-      I¹m using a postgre database, and a LocalFileStore
>-      It¹s not the process that takes time but jackrabbit (I saw the query
>execution time on jackrabbit logging)
>
>Do you have an idea of why it is so slow or any lead on this ?
>
>Thanks !
>Cédric
>
>
>
>
>--
>View this message in context:
>http://jackrabbit.510166.n4.nabble.com/really-poor-search-performance-tp46
>61920.html
>Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

Re: really poor search performance

Reply via email to