Hi, Which version of Jackrabbit do you use?
> select * from [nt:resource] > With a limit of 20. Is this the query that is slow? How slow is it? What is the source code you have used? > Since there is no ³count(*)² functionality in jackrabbit, > I have to run the query a second time without the limit to count the >total size. What are possible results for this? 349'244 for example? Counting is expensive, as all nodes needs to be loaded and access rights need to be checked. I suggest to only display in the UI "page 1 of many" instead of "page 1 of 30593". So, don't count if it's more than 100 or so. Regards, Thomas On 29/01/15 10:25, "cfalletta" <[email protected]> wrote: >Hi, > >I¹m using jackrabbit in a production environment where there are, for the >moment, 350.000 documents, mostly pdf¹s. >I don¹t understand why it takes a few minutes to execute a search query >for >so little documentsŠ As of the following month, I¹ll need to handle a flow >of 1Million documents a month, so I¹d better improve the performance >alreadyŠ > >I¹m using SQL2 to query the repository, and my request is really simple : >select * from [nt:resource]. With a limit of 20. >Since there is no ³count(*)² functionality in jackrabbit, I have to run >the >query a second time without the limit to count the total size. Weirdly, it >seems to take less time than the first query (maybe because of the caching >mechanism). > >I can¹t optimize the query obviously, so either there is something wrong >with my configuration or the way I store the documents and index them. I >can¹t imagine Jackrabbit is not capable of handling that little documents. > >Of course, I¹m looking on it for quite some time, and here are a few >information that may help solve the problem : >- I¹m storing up to 200 resource nodes (files) on a folder. Thus, my path >looks like /ATTACHMENT/2014/01/01/{someFolderUUID}/{theFile}. The reason >of >this was that the database size was growing exponentially when storing >thousands of files in the same folderŠ >- I¹m using MIX_REFERENCEABLE and MIX_VERSIONABLE when I store my >documents, >and the backend opens/closes a session after each operation. However, for >the moment not so many people use it. >- Not indexing the content (disabling tika parsers) doesn¹t seem to change >the performance >- I¹m using a postgre database, and a LocalFileStore >- It¹s not the process that takes time but jackrabbit (I saw the query >execution time on jackrabbit logging) > >Do you have an idea of why it is so slow or any lead on this ? > >Thanks ! >Cédric > > > > >-- >View this message in context: >http://jackrabbit.510166.n4.nabble.com/really-poor-search-performance-tp46 >61920.html >Sent from the Jackrabbit - Users mailing list archive at Nabble.com.
